4 June 2026
16:30 – 18:00
ŠIBENIK II
Presentation title
Implementing an Innovative Web-Based Data Collection Application for the Household Budget Survey
This contribution describes the development, testing, and initial implementation of a progressive web-based application at the Statistical Office of the Republic of Slovenia, designed to complement the expenditure diary in the Household Budget Survey.
Read more
Read less
The application was developed in spring 2025 within the EU-funded grant project HBS 2026 Innovative Tools and Sources and is specifically intended for the collection of digital receipts, while the traditional paper-based diary remains in use for recording expenditures supported by paper receipts.
The application enables respondents to submit digital receipts directly from retailers’ online applications. It does not replace the paper diary but represents an additional digital data collection channel within a mixed-mode diary design. The application was developed with a strong emphasis on simplicity, low respondent burden, and process standardisation, incorporating embedded logic and basic validation rules to support data quality at the point of collection.
Following development, the application underwent several testing phases. Internal testing focused on technical stability, data transmission, and validation mechanisms. In autumn 2025, cognitive testing and pilot testing were conducted to assess usability, respondent understanding, and interaction between the paper-based and digital components of the diary. Results indicated limited awareness and routine use of digital receipts among respondents. However, once introduced, the application was assessed as easy to use and clearly structured. Respondents emphasised the importance of interviewer support at the initial stage, particularly short demonstrations of application use. Simplicity and clarity were consistently identified as key strengths from a quality perspective.
In 2026, the application is deployed in regular fieldwork as part of the Household Budget Survey. In parallel, additional methodological components are being developed and tested. These include the implementation of optical character recognition for processing paper receipts collected via the paper diary, building on experience from the 2022 survey cycle, and the development of a new machine learning model for expenditure classification based on previous findings. While these components are not yet fully operational, the contribution discusses quality assurance considerations related to their integration, including error detection, validation strategies, transparency, and process control.
The described mixed-mode diary approach illustrates how digital tools can be integrated into established survey instruments to support quality modernisation. By embedding quality management directly into data collection processes and enabling methodological extensions, the approach contributes to more standardised and quality-focused household expenditure statistics production.
Urška Pirnat
Statistical Office of the Republic of Slovenia
Read more
Read less
Urška Pirnat is a senior adviser at the Statistical Office of the Republic of Slovenia. She works primarily on the Household Budget Survey, focusing on survey methodology, data collection processes, and quality assurance. Her recent work includes the development and implementation of innovative digital tools for expenditure data collection, integration of new data sources, and preparation of methodological solutions for the HBS 2026 cycle. Her research interests include quality management in official statistics, mixed-mode survey designs, use of administrative and alternative data sources, and the application of modern technologies to improve data quality and efficiency in statistical production. She is also involved in methodological work related to the use of machine learning techniques for expenditure classification.
CO-AUTHOR:
Presentation title
Data quality in diary-based household surveys (HBS & TUS) in times of digitalization
The digital transformation of official statistics is fundamentally changing the way diary-based household surveys are conducted and processed.
Read more
Read less
The presented work describes recent developments in the domain of Time Use Survey (TUS) and Household Budget Survey (HBS), focusing on the introduction of smartphone app-based data collection, the use of machine learning (ML) for classification tasks, and emerging artificial intelligence (AI) applications in survey production.
Both TUS and HBS rely on respondent-recorded diary information and free-text entries, which have traditionally required extensive manual coding and editing from the perspective of official statistics. App-based diary instruments are not only introduced to improve processing efficiency but also to better support respondents by providing an easy-to-use, readily available tool that aligns more closely with everyday recording behavior and aims to reduce respondent burden. The digital mode enables real-time plausibility checks, contextual prompts, and guidance during data collection. However, introducing such a new mode next to an existing pen-and-paper approach can result in differences and biases between modes. Using empirical evidence from parallel pen-and-paper and app-based data collection, results on mode differences between diary instruments will be presented, focusing on achieved diary completeness and structural differences in reporting as proxies for data quality.
In addition to changes in data collection modes, ML-based classification systems have been implemented to support the coding of free-text diary entries into standard classifications such as the classification of individual consumption by purpose (COICOP). These systems aim to increase efficiency and consistency while reducing manual workload. In HBS, AI-based receipt scanning technology is used to extract and classify expenditure information directly from receipts. Comparable AI-supported approaches are planned for TUS, particularly for the integration of geolocation (GEO) tracking data to enhance contextual information on activities and mobility patterns.
While the current contribution focuses on observed mode effects, it also highlights an emerging challenge for quality management in official statistics: the need to systematically assess how ML- and AI-based systems themselves influence data quality. Dedicated evaluations are currently under preparation and will be addressed in future analyses.
Jerome Olsen
Federal Statistical Office of Germany
Read more
Read less
Jerome Olsen is a psychologist and holds a doctorate from the University of Vienna. He subsequently worked as a Senior Research Fellow at the Max Planck Institute for Research on Collective Goods and joined the Federal Statistical Office of Germany in 2021. Since 2022, he has headed the unit “Methodology of Voluntary Household Surveys.
Presentation title
MultimodeGeneralizing Multimode Data Collection in Official Statistics: Implementation, Risks, and Methodological Safeguards at Insee
Over the past decade, Insee, the French Public Statistical System, and survey-producing institutions more broadly have increasingly adopted new data collection modes, driven by efficiency, cost reduction, and modernization objectives.
Read more
Read less
In particular, the expansion of web-based data collection—often combined with more traditional modes such as face-to-face interviews or paper questionnaires—has become a strategic priority. This shift aims to counter declining response rates, meet the expectations of certain respondent groups, and align national practices with international standards.
The transition accelerated sharply during the COVID-19 crisis in 2020, when social distancing measures made face-to-face interviews impossible. Many surveys were therefore conducted by telephone, internet, or using mixed-mode designs. This period demonstrated that these approaches can provide an effective and relevant means of collecting information, even for long and complex questionnaires. As a result, they are now fully integrated into the survey statistician’s methodological toolkit when designing data collection protocols.
However, fully self-administered modes such as web questionnaires can increase the risk of strong selection bias among respondents, as well as changes in the indicators measured. Their development has therefore gone hand in hand with advances in methods for addressing mode effects and endogeneity risks—an area of work that began in the 2010s through testing and experimental designs. Alongside the rollout of multimode data collection, Insee has developed methodological guidelines to safeguard data quality, with particular attention to temporal consistency. Numerous experiments are conducted upstream of data collection to assess the impact of introducing new modes on key indicators. The results of these tests are used to refine and adapt survey protocols, thereby limiting measurement effects and ensuring data reliability. This approach has been applied, for example, to the Housing Survey, the Time Use Survey, and the Household Budget Survey, which underwent a dedicated test in 2024 ahead of its full-scale collection in 2026.
This article presents the deployment of multimode data collection in Insee surveys and the systematic monitoring put in place to better document the associated results. It then reviews the methodological developments designed to ensure the quality of the data collected. Finally, drawing on the Household Budget Survey, it illustrates the introduction of new data collection modes in a social survey, the risks they may entail, and the methodological adaptations implemented to preserve the robustness and quality of the results.
Claire-Lise Dubost, Julie Solard
Insee
Read more
Read less
Claire-Lise Dubost works at INSEE on the Household Budget Survey within the Household Living Conditions Division. Her work consists of designing the survey, preparing data collection, analysing methodological results and processing the final data.
CO-AUTHORS:
Presentation title
Consumption expenditure categorization through dictionary-aided smart search in the mobile app for HBS respondents
The Italian Household Budget Survey (HBS) focuses on consumption expenditure behaviours of the households residing in Italy (levels, composition), and analyzes them according to household main social, economic and geographical characteristics.
Read more
Read less
The HBS is traditionally performed in Italy using two CAPI interviews (initial and final) and a 14-day expenditure paper diary.
Main techniques are therefore PAPI and CAPI, and data collection is temporally extended over two or three calendar months.
Response rates for HBS are slowly declining over time especially due to the rather heavy burden on respondents.
In particular, households seem to perceive the filling in of the 14-day expenditure diary as “time consuming”: this scenario should push in the direction of more attractive modes of data collection. To this end, ISTAT has developed a mobile app for respondents as an alternative tool for the collection of data traditionally only gathered through a paper diary.
Such an app can support the traditional manual recording of expenditure voices, but also offers the opportunity of drastically reducing the average time needed for this task.
This is achieved by implementing a smart search based on three main pillars:
1. the definition of a comprehensive dictionary of synonyms and related terms in the app metadata;
2. the use of speech-to-text technology to allow users to provide input quicker;
3. the use of fuzzy text search, allowing for more flexible matching between user input and metadata, particularly useful to smooth out any inaccuracies due to incorrect voice transcription.
The system still allows for easy control and correction by the respondent, in case the suggestions proved to be incorrect.
Such a tool holds significant promises for transforming official statistics, providing a robust approach to managing the growing volume and complexity of information, and reducing the impact of non-sampling errors, therefore enabling statistical institutes to collect higher-quality data.
Marco Silipo
ISTAT
Read more
Read less
Head of Unit, Senior IT Officer at the Italian Statistical National Institute (IT Department).
Responsible of the design and implementation of statistical business tools in compliance with UNECE international standards like CSPA, GSBPM and GSIM.
Previously, team leader in the development of several ISTAT apps, among which: the 2011 Italian Population Census app, the 2012 Industry Census app, the Consumer Price survey app (2008-2010).
Author of most of the mobile apps available for Android and iOS on ISTAT official stores.
20+ years of experience with Java and JavaScript, fostering the adoption of full stack development and microservices patterns.
Freelance instructor in full stack development topics like Angular and Java technologies.
Dev-Ops usage promoter.
Practitioner in the field of AI.
Former STC for the World Bank group, involved in the design and implementation of an innovative mobile APP for Household and Budget Income survey data collection.
CO-AUTHORS:
Presentation title
Using Administrative Data to Improve Survey Quality: Evidence from the EU-SILC Survey in Luxembourg
The EU Statistics on Income and Living Conditions (EU-SILC) has been conducted annually in Luxembourg since 2003.
Read more
Read less
Each year, the survey collects detailed microdata on income, poverty, and social exclusion from representative samples of the country’s resident population, and serves as the main reference source for official poverty and inequality indicators. Since its launch, EU-SILC has faced significant methodological and operational challenges, largely driven by the length and complexity of the questionnaire. These factors have contributed to high non-response rates and may have affected the survey’s representativeness.
In an effort to further enhance the overall quality of the survey, STATEC launched a collaboration with the General Inspectorate of Social Security (IGSS) to replace income-related survey questions with information sourced directly from administrative databases. This approach substantially reduced the length of the questionnaire and is expected to improve the quality of survey responses, especially for income measures. In addition, income aggregates derived from administrative records were integrated into the sample weighting procedure using a penalised calibration method, as implemented in the R package icarus.
In this contribution, we present the first results from the 2025 EU-SILC in Luxembourg, in which administrative sources were used for the first time. We examine the impact of several weighting scenarios on the survey estimates and compute standard error measures for the main EU-SILC indicators to assess the potential gains in accuracy compared with the fully survey-based approach. Overall, the findings point to a significant improvement in both the robustness and precision of the resulting indicators.
Guillaume Osier
STATEC
Read more
Read less
Guillaume Osier is a statistician at Luxembourg’s National Statistical Office (STATEC), where he heads the Living Conditions Unit within the Social Statistics Department. In this role, he manages major social survey programmes, including the EU Statistics on Income and Living Conditions (EU-SILC) and the Household Budget Survey (HBS). He is also adjunct lecturer in survey statistics at the University of Luxembourg.
CO-AUTHOR: