BACK TO SCHEDULE SPEEDTALK

Speed talk session 1

Process and Risk Management

3 June 2026
13:15 – 14:00
ŠIBENIK II

Presentation title
Applying AI to Improve Data Quality in the Household Budget Survey
The automation of receipt processing in the Household Budget Survey (HBS) has delivered a significant advancement in data quality and collection efficiency within this survey.

Read more Read less We have developed and integrated an in-house automated system based on Optical Character Recognition (OCR) and Machine Learning (ML).

This initial system already represents a drastic shift from the former process of entirely manual transcription. Considering an annual volume of over 60,000 tickets, which equates to approximately 400,000 products, this automation substantially reduces the burden on interviewers and respondents, saving us significant work, and minimizing transcription errors. The workflow begins with the upload of the purchase receipt image, which is routed through a Python routine that first applies OCR to extract raw text and subsequently employs an ML model for the automatic classification of each item using the COICOP code.

While the current ML-based prototype has delivered initial functionality and operates at zero operational API cost, its full utilization presents some quality challenges that require mitigation, specifically related to:

• Input Capture: Non-Standardized Photos (Full receipt vs. Itemized only), requiring excess cropping/data cleaning.

• Source Readability: Retailer Inconsistency (varying performance across shops) due to different layouts.

• Image Quality: Physical Defects (blur, shadows, creases, markings) which directly degrade OCR accuracy.

These drawbacks necessitate manual data cleaning and have a hidden cost in manual labour. To move the project toward the full potential of high-quality data, the immediate future step is the adoption and integration of an Advanced AI System.

To overcome these constraints, this project proposes adopting external, advanced AI services (multimodal LLMs or specialized Document AI APIs). That means evolving form a Python-based ML solution to a commercial, API-driven Artificial Intelligence service, which will be integrated into the NSI data collection platform. This solution is poised to deliver a dramatic quality uplift by:

• Semantic Document Understanding: Utilizing sophisticated AI models to identify and extract all required data from full receipts with high accuracy, regardless of image quality or complex, non-standard layouts.

• Contextual Classification: Leveraging deep NLP to perform robust, contextual COICOP classification.

This technological shift directly addresses the root causes of data processing errors, delivering cleaner data for official statistics. An external validation review will critically check the AI's accuracy, ensuring the cost-for-quality trade-off remains favourable and underpins public trust.

Main author / Presenter
Olga Gonzalez
National Statistics Institute

Read more Read less I am a mathematician at the National Statistics Institute of Spain, where I specialize in data analysis, python programming and survey monitoring. I hold a degree in Mathematics from the University of Santiago de Compostela and a Master’s in Big Data Analytics from the University of Carlos III in Madrid. My professional background includes four years of experience in the private sector as Data Scientist and nearly three years in official statistics at the NSI. Currently, I leverage programming in Python, SAS and JavaScript for both development projects and exploratory trials. In addition, an important part of my work involves monitoring key KPIs to ensure the effectiveness of data collection and following up on the performance of various national surveys.

Presentation title
Adapting the Classification of Products by Activity for Administrative Use: An AI Supported Approach
The introduction of mandatory e Invoicing and fiscalisation in Croatia from 2026 requires that every product and service listed on an invoice be assigned a correct six digit code from the Classification of Products by Activity (CPA/KPD).

Read more Read less Although CPA/KPD is a statistical classification designed primarily for the production of harmonised European statistics, it is now being adapted for administrative and fiscal purposes. This shift presents methodological challenges, as statistical classifications are structured to support comparability, analytical consistency and long term stability, whereas administrative systems require operational precision, immediate applicability and strict regulatory alignment. The discrepancy between statistical and administrative processes highlights the need for innovative solutions that can bridge these two fundamentally different domains without compromising the methodological integrity of the statistical system. To address these challenges, the Croatian Bureau of Statistics (CBS) is developing a multi stage, artificial intelligence supported procedure that assists users in identifying the correct CPA/KPD code for fiscalisation purposes. The procedure is designed to accommodate different levels of user expertise, different business contexts and varying degrees of product complexity. In the first stage, the system proposes a likely CPA/KPD code based on the main activity of the business entity, as defined by its NACE/NKD classification. This provides an initial orientation by linking the enterprise’s dominant economic activity with the most relevant product categories. The second stage introduces an interactive, AI driven chat interface. Users can describe their product or service in natural language, and the system interprets the description, compares it with CPA/KPD definitions and suggests the most appropriate code. This stage is particularly valuable when the product is not directly inferable from the main activity or when additional contextual clarification is needed. The third stage supports structured, large scale inquiries. Users complete an Excel template by entering a list of products or services along with detailed descriptions. The completed file is submitted to CBS, where an AI system evaluates each entry and assigns the correct CPA/KPD code. This stage is especially suited for businesses with extensive product catalogues or complex service portfolios. By integrating statistical methodology with modern AI tools, CBS aims to ensure accurate, consistent and user friendly coding support while maintaining the conceptual integrity of the CPA/KPD classification. This approach demonstrates how statistical infrastructures can be responsibly adapted to meet emerging administrative needs.

Main author / Presenter
Jasmina Dautbegović
Croatian Bureau of Statistics

Read more Read less Jasmina Dautbegović is a statistical specialist at the Croatian Bureau of Statistics with extensive experience in data processing and the implementation of statistical surveys in line with the Annual Implementation Plan. She has worked across multiple statistical domains, ensuring methodological consistency and high quality data production. Jasmina is currently engaged in the codification of business entities using the Classification of Products by Activity (CPA/KPD) and has contributed to the development of the artificial intelligence based application designed to support codification processes. Her work reflects strong analytical skills, precision and a commitment to modernising statistical operations. Helena Brkanović is a statistical expert at the Croatian Bureau of Statistics with extensive experience in the registration and classification of business entities within the Administrative Business Register. Her deep knowledge of NACE Rev. 2.1 has been essential in ensuring the accurate determination of main activities and in supporting the consistent application of statistical classifications.


CO-AUTHOR:

Helena Brkanović, Croatian Bureau of Statistics

Presentation title
Automatic Extraction of key variables from financial statements
The Statistical Business Register maintained by the Statistical Service of Cyprus (CYSTAT), requires continuous updates to maintain accurate business information.

Read more Read less Traditional data collection methods, however, can place a high administrative burden on businesses and incur significant manual processing costs for NSIs. This presentation will outline an automatic data extraction algorithm implemented at CYSTAT and designed to lessen the load on both businesses and NSIs.

The developed system processes financial statements submitted by companies in PDF format to the Department of Registrar of Companies and Intellectual Property (Registrar Department), extracting data in a three-step process:

[Step 1] A Robotic Process Automation (RPA) solution monitors and retrieves newly uploaded financial statements for pre-specified businesses, directly from the Registrar Department’s platform.

[Step 2] The downloaded documents are passed via API to a large language model (LLM), where key variables are requested and, if found, extracted.

[Step 3] Due to the use of an LLM, the results are not error-proof and hence cannot be automatically reflected in the Business Register. Instead the extracted records enter a human-in-the-loop verification step where reviewers validate and correct results, while suggesting improvements to the LLM prompt as necessary.

The deployment of the system has reduced manual data collection costs while preserving data quality through automated checks and expert review. This presentation will summarize the architecture of the system and its results, as well as outline some key takeaways to using LLM-assisted processes in official statistics.

Main author / Presenter
Andreas Hadjittofis
Presenter

Read more Read less Andreas has a background in mathematics and a passion for programming. After working as a data analyst in the private sector, he joined CYSTAT 5 years ago and is currently part of the NSI’s Research and Development Unit, collaborating with a multidisciplinary team of developers to drive innovation.


CO-AUTHOR:

Andreas Soteriou, Developer

Presentation title
Embedding Risk Management in Statistical Production: Conceptual Framework and Lessons Learned
The production of official statistics relies on complex processes that are exposed to operational risks likely to affect data quality, timeliness, business continuity and the institutional credibility of national statistical offices.

Read more Read less In response to these challenges, the Swiss Federal Statistical Office (FSO) launched a pilot project aimed at designing and testing a structured risk management system specifically tailored to statistical production.

The project is embedded within the federal risk management framework while addressing the specific needs of statistical activities.

It is based on a shared risk catalogue structured along the phases of the Generic Statistical Business Process Model (GSBPM), an evaluation matrix adapted to statistical impacts, and a clearly defined process covering roles, escalation mechanisms, monitoring and reporting. The approach emphasizes early risk identification by risk owners, complemented by a centralized, cross-cutting perspective.

Tested in several pilot units, the system made it possible to validate its operational feasibility, the relevance of the identified risks and the proportionality of the efforts required. The results show that the system enhances transparency, supports managerial decision-making and acts as a structuring lever for integrating risk considerations into quality management and business continuity activities.

This contribution presents the conceptual framework of the system, the main lessons learned from the pilot project, as well as the initial feedback from the first cycle of risk assessment. It aims to contribute to international discussions on the development of structured and proactive risk management practices in the production of official statistics.

Main author / Presenter
Mathieu Gunzinger
Federal Statistical Office (FSO)

Read more Read less Mathieu Gunzinger has been working at the Federal Statistical Office (Switzerland) for over twenty years. He contributed to the development of the quality management framework for the Swiss Business and Enterprise Register and led, together with his teams, the development in 2013 of the first Statistics on the Structure of Enterprises (STATENT) based on administrative registers. After managing the production of this statistic for nearly ten years, he moved in 2023 to cross-cutting quality management activities, where he leverages his experience as a production manager to support office-wide initiatives. Since then, he has conducted process analyses across various statistical activities, carried out quality reviews, and provided guidance on the optimisation of statistical production and support processes. In 2025, he led the pilot project introducing a risk assessment system in statistical production.

Presentation title
Fostering a Quality Culture in ONS (UK): Embedding Risk Management, Lines of Defence, and Championing Curiosity
The Office for National Statistics (ONS) is committed to embed a robust quality culture across the organisation—one that goes beyond technical standards to become a shared value and collective responsibility, and upholds the principle of the Code of Practice for Statistics.

Read more Read less

This presentation will explore how ONS’s risk-based approach to quality management, underpinned by the lines of defence model, is shaping a culture where quality is everyone’s responsibility. We will outline how ONS has adopted a risk management framework, inspired by cross-government best practice, to identify, assess, and address quality risks at every stage of the data lifecycle. This approach enables proportionate decision-making, supports value for money, and ensures compliance with the Code of Practice for Statistics.

The lines of defence model will be discussed as a practical tool for delegating and coordinating quality responsibilities, ensuring that risk management is embedded at all organisational levels.

A central theme will be the role of the ONS Quality Champions Network. Quality Champions act as catalysts for cultural change, sharing best practices, supporting colleagues, and fostering an environment where everyone feels empowered to raise issues and drive continuous improvement. Their work is integral to mitigating strategic risks and sustaining a healthy quality culture.

The presentation will also draw on lessons learnt from the last three years, including the importance of psychological safety, curiosity, and open dialogue. We will highlight how practical tools—such as quality improvement plans, maturity models, and guidance—are supporting teams to prioritise quality, learn from experience, and innovate.

The presentation will offer insights and practical examples of how ONS is building a culture that not only manages risk but also inspires excellence, collaboration, and public trust in official statistics.

Key themes to be covered:

• ONS’s risk management approach to quality, its benefits and challenges, and its practical applications

• The lines of defence model for quality assurance and governance

• The evolving role and impact of Quality Champions

• Cultural enablers: psychological safety, curiosity, and shared responsibility

• Tools and frameworks supporting quality improvement.

• Lessons learned and future directions for sustaining a quality culture.

Note: A similar topic was presented by a different colleague at Q2022. However, the presentation/abstract was significantly different as it focused on a few of ONS tools for managing quality risks, and didn't cover the full culture element. Our framework and lessons learnt on culture have substantially evolved since then.

Main author / Presenter
Lucia Barbone
UK Office for National Statistics

Read more Read less Lucia Barbone, PhD, leads the Best Practice Assurance & Improvement team at the UK Office for National Statistics, supporting ONS colleagues in upholding the Code of Practice for Statistics, managing risks to quality and embed best practice in statistical production. Lucia’s work focuses on developing and delivering quality assurance frameworks, providing expert guidance, and leading continuous improvement initiatives across ONS projects. Lucia has worked at ONS since 2020, having worked a range of roles and industries before, including academia, consultancy, and international development, and has a strong technical background in data analysis and economics. Lucia is a quality of data & analysis enthusiast, and she is passionate about translating ideas for improvement into action, in a way that adds value and has impact.

Presentation title
Risk Assessment of Innovative Projects in NSIs
In line with the Scheveningen (2013) and Bucharest (2018) memoranda, the European Statistical System (ESS) has been continuously adopting new technologies and methodologies to improve the quality, timeliness and relevance of official statistics.

Read more Read less Over the past decade, numerous innovative projects have been carried out at both European and national levels, notably in the areas of big data (ESSnets Big Data I and II), smart data collection (ESSnets Trusted Smart Surveys), and more recently artificial intelligence and machine learning (AIML4OS).

To support these developments, many National Statistical Institutes (NSIs) have established dedicated innovation structures, such as data science centres and AI laboratories, to conduct experimental and innovative projects. These projects play a key role in enabling methodological and technological change, but they are also characterised by high levels of uncertainty.

Innovation, however, is inherently associated with uncertainty and risk, affecting both project processes and outcomes. However, risk analysis for innovative projects remains relatively underdeveloped. Traditional risk management approaches, designed primarily for conventional projects, are often ill-suited to innovation, where uncertainty is frequently of a Knightian nature. As a result, risks specific to innovative projects are not always adequately identified or managed.

The first objective of this presentation is to identify the risks specific to innovative projects carried out within NSIs, as proper risk identification is an essential prerequisite for effective risk and quality management. This analysis builds in part on the typology proposed by Orellano and Gourc (2025), which distinguishes between external and internal risks. While external risks are generally well recognised, internal risks tend to be organisation-specific and less visible.

Lessons learned from innovative projects indicate that internal risks, whether financial, organisational, relational or operational, are particularly challenging to anticipate as they depend on existing processes, organisational culture and the risk aversion of the people involved.

The presentation then examines the relevance and potential limitations of introducing risk management into innovative project management. While excessive risk management may encourage risk-averse behaviour and inhibit innovation, the absence of structured risk management may lead to uncontrolled risk-taking. A balanced, quality-oriented approach to risk management can facilitate the controlled transition of experimental solutions into regular statistical production.

Main author / Presenter
Romain Lesur
INSEE

Read more Read less Romain Lesur is the head of the Data Science Lab (SSP Lab) at INSEE, the French National Institute of Statistics and Economic Studies. Prior to his current role, he was head of the IT innovation division at INSEE, that is responsible for the development of the Onyxia project. Previously, he was deputy head of the Statistical Service of the French Ministry of Justice. Before that, he was an economist at the French National Competition Authority and at the French Ministry of Finance.

Cookies

This website uses cookies to ensure you get the best experience.

x