BACK TO SCHEDULE REGULAR

Session 28

Quality in Administrative and Multi-Source Statistics

4 June 2026
16:30 – 18:00
ŠIBENIK IV

Presentation title
The use of a Metadata Questionnaire for assessing the quality of administrative sources used in the Statistical Business Register
Over the last years, the Hellenic Statistical Authority (ELSTAT) has been established as a trustworthy partner for administrative agencies to provide their data and it has streamlined its continuous access to administrative registers and other data sources that are suitable for feeding the Statistical Business Register.

Read more Read less The existence of common unique identifiers in these data sources, along with the existence of common unique identifiers in the Statistical Business Register of ELSTAT, multiply the benefits and the insights that these data can provide through their interlinking. Apart from the legal provisions and the technical arrangements put in place to make possible this access, a structured internal process has been developed by ELSTAT, aiming at maximising the coverage and the quality of the Statistical Business Register through efficiently utilising the various administrative data. This process is based on the prioritisation of the sources, in terms of coverage, accuracy and timeliness, separately, for each one of the variables of the Statistical Business Register. The prioritisation of the administrative sources’ data for each variable should rely on the thorough knowledge of the content of these sources. Nevertheless, what was missing, was the systematic provision by administrative sources of all the necessary information that could make faster and easier understanding and processing the administrative data, as these data sources have different characteristics related to coverage, accuracy, time lag, granularity etc. In response to this need, ELSTAT is applying a quasi bottom-up data-stewardship approach for getting the necessary insight into administrative sources, evaluating and finally using their data in the compilation of the Statistical Business Register. In particular, aiming at acquiring this knowledge in a systematic, coordinated and homogeneous manner across the available administrative registers and sources, ELSTAT developed a detailed Questionnaire, simulating in content to a typical statistical Metadata report and gathers metadata information from each one of the administrative sources. The Metadata Questionnaire addressed to administrative sources includes questions on the units of measures, the frequency, the timeliness, the institutional mandate, the coverage and the underlying population, the relevance, the quality management, the processing, the accuracy, the comparability and the revisions’ policy and practices. This paper presents the actions of ELSTAT for enhancing data quality of the Statistical Business Register, as materialised through the prioritisation process for feeding the Statistical Business Register with primary data, the novelties and the gains occurred by the application of the quasi bottom up data-stewardship approach for navigating through

Main author / Presenter
Ioanna Kouti; Dimitra Trivizaki
HELLENIC STATISTICAL AUTHORITY

Read more Read less Kouti Ioanna and Trivizaki Dimitra are statistical officers at the Business Statistics Division of the Hellenic Statistical Authority (ELSTAT). With experience in the Statistical Business Register and Structural Business Statistics (SBS), we focus on the collection, processing, and analysis of data regarding enterprise demography and the economic structure of Greek businesses. We hold Bachelor’s degree in Economics and a Master’s degree (MSc) in Statistics, combining economic theory with advanced quantitative methods. In our current role, we contribute to the production of high-quality official statistics that align with European standards, ensuring the accuracy of the business framework used for national and international reporting.

Presentation title
Development of a new multi-sourced production process for STS Employment Indices in Greece, supported by R routines.
Development of a new multi-sourced production process for STS Employment Indices in Greece, supported by R routines

Read more Read less The Quarterly Short Term Statistics (STS) on Employment Indices produced by the Hellenic Statistical Authority (ELSTAT), as part of European Business Statistics, measure key labour market variables, namely the number of employed persons, hours worked, and wages. The indices cover twelve major sections of economic activity, corresponding to Manufacturing, Construction, Trade and Services. The purpose of the indices is to monitor, on a quarterly basis, changes in employment levels, working time, and remuneration.

Up to 2024, the indices were compiled based on primary data collected through questionnaires as part of a direct sample survey addressed to enterprises, jointly covering the information needs for STS Employment Indices and Labour Cost Indices (LCI). The main challenges related to this survey were the high unit and item non-response, the burden posed on respondents, the resources-intensive nature of this work for ELSTAT and the different processing needs for LCI and STS.

Since 2025 onwards, ELSTAT has designed and is implementing a new production scheme with the use of administrative data, made available through cooperation protocols with the Ministry of Labour and the Social Security Fund. The new scheme includes the revision of STS Employment Indices by assessing, processing, and integrating the available data from administrative sources.

The main objective of this initiative was to enhance the accuracy and reliability of STS Employment Indices and to efficiently simplify the production process by establishing a comprehensive system for the regular quarterly calculation of these indices, encompassing all stages of production, including data receipt and storage, processing, validation, selection of final microdata, and the calculation of indices.

This paper presents the problems encountered when running the direct joint survey, along with the concrete procedural and IT steps followed to design and materialize the assessment, use and processing of primary administrative data, as the basic input for STS Employment Indices.

Main author / Presenter
Konstantina Tsami
Hellenic Statistical Authority (ELSTAT)

Read more Read less Konstantina Tsami is a graduate of the Department of Statistics at the Athens University of Economics and Business and holds a master’s degree in Applied Statistics from the same university. She has extensive professional experience in data analysis in the private sector, particularly in the consulting and market research industry. Since April 2025, she has been a permanent employee of the Hellenic Statistical Authority, working in the Section of Manufacture-Construction Indices and Industrial Products of the Business Statistics Division. She is currently responsible for the monthly production of the Turnover Index in Industry and participates in the management of data for the compilation of the Short-Term Statistics (STS) Employment Indices.


CO-AUTHORS:

Angelos Xirogiannis, Hellenic Statistical Authority (ELSTAT)
Ioanna Karavou, Hellenic Statistical Authority (ELSTAT)
Elissavet Salpea, Hellenic Statistical Authority (ELSTAT)
Konstatntinos Thomas, Hellenic Statistical Authority (ELSTAT)

Presentation title
Ensuring Quality in Official Statistics through Administrative Sources: Lessons from eVisitor
The increasing use of administrative data sources represents a key trend in the development of official statistics within the European Statistical System (ESS).

Read more Read less Such sources enable greater efficiency in statistical production, reduce the burden on reporting units, and provide broader data coverage, but they simultaneously require a systematic approach to quality assurance in accordance with ESS standards. The aim of this paper is to present the methodological approach to using administrative sources in tourism statistics, with a focus on the eVisitor system, including processes of data acquisition, validation, cleaning, and adaptation for the production of official statistics. Challenges in using administrative data were analyzed, including different purposes of collection, levels of detail, varying classifications, and terminology differences. Solutions were developed to harmonize the data methodologically, including unique identifiers, automated data processing applications, and standardized protocols. Data from eVisitor are used for the production of official tourism statistics, including indicators on tourist arrivals and overnight stays, as well as accommodation capacities in commercial establishments, and arrivals and overnight stays in non-commercial accommodation. The extracted and statistically processed eVisitor data are also used for other statistical purposes and surveys, including the Survey on Tourist Activity of the Population of the Republic of Croatia, national accounts, price statistics, trade, and environmental statistics. Working with administrative data has enabled improvements in statistical processes, reduction of the reporting burden, enhancement of data coverage and content of statistical registers, and production of new indicators, including arrivals and overnight stays by tourists’ gender and age. Data quality is ensured through validation, standardization, consistency checks, and inter-institutional coordination, as well as continuous cooperation with data providers (Croatian National Tourist Board), in order to achieve methodological compliance with ESS standards. Quality dimensions include relevance, accuracy and reliability, timeliness and punctuality of dissemination, comparability, and coherence across different statistical domains. The experience of using the eVisitor system demonstrates that administrative sources can significantly improve the quality of official statistics through systematic application of quality dimensions, a partnership approach, and methodological adaptation of the data. eVisitor represents a good practice example in accordance with the ESS Quality Framework, with key prerequisites for quality including inter-institutional collaboration, clear data exchange protocols, and continuous communication.

Main author / Presenter
Jasna Perko
Croatian Bureau of Statistics

Read more Read less Working in the Tourism Statistics Department since 2007. Good knowledge of statistical methodology for statistical surveys in field of tourism statistics (20 years of experience related to the development of methodologies for the implementation of statistical research in the field of tourism statistics), coordination of work and organization of monthly research on Tourist arrivals and nights and yearly research on Capacity and Turnover of Ports. Participation in EU projects concerning new data sources in the field of tourism statistics.


CO-AUTHORS:

Ivana Brozović, Croatian Bureau of Statistics
Gordana Šiklić, Croatian Bureau of Statistics

Presentation title
A model for editing and imputation of zero-inflated multivariate data.
Editing and imputation (EI) is an important step of data processing for NSIs.

Read more Read less Existing model-based procedures for the automatic and unsupervised detection of erroneous data do not work properly in the presence of zero-inflation. The problem with zero-inflated data, particularly common in economic variables, emerges when continuous variables show a substantial portion of zero values, and requires a specific modelling approach due to the semi-continuous nature of the distribution.

We propose a parametric model for EI of such data, which enables us to estimate potential unit-level errors in multivariate continuous data, by assessing the magnitude of the errors (the distance between the observed and model values) and their plausibility (the probability estimated by the model), and to evaluate possible imputation values. The resulting models will be implemented in the R package 'mixtobit'.

While in some cases zero values can be attributed to a distinct subpopulation, justifying a separate treatment, in our case, we model different sources of zero values. In fact, we assume that some of the zeros originate from the same process generating positive values, while others are due to erroneous data. Consequently, we propose a joint modelling approach for zeros and non-zero observations based on the Tobit model.

We do not want to restrict error analysis to zero values. To extend the analysis to cover all the data (in practice, both false zeros and positive outliers), we present our Intermittently Contaminated Tobit (ICT) model. The ICT assumes a baseline Tobit distribution for the population and an additional source of variability affecting a subset of units 'contaminated' by intermittent errors, which inflate the baseline Tobit's variance. This corresponds to a mixture of two Tobits, enabling us to estimate:

1) the probability of a unit being contaminated at the unit level;

2) the conditional expectation of the 'true' data given the observed, potentially erroneous, data.

These two quantities serve to define an individual 'error score', which can be used:

- at a unit level, to identify observations with influential errors exceeding a certain error score threshold (selective editing);

- at a macro level, to obtain robust estimates of the total/average values of the response variables in any specific domain by employing the expected 'true value' of all units in the sample.

We show the resuts of an application on economic data on the added value of Italian enterprises.

Main author / Presenter
Davide Di Cecco
ISTAT

Read more Read less Davide Di Cecco is a researcher at the Italian Statistical Institute. Over the last 15 years, he has worked in the field of statistical methodology at Sapienza University and ISTAT. His main areas of research are mixture models, data integration, capture-recapture analysis and Bayesian statistics.


CO-AUTHORS:

Ugo Guarnera, Agenzia delle entrate
Danila Filipponi, ISTAT

Presentation title
Rapid Estimates for the Swiss Structural Business Statistics using Model-Based Estimation, Bootstrap and Calibration
The Swiss Structural Business Statistics, STATENT (Statistique structurelle des entreprises, in French), provides essential information on the national economy.

Read more Read less It is primarily based on data from the Old Age and Survivors Insurance (OASI) and the Swiss Business and Enterprise Register (SBER), supplemented by ongoing enterprise surveys. The Swiss Federal Statistical Office’s STATENT Flash innovation project aims to provide rapid estimates on enterprises and employment one year ahead of official publication.

We propose a model-based approach to predict the total number of units expected in the STATENT of year t+1 using the SBER units of that year and a model trained on data of year t. The inclusion of each SBER unit in the final STATENT is modeled as a Bernoulli outcome, and the expected total is computed as the sum of the predicted inclusion probabilities for all SBER units. These probabilities are estimated using a decision tree fitted on data of year t, assuming the model´s temporal transferability to the next year. Auxiliary variables include size class, NACE section, and indicators of each unit´s presence in other sources, such as the previous year’s STATENT and the second quarterly OASI data delivery.

To quantify the variance of the predicted total, we developed a stratified bootstrap method tailored to the heterogeneity of economic sections. Units are resampled with replacement within NACE sections to preserve the section composition in each replicate. For the estimation we decompose the variance of the predicted total into an expected variance of a sum of independent Bernoulli variables and a variance of the corresponding expected value. Results on the estimates and their variances from 2019 to 2023 show the stability of our approach, with the estimated totals differing from the effective values by only around 0.1%. The variance estimates enable the construction of confidence intervals, which, combined with the effective values, help assess potential bias in our predictions.

A minor challenge arises from 0.1% under-coverage in the SBER population at the time of model fitting. Calibrating the estimated inclusion probabilities to the final STATENT total yields calibration factors, which allow to improve predictions and account for under-coverage.

This study presents a hybrid approach for rapid estimation for the Swiss structural business statistics, combining predictive modeling, variance estimation with design-based resampling and calibration. This approach is computationally feasible for large registers and aims to provide timely estimates for official statistics together with an evaluation of their quality.

Main author / Presenter
Athanassia Chalimourda
Swiss Federal Statistical Office

Read more Read less Athanassia Chalimourda studied mathematics, computer science, and statistics at the Universities of Athens and Bochum in Germany. Her professional experience spans finance, education, and large-scale academic research projects, where her work focused on predictive modeling, statistical learning, and artificial intelligence. Since 2019, she has been part of the Statistical Methods Unit at the Swiss Federal Statistical Office. She is passionate about advancing innovative methods for official statistics.


CO-AUTHORS:

Daniel Assoulin, Swiss Federal Statistical Office
Desislava Nedyalkova, Swiss Federal Statistical Office
Nicolas Vallon, Swiss Federal Statistical Office

Cookies

This website uses cookies to ensure you get the best experience.

x