Comparative studies using linked employer-employee (LEE) data can provide strong evidence for labor market patterns that hold across different economies. However, given the challenges of accessing and mastering multiple national datasets, individual researchers rarely work with data from many countries. Instead, cross-country collaborations have emerged to fill this gap, often using a distributed coding approach, where each research team analyzes its own national data using a common set of methods.
Three major initiatives have played a key role in advancing this type of research:
LINKEED (OECD): This OECD-led initiative works with harmonized LEE data from 20 countries, making it one of the most ambitious cross-country research efforts using administrative data. Its initial focus was on the role of firms in wage inequality, but its research agenda has since expanded to cover topics such as job mobility, career progression, firm-level wage-setting practices, gender pay gaps, and wage disparities between immigrants and native-born workers.
Global Repository of Income Dynamics (GRID): GRID provides an open-access database with harmonized microstatistics on income inequality and income dynamics, derived from administrative earnings records. Originally covering 13 countries, the project is now expanding to more than 25. Alongside making data available, GRID researchers have highlighted key trends in income inequality at both the national and global levels.
Comparative Organizational Inequality Network (COIN): Established in 2015, COIN started as a collaboration of researchers from seven countries studying workplace inequality. It has since expanded to include experts from more than 16 countries, including sociologists, economists, and management scholars. The network focuses on various aspects of organizational inequality, including earnings disparities, immigration, gender differences, and mobility within firms.
Beyond these large initiatives, smaller-scale comparative studies using LEE data have also emerged. One study analyzed the effects of job loss on workers’ labor outcomes across seven countries. Another examined the labor market impact of the Öresund Bridge’s construction, using linked registry data from Sweden and Denmark to track employment and wage changes on both sides of the border.
A common limitation of administrative databases is that they mainly contain information generated by public institutions. However, this drawback can be addressed by integrating additional data sources, a practice increasingly common in international research. Wage surveys, household budget data, and time-use surveys are among the most frequently linked sources. To merge these datasets with LEE data, researchers must obtain authorization from data providers and follow strict anonymization procedures. Once these conditions are met, new or existing data can be successfully incorporated into an LEE framework.
Academic institutions, research centers, and statistical agencies in various countries are also working to expand LEE datasets by integrating new data types. Many have started linking education, health, and business network data to employer-employee records. Countries such as Germany, Denmark, and France have pioneered the use of firm-level surveys, and even experimental data, in conjunction with administrative work history records. These efforts are conducted with participant consent and strict compliance with data protection regulations.
The German Institute for Employment Research (IAB) has played a leading role in developing linked datasets that merge employer surveys with administrative registers. One example is the Integrated Employment Biographies (IEB), which links detailed firm-level surveys on job vacancies with administrative records to track how vacancies are filled. Other projects link firm-level surveys on hiring practices, wage-setting behavior, and automation to LEE data, enabling studies on topics such as gender pay gaps, labor market transitions, and workforce composition.
Beyond direct data integration, innovative methods are also being explored to enhance LEE datasets. For instance, researchers have used name classification algorithms to estimate workers’ country or region of origin, allowing for the study of migration patterns. As modern data science techniques evolve, additional opportunities for improving LEE datasets may emerge, provided that security protocols remain in place and data providers are willing to collaborate.
LEE data is not only valuable for academic research but also plays an important role in policymaking. These datasets are widely used to analyze labor market trends, assess the effects of major policy changes, and provide insights into economic behavior.
One way to enhance the usefulness of LEE data is to increase the frequency of data updates. In many countries, LEE datasets are updated annually, typically incorporating data from the previous year. However, some countries construct their LEE datasets only on an irregular basis. Since administrative records, such as employer tax filings, are often generated monthly, making them available with minimal delay could greatly improve their value for policy analysis.
Timely access to high-quality administrative data would be particularly useful in assessing the labor market impact of sudden economic shocks, such as the COVID-19 pandemic, or tracking the consequences of large-scale immigration due to armed conflict. More frequent data updates would allow policymakers to make evidence-based decisions in real time while also giving researchers the opportunity to analyze emerging labor market trends more effectively.