Leximetric datasets

The CBR Leximetric Datasets are the product of work carried out at the Centre for Business Research (CBR) in Cambridge, beginning in 2005 when the Centre received funding from the Economic and Social Research Council to carry out a research project on law, development and finance. Further funding from the ESRC, the European Union’s FP5 and FP6 programmes, the Isaac Newton Trust, the Cambridge Political Economy Society and the International Labour Organization made it possible to expand the original datasets. Current work on the datasets is funded by the ESRC via its Digit Research Centre (Digital Futures at Work) and by the NORFACE consortium through its support for the POPBACK project (Populist Backlash, Democratic Backsliding, and the Future of the Rule of Law in Europe.

In 2017 three datasets were published, coding, respectively, for labour laws in 117 countries between 1970 and 2013 (the CBR Labour Regulation Index, since updated, please see below), shareholder protection in 30 countries between 1990 and 2013 (the CBR Extended Shareholder Protection Index), and creditor protection in 30 countries between 1990 and 2013 (the CBR Extended Creditor Protection Index). The coding of legal data is carried out using a so-called leximetric coding methodology developed in the CBR and more fully explained in the codebooks which accompany each of the datasets.

Taken together, the datasets provide a unique time series which enables researchers and other research users to track changes in labour, company and insolvency law over long periods of time for many countries. A distinguishing feature of these datasets is that all legal sources for the data coding are fully described in the relevant codebooks, thereby assisting transparency, external validity and replicability of results. The work of further developing the datasets on shareholder and creditor rights, so that they match the labour regulation index in terms of years and countries covered, is ongoing. A new version of the Labour Regulation Index, updated to the end of 2022, was published in December 2023.

Each dataset takes the form of an Excel spreadsheet containing the data and a Codebook containing the sources of the coding and an explanation of the coding methodology.

Every effort has been made to ensure that the laws coded in the datasets are accurately sourced and coded, but the scale of the data collected as part of this exercise means that errors or omissions are possible. We are pleased to receive clarifications and corrections and will periodically report on any necessary updates to the datasets to reflect information received.

If you wish to provide a clarification, and/or have an inquiry relating to any of the datasets, please contact Simon Deakin.

The datasets

  • CBR Labour Regulation Index Dataset 1970-2013 (117 countries) (updated in 2023, see below)
  • CBR Extended Shareholder Protection Index 1990-2013 (30 countries)
  • CBR Extended Creditor Protection Index 1990-2013 (30 countries)

They can be accessed by following this link to the University of Cambridge data repository:

Access the University of Cambridge repository

The combined database may be cited as:

Armour, J., Deakin, S. and Siems, M. (2016) CBR Leximetric Datasets

CBR Labour Regulation Index Dataset 2023

An updated version of the CBR-LRI was published in December 2023.  This codes for 117 countries between 1970 and 2022. 

Download the CBR Labour Regulation Index 2023 Dataset

Download the CBR Labour Regulation Index 2023 Codes and Sources

Former versions of the datasets

The current datasets build on and largely incorporate earlier versions of the indices which have been published at intervals over since 2007. The original datasets can be viewed here:

Law, Finance and Development datasets

ESRC-funded projects

The ESRC-funded projects of which the datasets formed part can be viewed here:

Law, Finance and Development

Law, Development and Finance in Rising Powers

Labour Law and Poverty Alleviation in Low and Middle-Income Countries

View further works relating to the CBR datasets