An Effort to Encode Rules for Machine Readability in Administrative Data
On March 31, 2026, the "Rules for Machine Readability of Administrative Data" were decided at the joint meeting of the (4th) Inter-Ministerial Digital Transformation Promotion Liaison Meeting and the (21st) Meeting for the Promotion of a Digital Society Executive Meeting.
As a position working on Evidence-Based Policy Making (EBPM) and visualization of policy effects, Digital Agency aims to improve machine-readability by developing its own check system and AI tools, etc., and provides a datafile (CSV and JSON format) that enables accurate interpretation of the Rules for Machine Readability (Cabinet Secretariat) (PDF format) . These rules and files can be used not only by government agencies but also by private companies, etc., in their efforts to improve machine-readability of data.
Code for Rules for Machine Readability in Administrative Data (JSON Format) (GitHub)
Code for Rules for Machine Readability in Administrative Data (CSV Format) (GitHub)
Table of Contents
- 1. Rules for machine readability in administrative data
- 2. Encoding Machine-Readability Rules
- 3. Download a list of rules and codes
- 4. Contact
1. Rules for machine readability in administrative data
Background of the efforts
With the progress of AI and digital technologies, databases have become fundamental resources for economic growth and the resolution of social issues, and the quality improvement of administrative databases is an important challenge for the whole country. Traditionally, some administrative databases have adopted formats (such as cell merging and blank line insertion) that are intended to be printed on paper and viewed by humans, but these formats are difficult for data analysis software and AI to process.
In general, when using information that is not machine-readable, data cleansing at the user side is necessary. Appropriate cleansing at the data provider side contributes to reducing costs for society as a whole, including the reduction of transcription errors, visualization using BI tools, AI utilization, and the promotion of EBPM. In particular, since the information disclosed by the government has many users (other government agencies, private sector, researchers, etc.), the effect of reducing the burden of data cleansing is more widespread throughout society than other information.
Three levels of data quality
The "Rules for Machine Readability of Administrative Data (Cabinet Secretariat) (PDF format) " organizes the qualities of machine readability in three stages so that officials in each government agency can work on them in stages. Level 1 is positioned as "at least basic rules that can be read by machines," and levels 2 and 3 show the technical path toward ensuring higher levels of machine readability.
Examples of Specific Rules
Specific rules for each level are as follows:
| Level 1 "Can view and post" | Level 2: Capable of aggregation and analysis | Level 3 "Can be linked and automated" | |
|---|---|---|---|
| US> | A tabular format, divided into rows and columns, in which the letters and numbers are stored in a form recognizable by computers and readable by machines as intended by humans. | A statistical data format in which the rules are consistent for each column, and each row can be treated as a sample and each column as a variable. The data type (quantitative or qualitative) is clear, and the handling of missing values is unified. | A standardised coding system and metadata (units, definitions, history of changes, etc.) are in place and can be compared across time series or other statistical surveys. |
| Rule Example | "Use Excel or CSV as the file format." "Do not use multiple tables per sheet." "Ensure that data is not divided." "Do not include information that is irrelevant to the main body of data." "Use one data per cell." | "Numeric data shall be numeric attributes." "Do not omit item names, etc. in data." "Enter item names that can be uniquely identified in each column." "Standardize answers to choices." | "The code table for the response shall be attached." "Enter the data unit." "Standardize the time axis." "Enter the data definition and update history." "Use the vertical format for the data." |
2. Encoding Machine-Readability Rules
Steps to Apply Rules
Here are some steps you can take to ensure your machine-readable data complies with the rules:
- 1. Collation: Collate with the rules and organize the status of conformity
- 2. Point out: list the items that need to be modified according to the target level
- 3. Fix: Fix the file manually or mechanically.
Coding Rules
For the repetition of routine tasks such as rule application procedures, it is effective to automate them by machine, including AI. In that case, in order for your own check tools and AI tools to refer to these rules, it is useful to write the rules in a machine-readable format rather than in a PDF document created for humans. Therefore, publish the rules themselves in a highly machine-readable code format.

3. Download the rule list and code
A Code Version of Rules for Machine Readability in Administrative Data
This is code that organizes the rule structure so that it can be read and interpreted directly by machines. It is used as a "setting file" for function to a unique check system, AI tool, etc. For reference, a sample implementation of function that performs rule judgment based on the setting file is also created and posted.
- Code for Rules for Machine Readability in Administrative Data (JSON Format) (GitHub)
- Code for Rules for Machine Readability in Administrative Data (CSV Format) (GitHub)
4. Contact
Please send any questions or ideas from , comments and requests, .