This guide is for existing and potential Open Data publishers, in particular Government Departments and public bodies who wish to publish Open Data on the data.gov.ie portal. In the following sections, we look at some of the stages of publishing Open Data: reviewing what data the organisation manages, identifying what data should be published as Open Data, ensuring the data is compliant with the recommendations in the Technical Framework, and publishing data as Open Data on data.gov.ie.
If you have any questions about publishing Open Data, please contact the Department of Public Expenditure and Reform at email@example.com
When considering the publication of Open Data, it is important for a public body to develop a clear understanding of all the data it holds.
A data audit provides a mechanism to discover what datasets an organisation holds. This enables improved knowledge management, data sharing and evidence-based decision-making. It also helps identify data that is unnecessary and utilising resources, or data that could be improved.
The aim of a data audit is to identify:
A simple data audit method is:
An Open Data Audit and Publication Planning Guide is available here.
A Data Audit Tool to support public bodies who want to start publishing Open Data, but are unsure which datasets are suitable for publication can be accessed at http://audit.data.gov.ie/. A user guide is for the Data Audit Tool is available here.
Once an organisation has an overview of the data it manages, the next consideration is what data should they publish as Open Data? According to the Foundation Document for the development of the Public Service Open Data Strategy, all appropriate data should be published as Open Data.
Open Data is considered the default option for appropriate new datasets. Where requested datasets are not released as Open Data, the responsible public body will provide reasons why not.
Therefore, a better question would be what data to publish first, with a view to continuous publication of all data. Open Data that is most useful is high-value data. High-value data can be defined as data that increases accountability, improves public knowledge, furthers the mission of the public body, or creates economic opportunity.
There are many ways an organisation can identify high-value datasets to prioritise for Open Data publication. These include considering:
It is important to note that some data will never be appropriate to publish as Open Data, for example, data publication that would lead to a violation of the fundemental right to privacy under data protection legislation, or data that may be classified for security reasons. The Central Statistics Office can also provide support on the statistical anonymisation of data for publication purposes. Further information and guidance on anonymisation is available on the website of the Office of the Data Protection Commissioner.
The Open Data Technical Framework provides guidance on the practical aspects of publishing Open Data. This ensures that publication of datasets on data.gov.ie is done in a consistent, persistent and truly open way. The Technical Framework comprises five key components:
Data and metadata published on data.gov.ie must be associated with the Creative Commons Attribution (CC-BY) Licence, at a minimum.
Data published on data.gov.ie must be machine-readable and in an open format (3-star Open Data), e.g. CSV, JSON or XML.
Data published on data.gov.ie must be compliant with DCAT-AP, the international Open Data metadata standard.
Data published on data.gov.ie should use national and international data-standards where possible,
Data published on data.gov.ie should use Unique Resource Identifiers where possible.
The data is now ready to be published as Open Data on data.gov.ie. Publication on an Open Data Portal opens the door to innovative data-reuse. Data.gov.ie does not host the actual datasets, but instead is a catalogue of metadata, with pointers to the data hosted elsewhere on the Web, for example, on the website of a public organisation.
Data can be manually published on data.gov.ie via the 'Add Dataset' online form. Alternatively, data can be programmatically published on data.gov.ie via the API or a data harvester.
The following guidance note on 'Add a New Dataset' is available here.
Data.gov.ie 'Add a New Dataset' Page
In order to add a dataset to data.gov.ie, the organisation requires a user account. An account can be created for Public Sector Bodies by contacting firstname.lastname@example.org
Once logged-in, the organisation can access the New Dataset page, as shown in Figure . By stepping through the online form, the organisation is promted to enter all the necessary metadata. The user-friendly interface makes it easy for non-technical users to publish datasets one at a time. However, using the web interface is not an efficient way to publish multiple data sets, or to periodically update existing datasets. For this, programmatic publication via the API should be used.
Data.gov.ie is built using CKAN, which provides a powerful API that allows developers to add datasets programmatically. Using the API to create or update datasets is quicker than using the web interface when dealing with multiple datasets or dynamic datasets. More details on how to use the API to push data to the portal are available here
Building a harvester that fetches data automatically into data.gov.ie makes sense if a lot of data is sourced from one place, for example, another data catalogue, or if data needs to be updated frequently, for example, daily. A harvester pulls the data from a predefined source periodically, adding and updating it automatically in data.gov.ie. It may utilise the portal's API or be built as a custom CKAN extension.
Data.gov.ie currently harvests data from: