What does it take to licence data?


Hi all,

Working currently on OpenSpending project and dealing with budget datasets who come from official governmental sources but has not been licensed yet, I was wondering how to best advise those governments on how to proceed with the licensing.

And trying to bring a constructive answer, I was wondering what it takes to licence such a dataset, who does what and at which level. What are the roles of different actors, and who takes the action.

At some point the issue raised of what would it imply also for us at Open Knowledge to publish data that is not openly licensed, meaning, what are the legal implications behind?

Maybe @Stephen and @jpmckinney could you advise on this?

Many many thanks!

Licensing for datasets in OpenSpending Next

I thought the French Budget data was open - at least according to the Global Open Data Census.

If the government has opened some budget data, then it should be easy to convince them to open supporting data.

The release process in Australia is typically:

  1. Determine the data custodian (the person/role responsible for collecting the data)
  2. Discuss the need/value in releasing the data (this may be in compliance with a policy or law)
  3. Discuss the licence (often CC-BY is used as a default)
  4. Discuss the file format (e.g. CSV) and data structure
  5. Discuss what documentation and metadata should accompany the data
  6. Seek permission to release (usually involves privacy and sensitive data checks, risk assessment, cost to publish, approval to publish from a senior manager)
  7. Release may be prioritised against releasing other datasets (governments are often resource constrained, so the desire/requirement to release everything may be slowed by lack of resources)
  8. Deploy resources to extract, package and release the data.

Often a government will have a formal process and checklist to follow to release data onto their data portal. Ideally these should be public documents.

Often a data portal will provide an option to request data to be released. As a data re-user, I’d start there.[quote=“CecileLG, post:1, topic:1701”]
At some point the issue raised of what would it imply also for us at Open Knowledge to publish data that is not openly licensed, meaning, what are the legal implications behind?

I don’t think you could publish the data if it was not openly licensed. You have not been given the right to do that by the data owner. @badapple - your thoughts?

Lastly, the licence applied to the French Budget data does not appear to have been assessed against the Open Definition. Perhaps this is something to explore.


If the government already has an open data initiative, then it’s like @Stephen describes.

In most cases, the question of which license to use is already answered, because many catalogs use the same license for all datasets.

For other datasets, there’s often a step of validating that there are no third-party rights over the dataset, but my understanding is that any government has all the rights to their own budget data.

The budget is prepared by the legislature, but at least in Canada, there’s a Treasury Board Secretariat (on the government side) that has the rights to publish this data. In other countries, it’s possible that the budget is entirely under the jurisdiction of the legislature without any government department having the rights to publish this data. This can make data release more complicated, because in general legislatures are not very aware of open data.

In terms of legal implications of publishing unlicensed data, it’s often a legal gray area. In some countries, data is not subject to copyright, so you would be able to publish it. In other countries like Canada, there are some specific criteria a dataset must meet for it to be subject to copyright. Anyhow, I know many organizations who republish government data that is unlicensed, and I’ve never heard of anything dramatic happening - mainly because it would be reflect very poorly on a government for it to sue a website for publishing publicly available, public interest data. In some cases, this unlicensed republication leads to the government applying an explicit license on the data.


Hey @Stephen and @jpmckinney, thanks a lot for this very detailed answer. Very helpful. Love the checklist idea !

I actually wasn’t talking specifically about the French budget dataset, but in a more general way I was seeking advice on how to better advise governments on the process of releasing / publishing their budget data, especially about the licensing and related procedures.

As the OpenSpending project manager, I have to face cases where some governments reach out to us when they want to use OpenSpending as an open data platform to publish their fiscal data but have no clue how to deal with related legal issues on their end.

I’m currently working with several governments who have complete almost all steps of the release process described by Stephen, but the one related to licensing. Which is why I came out with those questions before publishing anything.

Part of our work is also to support their initiative to implement processes to open and publish budget data which is why I was trying to see if there was a pattern or some existing procedures that they could benefit from.



I must follow up on this as well. :slight_smile:
After the publication of the Index we get questions from governments about how to modify their licenses so they would be open.
Even though I point them over and over to this page - http://opendefinition.org/licenses/ , I still get questions which I don’t feel as an expert to answer. There is the ODI guide for licensing, but I feel it is targeted for businesses rather than governments.

Is there any guide from this group that governments can use? Any support that we can give to governments that outreach to us? @Luis @rufuspollock


Hi, I missed this when it first came out.

In terms of the basic practicalities and background I recommend:

And, of course, the general instructions on opening up data in Open Data Handbook.

On this specific point, Open Knowledge should always apply classic safe-harbour provisions such as (and we do do this in our standard terms of use which is why every OK service is supposed to reference both on signup and in footer):

  • We do not guarantee the IP status of any data
  • We are hosts of the data not its providers
  • Use at your own risk etc etc