Applying licenses, waivers or public domain marks


#1

I’ve written a guide about applying licenses, waivers or public domain marks using the Frictionless Data specification. Your feedback is very welcome.

Applying licenses, waivers or public domain marks

Applying licenses, waivers or public domain marks to data packages and data resources helps people understand how they can use, modify and share the contents of a data package.

In this guide, “license”, can be read as “license, waiver or public domain mark”.

It is recommended to apply a license to a data package. This license applies to all the data, files and metadata in the data package unless specified otherwise.

You can optionally apply a license to a data resource. This allows a license that differs from the data package license to be applied to the data resource. If a license is not specified, the data resource inherits the license from the data package.

Specifying a license

The Frictionless Data specification states that a license must contain a name property and/or a path property and may contain a title property.

You can specify the location of a license using a URL or a Path.

Specify a license using a URL

To specify a license using a URL, use the fully qualified HTTP address as the value in the path property, e.g.

"licenses": [{
  "path": "https://cdla.io/sharing-1-0/",
  "title": "Community Data License Agreement – Sharing, Version 1.0"
}]

Specify a license using a Path

To specify a license using a path, use a relative POSIX path to the file in the data package as the value in the path property, e.g.

"licenses": [{
  "path": "LICENSE.pdf"
}]

In this example, LICENSE.pdf would be in the root of the data package folder, e.g.

folder
  |- datapackage.json
  |- LICENSE.pdf
  |- README.md
  |- data
      |- data.csv
      |- reference-data.csv

It is recommended that the licence is provided in markdown format to simplify its display in data platforms and other software.

The license can be a separate file or included in the README.md file. If license information is included in the README.md file, it is recommended that it follows the guide for formatting a README file.

Applying a license

These scenarios apply to either the data package or a data resource.

  1. Apply an open license
  2. Apply a non-open license
  3. Apply a waiver
  4. Apply a public domain mark
  5. Do not apply a license

Other considerations:

Apply an open license

For an open license, use name, path and title, e.g.

"licenses": [{
  "name": "CC-BY-4.0",
  "path": "https://creativecommons.org/licenses/by/4.0/",
  "title": "Creative Commons Attribution 4.0"
}]

name must be an Open Definition license ID however note that some license IDs are placeholders or have been retired and should not be used, e.g. other-at, other-open, other-pd, notspecified, ukcrown-withrights.

Apply a non-open license

To apply an non-open license, use the path and optionally the title properties. It is preferred that the license is published at a URL (a fully qualified HTTP address), e.g.

"licenses": [{
  "path": "https://creativecommons.org/licenses/by-nc-nd/4.0/",
  "title": "Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)"
}]

If the license is not available at a URL, you can specify a license using a path.

Apply a waiver

You can indicate that copyright has been waived by referencing a waiver at a URL in the path property, e.g.

"licenses": [{
  "name": "CC0-1.0"  
  "path": "https://creativecommons.org/publicdomain/zero/1.0/",
  "title": "CC0 1.0"
}]

If the waiver is not available at a URL, you can specify a waiver using a path.

Apply a public domain mark

You can indicate that there is no copyright in the data or that copyright has expired, using the public domain mark or other public domain dedications, e.g.

"licenses": [{
  "path": "http://creativecommons.org/publicdomain/mark/1.0/",
  "title": "Public Domain Mark"
}]

If the public domain dedication is not available at a URL, you can specify the public domain dedication using a path.

Do not apply a license

If you have not decided what license to apply but still want to publish the data package, describe the situation in a file in the data package, e.g.

"licenses": [{
  "path": "README.md"
}]

Other considerations

Provide additional license information

It can be helpful to data consumers to provide additional copyright or attribution information such as:

  • copyright notice - this property allows a data publisher to specify a short copyright notice that can be used directly
  • copyright statement - this property allows you specify a URL for the copyright statement
  • preferred attribution text - the text to be used when attributing the creator of some data
  • attribution URL - this is the URL to be used when building an attribution link

This is explained in the ODI Publisher’s Guide to the Open Data Rights Statement Vocabulary and Re-users Guide to the Open Data Rights Statement Vocabulary.

Some licenses require that data consumers provide the copyright notice in the attribution (e.g. CC BY 4.0 Section 3).

Some data publishers may waive some of their rights under a license, e.g.

Noosa Wedding Locations data by Noosa Shire Council is licensed under a Creative Commons Attribution 4.0 licence.

Noosa Shire Council waives the requirements of attribution under this licence, for this data.

You can include this information, either:

  • in the file containing license information (e.g. README.md)
  • as metadata properties in the datapackage.json

The data package specification supports adding addition metadata properties to the datapackage.json, e.g.

{
  "name" : "coastal-data-system-near-real-time-wave-data",
  "title" : "Coastal Data System – Near real time wave data",
  "licenses" : [{
    "name": "CC-BY-4.0",
    "path": "https://creativecommons.org/licenses/by/4.0/",
    "title": "Creative Commons Attribution 4.0"
  }],
  "copyrightNotice": "© The State of Queensland 1995–2017",
  "copyrightStatement": "https://www.qld.gov.au/legal/copyright",
  "attributionText": "Science, Information Technology and Innovation, Queensland Government, Coastal Data System – Near real time wave data, licensed under Creative Commons Attribution 4.0 sourced on 26 December 2017",
  "resources": [
    {
      "path": "https://data.qld.gov.au/dataset/coastal-data-system-near-real-time-wave-data",
      ...
    }
  ]
}

Copyright belongs to multiple parties

Sometimes data in a resource may be combined from multiple sources that are licensed in different ways. You can indicate this by placing two or more licenses in the licenses property. Further explaination should be given in the README.md.

"licenses": [{
    "name": "ODC-PDDL-1.0",
    "path": "http://opendatacommons.org/licenses/pddl/",
    "title": "Open Data Commons Public Domain Dedication and License v1.0"
  },
  {
    "name": "CC-BY-SA-4.0",
    "path": "https://creativecommons.org/licenses/by-sa/4.0/",
    "title": "Creative Commons Attribution Share-Alike 4.0"
  }]

License may become legally binding

The specification for licenses states:

This property is not legally binding and does not guarantee the package is licensed under the terms defined in this property.

A data package may be uploaded to a data platform and the licenses applied to the data resources and publicly displayed. This may make the license legally binding. Please check your specific situation before publishing the data.

Software may not fully support the Frictionless Data specification

Be aware that some data platforms or software do not fully support the Frictionless Data specification. This may result in license information being lost or other issues.

Always test your data publication to ensure you communicate the correct license information.

As examples:

  • CKAN Data Package extension:

    • does not upload the README.md file in a data package. If you have described licence information in the README.md file, this will be lost (issue #60)
    • does not display license information in the datapackage.json file correctly (issue #62)
  • Data Curator only allows the user to select from a limited set of open licenses to describe the data package and data resource licenses.


#2

how come wtfpl is not an open license?


#3

To become listed as an Open Data and Open Content license, it needs to be submitted for assessment http://opendefinition.org/licenses/process/

To be listed as an Open Source license it needs to go through a different process https://opensource.org/licenses

The WTFPL license could be used in a data package, without going through any assessment, like:

"licenses" : [{
    "path": "http://www.wtfpl.net/txt/copying/",
    "title": "DO WHAT THE ■■■■ YOU WANT TO PUBLIC LICENSE Version 2"
  }],

Assessment will give it a name property at http://licenses.opendefinition.org

I’ve used the title from the web page as the title for the license. I’m not sure if I need to keep the loud capitalisation. The forum software has blocked out the naughty word.


#4

#5

Published…

https://frictionlessdata.io/docs/applying-licenses/


#6

Interoperability with SPDX

@Stephen Hello. Just wondering what the overlap is between a Linux Foundation SPDX identifier and an Open Definition license ID? The identifiers I saw looked identical. So is there some coordination involved between the two systems? Sure there are some holes in the SPDX list as far as open data licenses are concerned, but these would not be hard to fix. I don’t really understand why there is not a single coordinated approach? With best wishes, Robbie.


#7

Hi @robbiemorrison I’m not the best person to comment on the overlaps between SPDX and the Open Definition license ID but I do note on the repository that there has been attempts to coordinate e.g. https://github.com/okfn/licenses/issues/67

Perhaps @rufuspollock or @mlinksva can comment further


#8

Hello @Stephen. Thanks. I am also interested in two other aspects. Automated parsing of license information in both the software and dataset contexts. For just one example, the FSFE reuse project. Now I am not suggesting the same tools will generalize to both code and data.

The other area is license compatibility. Below is a diagram I drew recently which admittedly needs further development and checking. Any feedback gratefully received.

 

Open data and database license and dedication compatibility

Source: Morrison, Robbie, Tom Brown, and Matteo De Felice (10 December 2017). Submission on the re-use of public sector information: with an emphasis on energy system datasets — Release 09. Berlin, Germany. Published under a Creative Commons CC BY 4.0 license.

There are licenses missing from the diagram such as: ODC-By, CDLA-Sharing, CDLA-Permissive.


#9

License compatibility is not my strong point but based on my understanding your drawing looks correct.

I’m wondering if your question may get a better audience in the Open Definition category. My guide (above) was specifically for applying licenses to Frictionless Data. Let me know if you need help to move your question.


#10

Hello @Stephen Thanks. I am also interested in the mechanics of applying license to frictionless data. See this topic for example. Now I know that has some errors of detail in relation to the frictionless specification. I will migrate my question as you indicate but will do so in a fortnight when I have the time to track the responses. Best. Robbie


#11

The GFDLv1.3->GPLv3 arc is incorrect: they are not compatible.


#12

Hello @mlinksva Thanks for the feedback. Do you have reference for your remark, particularly in relation to GFDLv1.3. The Debian project decided that GFDLv1.2 was not inbound compatible: here is their reasoning:

We consider that the GNU Free Documentation License version 1.2 conflicts with traditional requirements for free software, since it allows for non-removable, non-modifiable parts to be present in documents licensed under it. Such parts are commonly referred to as “invariant sections”, and are described in Section 4 of the GFDL.

Wikipedia also reports that:

The Free Software Foundation-recommended GNU Free Documentation License [7] is incompatible with the GPL license, and text licensed under the GFDL cannot be incorporated into GPL software.

But [7] does not support that statement as best as I can tell. I can update the diagram, but would prefer a reliable secondary source which supports that change. Otherwise I will attempt to hunt down a source that supports the diagram as it currently stands. Thanks again.


#13

Debian’s decision is not about license compatibility, but compatibility with their standards.

As for a source on license incompatibility, it’s in the text of the GFDL itself (emphasis added):

You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it.


#14

Hello @mlinksva Version 1.3 of the GFDL also says:

If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.

That statement does not support either view, but rather recommends using a software license for substantive software. I have some other contacts that I can ask for their opinion and will repost here in a few days time.


#15

The statement supports the view that GFDL is not compatible with GPL. A parallel license is necessary to permit use in free software. If GFDL was GPL compatible, the recommendation would not be necessary, as permission to use material from a GFDL work would have already been granted.