License compatibility as imperative?

My thanks to the two respondents. I am a little slow in replying, having been on holiday. On the CC0‑1.0 versus CC‑BY‑4.0 question, I am relatively ambivalent. But in many cases, CC‑BY‑4.0 material will be mixed in and that license will then necessarily prevail. Interesting to note (as mentioned earlier) that a literature review suggests that those involved in scientific research favor CC0‑1.0 while those dealing with public interest information provision favor CC‑BY‑4.0 due to its improved ability to retain provenance — more here.

An earlier remark by @SimonPoole that the CC‑BY‑4.0 model distinguishes between existing and adapted material may work for prose but it does not work for data in general. The changes involved are normally too fine grain to track in this manner. Moreover @SimonPoole writes:

The above statement is simply incorrect. The OGL‑UK‑3.0 states that CC‑BY‑4.0 is inbound compatible, hence the reverse cannot be true unless both licenses are legally equivalent — which they are not. Just one seemingly trivial condition will be sufficient to show why: the OGL‑UK‑3.0 has a choice of law provision and the CC‑BY‑4.0 does not. Indeed that one seemingly minor governing law clause is enough to prevent reverse compatibility. Arguments about how lax or otherwise different licenses might be are simply inadequate in this context — indeed the small details are material.

So that means that the OGL‑UK‑3.0 is a terminus license — meaning that once data is so licensed to cannot be returned to CC‑BY‑4.0 status. Moreover no other license (that I am aware of) will take OGL‑UK‑3.0 material so that therefore ends the path.

Continuing to promote licenses like the OGL‑UK‑3.0 or CDLA‑Permissive‑2.0 will necessarily fragment the open data space due to legal incompatibilities. This has relatively little to do with the operational merits of individual licenses and everything to do with how they play together. Moreover I cannot understand how legal siloing could be considered remotely desirable in aggregate. Sure there will be some edge cases where additional constraints are necessary and bespoke terms of use are indicated. But my focus is for general purpose public interest information and the CC‑BY‑4.0 is perfectly adequate. That license has also been endorsed for data by a number of bodies, including the European Commission and the German energy network regulator.

I would like to put together a position statement explaining these issues for the Open Knowledge Foundation to consider as a matter of license approval policy. But I am only willing to draft such a statement if it would be considered with an open mind. Please let me know, OKF.

1 Like

I believe that policy was authored in 2007, about five years before the data‑capable CC‑BY‑4.0 license was developed and released. Its relevance to this discussion is near zero, I am afraid to say.

The text of the OGL v3 states the following:

These terms are compatible with the Creative Commons Attribution License 4.0 and the Open Data Commons Attribution License, both of which license copyright and database rights. This means that when the Information is adapted and licensed under either of those licences, you automatically satisfy the conditions of the OGL when you comply with the other licence. The OGLv3.0 is Open Definition compliant.

So unless you are reading a different OGL v3 than I am, they are simply saying that OGL v3 licensed material can be distributed on CC BY 4.0 terms (which is exactly what I pointed out). Not the other way around, as you seem to be implying.This would “technically” be impossible as the OGL has no mechanism to maintain the copyleft aspects of all CC BY licenses

The choice of law point is interesting, but I don’t believe that it in practice creates an incompatibility.

Hi @SimonPoole. That is really interesting. And we will be reading the same license text — I am using the one that the SPDX project links to. I interpret that statement you quote above in exactly the opposite way. That CC‑BY‑4.0 material can (at least by assertion if not by clause‑by‑clause analysis) be optionally mixed with and relicensed under OGL‑UK‑3.0. In the same way that MIT code can be optionally imported and relicensed under GPL‑3.0‑or‑later.

Furthermore, I am not sure why you regard CC‑BY‑4.0 as being copyleft. The ODbL‑1.0 is an example of a copyleft data‑capable license, for instance. And the choice of law clause in the OGL‑UK‑3.0 naturally adds restrictions that cannot be arbitrarily removed by any given reuser at will. At least not legitimately.

It is worth noting that a license can potential claim to have inbound or outbound compatibilities that may not stand legal scrutiny. In that case, a court would need to decide which attribute to favor — the claim of directional compatibility or the conflicting terms. That is not something that I can do of course.

In my current exercise of determining directed compatibilities, I try to cite legal analysis wherever possible and will only use my own interpretations as an absolutely last resort and then mark those conclusions as strictly provisional.

So I stand by my earlier assertion that the OGL‑UK‑3.0 is a terminal license. With the proviso that I have some legal analysis on file that I have yet to work through.

1 Like

Furthermore, I am not sure why you regard CC‑BY‑4.0 as being copyleft.

From the text

Section 2 – Scope.

5. Downstream recipients.

2. No downstream restrictions. You may not offer or impose any additional or different terms or conditions on, or apply any Effective Technological Measures to, the Licensed Material if doing so restricts exercise of the Licensed Rights by any recipient of the Licensed Material.

But not only that, it was clearly the intent of the drafters of the licence too.

Wrt your previous response note that section 4 of CC BY 4.0 clarifies how “Adapted Material” works for data.

My initial response is, as always, that classifying and contesting the attributes of individual licenses largely misses the point. Indeed I wrote this to the European Commission in a recent submission a month back (¶ 51):

The debate on the best choice of data license or set of licenses rarely looks at the question of legal interoperability — meaning can material under one licence be mixed with material under another license and republished. Rather, the merits of individual classes of license are debated and then the merits of individual licenses. This same discussion takes place within our [energy systems analysis] community too. But this approach tackles the problem from the wrong end.

That said, the terms‑of‑use of each particular license are highly material — in relation to what users may do and how that material might be legitimately combined and distributed with material under other forms of public license.

To return to the question posed by @SimonPoole of whether the CC‑BY‑4.0 license can or should be considered copyleft. On first glance, the CC‑BY‑4.0 and CC‑BY‑SA‑4.0 licenses are typically regarded as “permissive” and “copyleft” respectively. While noting that that terminology was developed to classify software licensing — and that Creative Commons instead use the loosely related terms “attribution” and “share‑alike”. And in the software domain, copyleft licenses were designed to keep the covered code forever within the software commons — while permissive licenses intentionally allowed the covered code and any local improvements to migrate into proprietary software without the need to additionally reveal and return those improvements. If you wish, that latter action being a form of permitted enclosure, at least obliquely so.

So the broader question is essentially this: does the CC‑BY‑4.0 license force retention in the information commons? And conversely, does this particular license prevent downstream use in proprietary products. And the answers are yes and no, respectively. Some may consider those responses perverse perhaps? But one should also note that software can potentially exist in source and compiled forms and there is clearly no equivalent to binary‑only distribution for content — “binary” used here in the sense of executable files.

Stepping back, I agree with @SimonPoole that §2.a.5.B imposes copyleft‑like obligations in a similar fashion to the way that the GNU public license (GPL) family prohibits additional restrictions. I am going to ask about this clause elsewhere and will report back if I discover anything useful. Also to note that I read this earlier posting carefully and found it particularly instructive.

Regarding specific downstream use‑cases, deeming the CC‑BY‑4.0 to be inbound compatible with United Kingdom government OGL‑UK‑3.0, as that license does, would seem in contradiction of §2.a.5.B. We talked early about choice of law provisions in this regard. And indeed I observed that deemed interoperability could well be in conflict with the respective terms‑of‑use. That said, importing to OGL‑UK‑3.0 is also a use‑case of limited interest to me — but it will be for those working with United Kingdom public sector information. Similar claims of inbound‑compatibility are implied by the Linux Foundation in relation to their recent CDLA‑Permissive‑2.0 license. One certainly wonders how those compatibility assertions could have survived analysis by crown law offices and corporate legal departments.

Also needing examination are the notions of “licensed” and “adapted” material, as defined in CC‑BY‑4.0 under §1.f and §1.a respectively. I guess these notions derive from the free software presumption that every contributor retains their own copyright and the so‑called inbound=outbound precept such that the same license implicitly applies. Chestek (2017) examines these doctrines and argues that they should be abandoned in favor of a joint authorship doctrine. Nor has the inbound=outbound concept ever been tested in court, so its legal status remains uncertain in respect of computer code at least.

Regarding data specifically, the CC‑BY‑4.0 under §4 covers only 96/9/EC databases. Much of the data‑related material passed around in my community are just simple‑minded datasets — ranging from ASCII lists to HDF5 high‑performance storage formats — without the necessary seek functionality to attract database protection. Moreover, my community is starting to toy with semantic web technologies and the interaction between 96/9/EC and those technologies will doubtless be a nightmare.

As an aside, the 96/9/EC database directive is currently under review by the European Commission. My pick that is that database protection within the European Union will be removed in due course, possibly as soon as next year.

To close, my interest in data licensing is to go no more stringent that CC‑BY‑4.0. So the prohibition on significant downstream restrictions in §2.a.5.B is of no direct interest to me or most of my community.

That said, the free software world has been down the track of license proliferation. It would be very sad to see the OKF do likewise — by effectively promoting new licenses that create Open Definition-compliant legal silos because no one sought to fully analyze the wider context. Is that really the world that open data advocates would like to advance?

Finally, my earlier offer to the OKF to draft a position paper still stands. I remain concerned about no‑retreat or terminus licenses being applied to data and, in particular, the approval of new licenses that fall into this camp. Could the OKF respond either way? The ball is in your court!

References

Chestek, Pamela S (2017). “A theory of joint authorship for free and open source software projects”. Colorado Technology Law Journal. 16: 285–326. Open access.

1 Like

Hi @robbiemorrison . Thanks for the in depth explanation of the issue. I completely agree that the open data community should avoid license proliferation and I support your idea to draft a position paper about it. I could even review it if I may.

About the review of the 96/9/EC database directive by the European Commission you mentioned, could you please show me any references about that? The Brazilian Lei de Direito Autoral has a similar provision in Article 7, XIII has a similar provision and a repeal of the database directive by the EU would go a long way to push advocacy in Brazil to do something similar and create a simpler environment for data publishers to do the necessary prior legal clearance and to create an ecosystem that facilitates data reuse.

Hi @herrmann, the review of the 96/9/EC database directive is part of consultation on a proposed Data Act (this would still be a “Bill” under United Kingdom idiom):

The main background document is this:

The next round of public consultation closes on Friday 3 September 2021, see here. An earlier submission I coordinated is now up on Zenodo:

For some insight into the definitional differences between datasets and legally‑protected databases in relation to German law, please see:

1 Like

For reference, here is an extract from my inquiry on a legal forum covering mostly open source software:


Following discussions on the Open Knowledge Foundation (OKF) discussion forum (see here), an interesting legal conundrum has arisen for open data licensing.

At least one open‑data‑capable license claims material licensed CC‑BY‑4.0 is inbound‑compatible. Indeed this consideration seems to be a relatively common design criteria. That and other licenses include:

  • United Kingdom government OGL‑UK‑3.0
  • Linux Foundation CDLA‑Permissive‑2.0 (albeit based on hints in the LF press release because not yet listed on the SPDX site and nor has the underpinning legal analysis been made public)

I’ll use the OGL‑UK‑3.0 as an example as the details are fully accessible and a single illustration is sufficient in any case. This discussion is about importing datasets under CC‑BY‑4.0 licensing into other licensing regimes — but excludes objects under or potentially under 96/9/EC database protection for the sake of simplicity.

The OGL‑UK‑3.0 claims that CC‑BY‑4.0 material is inbound-compatible (not possible to cite clauses because nothing is numbered in the legal text) (emphasis added):

These terms are compatible with the Creative Commons Attribution License 4.0 and the Open Data Commons Attribution License, both of which license copyright and database rights. This means that when the Information is adapted and licensed under either of those licences, you automatically satisfy the conditions of the OGL when you comply with the other licence.

In addition, the OGL‑UK‑3.0 provides for a choice of law (emphasis added) — which the CC‑BY‑4.0 does not:

This licence is governed by the laws of the jurisdiction in which the Information Provider has its principal place of business, unless otherwise specified by the Information Provider.

Turning to CC‑BY‑4.0, section §2.a.5.B states that additional restrictions are prohibited (emphasis added):

No downstream restrictions. You may not offer or impose any additional or different terms or conditions on, or apply any Effective Technological Measures to, the Licensed Material if doing so restricts exercise of the Licensed Rights by any recipient of the Licensed Material.

My argument is that constraining the choice of law is a non‑trivial restriction. So although the OGL‑UK‑3.0 claims inbound‑compatibility from CC‑BY‑4.0, there is at least one term‑of‑use that would preclude this action. And there may also be other provisions in conflict too — I did not examine the two licenses side‑by‑side in detail.

So if my analysis stands, the OGL‑UK‑3.0 license will naturally create its own isolated silo. And if my analysis is wrong, the OGL‑UK‑3.0 creates a terminus license in the sense that material transferred from CC‑BY‑4.0 cannot be returned to CC‑BY‑4.0 for more general usage.


Addendum for clarity

In general:

  1. for collections of data under or potentially under copyright and related rights to be inbound‑compatible, the license on the inbound material must be no more onerous than the license on the receiving material in every respect
  2. if the license on the inbound material additionally prohibits further restrictions, then the only way that point 1 may be satisfied is if the inbound and receiving licenses are legally identical in every regard

Moreover, a 96/9/EC database is automatically a collection of data so that point 1 is exhaustive.

In some senses, point 2 could be read as a potentially onerous provision under the rationale of point 1 — but because it is entirely non‑specific, it is necessary to separate it out and accord it its own statement of logic.

Based on the above reasoning therefore, the United Kingdom government might well be advised to favor the CC‑BY‑4.0 license over the OGL‑UK‑3.0 license for the public interest information it releases and for the scientific research outputs it funds.

1 Like

I wouldn’t hold my breath, my impression from a meeting with the team working on the data act a fortnight ago is that that is not on the table (tweaking potentially maybe).

I interpret that statement you quote above in exactly the opposite way. That CC‑BY‑4.0 material can (at least by assertion if not by clause‑by‑clause analysis) be optionally mixed with and relicensed under OGL‑UK‑3.0.

No, @simonpoole has the correct interpretation here.

OGL is essentially a publishing licence. It is designed for the release of Government documents and data. Incorporating other data into OGL-licensed works is not a key consideration of its design. Maximising reuse of OGL-licensed works is.

Fairly obviously, no licence can unilaterally waive the provisions of another licence, and I cannot believe that the authors of OGL would assert that it can.

Sure my comments were speculation. And could well be wishful thinking on my part. One very useful provision would be to explicitly exclude public sector information. Giannopoulou (2018) provides useful analysis in this regard:

To quote from that chapter (standalone PDF p5, printed book p106):

The Database Directive does not clearly indicate the exclusion of public databases that fall under the PSI Directive from qualifying for the sui generis protection. In principle, since public sector databases are not excluded, branches of state power can benefit from the sui generis right protection when they fulfill the conditions [36]. Absent an ECJ decision, however, courts from some Member States have ruled against the possibility of public bodies asserting sui generis database rights. Namely, courts in Italy and Germany have held that even if public sector databases qualify for the protection, they should be exempt from it.[37] The highest administrative court in Amsterdam has held that the City of Amsterdam cannot hold sui generis rights on a database even if it has made a substantial investment towards its creation because the has not borne the risk for the investment in question. [38] Thus, it cannot impose limitations or charges in the reuse of that database. Finally, French law has been amended [39] to clarify that public bodies cannot invoke a sui generis right in order to refuse the reuse of their data.

1 Like

Meaning that the OGL‑UK‑3.0 is either strictly less onerous or strictly equivalent to the CC‑BY‑4.0.

There are two lines of argument in play here. One is contextual — based on the stated purpose of the instrument and the caliber of those drafting the legal text. And the other is analytical — based on a comparison of the respective terms‑of‑use and any formal claims of interoperability. I adopted the latter approach and came to different conclusions. As I indicated elsewhere, I normally try to refrain from doing my own legal analysis. But in this case, I elected to fly a kite. I will check with some media lawyers I know and report back if anything notable arises.

Nor did I ever argue that.

Also worth adding, as @SimonPoole and @systemed opine, that the OGL‑UK‑3.0 being inbound‑compatible to the CC‑BY‑4.0 would suit me better. And very certainly my United Kingdom colleagues.

There is another point that I would welcome clarification on. Can anybody apply the OGL‑UK‑3.0 to any class of information (aside from source code of course)? Or are there restrictions on the licensor and on the type of material? After all, the formal title for the license is the “Open Government Licence for public sector information”.

Note that wrt OGD the “new” EU open data directive limits the scenarios in which restrictive licensing terms can be used by public bodies and suggests that sui generis database rights should not be invoked. In practice this naturally all depends on the transformation in to national law, but at least in some cases (for example Germany) this has led to a general prohibition on using sui generis database rights in such scenarios. I suspect that there is still considerable wiggle room though.

That’s useful. §1.6 of directive (EU) 2019/1024 states:

The right for the maker of a database provided for in Article 7(1) of Directive 96/9/EC shall not be exercised by public sector bodies in order to prevent the re-use of documents or to restrict re-use beyond the limits set by this Directive.

1 Like

Can anybody apply the OGL‑UK‑3.0 to any class of information (aside from source code of course)? Or are there restrictions on the licensor and on the type of material?

There are no restrictions, although the licence is expressly intended for public sector use. The National Archives page about the OGL is helpful and also clarifies the compatibility issue (“The OGL terms are compatible with the latest versions of the Creative Commons Attribution License and the Open Data Commons Attribution License. This means that when the information is adapted and licensed under either of those licences, you automatically meet the conditions of the OGL as long as you comply with the terms of the other licence.”)

There is some limited use outside Government - for example, data published by the charity Sustrans.

1 Like

@systemed The National Archive wording you quote is also provided in the license text of the OGL‑UK‑3.0, but with version numbers cited for the other licenses. I gave my interpretation earlier. And that clearly differs from that by you and @SimonPoole. So I will endeavor to seek more information. Thanks for the other clarifications.

On reflection, the key problem lies with the phrase “automatically satisfy the conditions of the OGL” in the OGL‑UK‑3.0 license text (emphasis added). Moreover the direction of compatibility is unfortunately not specified in that passage either.

And compatibility cannot be two way — one materially different term‑of‑use will stymie that characteristic. And the choice of law provision in the OGL‑UK‑3.0 does just that. In any case, no one is arguing that the two instruments are legally identical.

A public license is made up of equivalently “terms‑of‑use” or “conditions” that may be either “permissions” or “restrictions”. That is precisely why I used the phrase “not more onerous” in my earlier analysis, because these two camps are not similarly behaved.

Essentially, additional permissions allow the user to do more and additional restrictions necessarily mean that the user is more constrained. That is pretty obvious.

If the OGL‑UK‑3.0 terms‑of‑use that differ are strictly more permissive than the CC‑BY‑4.0, then the OGL‑UK‑3.0 could potentially be inbound‑compatible to the CC‑BY‑4.0. But the choice of law provision in the OGL‑UK‑3.0 means that further restrictions must apply as well. In which case:

If the OGL‑UK‑3.0 terms‑of‑use that differ are strictly more restrictive than the CC‑BY‑4.0, then the CC‑BY‑4.0 could potentially be inbound‑compatible to the OGL‑UK‑3.0. But then CC‑BY‑4.0 section §2.a.5.B activates and prevents that scenario.

So that is stalemate!

If the passage on compatibility in the OGL‑UK‑3.0 does conflict with actual terms‑of‑use, it is impossible to anticipate whether a court would favor the conflicting terms‑of‑use or the commentary on compatibility. In any case, they would need to adjudicate on that matter. My guess is that individual terms would trump commentary and interpretation, but that is pure speculation on my part. Moreover, neither of the matters I raise would be considered minor. Both governing law and a prohibition of further restrictions can only be major terms.

Moreover, a data analyst who uses publicly licensed material in conflict with the terms‑of‑use of the prevailing license, forfeits the entire license and becomes duly liable for copyright infringement. And worth noting that much of this data will be used by researchers working for risk‑averse institutions. Moreover a lack of prior judicial assessment will not prevent civil litigation.

Returning to the question at hand, an analysis must be undertaken using “degree of onerousness” arguments based on term‑by‑term analysis of the two licenses.

Is anyone aware of that kind of detailed analysis being undertaken and, in particular, written up in the academic literature. Moreover, I am not interested in hand‑waving arguments as to where the OGL‑UK‑3.0 was pitched and what the quality of its underpinning legal advice might be. But I would be interested in seeing that advice. Is it likely to be publicly available?

My earlier conclusions largely stand. I now provisionally conclude that material under CC‑BY‑4.0 cannot be inbound‑compatible with the OGL‑UK‑3.0 due to §2.a.5.B. And that material under OGL‑UK‑3.0 cannot be inbound‑compatible with the CC‑BY‑4.0 due to choice of law provisions. So that then creates the perfect legal silo with the OGL‑UK‑3.0 essentially residing in splendid isolation.

I am not asking people to agree with my analysis. I have no legal training and limited knowledge of English law. What I am asking is that these matters be acknowledged and tackled. And if a version 4.0 of the OGL is required to address these problems, then so be it. The other option would be for the United Kingdom to favor CC‑BY‑4.0 as the European Commission has largely done. And the CC‑BY‑4.0 is a genuinely international license.

My sole interest is data interoperability. We need that interoperability in order to confront the myriad of problems we collectively face, both large and small.

1 Like

I captured my earlier analysis in the following diagram. I believe this to be a serious issue. And I note that have not received any substantive counter‑argument thus far. That said, this analysis is entirely provisional and the information given should not be relied upon under any circumstances. R

1 Like

“You’re welcome !” it was my pleasure to help you.