In this video from the ODI Lunchtime Lecture, Woodrow Hartzog from Stanford University compares various definitions of open data. He includes in the discussion different definitions from:
- Open Knowledge
- Open Data Handbook
- Open Definition
These should all be the same.
Perhaps the various OKI sites need some housekeeping so we provide a consistent definition.
(Discussion on different definitions starts about 15 mins into the video but the whole video is worth watching.)
I listened. Hartzog didn’t spend very much time on any of the definitions so I’m not sure he pointed out any inconsistencies among the ones on OKI sites. Does a table of those exist? Maybe you want to create such a thing @Stephen.
Far more interesting to me was Hartzog’s critique of open data not permitting privacy restrictions and thus not being acceptable to regulators or the public and thus being non-sustainable. I’m not certain he completely appreciates the nuances of open data (by the OD, or similar) particularly around not charging money (freedom, not price; it actually must be ok to sell open data) or around share-alike (which is not a template for other restrictions; its purpose is merely to prevent other restrictions from being added). But he probably does understand, and his critique is worth taking seriously in either case.
Around 45 minutes the host, Ellen Broad (I think) of ODI clearly articulates why OD and similar are focused on limiting the damage of property-based control over data and that privacy has a different basis and is hard to bundle with property mitigations in one definition, let alone one license or similar legal instrument.
Hartzog backed up his argument with claims that “fire-and-forget data” is not very valuable and wanted open data advocates to back off of such data and thus become more comfortable with data which is not just dumped into the world, and which includes conditions of use around privacy best practices. I think Hartzog vastly overstates the uselessness of data dumps. One of the wins of open data is people using datasets long after whatever researcher or entity who made the dataset is long gone, for new purposes. At the end of Q&A someone asks about census data; Hartzog says it’s a hard problem.
Still, I think it’s very worthwhile taking Hartzog’s critique seriously and imagining how the OD might begin to address it. One beginning might be just to point out in the definition that use of data free of IP encumbrances (and thus OD) can be subject to other regulation and this doesn’t necessarily make it non-open.
Also toward the end Hartzog briefly mentions several technical, legal, and procedural protections for privacy in data sharing. If anyone has a relatively exhaustive list of such I’d enjoy reading and thinking about how each complements and/or is in conflict with openness.
Thanks for posting the video Stephen!
The paper that supports the video is here.
Re: Inconsistencies. If you read the intro text on various sites rather than following the link to the Open Definition, you get:
From the OKI site (with a reference to the Open Definition)
‘Open knowledge’ is any content, information or data that people are free to use, re-use and redistribute — without any legal, technological or social restriction.
From the Handbook (with a reference to the Open Definition)
Open data is data that can be freely used, re-used and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike.
From the Open Definition home page
“Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness).”
“Open data and content can be freely used, modified, and shared by anyone for any purpose”
Perhaps this is why they were identified as different sources of a definition for open data even though they all lead to the Open Definition.
Thank you for pointing this out Stephen!
I guess these inconsistencies is because we didn’t update some content on
our site for two years, and the handbook itself just got a cosmetic face
list, but not a content upgrade, while the Open Definition did publish a
new version already. So I guess this is more of a case of just not paying
enough attention on our side. Also, to be honest, these versions does not
look that inherently that different to me.
In any case, I can change these right away to be consistent. Should I just
change it based on the Open Definition 2.0?