GDPR vs. (part of) Open Data

As part of “day to day” Open Data operations in Slovakia we recently stumbled upon topic of “GDPR vs. Open Data” Statements like “GDPR will kill your Open Data” or “Unofficially, personal information regulators thinks Open Data is illegal” were overheard.

WE can discard such statements as “unofficial” and “FUD”. But there was some “risk” identified already (see use-case example bellow). So, we would like to discuss and share know-how, tips and tricks with others around the EU (or world-wide), since GDPR is not a Slovak specific.

To make the debate more grounded, focuses and real, here’s a quick summary of what we (=Slovak Open Data initiative) concluded so far:

  1. GDPR has mostly no effect on Open Data (see point 3 as to why)
  2. GDPR relates to Open Data mostly because both proper execution of Open Data publishing and protection of personal data require (amount other things) proper data management, some ETL (anonymization , …), etc.
  3. The most contentious item (GDPR vs. Open Data) so far are an “exceptional” Open dataset which do contain personal information, for example Business Registry or Land Ownership data.

So, few more points on those “exceptional” Open dataset which do contain personal information:

  1. As mentioned above, we believe those are a minor part of the total set of (Open) data there is.
  2. But those tend to be of higher interest/value (precisely because they do contain names, etc.)
  3. But as those were published with certain purpose in mind (in general, usually, because of “public intetrest”), we believe their processing/usage should not be in any way affected by GDPR.
  4. But the problem as of now is that agencies responsible for protection of personal information seem to think (assumed based on what we know from them directly or indirectly so far) such datasets do not have any exception from GDPR, thus users of such Open Data have to produce the usual paperwork (and other stuff) which is required from “data controllers”.

The desired outcome for the Open Data community would be to achieve a state when it is clear that even though a dataset does contain some personal information, Open license is still valid and thus users of that data are able to use and re-use it “as usual”.

Typical worst-case scenario we would like to avoid:

  1. Slovak Business Registers publishes data about companies and their owners as Open Data.
  2. Some guys create a nice service based on that, create a company around that once the service proves interesting and commercially viable.
  3. Personal Data authority then steps is and tell them to either explain why they are processing names of company owners (and prove they have some relationship to all of them, etc. bla bla bla, ewe do not fully understand what other stuff the authority may require) or cease/delete the names right away (and pay a fine, etc.)
  4. That sets and example country-wide (“legal uncertainty”, “high costs/difficulty in starting the project”, etc.) and we “fast rewind” back few years and function/live as if no Open Data from Business Register exists.

That would essentially defeat the purpose of making such data public and published Open Data (what is the purpose of having some data freely available on the Internet if you can’t legally make a copy and work with it?)

(We can use also an “NGO fighting corruption” in that example, will still be almost same. But NGOs can at least “hide” to some degree as “journalists” in that case.)

Side note: I see last year there was something written in that regards in New Organization Request by melina_t but since that one is locked, I do not know what.

8 Likes

The main issue here seems to be that the core value of Open Data lies when its licensed under creative commons-like licenses = no special purpose. Publishing and reusing public data with personal data (like company owners in companies, beneficiary ownerships, land registries) that bring great transparency and added value seem to be very problematic under GDPR.

How is opencorporates.com, openownership.org, dealing with this?

related: GDPR and ICO considerations - Data Issues - Companies House Developer Forum

1 Like

Thanks for posting this @hanecak - really useful and great to have this discussion.

Just realized that this might even affect future Global Open Data Index ratings for all EU countries. Spending & Procurement data (vendors/suppliers can be physical persons in some cases), Company registry & land ownerships all can contain personal info. Under GDPR (effective from 25. may 2018) it is unclear if such data can be freely reused.

I’m not sure much of this is about GDPR, though. Don’t we have these difficulties under existing law?

Most open data is non-personal. But if personal data is published in an open dataset the licence will not exempt the re-user from their responsibility to comply with legal requirements for data protection.

Another side note regarding "defeating the purpose of Open Data: In cases of clash with GDPR, “chilling effects” describe in your (OKFN’s) “Avoiding data use silos – How governments can simplify the open licensing landscape” report (New Report: Avoiding data use silos – How governments can simplify the open licensing landscape – Open Knowledge Foundation blog) will happen, given that GDPR will add huge set of complex rules on top of an open lincese.

If I understood “the message” from authorities so far, then yes, you are right. According to some, the main novelty of GDPR is that now they are going to actually enfore the rules (which were in force previously), because previously those were not enforced that much (you could just buy some bogus paperwork and be compliant; or simply ignore it, see http://nocookielaw.com/ ).

Depends.

Example: We have companies registry in Slovakia, that publishes statutory bodies (mostly persons). Under current legislation, if you have open license for this dataset (we have), you can do whatever you want with the data (assuming you don’t break any laws - like spamming is prohibited, etc). But… under GDPR this is completely different story: You need to show that you are allowed to process personal data (even published public domain data) for a specific purpose. Thats problematic:

  1. you definitely don’t have permission from all statutory bodies (persons) to do that. And it’s impossible that you will ever get that.
  2. you probably don’t have any idea what the original purpose of publishing the data was in the first place. it’s not stated in law anywhere.

Since this is moving quite slowly (or hopefully just silently), a little update from Slovakia:

  1. Filip Glasa from FinStat.sk (one of the good examples of start-ups founded based on Open Data) took over and is making a law analysis (of the GDPR and related laws, with lawyers). He is partially working with some Slovak authorities and he will publish and discuss his findings on a public workshop, see Nariadenie GDPR a otvorené dáta - Stretnutie s ÚOOÚ - problémové okruhy - #67 by filipglasa - Spolupráce so štátom a samosprávami - platforma.slovensko.digital .

  2. One of the authorities involved in Slovakia is USV ROS (Ministerstvo vnútra SR - Splnomocnenec vlády SR pre rozvoj občianskej spoločnosti) which is driving OGP in Slovakia, thus also Open Data. They are also involving Office for Personal Data Protection (Úrad na ochranu osobných údajov Slovenskej republiky).

  3. AFAIK, USV ROS also asked for feedback concerning this also some of its partnering organization in other EU countries.

  4. Little piece of maybe useful advice comes from CUZK (ČÚZK - Home): One of their representatives mentioned (during ISSS 2018 conference) section 6 of chapter 3 in GDPR, based on which they think a special laws should be created to resolve issues around personal information in public registries (and Open Data published from those). I did not have chance to talk to them, so we’ll try to get in touch later. (Maybe useless reference: https://twitter.com/PHanecak/status/983311998887825408)

Next update will be after Filip’s workshop, i.e. after April 17th 2018.

2 Likes

Clarifying things like company registries is crucial here. Has anyone done any digging or asked any questions about GPDR and datasets like that?

The situation around whois seems related to this, as an infrastructural
dataset/service with utility in tackling corruption and fraud -

Facilitated by Open Data Services Co-operative
Wednesday 25th April from 1pm - 4pm
Location: Space4 in Finsbury Park.

Our agenda will include an overview of GDPR from Darren Wright of Inside Outcomes, followed by group discussions on the particular issues that arise of open data standards, and the challenges and opportunities of GDPR.

We aim to all be able to leave with key actions for meeting the GDPR, and ideas for next steps to continue to have the right balance of transparency and privacy embedded in our work.

Few points from the Trend+FinStat.sk workshop:

  • workshop targeted mainly the use of personal information obtained from public registries via Open Data
  • from the point of view of re-user (i.e. some service provider working with Open Data): the key is to show the legitimate interests and do the proportionality test, thus:
    • help from lawyer needed almost always
    • if something is already being done (“in production”) with all necessary “due diligence”, disastrous changes may not be probable just due to GDPR
  • there already is a lot of personal information out there and with legitimate use-cases (try googling for one of our former ministers Mr. Drucker and you’ll get even his date of birth) but stance of our Personal Infoirmation Protection office is not fully clean and “in line” with the rest of us
  • Slovak eGov and Open Data strategy is quite strong and both protects personal information yet also defines some exceptions in case of Open Data
  • work from governments and EC would still be much welcome, to clarify things, dispel myths, etc.

“Long story”: Open Data a GDPR - Projekt OpenData - Opendata (in Slovak and English, slides in photos in Slovak).

1 Like

Another update (based on hint from Malte Beyer over Twitter): EDPS opinion from 18 April 2012 (https://edps.europa.eu/sites/edp/files/publication/12-04-18_open_data_en.pdf):

Opinion of the European Data Protection Supervisor

2.3. PSI reuse under the current data protection framework

20. In particular, it is not easy to implement the principle of purpose limitation effectively
in case of PSI reuse. On the one hand, the very idea and driving force for innovation
behind the concept of ‘open data’ and PSI reuse is that the information should be
available for reuse for innovative new products and services, and thus, for purposes
that are not previously defined and cannot be clearly foreseen. On the other hand,
purpose limitation is a key data protection principle and requires that personal data that
have been collected for a specific purpose should not at a later stage be used for
another, incompatible purpose, unless certain additional conditions have been met. 13 It
is not easy to reconcile these two concerns (open data and data protection).

5. CONCLUSIONS

  • require that an assessment be carried out by the public sector body concerned
    before any PSI containing personal data may be made available for reuse (Section
    3.1);

Beware, I’ve picked bits and pieces, so I may have destroyed the main message of the whole document.

Reference: https://twitter.com/BeyerMalte/status/989405554991824897

3 Likes

And another feedback, this time from European Open Data Portal:

with subtitle “General Data Protection Regulation (GDPR) as a supporter for Open Data

And yet another document from European Data Portal (EUDP), this time about “revised PSI and GDPR”. Few quotes:

Protection of Data under the PSI directive

This primacy principle is explicitly recognised by the PSI Directive. In other terms, EU member states and PSI re-users must consider the principles and obligations of data protection law when applying or implementing the PSI Directive. This does not imply that PSI that contains personal data cannot be opened, it rather demands a thorough assessment under which conditions the opening is lawful.

The PSI Directive triple assessment

In order to support the opening of PSI while protecting personal data, the PSI directive established a triple assessment: (shortened)

  1. Determine whether the PSI contains personal data.
  2. Determine whether national access regimes restrict access to the PSI. If yes, the same restrictions apply to PSI publication and re-use as well.
  3. PSI containing personal data that is opened for re-use should only be processed in compliance to data protection law.

I.e. we (Open Data community) most probably wont get a sort of “blank cheque” and while PSI still pushes also for data with personal info to be published (if/as needed/required, etc.) whoever is going to work with that data needs to do also his “GDPR homework” (i.e. not everybody wil be able to do whatever with such data). This complicates matters, but might be a necessary sacrifice if we also truly mean it with privacy. IIRC. IANAL.

Source: https://twitter.com/EU_DataPortal/status/1082230423340634113

3 Likes

Thank you for the update, @hanecak!

Historically, “privacy” is used as a “get out of jail for free” card by folks who don’t want to do their homework with open data. The new open data/PSI directive is actually a chance to make the release of a number of datasets mandatory, and under GDPR a legal obligation to release some data is already a sufficient legal basis to process some private data, so most supposed issues with GDPR would immediately vanish.

For anyone wondering about the revised PSI directive, here are the links: