Discussion on data provenance for journalists and NGOs

pudo · September 11, 2014, 9:05am

That seems like a workable path, but I somehow dislike the somewhat arbitrary split between data and metadata: knowing where a given fact comes from is data as well (and it shouldn’t be easy to change the data but not the metadata…). If you were to create a fully-sourced table, your metadata file is now multiple times as large as your actual data file.

Perhaps it makes more sense to think up a CSV format that holds statements, i.e. to push the problem up a layer in the stack. (And yes, I don’t know what the difference between CSV statements and nQuads is… really been infected).

Topic		Replies	Views
Emerging patterns / workflows for Data Packages (2014) Frictionless Data	0	883	August 9, 2016
OpenSpending Data Package Structure OpenSpending	9	2266	May 26, 2015
Panama Papers: a case for reproducible research, data activism and frictionless data Policy and Research frictionless-data , datajournalism	3	1885	May 18, 2016
Tracking Data Issues: what's the current state of the art? Open Knowledge Labs	17	2278	May 3, 2017
Crowdsourcing journal subscription cost data Open Access	6	2704	October 28, 2014

Discussion on data provenance for journalists and NGOs

Related topics