Discussion on data provenance for journalists and NGOs

That seems like a workable path, but I somehow dislike the somewhat arbitrary split between data and metadata: knowing where a given fact comes from is data as well (and it shouldn’t be easy to change the data but not the metadata…). If you were to create a fully-sourced table, your metadata file is now multiple times as large as your actual data file.

Perhaps it makes more sense to think up a CSV format that holds statements, i.e. to push the problem up a layer in the stack. (And yes, I don’t know what the difference between CSV statements and nQuads is… really been infected).

1 Like