Csvy: csv + yaml


#1

Hello, not sure this is the right category but I wanted to share an interesting idea I found the other day: CSVY, that is, CSV files with a YAML front matter (à la Markdown): http://csvy.org/

So far it seems direct support is limited to the R rio package, but I think there may be some interesting points of contact re table metadata.


#2

@steko thanks for flagging - we have chatted with martin fenner about this since he first proposed a version of it in 2014.

The very early version of Tabular Data Package actually supported putting metadata in the front line of the CSV - though we deprecated that in later versions.


#3

@steko I’ll add that @mfenner recently wrote a blog post about this as a follow-up to CSVConf where the ideal way of providing CSV metadata was a hot :fire: topic of discussion:


#4

Yes, I learned about CSVY from that blog post. So there’s some interest in the general idea but no agreement on the best practical solution …


#5

@steko i’m personally a bit sceptical about inlining metadata into your data files. There have been many proposals for it over the years. The challenges are:

  • You mess up the original data (it no longer is “pure” data) - this will break tools that were used to consuming e.g. plain CSV
    • This also makes “progressive enhancement” difficult esp when done by third parties (i.e. not the original data creators) - e.g. gradually enhancing data with relevant metadata.
  • You need to agree a convention for your metadata and get that consistently into your data structure (not easy)
  • Data and metadata in one file means that data creators also need to understand metadata

Benefits:

  • data and metadata kept together
  • can use existing tooling e.g. a spreadsheet to add metadata