Shared table schema

herrmann · July 16, 2018, 1:31pm

Suppose I have a Tabular Data Package with a bunch of CSV files. All CSV files should use the same schema. Is there a way to share a single schema for validating all of the CSV files? Ideally, I should write the schema only once on the datapackage.json file and it should be applied to all of its resources.

vitorbaptista · July 17, 2018, 5:10pm

Hi @herrmann,

As we talked over Telegram, the way to reuse table schemas is to create them in a separate JSON file (instead of inside the datapackage.json) and reference them. Instead of:

// datapackage.json
{
  "resources": [{
    "name": "data",
    "schema": {
      // The table schema
    }
  }],
  // ...
}

You’d have:

// datapackage.json
{
  "resources": [{
    "name": "data",
    "schema": "tableschema.json"
  }],
  // ...
}

// tableschema.json
{
  // The table schema
}

So you can simply add "schema": "tableschema.json" to all resources that share the same schema. There’s an example of this on Tabular Data Resource | Frictionless Standards, but we could probably improve it.

In the future, I see we adopting JSON References to allow the reused table schema to be in the datapackage.json file. Meanwhile, using a separate tableschema file works.

herrmann · July 18, 2018, 2:34pm

Thank you, @vitorbaptista, I’ll try using it that way.

Indeed, the Tabular Data Resource specification does show it in the examples, but an added explanation would be nice. I suppose I could just send a PR to the GitHub repository to make this point clearer.

About JSON references, do you mean this IETF draft? I think it would be useful as an added option in the future, but I imagine that support in JSON parsing tools would probably be slow to catch up with this feature.

vitorbaptista · July 18, 2018, 3:49pm

I agree we could be clearer. And yes, that’s the standard I’m referring to. I haven’t checked, but even though it’s a draft, it’s somewhat common, so I imagine most libraries support it already. However, this is something for the future, as it would require a spec update.

herrmann · July 20, 2018, 1:11pm

Yes, I agree.
I’ve just made a pull request with a proposal on how to write this on the specs.

Topic		Replies	Views
Tutorial for handcrafting a Table Schema Frictionless Data	6	1374	June 15, 2017
Thoughts on JSON Table Schema Frictionless Data	9	6134	October 19, 2015
Foreign keys across data packages Frictionless Data	1	1495	April 7, 2018
Frictionless Data websites Frictionless Data	3	1227	May 11, 2016
EPrints and (Tabular) Data Packages Frictionless Data datapackage , jsontableschema , frictionlessdata	0	1286	July 12, 2016

Shared table schema

Related topics