Foreign keys across data packages

The Table Schema: Foreign Keys to Data Packages pattern (repeated below) will be super powerful once implemented.

Imagine having a single shared code-list that many data packages reference :heart_eyes:

Table Schema: Foreign Keys to Data Packages

Purpose: allow users to link from the column of a Tabular Data Resource in one Data Package to a Tabular Data Resource in another Data Package.

To support this:

The foreignKey MAY have a property datapackage. This property is a string being a url pointing to a Data Package or is the name of a datapackage.

I’ve made an example data package that implements this and PR #37 contributes it to frictionlessdata/example-data-packages.

The foreignKeys bit in the data package looks like:

    "foreignKeys": [{
      "fields": ["code"],
      "reference": {
        "datapackage": "https://raw.githubusercontent.com/frictionlessdata/example-data-packages/master/donation-codes/datapackage.json",
        "resource": "donation-codes",
        "fields": ["donation code"]
      }
    }]

What I’m not sure about in the pattern is the option for "datapackage" to be specified as “the name of a datapackage” (from the last few words in the pattern).

In this case what would the datapackage.json look like and how is the url location resolved?

1 Like

This was answered by Rufus on GitHub

the answer was that this would work if there was a canonical data package registry like DataHub and then name dereferenced against that. More discussion of context of this here #237

I’ve updated the PR accordingly.