Data Package View - Tabular format

#1

Hi All,

In the scenario where a data package provides 2 datasets linked by foreign key - eg https://github.com/frictionlessdata/example-data-packages/tree/master/countries-and-currencies

What is the best way of specifying a combined tabular view that combines data from both sources?
For eg, for the countries-and-currencies package, is there a recommended way to specify a particular view of the data I want?

eg, source data is

countries-using-usd-and-gbp.csv
name
currency_alphabetic_code

currencies.csv
currency_alphabetic_code
currency
symbol

with the data related by currency_alphabetic_code

The data view I want is
countries-using-usd-and-gbp.name +
currencies.currency +
currencies.symbol

Cheers,
Simon

1 Like
#2

Great question, @SimonG!

Perhaps @rufuspollock or @Stephen could take a stab at this.

#3

Thanks @todrobbins … I couldn’t see a way to easily do this, and it still needs the package consumer to stitch the data together to present per the view. For practical reasons I’ve combined my data into a single resource, rather than multiple with a primary/foreign key relationship. It will do! Cheers

#4

I haven’t used it myself but data package pipelines may be worth a look

see also http://okfnlabs.org/blog/2018/10/18/announcing-datapackage-pipelines-v2.html

#5

Thanks @Stephen - we do quite a lot of data pipelineing but we’re a nodejs shop and python isn’t on our radar. Thanks for the heads up though.

#6

@SimonG, since you are using Javascript, are you also using dataframe-js? If so, I suppose you could load each table in a dataframe and join them as in the example from the documentation:

df.innerJoin(df2, ['column2', 'column3']);

Note: if you are using Goodtables for data validation / continuous integration, please keep in mind that it does not yet support validation of foreign key relations.

#7

@herrmann interesting! No, I’m not using dataframe-js. I do use both lokijs and lunr in my projects - out of pure familiarity I’d probably defer to these first, but I’ll investigate what dataframe-js offers - thanks. I like the idea of Goodtables, but I use ajv for my validation needs (whilst not data-package aware ajv plugs into our node pipeline effectively). It’s great to see this type of tooling emerge - I can see a datapackage to ajv adaptor could be useful as well.