Data Package View - Tabular format

Hi All,

In the scenario where a data package provides 2 datasets linked by foreign key - eg examples/countries-and-currencies at main · frictionlessdata/examples · GitHub

What is the best way of specifying a combined tabular view that combines data from both sources?
For eg, for the countries-and-currencies package, is there a recommended way to specify a particular view of the data I want?

eg, source data is

countries-using-usd-and-gbp.csv
name
currency_alphabetic_code

currencies.csv
currency_alphabetic_code
currency
symbol

with the data related by currency_alphabetic_code

The data view I want is
countries-using-usd-and-gbp.name +
currencies.currency +
currencies.symbol

Cheers,
Simon

1 Like

Great question, @SimonG!

Perhaps @rufuspollock or @Stephen could take a stab at this.

Thanks @todrobbins … I couldn’t see a way to easily do this, and it still needs the package consumer to stitch the data together to present per the view. For practical reasons I’ve combined my data into a single resource, rather than multiple with a primary/foreign key relationship. It will do! Cheers

1 Like

I haven’t used it myself but data package pipelines may be worth a look

see also Announcing datapackage-pipelines version 2.0 - Open Knowledge Labs

1 Like

Thanks @Stephen - we do quite a lot of data pipelineing but we’re a nodejs shop and python isn’t on our radar. Thanks for the heads up though.

1 Like

@SimonG, since you are using Javascript, are you also using dataframe-js? If so, I suppose you could load each table in a dataframe and join them as in the example from the documentation:

df.innerJoin(df2, ['column2', 'column3']);

Note: if you are using Goodtables for data validation / continuous integration, please keep in mind that it does not yet support validation of foreign key relations.

1 Like

@herrmann interesting! No, I’m not using dataframe-js. I do use both lokijs and lunr in my projects - out of pure familiarity I’d probably defer to these first, but I’ll investigate what dataframe-js offers - thanks. I like the idea of Goodtables, but I use ajv for my validation needs (whilst not data-package aware ajv plugs into our node pipeline effectively). It’s great to see this type of tooling emerge - I can see a datapackage to ajv adaptor could be useful as well.

thanks again @herrmann, I’ve just found reason to try dataframe-js and have a better grasp of it. Indeed it is great for tabular data! lokijs as a nosql db is powerful but the clear use-case of dataframes makes perfect sense.

2 Likes

Glad to be of help!

I haven’t used dataframe-js myself yet, as today I use dataframes in Python with the Pandas library. But if I was to deal with data in javascript, that’s what I would try. Please let us know later about your experience with it, if you’d like to share it.