Frictionless data 'intents' for frontend apps consuming data packages


#1

apologies if this has been discussed before - i searched the forum and couldn’t find something similar, but i am aware i am unsure about how this would be framed in general, so i may have missed something that wouldn’t sound related to my eye.

i am familiar with the production/preparation and publishing aspects of frictionless data, but being involved in the development of frontend data visualization apps that could consume data packages, i am wondering if there is prior work in this domain (consumption of data packages).

what i am imagining is a system akin to Android’s ‘intents’ so that (abstracting here from implementation details) a publisher or aggregator of data packages could provide a ‘visualize this dataset’ button, which would then list [1] possible apps through which the dataset could be visualized.

frontend apps would have to specify somehow what they could accept/what would make sense to try to visualize through them. the app i am currently focusing on (http://pattrn.co/), for example, would only be listed if a dataset provides at least lat, lon, timestamp data, as it provides map-based visualization of events + crossfilter/dimensional charting features. a visualization app that only generates charts ignoring map data would need to specify a minimum set of variables of specific kinds that it would need in order to generate (potentially) meaningful visualizations, and so on.

[1] Depending on which data visualization apps would make sense to use, according to the data package’s metadata.


#2

ok - starting to reply to myself: just found out about Data Packages Views, which seems to be quite related to my ‘intents’ idea above, perhaps with a different initial focus.

if i understand correctly, the initial work on DPV is focused on defining possible views from the data package’s side, whereas ‘intents’ would work by matching data requirements ‘advertised’ by visualization apps with metadata exposed by data packages. maybe worth experimenting with both approaches in parallel if there is interest.


#3

In summary, what data package viewer is, is a tool that allows the end-user to see the looks of the data package. On the other hand, this also allows the curator managers to ensure the integrity and quality of all data packages.

This is the viewer in action.
AFAIK, this tool just renders the outlook of the CSV file, which is helpful for the end user and the managers to ensure package quality. The source code of the data package viewer may help you find a way though.

Thanks!


#4

In OpenSpending we took a different approach than the one in the Data Package Views (sorry @rufuspollock for missing that discussion).

The basic idea is that in the Fiscal Data Package we have more detailed descriptions for each of the columns in the tabular data. Based on these descriptions, we decide which visualisations apply to the data. For example:

  • If we have a ‘location’ column, then a Map visualisation is offered to the user.
  • If there’s a ‘date/time’ column, we offer a timeline visualisation
  • If there’s a group of columns that form a hierarchy, we offer visualisations that support drilldown (e.g. Treemap or bubbletree).

This is closer to the original ‘intents’ idea which I like very much. We could envision a more intricate view-oriented description for the data, which will allow better matching of visualisations to datasets.
An example (from the top of my head) would be a data column which specifies that it’s best viewed in a logarithmic scale. Then, a visualisation capable of showing logarithmic scales would be considered a perfect match, and one that does not support log scales could be considered a lesser, but still possible match. A visualisation not supporting numerical values (think ‘word cloud’) would not be considered at all.


#5

I am not sure if that would create a barrier to new contributors? They would need to be relatively proefficient in statistics, to know which representation would manipulate data the least or, in your terms, which is the most adequate method of representation.


#6

@gsilvapt @adam thanks for sharing your ideas.

there should be scope for some kind of ‘progressive enhancement’-style approach to this, allowing to address both simpler scenarios where no view-oriented description of the data is available, to more complex scenarios such as the one sketched by @adam.

in a way, as i read more about the various tools and strategies already out there or being discussed, it seems to me that the original ‘intents’ idea could roughly translate to a lightweight system to match metadata exposed by (or inferrable from) data packages, and capabilities exposed by frontend apps - roughly corresponding to Intent objects and Intent filters in the Android’s Intents system: by default this could just lead e.g. to ReclineJS or to Vega/Vega Lite views, whilst allowing more ad-hoc matches where metadata is available on either side (data packages or frontend apps).

i’ll try to add a brief spec and some rough code on a public repo to test whether this makes sense :slight_smile:


#7

@hotzeplotz i wanted to write the moment I saw this but was away.

I am delighted to have this suggestion! As you guess a bit from seeing the Data Package Views spec something a bit like this has been in the back of my mind – I even remember looking at the intents idea a bit a couple of years ago – so I think you are spot on.

I think you are right that the “inversion” offered by “intents” has some attractive features over adding more and more stuff into the Data Package spec (or as extensions).

:thumbsup:

Basically the “intents” would connect with the Data Package “profiles” - if a Data Package has X profile then this app can use it kind of thing.


Finally, I agree with you re progressive enchancement:

Please do go ahead and add an issue / proposal to https://github.com/dataprotocols/dataprotocols/issues

PS: good to hear from you again and pattrn looks super cool.