We are excited to announce that Open Knowledge has received funding from The Alfred P. Sloan Foundation to work on a broad range of activities to enable data-driven research.
I can imaging a much improved publishing pipeline with metadata generation, data quality checks preceding loading into CKAN. Then as a consumer I can imagine getting a better understanding of the data and if it’s fit for my purpose without having to download the data.
Is there a clear roadmap of what will be delivered when? The present roadmap shows the parts but not their integration.
Thanks @Stephen! Really great points! You’re right: the current roadmap can use some work. A more useful roadmap will come with a refreshed website soon .
Another thought… many open data publishers are resource constrained; help them get started with datapackages by writing a program that:
- for each csv dataset already published in CKAN
- derive the schema
- use existing metadata to top and tail the schema and create a data package
- publishing the data package with the data resource
This will kick start the use of data packages in portals and when complemented with other tools like datapackagist will lift the quality of publishing.
I have posted on the Labs blog going into a bit more detail on some CKAN tooling we have created: http://okfnlabs.org/blog/2016/03/11/frictionless-data-transport-in-python.html
Specifically, there is a CKAN extension that allows for importing and exporting Data Packages: https://github.com/ckan/ckanext-datapackager