Project to monitor the status of open data portals

Hi all,

I have been working on my side project to watch the status of open data portals. Unlike the dataportals.org, this project aims at presenting more insights into each listed data portal, including its dataset count, tags, categories, and publishers. This information is collected every week for all supported data portals so that I can maintain the history of how the data opening was being improved.

Now more than 100 CKAN-based data portals have been covered and I am working on the support for Socrata-based portals (200+).

You can take a look at the result at OpenDataDiscovery.org.

This project is open-sourced at GitHub. If you are interested in the status at historic moments, the historical data is also downloadable at the project website.

Please let me know how do you think about it :slight_smile:

5 Likes

Hi @haoliangyu,

Great project!
Will you improve in a search-engine?

Just a few for Italian stuff.
The official Gov’s website (dati.gov.it) won’t work well since an year. Please have a look on this that has a full opendata portals’ list Datasets - CKAN

Moreover, in Italy at the moment there’re two Socrata-based portals

Best

I found it difficult to use the map. We can zoom in and out but not pan / scroll - can you add that facility?

@nelsonmau Yeah, I also feel searching data across portals is difficult. It would be great to have a tool to do the cross-portal search.

But it is also a very huge project and I may not consider it in the near future. The current task of Open Data Discovery is to support more data portals and present the insights.

Thanks for the two Italian portals. They don’t appear at Socrata’s Discovery API and I almost miss them :pensive:

@Jersey Thanks for testing! You can indeed use the mouse to pan and scroll. The zoom in/out buttons are just the standard map control buttons and I would add more buttons if more people request. Currently dragging the map could trigger the click event at the map and I will fix the bug later.

Hi @haoliangyu
Nice work!

Would you ever add DKAN to your project?
we’re working on Iran’s first open data portal and have some technical issues using CKAN (which try to solve them and looking for CKAN experts.), for this reason, maybe we prefer to use DKAN!
Best.

openprism allows for search across socrata/ckan portals:

@mhkhani Glad to know there will be an open data portal for Iran!

I haven’t add DKAN yet. But DKAN is on my radar and I will probably add support to it next. Please let me know when you are able to publish the portal and we can work out how to add it to the map :wink:

2 Likes

@haoliangyu Great job!
our schedule to publish the beta version of data portal is mid-January 2017, or maybe sooner.
I’ll inform you.

@haoliangyu great to hear about this project.

I would emphasize that we are keen to see dataportals.org to be expanded with additional metadata like thos you mention e.g.

  • data counts
  • tags
  • publishers

Would you be interested in merging this data into dataportals.org?

@rufuspollock Yeah, that’s very good proposal and I would also like to share the data.

Technically speaking, I don’t think we need to actually merge the data or code. It is changing constantly as new features and new portals are added. The simplest way is to expose an API for dataportals.org to query metadata by portal domain or name so that the dataportals.org can get and show the metadata when its user clicks to open the details of a portal. In this way, we don’t need to worry about any deep integration and the change to both projects would be minimized.

Current all information retrieval at my site is based on API, for example http://opendatadiscovery.org/api/instance/1 for http://data.gov. It will returns the portal’s name, description, url, location, dataset count, and top 10 tags / categories / organizations.

Good day!

Thank you for this initiative.
As a suggestion - DKAN portals support, it will be exciting for data.gov.ru and some regional portals of Russia

I can export the data in the CSV format if you only want:

  • data count
  • number of tags
  • number of categories
  • number of publishers

for each portal. As for the name and associated data count for tags/categories/publishers, I am not sure about the best way to format them in a CSV file as details of each portal are very different. Do you have any suggestion?

Hi @haoliangyu!

I’m wondering the whole list of Socrata Portals but didn’t find it into this list https://github.com/haoliangyu/OpenDataDiscovery.org/blob/master/portals.md

Any hint for helping me?
Thanks in advance!

@nelsonmau

The Socrata Discovery API provides a list of 200+ websites that are supported by Socrata platform. This list is a mix of data portals and government performance websites like Edmonton Open Performance Portal. I only pick the data portals for my project and there are 160.

Noted that the Discovery API doesn’t seem to provide a full list of Socrata portals, as I didn’t find the Italian portals you had mentioned.

I have some technical issues for my server so the project will be down for a while. But I can generate a list from my database:

socrata.csv (11.3 KB)

@haoliangyu

Interesting! I went through http://api.us.socrata.com/api/catalog/v1/domains
But I’ve got only data from US, nor from EU that are for me more important.
Anyway, many thanks!

@nelsonmau

You really inspire me about the issue of location. Try http://api.eu.socrata.com/api/catalog/v1/domains

At this moment (Dec 9th, 2016, 19:00 GMT) the opendatadiscovery.org website just returns a blank page with “Not found” written on it. I hope it gets fixed soon, as I am quite curious to take a look at the project.

@herrmann Yeah, for some unknown reasons, the server crashed… I will fix it ASAP.

@herrmann I have fixed the server issue. Now the website opendatadiscovery.org is open again :slight_smile:

Noted that I just added many new portals and I may need some more time to get all data updated.

Preview

1 Like