Dataset no longer found on Datahub


#1

Two days ago I could access a dataset called “twitter-2012-presidential-election” on Datahub.io. This dataset is now gone.

I suspect this is related to the site’s makeover? The old URL where it was available is https://old.datahub.io/dataset/twitter-2012-presidential-election. It is 404 now. Searching the name of the dataset with a search engine still leads to this URL.

Where can I find the dataset now?


#2

it still lives on its original servers.
hitting this in wayback gives a glimpse at the dataset.
Each dataset has its own url, which provides the dataset’s google server url, you can see this in the first dataset: cache-0-json.gz.
That datasets’ profile linked above, provides this url https://ckannet-storage.commondatastorage.googleapis.com/2015-07-09T02:10:43.036Z/cache-0-json.gz which is still alive.
Each dataset provides the same, so you can access them all now.
I’ve downloaded them all, but now I’m looking for where I can dump 60+gbs of data.
I think this is an ideal case/example for me to play around with dat and beaker browser, but if I understand it correctly, I’ll still have to house the data on my box, which doesn’t work for me.
Unless I’m mistaken, I can chop it all up and upload it in files that meet datahub/github file size requirements, and that is what I’d normally do. In this case, seems like much more manual/busy work than I want to do. That said, if I can’t find a more ideal solution, at some point I’ll be doing it.


#3

This topic was automatically closed after 7 days. New replies are no longer allowed.