Exploring the meaning of, "Available in bulk"?

Currently the question, “Available in bulk?” in the census has help text,

Data is available in bulk if the whole dataset can be downloaded easily. It is considered non-bulk if the citizens are limited to getting parts of the dataset through an online interface. For example, if restricted to querying a web form and retrieving a few results at a time from a very large database.

If a dataset is published weekly and the current and previous weeks data are all available for download, is it considered “available in bulk”?

See also Open Data Handbook definition of bulk

I think that if we asked for it to be updated on a weekly basis, than its bulk. Otherwise, I believe it is not bulk.

Thanks @Mor
The question is being driven by our experience of users complaining about the size of our bulk data and downloads timing out. I think bulk releases should:

  • match the update frequency stated in your metadata.
  • be a size that enables the data to be downloaded by the majority of internet users in your region.

In addition, “Historical copies of datasets should be preserved, archived, and kept accessible as long as they retain value” as stated by the draft International Open Data Charter.

It sounds right to me, but I am not the expert to this, so it might be good to hear other people on that as well :smile:
@rufuspollock @dirdigeng @ddie @hackyourcity - thoughts?

1 Like