Entry for National Statistics / Brazil

herrmann · January 12, 2017, 2:03pm

This is a discussion about the submission for National Statistics / Brazil.

This is another excellent submission. Kudos to @Wagner_Faria_de_Oliv!

Here are some points worth noting.

Formats (question B8) are marked: CSV, TSV and XLS. There may be other formats available on other IBGE websites and pages. However, at the links provided (which are the ones that are supposed to be considered in the evaluation), I could only find XLS.

Ease of use (question B9). Ease of use for open data should also take into consideration the ease of automation. In the case of this data, all the XLS files are very presentational, i.e. contain many cells whose purpose is too explain something or for visual design only. One of the common use cases of open data is for automation, when someone periodically downloads the files and reads cells and does something with the values, e.g. insert them into a database. One can do so with such files. However, since they’re presentational, the next iteration of the file might have the structure or position of the table changed, as it’s clearly intended for human consumption and not for automation. For this reason I’d move down ease of use a notch, if considering only the datasets mentioned at the links.

Note, however, that some of these are indeed made available by IBGE on other channels for machine readable consumption through the Sidra API, which was not mentioned as a source in this submission. However, I did not find this API’s documentation easy to follow, e.g. coudn’t find easily the list of available tables and documentation on each of those tables themselves.

Wagner_Faria_de_Oliv · January 16, 2017, 2:40pm

Dear @herrmann
Thanks for your comments!

Responding specifically to your points:

.CSV and .TSV are available in SIDRA queries. I think I put the overall link for the SIDRA webpage in question 2.2, didn’t I? my bad if I forgot that.
I totally agree with you concerning ease of use. Actually this question was the hardest for me in every dataset, because I think ease of use can be evaluated in many different dimensions. But I think you’re totally right about that.
I was actually not acquainted with SIDRA’s API, but it really should be included in the evaluation, you’re right about that.

Thanks again!
Wagner

herrmann · January 16, 2017, 4:23pm

@Wagner_Faria_de_Oliv, I’ve just checked the submission again, and ineed you did mention SIDRA and include a link to it. Sorry, my bad in overlooking that! The only thing that could perhaps be improved (that is, if you know this information) is providing the table numbers and/or example API calls for the information required by the survey (i.e. GNP, unemployment and population).

Yes, I too think it is difficult to evaluate ease of use. Nevertheless, I’d argue that SIDRA’s API is not easy to use, as its documentation alone is not enough to figure out the calls necessary for this dataset - one still has to find the tables and their documentation elsewhere.