In order to fight the pandemic, it is essential to have consistent data to work with. Frictionless Data can help achieve that consistency by providing the means to automatically detect common errors and prompt for it to be corrected.
With that in mind, I have added Tabular Data Packages to Brasil.io, a collaborative platform that has been scraping data from the official State Health Secretariat reports. The data is then made available both as CSV downloads and though an API. It is updated daily and is the only, AFAIK, open data source on Covid-19 in Brazil that has city-level data.
A recent issue on the project discusses whether we should suggest or recommend that authorities do use a specific table schema. That would help not only in collecting information more directly from CSVs instead of scraping PDFs, but also making data more uniform across states, making them more comparable and easier to aggregate.
The local chapter of Open Knowledge has even been evaluating transparency of states regarding the disclosure of Covid-19 data, but it does not propose a specific schema, nor have the data points required in the “content” part of the score been based in any sort of international “standard” for Covid-19 data.
In fact, each international source I have looked into uses a different schema and has different data. For instance, the Johns Hopkins CSSE dataset does feature daily data on recovered cases, whereas most other international datasets don’t. I don’t think an internationally agreed standard schema for Covid-19 data does exist, but it definitely should exist.
Are you aware of any data standardization efforts for Covid-19 data?