Geo Data Package


#42

Thanks @henrykironde. The example I was thinking of is point data in a CSV:

facility id facility name address lat lon
1 St. John’s Hall 1 jones st, london 51.5 0.1
2 Data Wranglers Community Hall 2 smith st, west ham na na
3 Coordinate’s Anonymous Hall 3 fiona rd, shoreditch 51.55
4 Missing Values Drop-in Centre 4 shrek lane, soho
5 St. John’s Childcare 1 jones st, london 51.5 0.1
6 Pear Hall 6 smith st, west ham na

The data is useful without lat,lon.

On validation:

  • row 1 is OK
  • row 2 is OK assuming "missingValues": "na" is defined
  • row 3 is Invalid but how is this tested if the required constraint can’t be used due to missing values allowed? Perhaps using "locations": [{ "type": "lat-lon", "fields": {"latitude": "lat", "longitude": "lon"}}] could check if both values are present or missing?
  • row 4 is OK, default is "missingValues": ""
  • row 5 is OK, but note it is a duplicate of row 1 hence using primaryKey won’t work.
  • row 6 is OK, but not pretty

#43

Thanks for the clarification @Stephen I like the example. What makes this file a spatial data file is the fact that it has latitude and longitude columns with at least one row having both actual latitude and longitude. There will be many cases of such data and I think it is totally fine as long as “missingValues” is defined.

In my opinion, I would not call this a duplicate. They may be locations on the same building but on different floors (row 5 is OK, but note it is a duplicate of row 1 hence using primaryKey won’t work.).
If I get this correct, I am thinking of something like

{
  "locations": [
    {
      "type": "lat-lon",
      "fields": {
        "latitude": "lat",
        "longitude": "lon"
      },
      "missingValues": "NA"
    }
  ]
}

#44

Hi @henrykironde you’re correct - using primaryKeys is an incorrect hack. The primaryKey is facility id.

My point is that software that validates the data (like goodtables.io, tableschema.js or datapackage.js) will need to use the locations property to ensure that either:

  • both latitude and longitude are present
  • both latitude and longitude are missing based on missingValues