See initial discussion of this topic here: https://lists.okfn.org/pipermail/data-protocols/2015-July/000101.html
First can I say I am a long-time follower and huge fan of the
At Snowplow we are thinking of using JSON Table Schema in our Iglu schema
First a quick question - I couldn’t find a JSON Schema for the JSON Table
Schema. Has anybody written this yet?
More broadly: I’m not convinced that the current unitary JSON Table Schema
is a viable approach.
Different relational databases have different capabilities - for example, a
valid table definition for Redshift must have SORTKEY and DISTKEY, and
indexes are not supported. This is distinct from Postgres DDL, which in
turn is distinct from BigQuery DDL, Vertica DDL etc.
For me, the value of a JSON Table Schema would be in making table DDL
declarative and composable. To be useful though, it must be possible to
generate valid idiomatic (i.e. database-specific) DDL from a given instance
of a JSON Table Schema.
Based on this, I’m leaning towards a JSON Table Schema which has
database-specific flavors. I think the two options here are:
- Create a separate definition document (in JSON Schema) for each
database that we want to support, or
- Create a unitary JSON Table Schema which uses enums of e.g.
database-specific field-descriptor types to support differences
The downside of the first option is that there is no guaranteed
predictability of schema shape between different database types. The second
option is a little more fiddly but probably more useful long-term.
Does anybody have any thoughts on the above?