Table Schema Constraints UUID


#1

Hello. I am attempting to use goodtables.validate to validate a CSV file. In my JSON schema, I have a constraint that I want to apply to a field that holds one or more UUID. The constraint should flag as invalid any rows that have more than one UUID in the field. As example, some fields will have 2 or more UUID separated by a new-line character.

I have tried using “format”=“uuid” but that flagged all records as invalid. I have tried using a variety of “constraint”=some form of xml regular expression and it either rules all rows invalid or passes all rows.

Any ideas on what I am doing wrong?

Thank you!


#2

Forgot to add that I also tried using “maxlength”=36 and that passed all records as well, which seems really odd so obviously I am doing something wrong.

Thanks!


#3

Try for a Column called uuid, this for a field in your Table Schema

{
        "name": "uuid",
        "type": "string",
        "format": "uuid"
      }

Can you share your data and schema?

The separation of uuids by a New line feed doesn’t sound right


#4

Re: maxLength … I realized that I had maxlength and hence it was not working. Now I can flag all the items that have more than 36 characters which is one way to go about it but I am still curious why the “format”=“uuid” did not flag them when there were two UUIDs in the specified field.

Here is the relevant section of the schema

        {
            "name":"RelUUID",
            "title":"RelUUID",
            "type":"string",
            "format": "uuid",
            "description":"The global unique identifier(s) of the item(s) to which this sample belongs."
        },

RelUUID refers to the last field in the below row. You can see the two UUIDs there separated by a newline. This is coming off an application that allows the user to select more than one UUID (hence the problem) and exports them the CSV with the newline separator. “D111a” and “D111a 5” are the object names (also separated by a newline) and directly after at the two UUID values.

D111 Sample 36 ochre field 2016-09-03T10:05:20Z CD 3-Sep-16 2016-09-03T10:12:55Z Europe/Athens A9753167-7DD8-4E42-9A4D-6AAEE531D0BB “D111a
D111a 5” "14D0B92B-8E40-435E-8E64-4AFFFBAFD2C3

Thanks for your thoughts.
49640BDE-05D7-4EBA-8BC5-C27E6149CC32"


#5

Hmm. I see that the input box inserted my sign-off between the two UUIDs.


#6

Sorry in above message, if should say: “but I am still curious why the “format”=“uuid” flags ALL the rows even when there is only a single UUID in the relevant field.”


#7

@nathanxmeyer here’s an example csv with uuid and a table schema with validation results from try.goodtables.io - that may help.

Can you reformat the data as a valid CSV? You can use Goodtables to validate a csv without using a schema, that may provide some clues. Use goodtables.yml to control your goodtables.io validation e.g. files: '*.csv'