The (Open) Data Consumer's Checklist

Key considerations for users of open data

Overview of today

  1. an example data set
  2. the general case
  3. open data certificates (if we have time)
  4. questions

Considerations for the
general case

  1. Accessibility
  2. Ownership and licensing
  3. Form
  4. Quality
  5. Support

Accessibility (1/5)

  • is the data already available?
    if so, where?
  • how can you access it?
    dumps? an api?
  • in what format is the data published?
    csv? xml? json? pdf?!

Ownership and licensing (2/5)

  • who publishes the data?
  • are they the originator of the data?
  • under what licence is the data published?
  • does it contain personal data?

Form (3/5)

  • prior processing
    is the data in raw or summary form?
    how has it been processed?
  • form
    shape, granularity, etc.
    how will these affect your analysis/product/application?
    is the form compatible with other data you are using?
  • transformations
    what syntactic and semantic transformations
    will you need to make?

Quality (4/5)

  • currency and regularity
    how current is the data?
    how regularly is it updated?
    for how long will it be published?
    what is the commitment by the publisher?
  • comprehensibility
    do you understand all the fields and their context?
  • accuracy
    what do you know about the accuracy of the data?
    how is missing data handled?

Support (5/5)

  • (how) is the data set documented?
  • does the metadata make sense?
  • is there a place you can report errors in the data?
  • does the publisher offer support in any way?

Open Data Certificates

  • the first robust quality badge for open data
  • helps...
    publishers certify their data
    users find and use it
    policy makers benchmark

Quick, what are the three most important questions?

  1. how can you use the data?
  2. is the quality sufficient and appropriate?
  3. will the data be available in the future?



