On March 26th, 2019 I was invited – thanks to Elena Simperl – along with her postdoc Luis Ibanez Gonzelz to give a workshop about “Metadata and Data Quality – Practice and Tools” in Hamburg, Germany at the GovData office. The workshop was attended by a total of 25 participants from 12 EU countries. These participants are the EU member state representatives who are responsible and/or maintain their countries’ data portals.
The workshop started with the theory of data quality: how is it defined, what are the different dimensions and metrics for quality assessment. This was followed by guidelines for managing portals’ data, specifically for interlinking and link quality. After lunch, the participants were introduced to the FAIR data principles and metrics. They were shown specific metrics and model answers for each of them followed by a discussion on which metrics are most applicable to Data Portals.
Finally the techniques for assessing and improving quality were discussed i.e. both automated methods as well as how to involve humans-in-the-loop along with an overview of tools such as spreadsheets, Trifacta Wrangler, Jupyter notebooks and Open Refine that can be potentially used for quality assessment.
The participants gave positive feedback and left with a lot of food for thought on different quality criteria and methods that are relevant and that they can implement in their respective data portals.
Slides are available at
- Slides: Data quality assessment dimensions and metrics
- Slides: Crowdsourcing linked data quality assessment
- Slides: FAIR principles and metrics
Check out tweets and pictures below:
#opendata #dataquality #FAIR #LinkedData #EUDataPortal #DataPortal
Leave a Reply