Peer Reviewed Publications
-
Sebastian Neumaier, Jürgen Umbrich, and Axel Polleres.
Automated quality assessment of metadata across open data portals.
ACM Journal of Data and Information Quality (JDIQ), 2016.
[ .pdf ]
The Open Data movement has become a driver for publicly available data on the Web. More and more data -- from governments, public institutions but also from the private sector -- is made available online and is mainly published in so called Open Data portals. However, with the increasing number of published resources, there are a number of concerns with regards to the quality of the data sources and the corresponding metadata, which compromise the searchability, discoverability and usability of resources. In order to get a more complete picture of the severity of these issues, the present work aims at developing a generic metadata quality assessment framework for various Open Data portals: we treat data portals independently from the portal software frameworks by mapping the specific metadata of three widely used portal software frameworks (CKAN, Socrata, OpenDataSoft) to the standardized DCAT metadata schema. We subsequently define several quality metrics, which can be evaluated automatically and in a efficient manner. Finally, we report findings based on monitoring a set of over 260 Open Data portals with 1.1M datasets. This includes the discussion of general quality issues, e.g. the retrievability of data, and the analysis of our specific quality metrics.
-
Jürgen Umbrich, Sebastian Neumaier, and Axel Polleres.
Quality assessment & evolution of open data portals.
In IEEE International Conference on Open and Big Data, Rome, Italy, August 2015. Best paper award.
[ .pdf ]
Despite the enthusiasm caused by the availability of a steadily increasing amount of openly available, structured data, first critical voices appear addressing the emerging issue of low quality in the meta data and data source of Open Data portals which is a serious risk that could disrupt the Open Data project. However, there exist no comprehensive reports about the actual quality of Open Data portals. In this work, we present our efforts to monitor and assess the quality of 82 active Open Data portals, powered by organisations across 35 different countries. We discuss our quality metrics and report comprehensive findings by analysing the data and the evolution of the portals since September 2014. Our results include findings about a steady growth of information, a high heterogeneity across the portals for various aspects and also insights on openness, contactability and the availability of meta data.
-
Jürgen Umbrich, Sebastian Neumaier, and Axel Polleres.
Towards assessing the quality evolution of open data portals.
In ODQ2015: Open Data Quality: from Theory to Practice Workshop, Munich, Germany, March 2015.
[ .pdf ]
In this work, we present the Open Data Portal Watch project, a public framework to continuously monitor and assess the (meta-)data quality in Open Data portals. We critically discuss the objectiveness of various quality metrics. Further, we report on early findings based on 22 weekly snapshots of 90 CKAN portals and highlight interesting observations and challenges.
Bachelor Theses
-
Mattias Blaim.
Quality and Compatibility of License Information in Open Data portals.
Bachelor thesis, Vienna University of Economics and Business, Vienna, Austria, December 2014.
[ .pdf ]
-
Norbert Walter.
OpenData@WU. A data-pipeline for the WU Bach-API.
Bachelor thesis, Vienna University of Economics and Business, Vienna, Austria, 2015.