System information

Technology

The core of the Open Data portal watch system is implemented using Python.

Backend

The backend technoligy stack currently consists of Postgres 9.4 to store the collected metadata and several python modules for harvesting, assessing and analysing the metadata.

Frontend

The current frontend technoligy stack uses Python Flask and Tornado as HTTP server and JQuery and the Semantic UI, a modern front-end development framework, powered by LESS and jQuery.

HTTP Lookups

We perform all our requests from the following IP address: 137.208.107.58

Harvesting metadata

The framework accesses the available APIs for the monitored portals nce a week. We try to minimise the number of requests on the servers by applying wait times of 1-5 seconds between two consecutive requests on the same domain and use pagination for APIs

HTTP HEAD lookups

As pat of one of our quality metrics, the system performs HTTP HEAD lookups on the resources to check their availability using threading. Our system respects the policies specified in the robots.txt file. To prevent denial of services attacks, we use wait times between consecutive HEAD lookups on the same domain. The per-domain wait time is either extracted from the robots.txt or set to a default value of 5 seconds.