The core of the Open Data portal watch system is implemented using Python.
The backend technoligy stack currently consists of Postgres 9.4 to store the collected metadata and several python modules for harvesting, assessing and analysing the metadata.
The current frontend technoligy stack uses Python Flask and Tornado as HTTP server and JQuery and the Semantic UI, a modern front-end development framework, powered by LESS and jQuery.
We perform all our requests from the following IP address: 18.104.22.168
The framework accesses the available APIs for the monitored portals nce a week.
We try to minimise the number of requests on the servers by applying wait times of 1-5 seconds
between two consecutive requests on the same domain
and use pagination for APIs
HTTP HEAD lookups
As pat of one of our quality metrics, the system performs HTTP HEAD lookups on the resources to check their availability using threading.
Our system respects the policies specified in the robots.txt
To prevent denial of services attacks, we use wait times between consecutive HEAD lookups on the same domain.
The per-domain wait time is either extracted from the robots.txt or set to a default value of 5 seconds.