Cucumber tests for regulatory data?

16 July 2014

This is a write up of an idea that came out of the Environment Agency hackday.

How do we know software is working?

We can run the software and look out for bugs. For an open-source project, we can inspect the code.

We can also write and run automated tests against the software, so if it’s broken we know. Something like this:

GIVEN a user is logged in
    WHEN they click on the 'my account' link
    THEN they can view their billing history 

That example is written in a format called Cucumber or Gerkin syntax. It is designed to both be readable, and to be written by, a non-technical person, but can be run automatically by a machine.

How about a scientific experiment? How is that verified?

Since the 17th century, it’s been via the publication of the results, along with a description of how to replicate the experiment, in a peer reviewed journal.

Increasingly that needs to include any source code used to run the experiment.

Finally, what about regulatory authorities? How does society verify that regulations are being met? How do we check for breaches?

The relevant authority probably publish reports listing any breaches of regulations. They may also publish raw open data to show that in a transparent way.

The Environment Agency, for example, publish data on pollution incidents, ground water abstraction and water quality.

So we could look through that open data, and manually check it against the various regulations, legislation and policies.

But what if we applied software testing principles to regulatory data and wrote automated tests to show up and breaches of regulations? Tests that were understandable and can be run by machines.

They might look like this:

GIVEN there is a release of Sulphur Dioxide from a power station
    WHEN the concentration of Sulphur Dioxide should not be greater than 40 ppm
    THEN there has been a breach of regulation E209       

Or this:

GIVEN a company with license R409
    WHEN it abstracts > 100L in a 30 day period
    THEN there has been a breach of license R409

Regulatory bodies could start publishing tests like these alongside the open data they release to demonstrate how they are regulating by showing their workings.

Other interested parties - campaign groups, charities, parliament - could review tests against legislation and run them against the data to independently verify that the regulatory body is doing its job.