Skip to content

Report of the Digital Government Review

Anonymity: no guarantees

So far, there are no comments on this section. Jump to comments

It is important to recognize the risks when dealing with data, especially personal data. Most risks are exposed when individuals can be identified within the data. Hence much shared data will be anonymised with the aim that even those who collect and analyse an entire data set can identify no individual.

Organisations will frequently state that data is safe to be shared or released as it has been anonymised and that no individual can be identified. This is, unfortunately, an oversimplification. Despite the use of the best algorithms and the best obfuscation techniques it is not possible to guarantee that no one can be identified within anonymised data. As the UK Anonymisation Network (UKAN) state “As with any security measure anonymisation is not foolproof” [54].

There are a number of reasons for this: continuing advances in statistical techniques, continuing advances in computing technology, the continuing availability of additional datasets which create the ability to link data to aid identification. We would recommend that those wishing to understand the detail read the 2010 paper by Paul Ohm “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymisation” [55] or explore the excellent set of resources collated by UKAN.

Once we accept this fact, it is easy to become extremely alarmed about the information that has already been released and which might – we would stress might – be re-identified in the future. If people’s bloodtypes were exposed in a hypothetical future leak of NHS then cases where children have a legal father who is unaware that he is not the biological father would be exposed. This would clearly cause significant upset.

Being scared is different to being informed, to understanding and communicating risk, and to making informed decisions about how data is used or not used.

But in understanding the risk we need to start by making the assumption that it is not possible to guarantee anonymity of personal data.

This is not the current starting assumption for many policy makers. There are some organisations that will hold to high anonymity standards but there are many that have failed and created a higher risk of disclosure of sensitive personal data by over-stating the power of an algorithm and under-estimating the risk of re-identification.

Public sector organisations should start with the assumption that it is not possible to guarantee anonymity of personal data.

“To unlock the potential of IOT We need a data-handling framework that categorizes different types of data and associated management strategies. Its aim should be to reassure consumers while at the same liberating data to drive innovation.” – Large Company


This page reformats automatically when printed. Print this section

Please note that comments left here are public - you can also make a private submission.

Your email address will not be published. Name, email address and comment are required fields. Please note we may moderate comments.