Data anonymisation as a safe method of personal data protection

2019/07/22

The year 2018 was the year of GDPR. The decision of the European Union to tighten up the rules on processing and protection of personal data has affected the activities of companies worldwide. One of the requirements that is part of the new regulation is to store and process personal data in such a way as to ensure the security of data subjects. To meet this requirement, pseudonymisation and anonymisation of personal data are used, among others. From this article, you will learn more about these two techniques.

Key topics related to GDPR

When dealing with personal data, whether processed to sign a contract (e.g. sales contracts) or for marketing purposes, you must:

Appoint a personal data controller who will be responsible for the security of personal data,
Keep a catalog of personal data sets, their processing records, as well as possible violations,
In case of abuse or unlawful processing of personal data, you will be obliged to inform the data subjects about the incidents,
Delete the data that was collected to perform outdated activities, e.g. to send a newsletter when a subscriber canceled their subscription.

Consumers gained a number of rights they hadn’t had before, for instance:

The right to be forgotten and to permanently delete personal data from a company’s database,
To receive a copy of their data and the record of activities it was used for,
An opportunity to transfer their personal data to another entity,
To obtain information about details of their data processing (such as the purpose and the time until when the data will be processed) without having to ask for it.

Company owners had been obliged to take care of personal data security even before the implementation of GDPR. However, the greatest astonishment was caused by heavy penalties for violations, as well as the fact that you have to determine on your own how to protect personal data to, e.g. avoid leakage. And data leakage is more and more frequent. For instance, just a year after the implementation of GDPR, there was an attack on the popular Canva.com graphic design service. During the incident, attackers could have access to data from up to 139 million accounts! As for the Polish market, the Morele.net issue was a high-profile one – customers’ data leaked from the service and were published online.

What is pseudonymisation, and is it sufficient to protect personal data

To meet entrepreneurs’ expectations, the legislators suggested reversible anonymisation – in the regulation called “pseudonymisation”– should be used, as a way to reduce the risk of personal the data security breach. It involves processing personal data in such a way that it is impossible to identify a specific person without additional information, e.g. a key used for the data encryption. What is important, the encrypted data and the information that lets you reverse the process should be stored in different locations.

Please note that with pseudonymisation you cannot be 100% sure that the personal data you collect and the process is secure. If both the data and the decrypting information fall into the wrong hands, there may be a leakage of non-confidential data that you can then attribute to specific individuals.

Data anonymisation as a safer alternative

Companies that process large amounts of personal data, especially sensitive ones, should consider full data anonymisation, especially if external entities, such as programmers or server administrators, have access to them. It may not be enough to sign a contract according to which they are obliged to take care of your customer data security. You have no influence on what will happen on their side. If there is a data security breach that is not your fault, but you are the data controller, you will be responsible for this situation, and you will be the one to bear all the consequences.

The difference between data anonymisation and pseudonymisation in that the first one is irreversible. There is also no key to decrypt the data and assign it to a specific person. It means the personal data that have been anonymised is no longer personal data according to GDPR. In other words, anonymisation removes the link between the data and the data subjects.

It is also worth noting that thanks to anonymisation, you do not need to get consent to personal data processing, and you may store the data for an unlimited period of time.

How to anonymise your data

In the past, when we stored personal data mainly on paper, the easiest way to anonymise it was merely to blot it with a black marker. However, this is a time-consuming method, and fortunately, with electronic data, we can automate the process. There are several methods of anonymisation, for instance:

Randomisation – which means removing the link between the data and its holder by randomly separating them,
Generalisation – that is making data less precise, e.g. a range is given instead of a specific number,
Data perturbation – i.e. replacement of real data with another data set, similar to the original one,
Character masking – changing selected characters into, e.g. “X”,
Data aggregation – which means data grouping.

Shopware5Anonymizer

While working with clients on various types of projects, we noticed a need for efficient and straightforward anonymisation, especially in a non-production environment. That’s why we created Shopware5Anonymizer.

Shopware5Anonymizer is a plugin for the Shopware 5 e-commerce platform, used to anonymize customer databases of online stores. It generates random data and replaces the real one. The algorithms used in the plugin let you anonymise data quickly and effectively.

In its basic version, the plugin anonymizes data from native Shopware tables, but its features can be easily extended and adapted to any Shopware online store, including one that has implemented additional personal data storing solutions.

When it is worth to implement data anonymization

If you decide to erase personal data from your system irreversibly, you have to take into account the fact that you will not be able to use the data anymore, e.g. for analytical purposes. That is why the decision about data anonymisation should be well thought out, mainly because you cannot reverse the process.

Anonymisation is especially worth considering when:

You provide third parties with access to your databases. You don’t need to sign contracts with them to entrust the processing of personal data,
You want to limit data sets subject to GDPR. Anonymisation is not subject to GDPR. Thanks to that, even if there is a leakage, you don’t have to inform the data subjects or state authorities about it,
You are concerned about how to store pseudonymised data and the decryption key,
People whose data you process often ask you to delete their data (the right to be forgotten). Anonymisation allows you to do it and, at the same time, you can keep anonymous information that is not subject to GDPR and cannot be used to identify a specific person.

As you can see, data anonymisation has several undeniable pros. However, there are also cons. Irreversibility is the key one. That is why you should consider both the pros and cons before you decide to carry out this process. Note that irreversibility can also be a huge advantage. Everything depends on the situation, the reasons why you collect data, and how you process it. It is no good to be too zealous, they say, but in the case of personal data and GDPR, it is better to be overzealous than face the consequence of the lack of proper protection.