Secure, reliable deletion thanks to precise, central rules
- Case Studies
Development and integration of an enterprise application with web UI, workflow-controlled Java processes, and message queuing
E-Plus (Telefonica) dared to approach the critical topic of data deletion and is now one of the first telephone service providers to manage its customer data stock in structured and centralized fashion, in compliance with data protection regulations.
Previously, the deletions were done manually by employees after an analysis in different locations was performed. This resulted in different interpretations of the rules and was complicated when it came to coordinating with other departments.
In order to understand how complex this topic is, you have to imagine how many systems participate in the provision, management, and support of telecommunications services. Let's look at a fictitious customer: we’ll call him Frank.
Frank has a mobile phone contract for himself. Frank has a second SIM card on his contract for his wife Andrea. Then a while ago, he ordered a prepaid phone for his daughter Lisa. However, Lisa is now 18 years old and has her own mobile phone contract. There is still some credit on the prepaid card, but it’s not being used at the moment. Frank also selected a prepaid phone for his son Luke. Luke’s phone, including the card, was stolen. The card is locked, but in the meantime it has been used by third parties – and here it is unclear whether Frank’s insurance company or that of the telephone service provider will pay the damages.
Now Frank has decided to change mobile phone providers and he has canceled his contract. For the mobile phone provider, he now has the status “deactivated customer” and the challenge for the provider is to manage Frank’s data properly with regard to data protection, accounting, and legal procedures.
What should become clear from this example: Frank's personal data is stored in many different systems and is also required there:
There are different retention periods for different processes. For example, invoices have to be kept for 10 years. Connection data, by contrast, has to be deleted after a brief time – unless it is invoice-relevant.
Another challenge is that not every system saves the same data. Some systems use and know only the SIM number of the decommissioned prepaid card, others only the mobile number of the second SIM card. However, all of this data is personal since it can be traced back to Frank.
Our customer approached us with this initial situation. Our task was to develop a system that controls and monitors the deletion of obsolete customer data. Just deleting data would be easy. Of course our system should only approve for deletion the data that has to be removed according to certain specifications. And this everywhere where data is stored that can be traced back to Frank in any way. Further down the line, we call these deletion candidates “deactivated customers.”
From the description above, it should be clear that this task is extraordinarily complex.
A system for the clean, automated deletion of personal customer data
Who is my customer? Combination of loose strings of data
Initially, we had to find out in which systems personal data is recorded – for it has to be deleted there.
Here, there are:
We call systems in the second and third category source systems since they help us identify the candidates for deletion. We call systems in which data has to be deleted target systems.
The tricky part is that none of the source systems includes all identification characteristics regarding whether a customer may be deleted and these systems also do not contain all the data types to be deleted. Each system manages its data using its own management structure.
Here is a simplified example: Source system A knows the customer number and the contract number. Source system B knows the telephone number and the SIM card number, but not the contract or customer number. System C knows both the contract number and the telephone number. Via system C, the data from systems A and B can be linked and a specific customer number assigned.
However, it can be that certain data for a deactivated customer may not yet be deleted. This is the case, for example, if the remaining credit on a prepaid card has not yet been used or there are unpaid invoices with a term contract.
This information comes from source system D. There has to be another synchronization.
Only systems A, B, C, and D together include the information about whether or not a customer may be deleted.
As described earlier, some data must be kept longer than other data. And under some circumstances, data whose original retention period has elapsed may not yet be deleted.
We translated these (truly very complex) specialized relationships into some programming rules. This sounds simple, but it wasn’t. For packing many if-then cases cleanly into a logical set of rules was a central milestone of the project.
Summary: After consolidation, the data in DROP is subjected to a set of deletion rules. Here it must be decided for each customer whether and according to which rules he may be deleted.
Only now is the picture complete: DROP knows all customers, knows whether and why they must be deleted, and can assign them uniquely in each target system.
Do you think we can now start with the deletion at last? Nope! Since the previous steps were so complex, it’s obvious that the transfer of the deletion commands is no less complex.
In order to initiate the deletion, DROP creates a list with deletion commands. This list is made available to the target systems (do you remember? those are the systems from which data has to be removed). Each target system can see from this list whether and which data it must delete.
So much for the theory. In practice, here’s what this means: An enormous quantity of data has to be shifted and processed. This computing task demands a lot of varied, in some cases very old systems. Not every system can handle such large quantities of data equally well. In order not to disturb operations, a concern was that the systems could not be slowed down by the deletion processes.
Our solution: So that we could proceed as quickly as possible but as slowly as necessary, each system receives the delete commands individually in precisely the packet quantities that the system in question can process. The deletion packets are sent asynchronously.
Some systems only receive data at night – that is, at low-traffic times – others can process a maximum of a handful of data records per hour, and some systems perform better and make faster progress.
Now things get easier. The deletion lists are made available to the target systems at regular intervals. Customers who are on the list are removed from the systems.
The law requires proof of deletion of the data. To fulfill this requirement, it is documented in a separate report which data was deleted successfully. Since this report, in turn, contains personal data, it is also destroyed after a protection period. Once this process is complete, the customer data is finally deleted, in compliance with data protection regulations.
It was clear to us that this whole process is very complicated. In the course of the project, however, there were a few unforeseen challenges that had to be confronted.
One of the challenges was the sometimes poor data quality. Many data stocks were as old as the company – much of the data and many of the systems and processes are 10 to 20 years old – in the digital age, this makes them true fossils.
So that nothing incorrect is deleted, we built in a kind of ripcord: If DROP encounters nonsensical or incomprehensible information, it is not deleted; instead, a warning is displayed. This data then has to be compared individually and manually and corrected. After correction, the data record can be entered again with the next DROP run.
The deletion of customer data is a company-critical process. Errors here cannot be forgiven. Therefore, there was an intensive test phase and multi-stage introduction of the system.
In the course of the integration test, it quickly became clear that the test data was not “real” enough. The complexity of the processes required a more realistic test field than previous systems.
We analyzed the live data in detail and were able to create a custom-tailored test field for the testing.
The whole DROP application is characterized by asynchronicity and parallelism. So that DROP runs cleanly, we had to formulate a whole lot of complex processes and data relationships, control processes, and use external interfaces. For the flow control, we used Activiti – a BPMN-compliant workflow engine. It has the advantage that the workflows can be displayed graphically so that the specialists in charge (who do not love code as much as we do) can examine them. This made communication between technical specialists and the specialized departments much easier. This way, we were able to assess the technical implementation of the requirements together.
To decouple the processes from each other in time, they communicate with one another via a message bus (Active MQ). For the implementation of the interfaces to the outside, we relied on proven enterprise integration patterns. We used Apache Camel for this.
We have been assisting companies in the telecommunications sector in the areas of fraud management and compliance for many years. For this project, we were requested explicitly since we are known as an experienced, easy-going, and flexible partner.
Overall, we worked very closely together from the design to the testing to the go-live. We were always informed about the status of other people’s tasks.
The system has been live since April 2015 and customer data is deleted in data protection-compliant fashion.
Spring Framework | ORM (Object Relational Mapping ) – Hibernate | Message Queuing – Active MQ | Web-Framework – Apache Wicket | BPMN Workflow Engine – Activiti | Enterprise Integration – Apache CAMEL
DROP was certainly one of the most demanding projects in which we have been involved. Precisely because of this, we are proud of the result and we learned a lot in the process. It’s always nice to overcome challenges, and this project offered us many of these. Our customer recognized how important this topic is and set a true milestone with respect to data protection with this system. From the consumers’ point of view, we hope that other companies will follow this example and quickly establish data protection-compliant handling of their customer data.