Leveraging GDPR “Legitimate Interests Processing” for Data Science

by | Nov 18, 2019

The GDPR is not intended to be a compliance overhead for controllers and processors. It is intended to bring higher and consistent standards and processes for the secure treatment of personal data. It’s fundamentally intended to protect the privacy rights of individuals. This cannot be more true than in emerging data science, analytics, AI and ML environments where due to the nature of vast amounts of data sources there is higher risk of identifying the personal and sensitive information of an individual. 

The GDPR requires that personal data be collected for “specified, explicit and legitimate purposes,” and also that a data controller must define a separate legal basis for each and every purpose for which, e.g., customer data is used. If a bank customer took out a bank loan, then the bank can only use the collected account data and transactional data for managing and processing that customer for the purpose of fulfilling its obligations for offering a bank loan. This is colloquially referred to as the “primary purpose” for which the data is collected.  If the bank now wanted to re-use this data for any other purpose incompatible with or beyond the scope of the primary purpose, then this is referred to as a “secondary purpose” and will require a separate legal basis for each and every such secondary purpose. 

For the avoidance of any doubt, if the bank wanted to use that customer’s data for profiling in a data science environment, then under GDPR the bank is required to document a legal basis for each and every separate purpose for which it stores and processes this customer’s data. So, for example, a ‘cross sell and up sell’ is one purpose, while ‘customer segmentation’ is another and separate purpose. If relied upon as the lawful basis, consent must be freely given, specific, informed, and unambiguous, and an additional condition, such as explicit consent, is required when processing special categories of personal data, as described in GDPR Article 9.   Additionally, in this example, the Loan division of the bank cannot share data with its credit card or mortgage divisions without the informed consent of the customer. We should not get confused with a further and separate legal basis the bank has which is processing necessary for compliance with a legal obligation to which the controller is subject (AML, Fraud, Risk, KYC, etc.). 

The challenge arises when selecting a legal basis for secondary purpose processing in a data science environment as this needs to be a separate and specific legal basis for each and every purpose. 

It quickly becomes an impractical exercise for the bank, let alone annoying to its customers, to attempt obtaining consent for each and every single purpose in a data science use case. Evidence shows anyway a very low level of positive consent using this approach. Consent management under GDPR is also tightening up. No more will blackmail clauses or general and ambiguous consent clauses be deemed acceptable. 

GDPR offers controllers a more practical and flexible legal basis for exactly these scenarios and encourages controllers to raise their standards towards protecting the privacy of their customers especially in data science environments. Legitimate interests processing (LIP) is an often misunderstood legal basis under GDPR.  This is in part because reliance on LIP may entail the use of additional technical and organizational controls to mitigate the possible impact or the risk of a given data processing on an individual. Depending on the processing involved, the sensitivity of the data, and the intended purpose, traditional tactical data security solutions such as encryption and hashing methods may not go far enough to mitigate the risk to individuals for the LIP balancing test to come out in favour of the controller’s identified legitimate interest . 

If approached correctly, GDPR LIP can provide a framework with defined technical and organisational controls to support controllers’ use of customer data in data science, analytics, AI and ML applications legally. Without it, controllers may be more exposed to possible non-compliance with GDPR and the risks of legal actions as we are seeing in many high profile privacy-related lawsuits. 

Legitimate Interests Processing is the most flexible lawful basis for secondary purpose processing of customer data, especially in data science use cases. But you cannot assume it will always be the most appropriate. It is likely to be most appropriate where you use an individual’s data in ways they would reasonably expect and which have a minimal privacy impact, or where there is a compelling justification for the processing.

If you choose to rely on GDPR LIP, you are taking on extra responsibility not only for, where needed, implementing technical and organisational controls to support and defend LIP compliance, but also for demonstrating the ethical and proper use of your customer’s data while fully respecting and protecting their privacy rights and interests. This extra responsibility may include implementing enterprise class, fit for purpose systems and processes (not just paper-based processes). Automation based privacy solutions such as CryptoNumerics CN-Protect that offer a systems-based (Privacy by Design) risk assessment and scoring capability that detects the risk of re-identification, integrated privacy protection that still retains the analytical value of the data in data science while protecting the identity and privacy of the data subject are available today as examples of demonstrating technical and organisational controls to support LIP.   

Data controllers need to initially perform the GDPR three-part test to validate using LIP as a valid legal basis. You need to:

  • identify a legitimate interest;
  • show that the processing is necessary to achieve it; and
  • balance it against the individual’s interests, rights and freedoms.

The legitimate interests can be your own interests (controllers) or the interests of third parties (processors). They can include commercial interests (marketing), individual interests (risk assessments) or broader societal benefits. The processing must be necessary. If you can reasonably achieve the same result in another less intrusive way, legitimate interests will not apply. You must balance your interests against the individual’s. If they would not reasonably expect the processing, or if it would cause unjustified harm, their interests are likely to override your legitimate interests.  Conducting such assessments for accountability purposes is happily now also easier than ever, such as with TrustArc’s Legitimate Interests Assessment (LIA) and Balancing Test that identifies the benefits and risks of data processing, which assigns numerical values to both sides of the scale and uses conditional logic and back-end calculations to generate a full report on the use of legitimate interests at the business process level.

What are the benefits of choosing Legitimate Interest Processing?

Because this basis is particularly flexible, it may be applicable in a wide range of different situations such as data science applications. It can also give you more on-going control over your long-term processing than consent, where an individual could withdraw their consent at any time. Although remember that you still have to consider managing marketing opt outs independently of whatever legal basis you’re using to store and process customer data.  

It also promotes a risk-based approach to data compliance as you need to think about the impact of your processing on individuals, which can help you identify risks and take appropriate safeguards. This can also support your obligation to ensure “data protection by design,” performing risk assessments for re-identification and demonstrating privacy controls applied to balance out privacy with the demand for retaining analytical value of the data in data science environments. This in turn would contribute towards demonstrating your PIAs (Privacy Impact Assessments) which forms part of your DPIA (Data Protection Impact Assessment) requirements and obligations.

LIP as a legal basis, if implemented correctly and supported by the correct organisational and technical controls, also provides the platform to support data collaboration and data sharing.  However, you may need to demonstrate that the data has been sufficiently de-identified, including by showing that the risk assessments for re-identification are performed not just on direct identifiers but also on all indirect identifiers as well.  

Using LIP as a legal basis for processing may help you avoid bombarding people with unnecessary and unwelcome consent requests and can help avoid “consent fatigue.” It can also, if done properly, be an effective way of protecting the individual’s interests, especially when combined with clear privacy information and an upfront and continuing right to object to such processing. Lastly, using LIP not only gives you a legal framework to perform data science it also provides a platform that demonstrates the proper and ethical use of customer data, a topic and business objective of most boards of directors. 

 

About the Authors:

Darren Abernethy is Senior Counsel at TrustArc in San Francisco.  Darren provides product and legal advice for the company’s portfolio of consent, advertising, marketing and consumer-facing technology solutions, and concentrates on CCPA, GDPR, cross-border data transfers, digital ad tech and EMEA data protection matters. 

Ravi Pather of CryptoNumerics has been working for the last 15 years helping large enterprises address various data compliance such as GDPR, PIPEDA, HIPAA, PCI/DSS, Data Residency, Data Privacy and more recently CCPA compliance. I have a good working knowledge of assisting large and global companies, implement Privacy Compliance controls as it particularly relates to more complex secondary purpose processing of customer data in a Data Lakes and Warehouse environments. 

Join our newsletter