Select Page
Banking and fraud detection; what is the solution?

Banking and fraud detection; what is the solution?

As the year comes to a close, we must reflect on the most historic events in the world of privacy and data science, so that we can learn from the challenges, and improve moving forward.

In the past year, General Data Protection Regulation (GDPR) has had the most significant impact on data-driven businesses. The privacy law has transformed data analytics capacities and inspired a series of sweeping legislation worldwide: CCPA in the United States, LGPD in Brazil, and PDPB in India. Not only has this regulation moved the needle on privacy management and prioritization, but it has knocked major companies to the ground with harsh fines. 

Since its implementation in 2018, €405,871,210 in fines have been actioned against violators, signalling that the DPA supervisory authority has no mercy in its fervent search for the unethical and illegal actions of businesses. This is only the beginning, as the deeper we get into the data privacy law, the more strict regulatory authorities will become. With the next wave of laws hitting the world on January 1, 2020, businesses can expect to feel pressure from all locations, not just the European Union.

 

The two most breached GDPR requirements are Article 5 and Article 32.

These articles place importance on maintaining data for only as long as is necessary and seek to ensure that businesses implement advanced measures to secure data. They also signal the business value of anonymization and pseudonymization. After all, once data has been anonymized (de-identified), it is no longer considered personal, and GDPR no longer applies.

Article 5 affirms that data shall be “kept in a form which permits identification of data subjects for no longer than is necessary for the purposes for which the personal data are processed.”

Article 32 references the importance of “the pseudonymization and encryption of personal data.”

The frequency of a failure to comply with these articles signals the need for risk-aware anonymization to ensure compliance. Businesses urgently need to implement a data anonymization solution that optimizes privacy risk reduction and data value preservation. This will allow businesses to measure the risk of their datasets, apply advanced anonymization techniques, and minimize the analytical value lost throughout the process.

If this is implemented, data collection on EU citizens will remain possible in the GDPR era, and businesses can continue to obtain business insights without risking their reputation and revenue. However, these actions can now be done in a way that respects privacy.

Sadly, not everyone has gotten the message, as nearly 130 fines have been actioned so far.

The top five regulatory fines

GDPR carries a weighty fine:  4% of a business’s annual global turnover, or €20M, whichever is greater. A fine of this size could significantly derail a business, and if paired alongside brand and reputational damage, it is evident that GDPR penalties should encourage businesses to rethink the way they handle data

1. €204.6M: British Airways

Article 32: Insufficient technical and organizational measures to ensure information security

User traffic was directed to a fraudulent site because of improper security measures, compromising 500,000 customers’ personal data. 

 2. €110.3M: Marriott International

Article 32: Insufficient technical and organizational measures to ensure information security

The guest records of 339 million guests were exposed in a data breach due to insufficient due diligence and a lack of adequate security measures.

3. €50M: Google

Article 13, 14, 6, 5: Insufficient legal basis for data processing

Google was found to have breached articles 13, 14, 6, and 5 because it created user accounts during the configuration stage of Android phones without obtaining meaningful consent. They then processed this information without a legal basis while lacking transparency and providing insufficient information.

4. €18M: Austrian Post

Article 5, 6: Insufficient legal basis for data processing

Austrian Post created more than three million profiles on Austrians and resold their personal information to third-parties, like political parties. The data included home addresses, personal preferences, habits, and party-affinity.

5. €14.5M: Deutsche Wohnen SE

Article 5, 25: Non-compliance with general data processing principles

Deutsche Wohnen stored tenant data in an archive system that was not equipped to delete information that was no longer necessary. This made it possible to have unauthorized access to years-old sensitive information, like tax records and health insurance, for purposes beyond those described at the original point of collection.

Privacy laws like GDPR seek to restrict data controllers from gaining access to personally identifiable information without consent and prevent data from being handled in manners that a subject is unaware of. If these fines teach us anything, it is that investing in technical and organizational measures is a must today. Many of these fines could have been avoided had businesses implemented Privacy by Design. Privacy must be considered throughout the business cycle, from conception to consumer use. 

Businesses cannot risk violations for the sake of it. With a risk-aware privacy software, they can continue to analyze data while protecting privacy -with the guarantee of a privacy risk score.

Resolution idea for next year: Avoid ending up on this list in 2020 by adopting risk-aware anonymization.

2 ways in which organizations can open their data in a privacy protected way

2 ways in which organizations can open their data in a privacy protected way

Opening data means making it accessible for any person to view, use, and share. While seemingly daunting, opening data helps researchers, scientists, governments, and companies better provide for the greater good of society. 

The McKinsey Global Insitute predicted that open data could unlock USD 5 trillion per year in economic value globally. However, opening data can’t happen without privacy protection.

Data marketplaces have begun popping up to make data more accessible and, in some cases monetizing it. These marketplaces are collections of data, with a degree of privacy protection, organized for buying and selling between companies for analytic purposes. 

After ensuring your company’s user data is privacy-protected properly, marketplaces are an excellent way to generate additional revenue sources, open up partnerships, and expand the value of your data.

Healthcare data marketplace.

Healthcare is an industry that has a lot to benefit from opening data. Making more data available can help find cures for rare diseases, monetizing data can provide much needed financial resources to hospitals, and letting external experts analyze data can lead to discoveries.

To take advantage of this opportunity, Mayo Clinic, a not-for-profit, academic medical center committed to clinical practice, education, and research, announced the launch of a healthcare data platform. The medical center is looking to digitize 25 million pathology slides in the next two years, creating the largest source of labeled medical data in the world that is easily accessible for doctors and other researchers to find the information necessary to diagnose or educate. 

Mayo Clinic is emphasizing the importance of eliminating the risk of re-identification of Personal Health Information (PHI) while maintaining the value of the data. 

Marketplaces in other industries

Data marketplaces have appeared in other industries, from finance to marketing, and government. The available information is beneficial to everyone involved.

Governments have been leading in the open data movement; for example, The United States Census Bureau created the leading source of statistical information of US citizens. Through their website, researchers can find data on employment, population, and other statistics relevant to American people. This data is collected, and de-identified such that any American person can be re-identified, while the information remains valuable. 

On the marketing side, there are a couple of examples. Salesforce launched Data Studio, and Oracle created the Oracle Data Marketplace. These projects allow companies to buy and sell their data, to better understand their customers and marketing activities. 

How is CryptoNumeric’s contributing to open data? 

Recently, we partnered with Clearsense, a healthcare data analytics company that is reimagining how healthcare organizations manage and utilize their data.

Clearsense helps healthcare organizations unlock the power of their disparate data sources to lower cost, increase revenue, and improve outcomes. 

Through the use of our products, CN-Protect and CN-Insight, Clearsense helps it’s healthcare partners in two ways:

  •  Anonymize their data so that it can be shared. Following HIPAA standards and using state-of-the-art privacy protection techniques, the original datasets are transformed, resulting in a privacy protected dataset that preserved its analytical value. 
  • Perform privacy-protected analytics. There are cases in which there is a need to combine various datasets; however, for regulatory restrictions, these datasets can not be moved, limiting its usefulness. With the help of CN-Insight, Clearsense was able to overcome this challenge and perform analytics on datasets as if they were combined, without the need of relocating them. 

Clearsense now can offer its customers an opportunity to open up their data in a way that is compliant with regulations and cost-effective.

By protecting data privacy while maintaining its value, opening up collected information is helping move toward a privacy-safe future that benefits from the enormous amounts of data generated every second.

To learn more about our partnership with Clearsense watch our webinar Facilitating Multi-Site Research: Privacy, HIPAA and Data Residency

To learn more about how the Mayo Clinic project, read our blog Two Paths of Data Monetization: Exploitation or Protection

 

Join our newsletter


The Consequences of Data Mishandling: Twitter, TransUnion, and WhatsApp

The Consequences of Data Mishandling: Twitter, TransUnion, and WhatsApp

Who should you trust? This week highlights the personal privacy risks and organizational consequences when data is mishandled or utilized against the best interest of the account holder. Twitter provides advertisers with user phone numbers that had been used for two-factor authentication, 37,000 Canadians’ personal information is leaked in a TransUnion cybersecurity attack, and a GDPR-related investigation into Facebook and Twitter threatens billions in fines.
Twitter shared your phone number with advertisers.

Early this week, Twitter admitted to using the phone numbers of users, which had been provided for two-factor authentication, to help profile users and target ads. This allowed the company to create “Tailored Audiences,” an industry-standard product that enables “advertisers to target ads to customers based on the advertiser’s own marketing lists.” In other words, the profiles in the marketing list an advertiser uploaded were matched to Twitter’s user list with the phone numbers users provided for security purposes.

When users provided their phone numbers to enhance account security, they never realized that this would be the tradeoff. This manipulative approach to gaining user-information raises questions over Twitter’s data privacy protocols. Moreover, the fact that they provided this confidential information to advertisers should leave you wondering what other information is made available to business partners and how (Source). 

Curiously, after realizing what happened, rather than come forward, the company rushed to hire Ads Policy Specialists to look into the problem. 

On September 17, the company “addressed an “error” that allowed advertisers to target users based on phone numbers.” (Source) That same day, they then posted a job advertisement for someone to train internal Twitter employees on ad policies, and to join a team working on re-evaluating their advertising products.

Now, nearly a month later, Twitter has publicly admitted their mistake and said they are unsure how many users were affected. While they insist no personal data was shared externally, and are clearly taking steps to ensure this doesn’t occur again, is it too late?

Third-Party Attacks: How Valid Login Credentials Led to Banking Information Exposure 

A cybersecurity breach at TransUnion highlights the rapidly increasing threat of third party attacks and the challenge to prevent them. The personal data of 37,000 Canadians was compromised when legitimate business customer’s login credentials were used illegally to harvest TransUnion data. This includes their name, date of birth, current and past home addresses, credit and loan obligation, and repayment history. While this may not include information on bank account numbers, social insurance numbers may also have been at risk. This compromise occurred between June 28 and July 11 but was not detected until August (Source).

While alarming, these attacks are very frequent, accounting for around 25% of cyberattacks in the past year. Daniel Tobok, CEO of Cytelligence Inc. reports that the threat of third party attacks is increasing, as more than ever, criminals are using the accounts of trusted third parties (customers, vendors) to gain access to their targets’ data. This method of entry is hard to detect due to the nature of the actions taken. In fact, often the attackers are simulating the typical actions taken by the users. In this case, the credentials for the leading division of Canadian Western Bank were used to login and access the credit information of nearly 40,000 Canadians, an action that is not atypical of the bank’s regular activities (Source).

Cybersecurity attacks like this are what has caused the rise on two-factor authentication, which looks to enhance security -perhaps in every case other than Twitter’s. However, if companies only invest in hardware, they only solve half the issue, for the human side of cybersecurity is a much more serious threat than often acknowledged or considered. “As an attacker, you always attack the weakest link, and in a lot of cases unfortunately the weakest link is in front of the keyboard.” (Source)

 

Hefty fines loom over Twitter and Facebook as the Irish DPC closes their investigation.

The Data Protection Commission (DPC) in Ireland has recently finished an investigation into Facebook’s WhatsApp and Twitter over breaches to GDPR (Source). These investigations looked into whether or not WhatsApp provided information about the app’s services in a transparent manner to both users and non-users, and about a Twitter data breach notification in January 2019.

Now, these cases have moved onto the decision-making phase, and the companies are now at risk of a fine up to 4% of their global annual revenue. This means Facebook could expect to pay more than $2 billion.

This decision moves to Helen Dixon, Ireland’s chief data regulator, and we expect to hear by the end of the year. These are landmark cases, as the first Irish legal proceedings connected to US companies since GDPR came into effect a little over a year ago (May 2018) (Source). Big tech companies are on edge about the verdict, as the Irish DPC plays the largest GDPR supervisory role over most big tech companies, due to the fact that many use Ireland as the base for their EU headquarters. What’s more, the DPC has opened dozens of investigations into other major tech companies, including Apple and Google, and perhaps the chief data regulator’s decision will signal more of what’s to come (Source).

In the end, it is clear that the businesses and the public must become more privacy-conscious, as between Twitter’s data mishandling, the TransUnion third-party attack, and the GDPR investigation coming to a close, it is clear that privacy is affecting everyday operations and lives.

Join our newsletter

How to Decode a Privacy Policy

How to Decode a Privacy Policy

How to Decode a Privacy Policy

91% of Americans skip privacy policies before downloading apps. It is no secret that people and businesses are taking advantage of that, given that there’s a new app scandal, data breach, or hack everyday. For example, take a look at the FaceApp fiasco from last month.

In their terms of use, they clearly state the following;

 “You grant FaceApp a perpetual, irrevocable, nonexclusive, royalty-free, worldwide, fully-paid, transferable sub-licensable license to use, reproduce, modify, adapt, publish, translate, create derivative works from, distribute, publicly perform and display your User Content and any name, username or likeness provided in connection with your User Content in all media formats and channels now known or later developed, without compensation to you. When you post or otherwise share User Content on or through our Services, you understand that your User Content and any associated information (such as your [username], location or profile photo) will be visible to the public” (Source).

However, these documents should actually be rendered important, especially since it discloses legal information about your data, including what the company will do with your data, how they will use it and with whom they will share it. 

So let’s look at the most efficient way to read through these excruciating documents. Search for specific terms by doing a keyword or key phrase search. The following terms are a great starting point: 

  • Third parties
  • Except
  • Retain
  • Opt-out
  • Delete
  • With the exception of
  • Store/storage
  • Rights 
  • Public 

“All consumers must understand the threats, their rights, and what companies are asking you to agree to in return for downloading any app,” Adam Levin, Founder of CyberScout says. “We’re living in an instant-gratification society, where people are more willing to agree to something because they want it right now. But this usually comes at a price” (Source).

New York Passes Data Breach Law

A New York law has recently been passed, known as the SHIELD Act, or the Stop Hacks and Improve Electronic Data Security Act. This act requires businesses that collect personal data from New York residents to comply. Below are some of the act’s enforcement and features: 

  • requires notification to affected consumers when there is a security breach,
  • broadens the scope of covered information, 
  • expands the definition of what a data breach means, 
  • and extends the notification requirement to any entity with the private information of a New York resident (Source)

Why Apple Won’t Let You Delete Siri Recordings

Apple claims to protect its users’ privacy by not letting them delete their specific recordings. “Apple’s Siri recordings are given a random identifier each time the voice assistant is activated. That practice means Apple can’t find your specific voice recordings. It also means voice recordings can’t be traced back to a specific account or device” (Source).

After it was reported that contractors were listening to private Siri conversations, including doctor discussions and intimate encounters, Apple needed to change its privacy policies. 

The reason why Siri works differently than its rivals is because of how Google Assistant or Alexa data is connected directly with a user’s account for personalization and customer service reasons. Apple works differently, as they don’t rely too much on ad revenue and customer personalization like their rivals – they rely on their hardware products and services.

LAPD Data Breach Exposes 2,500 Officers’ Data

The PII of about 17,500 LAPD applicants and 2,500 officers has been stolen in a recent data breach, with information such as names, IDs, addresses, dates of birth and employee IDs compromised.

LAPD and the city are working together to understand the severity and impact of the breach. 

“We are also taking steps to ensure the department’s data is protected from any further intrusions,” the LAPD said. “The employees and individuals who may have been affected by this incident have been notified, and we will continue to update them as we progress through this investigation” (Source).

Join our newsletter


Capital One: An Expensive Lesson to Learn

Capital One: An Expensive Lesson to Learn

As part of their business practices, organizations are uploading private customer information to the Cloud. However, just focusing on how secure the data is and not thinking about privacy is a mistake.

Capital One’s recent data breach proves that organizations need to be more conscious and proactive about their data protection efforts to prevent potential privacy exposure risks. Organizations have an obligation to ensure their customers’ data is fully privacy-protected before it is uploaded to the Cloud. This doesn’t just mean eliminating or encrypting client names, ID’s, etc. It also entails understanding the risks of re-identification and applying as many privacy-protecting techniques as needed.

Capital One’s $150 Million USD Mistake

This month, one of the United States’ largest credit card issuers, Capital One, publicly disclosed a massive data breach affecting over 106 million people. Full names, addresses, postal codes, phone numbers, email addresses, dates of birth, SINs/SSNs, credit scores, bank balances and, income amounts were compromised (Source).

Former AWS systems engineer, Paige Thompson, was arrested for computer fraud and abuse, as a result of obtaining unauthorized access to Capital One customer data and credit card applications (Source). “Thompson accessed the Capital One data through exploiting a ‘misconfiguration’ of a firewall on a web application, allowing her to determine where the information was stored”, F.B.I. officials stated. “These systems are very complex and very granular. People make mistakes” (Source).

To make amendments, Capital One is providing any affected customers with free credit monitoring and identity theft insurance. They will also be notifying customers if their data has been compromised (Source). 

Unfortunately, the company is expecting the breach to cost about $150 million USD, and these costs are driven by customer notifications, credit monitoring, technology costs, and legal support.

How the breach could have been avoided

Simply encrypting data clearly isn’t enough, because Thompson was able to exploit a security system vulnerability and decrypt the data (Source). 

Organizations should apply as many privacy-protecting techniques as possible to their dataset to minimize risks of customer re-identification in case of a data breach.

One way in which data can be privacy-protected to reduce the risk of re-identification is by anonymizing it. The best privacy technique to accomplish anonymization is differential privacy, which uses mathematical guarantees to hide whether an individual is present in a data set or not. 

A second way to reduce the risk of re-identification is by combining pseudonymization of direct identifiers with generalization and suppression techniques of indirect identifiers. Optimal k-anonymity is a privacy technique that generalizes and suppresses data to make it impossible to distinguish any specific individual from the rest of the individuals.

Organizations should elevate their understanding of privacy-protection to the same level at which they understand cyber-security. There are two essential questions that every organization need to be able to answer:

  1. What is the re-identification risk of my data?
  2. What privacy-protecting techniques can we implement throughout our data pipeline?

To learn more about how CryptoNumerics can help you privacy-protect your data, click here.

Join our newsletter


Protect Your Data Throughout the Pipeline

Protect Your Data Throughout the Pipeline

Organizations all over the world have embraced the opportunity that data and analytics present. Millions of dollars are spent every year in designing and implementing data pipelines that allow organizations to extract value from their data. However, data misuse and data breaches have led government bodies to promote regulations such as GDPR, CCPA, and HIPAA, bestowing privacy rights upon consumers and placing responsibilities upon businesses.

Maximizing data value is essential, but, privacy regulations must be satisfied when doing so. This is achievable by implementing privacy-protecting techniques throughout the data pipeline to avoid compliance risks. 

Before introducing the privacy-protecting techniques, it is important to understand the four stages of the data pipeline:

  1. Data Acquisition: first off, the data must be acquired, which can be either generated internally or externally from third parties.
  2. Data Organization: the data is now stored for future use, and needs to be protected along the pipeline to avoid misuse and breaches. This can be achieved using access controls.
  3. Data Analysis: the data must now be opened up and mobilized in order to analyze it, which allows for a better understanding of an organization’s operations and customers, as well as improved forecasting.
  4. Data Publishing: analysis results are published, and/or internal data is shared with another party. 

Now that we have talked about the 4 stages of the data pipeline, let’s go over the sixteen privacy-protecting techniques that can be implemented throughout the pipeline to make it privacy-protected.

These techniques can be categorized based on their function into four groups: randomizing, sanitizing, output, and distributed computing.

Within the randomizing group, there are two techniques: additive and multiplicative noise. In applying these techniques, random noise is added or multiplied on the individual’s record to transform the data. These techniques can be used in the Data Acquisition stage of the data pipeline. 

The sanitizing group has five privacy techniques in it. The first technique is k-anonymity, where identifiable attributes of any record in a particular database are indistinguishable from at least one other record. Next comes l-diversity, which is an extension of k-anonymity. However, this technique solves the k-anonymity shortfall by making sure there is a diversity of sensitive information in each group. Another technique is t-closeness, which makes sure that the distribution of sensitive elements in each group remains the same as the distribution in the whole group. This technique is used to prevent attribute disclosure by maintaining a ‘t’ threshold. Additionally, there is the personalized privacy technique, in which privacy levels are defined and customized by owners. The last technique in this group is ε-differential privacy, which ensures any single record does not affect the overall outcome of the data’s analysis. These techniques can be used in the Data Acquisition stage, Data Organization stage, and the Data Publishing stage of the data pipeline. 

The output group has three techniques, which are used to reduce the inference of sensitive information from the output of any algorithm. The first technique is known as association rule hiding, where information used to exploit privacy can be taken from the rules identified in the data set. Next, there is the downgrading classifier effectiveness technique, where data is sanitized to reduce the classifier’s effectiveness to prevent information from being leaked. Finally, the query auditing and inference control technique, where data queries can output data that can be used to detect sensitive information. These techniques can be applied to the Data Publishing stage of the data pipeline. 

Last but not least, the distributed computing group, made up of seven privacy-protecting techniques. 1-out-of-2 oblivious transfer is where two messages are sent, but only one out of the two messages, are received and encrypted. Another technique in this group is homomorphic encryption, a method of performing a calculation on encrypted information (ciphertext) without decrypting it (to plaintext) first. Secure sum receives the sum of inputs without revealing these inputs to others. Secure set union shares and creates a union of sets without compromising the owners of each set. Secure size of intersection figures out the size of the data set’s intersection without revealing the data itself. The scalar product technique computes the scalar product between two vectors without revealing the input vector to each other’s party. Finally, the private set intersection technique computes the intersection of two sets from each party without revealing anything else. This technique can be used in the Data Acquisition stage, as well. All of the techniques from the distributed computing group prevent access to original, raw data while allowing analysis to be performed. All of these techniques can be applied to the Data Analysis stage and Data Publishing stage of the data pipeline. Homomorphic encryption can also be used in the Data Organization stage of the data pipeline.

These sixteen techniques help protect data’s privacy throughout the data pipeline. For a visual view on the privacy-exposed pipeline versus the privacy-protected pipeline, download our Data Pipeline infographic

For more information, or to find out how to privacy-protect your data, contact us today at info@cryptonumerics.com.
Join our newsletter