Select Page
Location data and your privacy

Location data and your privacy

As technology grows to surround the entirety of our lives, it comes as no surprise that each and every move is tracked and stored by the very apps we trust with our information. With the current COVID-19 pandemic, the consequences of inviting these big techs into our every movement are being revealed. 

At this point, most of the technology-users understand the information they do give to companies, such as their birthdays, access to pictures, or other sensitive information. However, some may be unknowing of the amount of location data that companies collect and how that affects their data privacy. 

Location data volume expected to grow

We have created over 90% of the world’s data since 2017. As wearable technology continues to grow in trend, the amount of data a person creates each day is on a steady incline. 

One study reported that by 2025, the installation of worldwide IoT-enabled devices is expected to hit 75 billion. This astronomical number highlights how intertwined technology is into our lives, but also how welcoming we are to that technology; technology that people may be unaware of the ways their data is collected. 

Marketers, companies and advertisers will increasingly look to using location-based information as its volume grows. A recent study found that more than 84% of marketers use location data for their 

The last few years have seen a boost in big tech companies giving their users more control over how their data is used. One example is in 2019 when Apple introduced pop-ups to remind users when apps are using their location data.

Location data is saved and stored for the benefit of companies to easily direct personalized ads and products to your viewing. Understanding what your devices collect from you, and how to eliminate data sharing on your devices is crucial as we move forward in the technological age. 

Click here to read our past article on location data in the form of wearable devices. 

COVID-19 threatens location privacy

Risking the privacy of thousands of people or saving thousands of lives seems to be the question throughout this pandemic; a question that is running out of time for debate. Companies across the big 100 have stepped up to volunteer its anonymized data, including SAS, Google and Apple. 

One of the largest concerns is not how this data is being used in this pandemic, but how it could be abused in the future. 

One Forbes article brought up a comparison of the regret many are faced with after sharing DNA with sites like 23andMe, leading to health insurance issues or run-ins with criminal activity. 

As companies like Google, Apple and Facebook step-up to the COVID-19 technology race, many are expressing their concerns as these companies have not been deemed reliable for user data anonymization. 

In addition to the data-collecting concern, governments and big tech companies are looking into contact-tracking applications. Civilian location data being used for surveillance purposes, while alluded for the greater good of health and safety, raises multiple red flags into how our phones can be used to survey our every movement. To read more about this involvement in contact tracing apps, read our latest article

Each company has released that it anonymizes its collected data. However, in this pandemic age, anonymized information can still be exploited, especially at the hands of government intervention. 

With all this said, big tech holds power over our information and are playing a vital role in the COVID-19 response. Paying close attention to how user data is managed post-pandemic will be valuable in exposing how these companies handle user information.

 

Google and Apple to lead data privacy in the global pandemic

Google and Apple to lead data privacy in the global pandemic

What happens to privacy in a global pandemic? This question continues to be debated as countries like Canada, the United States and the United Kingdom move into what is assumed to be the peak of COVID-19 spreading in their countries. 

The world watched as countries like South Korea and China introduced grave measures to track its citizens, essentially stripping their privacy rights. But as numbers continue to rise in the western world, governments are looking to implement similar tracking technologies into their own citizen’s devices. 

At the frontlines of the tracing-app, is the U.K.’s National Health Service’s health technology development unit (NHSX). The U.K.’s contract-tracing app would track COVID-19 positive patients and alert the people they had been in contact with. 

However, prior to their launching of the app, big tech companies Google and Apple released their joint contact-tracing system, limiting invasive apps on their devices and therefore derailing the development of the app. 

Google and Apple have released that they are not releasing an app themselves, but instead a set of ‘privacy-focused API’s” to ensure that governments are not releasing invasive apps onto their citizen’s devices. 

Countries like Singapore that have these contact tracing apps on phones have problems that Google and Apple are looking to avoid. Issues including requiring citizens to leave their phones unlocked or severe battery drainage. 

Google and Apple have informed that these Bluetooth-systems will run in the background and work even when the phone is locked. They have also released that this system will cease to run once the pandemic is over. 

The two big tech companies have created a high standard for privacy in the pandemic age. They will have to grant permission not only for the government applications to go live but for health authorities to access the technology (source). They have also released that they are developing policies on whether they will allow tracing apps to gather location. 

One Oxford University researcher said that around two-thirds of a country’s population would need to be involved for the contact tracing to be effective. However, the top U.S. infection diseases export says that many Americans would be inclined to reject any contact-tracing app that knowingly collects their location data.

The idea behind the Google/Apple partnership is to ensure governments are not forcing highly invasive technologies onto its citizens, and that while the world is engulfed in its chaos, personal privacy remains as intact as possible.

The NHSX has continued with its app development. However, it is alleged that they are in close contact with the Apple/Google partnership. The European Commission told one reporter that “mobile apps should be based on anonymized data and work with other apps in E.U. countries.” 

As the world struggles to contain this virus’ spread, apps and systems such as the Google/Apple partnerships could have a great effect on how COVID19 is managed. It’s important going forward not only to pay attention to how our data is being managed, but also how our anonymized data can be helped to save others.

 

Data sharing in a global pandemic

Data sharing in a global pandemic

As the world continues to brace the impact of Covid-19, data privacy remains a central concern in the development of working from home and containing the spread. Last week, Google released an anonymized dataset of location data outlining hotspots of group gatherings. In addition to this, UK officials are looking to introduce contact tracing applications. For those companies containing the spread within their offices, Zoom has risen as a trusted video conference app. However, in the last few weeks, serious privacy concerns have increased.  

Google releases location data.

On Friday, Google released its COVID-19 Community Mobility Reports. These reports are a collection of data from users who have opted in to sharing their location history with the search giant. This location history comes from Google maps, where the data is aggregated and anonymized. 

Google says that by releasing this data, public health officials are able to determine which businesses are most crowded. This helps to determine the type of grand scale decisions to be made in terms of curfews, stay-at-home orders, or which companies are necessary to remain open. 

The reports are open for public viewing, opening up the data of 131 countries with certain countries displaying regional data such as provinces or states. After selecting the country, Google creates a PDF file with the data for downloaded. 

Each PDF report contains six categories of location data. These include: 

  • Retail and recreation (restaurants, shopping centers, libraries, etc.)
  • Grocery and pharmacy (supermarkets, drug stores)
  • Parks (beaches, dog parks)
  • Transit stations (subways, bus and train stations)
  • Offices 
  • Residences 

(source

Creating these reports comes after weeks of requests from public health officials asking for applications to test a person’s contact with an infected patient. While Google’s data is unable to determine that these datasets may help cities or countries to determine preventive measures. 

Other countries have used similar, more aggressive technology using location data. At the beginning of March, we released an article about Korea’s efforts to stop the spreading by using people’s location to track if they leave their house or not. 

Another news article revealed Taiwan also participated in using location data to track its citizens. Even by going as far as calling phones twice a day to ensure citizens are not just leaving their house without their phone. 

Google released that its data will cover the past 48-72 hours, and has yet to determine when the data will be updated.

Contact Tracing Apps

Similar to the data released by Google, there is more pressure on governments to introduce contact tracing apps like the ones seen in Korea or Taiwan.

In the UK, researchers have begun compiling papers to discuss how privacy can be handled and mishandled in these tracing apps.

One researcher, Dr. Yves-Alexandre de Montjoye, created a whitepaper outlining eight questions to understanding how privacy is protected in these types of potential apps. 

These eight questions include: 

  • How do you limit personal data gathered by the app developers?
  • How do you protect the anonymity of every user? 
  • Does the app reveal to its developers the identity of users who are at risk? 
  • Could the app be used by users to learn who is infected or at risk, even in their social circle? 
  • Does the app allow users to learn any personal information about other users? 
  • Could external parties exploit the app to track users or find out who’s infected? 
  • Do you put in place additional measures to protect the personal data of infected and at-risk users?
  • How can users verify that the system does what it says? 

(source

As governments move quickly to contain the spread of Covid-19, actions like contact tracing apps are becoming seriously considered for introduction. However, patient privacy should not disappear. As the world braces for an even more significant influx of Covid-19 cases, it is in the hand of government officials and big tech to work together to contain the spread while maintaining data privacy.

Zoom faced with data sharing lawsuit

A few weeks ago, we released an article outlining small privacy concerns about introducing Zoom into the work from home environment. Since that article, Zoom has not only increased in popularity but insignificant privacy concerns as well. 

On March 26, Vice news released an article detailing Zoom’s relationship with Facebook. 

Vice reported that Zoom sends analytic data to Facebook, even when a Zoom user doesn’t have a Facebook. While this type of data transfer is not uncommon between companies and Facebook SDK, Zoom’s own privacy policy leaves its data-sharing out.

The article report that Zoom notifies Facebook when a user opens its app as well as specific identifying details for companies to target users with advertisements through. 

Zoom reached out to the news article stating they will be removing the Facebook SDK. However, it wasn’t long before the video-conferencing company was hit with a lawsuit. 

But the privacy concerns don’t just come from data-sharing. The past few weeks have seen numerous reports of account hacking, allegedly not offering end-to-end encryption, password stealing, leaks, and microphone/camera hijacking

And these claims are only just starting to roll in. As Zoom projects to the top of one used video-conferencing technology during this work-from-home burst, the next few weeks could see thousands of data privacy broken.

4 techniques for data science

4 techniques for data science

With growing tension between privacy and analytics, the job of data scientists and data architects has become more complicated. The responsibility of data professionals is not just to maximize the value of the data, but to find ways in which data can be privacy protected while preserving its analytical value.

The reality today is that regulations like GDPR and CCPA have disrupted the way in which data flows through organizations. Now data is being siloed and protected using techniques that are not suited for the data-driven enterprise. Data professionals are left with long processes to access the information they need and, in many cases, the data they receive has no analytical value after it has been protected. 

This emphasizes the importance of using adequate privacy protection tactics to ensure that personally identifiable information (PII) is accessible in a privacy-protected manner and that it can be used for analytics.

To satisfy GDPR and CCPA, organizations can choose between three options, pseudonymization, anonymization, and consent: 

Pseudonymization is replacing direct identifiers, like names or emails, with pseudonyms to protect the privacy of the individual. However, this process is still in the scope of the privacy regulations, and the risk for re-identification remains very high.

Anonymization, on the other hand, looks at direct identifiers and quasi-identifiers and transforms the data in a way that’s now out-of-scope for privacy regulations and can be used for analytics. 

Consent requires organizations to ask customers for their consent on the usage of data, this opens up the opportunity for opt-outs. If the usage of the data changes, as it often does in an analytics environment, then consent may very well be required each time.

There are four main techniques that can help data professionals with privacy protection. All of them have different impacts on both privacy protection and data quality. These are: 

Masking: A de-identification technique that focuses on the redaction or transformation of information within a dataset to prevent exposure. 

K-anonymity: This privacy model ensures that each individual is indistinguishable from at least k-1 other individuals based on their attributes in a dataset.

Differential Privacy: Is a technique applied to an algorithm that mathematically guarantees that the output of the algorithm doesn’t change whether an individual is in the dataset or not. It is achieved through the addition of noise to the algorithm. 

Secure Multi-Party Computation: This is a cryptographic technique where a group of parties can compute a function over their inputs while keeping their inputs private.

Keep your eyes peeled in the next few weeks for our whitepaper, which will explore these four techniques in further detail.

Key terms to know to navigate data privacy

Key terms to know to navigate data privacy

As the data privacy discourse continues to grow, it’s crucial that the terms used to explain data science, data privacy and data protection are accessible to everyone. That’s why we at CryptoNumerics have compiled a continuously growing Privacy Glossary, to help people learn and better understand what’s happening to their data. 

Below are 25 terms surrounding privacy legislations, personal data, and other privacy or data science terminology to help you better understand what our company does, what other privacy companies do, and what is being done for your data.

Privacy regulations

    • General Data Protection Regulation (GDPR) is a privacy regulation implemented in May 2018 that has inspired more regulations worldwide. The law determined data controllers must establish a specific legal basis for each and every purpose where personal data is used. If a business intends to use customer data for an additional purpose, then it must first obtain explicit consent from the individual. As a result, all data in data lakes can only be made available for use after processes have been implemented to notify and request permission from every subject for every use case.
    • California Consumer Privacy Act (CCPA) is a sweeping piece of legislation that is aimed at protecting the personal information of California residents. It will give consumers the right to learn about the personal information that businesses collect, sell, or disclose about them, and prevent the sale or disclosure of their personal information. It includes the Right to Know, Right of Access, Right to Portability, Right to Deletion, Right to be Informed, Right to Opt-Out, and Non-Discrimination Based on Exercise of Rights. This means that if consumers do not like the way businesses are using their data, they request for it to be deleted -a risk for business insights 
    • Health Insurance Portability and Accountability Act (HIPAA) is a health protection regulation passed in 1998 by President Clinton. This act gives patients the right to privacy and covers 18 personal identifiers that are required to be de-identified. This Act is applicable not only in hospitals but in places of work, schooling, etc.

Legislative Definitions of Personal Information

  • Personal Data (GDPR): Any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person’ (source)
  • Personal Information (PI) (CCPA): “information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household.” (source)
  • Personal Health Information (PHI) (HIPAA): considered to be any identifiable health information that is used, maintained, stored, or transmitted by a HIPAA-covered entity – A healthcare provider, health plan or health insurer, or a healthcare clearinghouse – or a business associate of a HIPAA-covered entity, in relation to the provision of healthcare or payment for healthcare services. PHI is made up of 18 identifiers, including names, social security number, and medical record numbers (source)

Privacy terms

 

  • Anonymization is a process where personally identifiable information (whether direct or indirect) from data sets is removed or manipulated to prevent re-identification. This process must be made irreversible. 
  • Data controller is a person, an authority or a body that determines the purposes for which and the means by which personal data is collected.
  • Data lake is a collection point for the data a business collects. 
  • Data processor is a person, an authority or a body that processes personal data on behalf of the controller. 
  • De-identified data is the result of removing or manipulating direct and indirect identifiers to break any links so that re-identification is impossible. 
  • Differential privacy is a privacy framework that characterizes a data analysis or transformation algorithm rather than a dataset. It specifies a property that the algorithm must satisfy to protect the privacy of its inputs, whereby the outputs of the algorithm are statistically indistinguishable when any one particular record is removed in the input dataset.
  • Direct identifiers are pieces of data that identify an individual without the need for more data, ex. name, SSN, etc.
  • Homomorphic encryption is a method of performing a calculation on encrypted information (ciphertext) without decrypting it (to plaintext) first.
  • Identifier: Unique information that identifies a specific individual in a dataset. Examples of identifiers are names, social security numbers, and bank account numbers. Also, any field that is unique for each row. 
  • Indirect identifiers are pieces of data that can be used to identify an individual indirectly, or with the combination of other pieces of information, ex. date of birth, gender, etc.
  • Insensitive: Information that is not identifying or quasi-identifying and that you do not want to be transformed.
  • k-anonymity is where identifiable attributes of any record in a particular database are indistinguishable from at least one other record.
  • Perturbation: Data can be perturbed by using additive noise, multiplicative noise, data swapping (changing the order of the data to prevent linkage) or generating synthetic data.
  • Pseudonymization is the processing of personal data in a way that the personal data can no longer be attributed to a specific data subject without the use of additional information. This is provided that such additional information is kept separately and is subject to technical and organizational
  • Quasi-identifiers (also known as Indirect identifiers) are pieces of information that on its own are not sufficient to identify a specific individual but when combined with other quasi-identifiers is possible to re-identify an individual. Examples of quasi-identifiers are zip code, age, nationality, and gender.
  • Re-identification, or de-anonymization, is when anonymized data (de-identified data) is matched with publicly available information, or auxiliary data, in order to discover the individual to which the data belong to.
  • Secure multi-party computation (SMC), or Multi-Party Computation (MPC), is an approach to jointly compute a function over inputs held by multiple parties while keeping those inputs private. MPC is used across a network of computers while ensuring that no data leaks during computation. Each computer in the network only sees bits of secret shares — but never anything meaningful.
  • Sensitive: Information that is more general among the population, making it difficult to identify an individual with it. However, when combined with quasi-identifiers, sensitive information can be used for attribute disclosure. Examples of sensitive information are salary and medical data. Let’s say we have a set of quasi-identifiers that form a group of women aged 40-50, a sensitive attribute could be “diagnosed with breast cancer.” Without the quasi-identifiers, the probability of identifying who has breast cancer is low, but once combined with the quasi-identifiers, the probability is high.
  • Siloed data is data stored away in silos with limited access, to protect it against the risk of exposing private information. While these silos protect the data to a certain extent, they also lock the value of the data.
How can working from home affect your data privacy?

How can working from home affect your data privacy?

On March 11, the World Health Organization declared the Coronavirus (COVID-19) a global pandemic, sending the world into a mass frenzy. Since that declaration, countries around the world have shut borders, closed schools, requested citizens to stay indoors, and sent workers home. 

While the world may appear to be at a standstill, some jobs still need to get done. Like us at CryptoNumerics, companies have sent their workers home with the tools they need to complete their regularly scheduled tasks from the comfort of their own homes. 

However, with a new influx of people working from home, insecure networks, websites or AI tools can lead company information vulnerable. In this article, we’ll go over where your privacy may be at risk during this work-from-home season.

Zoom’s influx of new users raises privacy concerns.

Zoom is a video-conferencing company used to host meetings, online-charts and online collaboration. Since people across the world are required to work or participate in online schooling, Zoom has seen a substantial increase in users. In February, Zoom shares raised 40%, and in 3 months, it has doubled its monthly active users from the entire year of 2019 (Source). 

While this influx and global exposure are significant for any company, this unprecedented level of usage can expose holes in their privacy protection efforts, a concern that many are starting to raise

Zoom’s growing demand makes them a big target for third-parties, such as hackers, looking to gain access to sensitive or personal data. Zoom is being used by companies large and small, as well as students across university campus. This means there is a grand scale of important, sensitive data could very well be vulnerable. 

Some university professors have decided against Zoom telecommuting, saying the Zoom privacy policy, which states that they may collect information about recorded meetings that take place in video conferences, raises too many concerns of personal privacy. 

On a personal privacy level, Zoom gives the administrator of the conference call the ability to see when a caller has moved to another webpage for over 30 seconds. Many are calling this option a violation of employee privacy. 

Internet-rights advocates have begun urging Zoom to begin publishing transparent reports detailing how they manage data privacy and data security.  

Is your Alexa listening to your work conversations?

Both Google Home and Amazon’s Alexa have previously made headlines for listening to homes without being called upon and saving conversation logs.  

Last April, Bloomberg released a report highlighting Amazon workings listening to and transcribing conversations heard through Alexa’s in people’s homes. Bloomberg reported that most voice assistant technologies rely on human help to help improve the product. They reported that not only were the Amazon employees listening to Alexa’s without the Alexa’s being called on by users but also sharing the things they heard with their co-workers. 

Amazon claims the recordings sent to the “Alexa reviewers” are only provided with an account number, not an address or full name to identify a user with. However, the entire notion of hearing full, personal conversations is uncomfortable.

As the world is sent to work from home, and over 100 million Alexa devices are in American homes, there should be some concern over to what degree these speaker systems are listening in to your work conversations.   

Our advice during this work-from-home-long-haul? Review your online application privacy settings, and be cautious of what devices may be listening when you have important meetings or calls.