Announcing CN-Protect for Data Science

Announcing CN-Protect for Data Science

We are pleased to announce the launch of CN-Protect for Data Science

CryptoNumerics announces CN-Protect for Data Science, a Python library that applies insight-preserving data privacy protection, enabling data scientists to build better quality models on sensitive data.  

Toronto – April 24, 2019CryptoNumerics, a Toronto-based enterprise software company, announced the launch of CN-Protect for Data Science which enables data scientists to implement state-of-the-art privacy protection, such as differential privacy, directly into their data science stack while maintaining analytical value.

According to a 2017 Keggle study, two of the top 10 challenges that data scientists face at work are data inaccessibility and privacy regulations, such as GDPR, HIPAA, and CCPA.  Additionally, common privacy protection techniques, such as Data Masking, often decimate the analytical value of the data. CN-Protect for Data Science solves these issues by allowing data scientists to seamlessly privacy-protect datasets that retain their analytical value and can subsequently be used for statistical analysis and machine learning.

“Private information that is contained in data is preventing data scientists from obtaining insights that can help meet business goals.  They either cannot access the data at all or receive a low quality version which has had the private information removed.” Monica Holboke, Co-founder & CEO CryptoNumerics. “With CN-Protect for Data Science, data scientists can incorporate privacy protection in their workflow with ease and deliver more powerful models to their organization.”

CN-Protect for Data Science is a privacy-protection python library that works with Anaconda, Scikit and Jupyter Notebooks, smoothly integrating into the data scientist workflow.  Data scientists will be able to:

  • Create and apply customized privacy protection schemes, streamlining the compliance process.
  • Preserve analytical value for model building while ensuring privacy protection.
  • Implement differential privacy and other state-of-the-art privacy protection techniques using only a few lines of code.

CN-Protect for Data Science follows the successful launch of CN-Protect Desktop App in March. It is part of CryptoNumerics’ efforts to bring insight-preserving data privacy protection to data science platforms and data engineering pipelines while complying with GDPR, HIPAA, and CCPA. CN-Protect editions for SAS, R Studio, Amazon AWS, Microsoft Azure, and Google GCP are coming soon.  

Join our newsletter



Top 10 Challenges Data Scientists Face at Work

Top 10 Challenges Data Scientists Face at Work

We all have heard that “data is the new oil”. As with oil, data has to be transformed to be of real value to the society. The people in charge of this transformation are data professionals.

Data professionals are constantly trying to make sense of data by building models that can provide the insights necessary for organizations to grow and generate more value. However, these professionals face many challenges that prevent them from building powerful models.

In 2017, Kaggle did a study titled the “State of Data Science and Machine Learning”. One of the questions the survey asked was, “At work, which barriers or challenges have you faced this past year? (Select all that apply)”. Here are the top 10 results:

Here is a look at how often they encountered these problems:

 

 Most of the timeOftenSometimesRarely
Dirty Data43%40%16%1%
Lack of data science talent in the organization31%40%27%2%
Company politics / Lack of management/financial support for a data science team26%40%30%4%
Unavailability of/difficult access to data28%42%27%2%
The lack of a clear question to be answering or a clear direction to go in with the available data29%43%27%2%
Data Science results not used by business decision makers16%44%37%3%
Explaining data science to others19%41%36%3%
Privacy Issues25%36%34%5%
Lack of significant domain expert input22%46%29%3%
Organization is small and cannot afford a data science team37%36%24%3%

Data cleanliness is clearly a big issue, as data scientists spend 80% of their time cleaning data. However challenges like a lack of talent/expertise, company politics meaning results are not used, and data inaccessibility, are more difficult to solve as they require systemic changes within the organization.

To find how data professionals answered the other questions in the study, click here to visit Kaggle 2017 study.

Join our newsletter