Build statistical or machine learning models using privacy-protected datasets without relocating the data. Available as a plug-in to your
Regulations or concerns about IP are blocking your ability to conduct data science.
CN-Insight allows you to build models without relocating the data by using a combination of cryptographic protocols and numerical methods. It integrates into your data science platform or can be used as a standalone product. No other server is involved.
Secure Multiparty Computation
Jointly computes a function over inputs held by multiple parties while keeping those inputs private.
- Statistical and machine learning model training evaluates a number of mathematical operations or functions. The underlying data is kept private by using cryptographic protocols from Secure Multiparty Computation that protect the data.
- During model training the raw data is never relocated, only encrypted data that cannot be reverse engineered is communicated between the parties. Regulatory constraints such as Data Residency are therefore satisfied.
- The trained model resides as encrypted shares at each parties location. No single party can decrypt the model. To decrypt the model all encrypted shares must be brought together, depending on the agreement one of the following scenarios could happen:
- all parties could receive all the shares the model and thus decrypt the model;
- a subset of parties could receive all shares and only that subset can decrypt the model; or
- the shares are never brought together and any future use of the model requires the same parties to participate.
- Each party has control over the frequency and types of models trained on their data.
Private Set Intersection
Identifies common elements from datasets that cannot be relocated without revealing anything else and without the exorbitant cost, time, and risk of using a “trusted” third party.
- Training models on datasets that cannot be relocated
requiresproper alignment of the data and it is also necessary to search for redundancies.
- Matched row identifiers are either output in each parties location or piped directly into the training model.
- This function can also be used independently of training to identify overlap with potential data partners.