Privacy Compliance Technology

Data Anonymization

To meet the specifications of anonymization, the data must be stripped of sufficient elements such that the data subject can no longer be identified. More precisely, that data must be processed in such a way that it can no longer be used to identify a natural person by using ‘all the means likely reasonably to be used’ by either the controller or a third party. An important factor is that the processing must be irreversible.

CryptoNumerics’ CN-Protect Solution anonymizes direct identifiers in a manner that is irreversible, and additionally applies further advanced practical privacy-preserving protection to ensure indirect identifiers cannot be used to re-identify a natural person through inference attacks and the mosaic effect.

    Data Anonymization

    To meet the specifications of anonymization, the data must be stripped of sufficient elements such that the data subject can no longer be identified. More precisely, that data must be processed in such a way that it can no longer be used to identify a natural person by using ‘all the means likely reasonably to be used’ by either the controller or a third party. An important factor is that the processing must be irreversible.

    CryptoNumerics’ CN-Protect Solution anonymizes direct identifiers in a manner that is irreversible, and additionally applies further advanced practical privacy-preserving protection to ensure indirect identifiers cannot be used to re-identify a natural person through inference attacks and the mosaic effect.

      Data Pseudonymization

      Pseudonymization means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.

      Personal data which have undergone pseudonymization, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person.

      CryptoNumerics CN-Protect solution is a software solution architecturally configured so that indexes and additional information used that links a data subject to the pseudonymized data is kept in a separate and secure manner demonstrating enterprise-class technical and organizational controls. 

      CryptoNumerics CN-Protect solutions, risk assessment for re-identification capability and privacy-preserving privacy actions on indirect identifiers is central to demonstrating that pseudonymized data cannot be used to identify a natural person.

      Differential Privacy

      Differential Privacy (DP) is a privacy framework that characterizes a data analysis or transformation algorithm rather than a dataset. It specifies a property that the algorithm must satisfy to protect the privacy of its inputs, whereby the outputs of the algorithm are statistically indistinguishable when any one particular record is removed in the input dataset. It involves a randomization element and tunable parameters such as epsilon and delta to guarantee the statistical indistinguishability of any individual record, up to a specified limit. It is most effective when applied to very large datasets. 

      Differential Privacy

       

      Differential Privacy (DP) is a privacy framework that characterizes a data analysis or transformation algorithm rather than a dataset. It specifies a property that the algorithm must satisfy to protect the privacy of its inputs, whereby the outputs of the algorithm are statistically indistinguishable when any one particular record is removed in the input dataset. It involves a randomization element and tunable parameters such as epsilon and delta to guarantee the statistical indistinguishability of any individual record, up to a specified limit. It is most effective when applied to very large datasets. 

      Optimal k-anonymity

        K-anonymity is a privacy model that protects re-identification of individuals in a dataset based on indirect identifiers (or quasi-identifiers).  Quasi-identifiers, such as age, gender, or zipcode, are not fully identifying each on their own but in combination can be used to distinguish a much smaller, or possible a single individual in the dataset.  K-anonymity considers each unique combination of quasi-identifiers as an equivalance class (or bin), and then ensures that each bin has at least k members which are indistinguishable from each other. This is achieved by generalizing and/or suppressing the quasi-identifiers.  For example, if k is set equal to 5, then any bin must contain at least 5 individuals. In bins smaller than 5, the zip code digits are redacted from right-to-left, thus increasing the area that the remaining digits represent, and thus more individuals would be contained in that bin. 

        As k increases, the data becomes more general and the risk of re-identification is reduced. 

        t-closeness

        t-closeness is a privacy model that extends k-anonymity to protect the disclosure of sensitive attributes (non-identifiers) in a dataset, that may be used to infer additional information about individuals or vice versa, such as a medical diagnosis. It ensures that the sensitive attributes in each bin is within a distance t of the distribution of sensitive values for the entire dataset. Transformation occurs by generalizing and/or suppressing the quasi-identifiers. For example, if the sensitive attribute is salary, then each group’s frequency distribution of salary will be within a distance t from the salary frequency distribution for the entire dataset.  The distance is measured as the cumulative absolute difference of the distributions. 

        As t decreases, the risk of sensitive attribute disclosure decreases and information loss correspondingly increases.

          t-closeness

          t-closeness is a privacy model that extends k-anonymity to protect the disclosure of sensitive attributes (non-identifiers) in a dataset, that may be used to infer additional information about individuals or vice versa, such as a medical diagnosis. It ensures that the sensitive attributes in each bin  is within a distance t of the distribution of sensitive values for the entire dataset. Transformation occurs by generalizing and/or suppressing the quasi-identifiers. For example, if the sensitive attribute is salary, then each group’s frequency distribution of salary will be within a distance t from the salary frequency distribution for the entire dataset.  The distance is measured as the cumulative absolute difference of the distributions. 

          As t decreases, the risk of sensitive attribute disclosure decreases and information loss correspondingly increases.

            Get In Touch