The initial Safe-DEED demonstrator for the private data exchange intra and inter companies has been implemented and evaluated by the partners.
Having collected initial feedback we started defining the functionality of the next version aiming at:
- Implementing additional applications functionality, as described below
- Introduce business perspective and implement GDPR compliance rules e.g. roles appointment per process, etc.
In particular, with respect to the demonstrator applications the following implementations are planned:
1.1. PSI
The current version of the implementation aims at demonstrating the feasibility of a privacy preservation compliant data intersection between different organizations. Following the initial business and technological analysis this proof of concept has been achieved.
In the next steps we aim at:
- deep diving the business needs for corporate collaboration through data interception analysis
- the respective software architectures and security constraints
- corporate policies and privacy preservation in order to produce an application able to support and augment such business collaboration.
We consider this service a significant asset of a potential data marketplace.
The application will be included in the next version of the demonstrator.
1.2. De-anonymisability
RSA’s trials so far represent the procedure a company should follow before releasing, exchanging or selling their datasets, namely practical and theoretical de-anonymisation and measures to reduce the risk of de-anonymisation.
The practical de-anonymisation would most probably consist of manual work, since most of the information that is publicly available is unstructured, such as free text in social media. A considerable amount of time might be needed to be spent on looking for sources of auxiliary information and gathering it.
The theoretical de-anonymisation is complementary to the practical de-anonymisation and consists of identifying the red flags of a dataset through analysis of the QIs. The analysis indicates which QIs are crucial for de-anonymisation, in case information that is not currently publicly available, will become so in the future or in case there is information that individuals privately possess and want to perform de-anonymisation attacks on specific individuals. Such a tool might help a data publisher decide whether to release or not certain QIs of their dataset, and also help define their generalisation hierarchies.
The anonymisation measures are necessary in order to protect the privacy of the individuals in a dataset, before a company releases or sells their dataset to a third party(-ies). The procedure might require a considerable effort, depending on the complexity and nature of the dataset’s QIs, since more advanced anonymity principles than k-anonymity might be required, and granular generalisation hierarchies might be needed to lose as less information as possible.
1.3. Valuation
Currently an initial implementation of the Safe-DEED Data Valuation Component has been achieved. It consists of basic or mock-up implementations of each of the DVC components and ensures the interface and data flow between these.
The input files supported in the current version are CSV and XLS(X) files, with the possibility of reading separate sheets, in the case of the latter.
In the next versions of the demonstrator, the components will exhibit more complex functionality:
- increase the range of input files and expand the file format interpretation functionality (DIL);
- basic ML algorithms (regression, classification, clustering) to allow for a proper valuation of the usability of the input data set (ADAS);
- development of a rule-based economic model for the valuation of the economic aspects of the input data set (S2VM).