Blockchain & Compliance

Part I: An exploration of challenges

V-ID has recently started discussions with several parties to map the advantages and disadvantages of methods for handling personal data. The goal of these discussions is to establish a standard for GDPR compliant digital file security using blockchain technology.


Hashing files

Blockchain solutions such as Factom, V-ID and Po.et store the SHA256 hash of a file in one or more blockchains. Using this method, it is possible to check whether the content of a copy has changed compared to the original file by verifying it’s hash.

A hash is often described as a digital fingerprint of a file. SHA256 hashing is a one-way transformation of data to an unreadable piece of data. The hash value consists of a string of 256 characters.

Challenge 1: The right to be forgotten

At first glance, some provisions of the GDPR seem to be in direct contradiction to the fundamentals of blockchain technology. The most controversial GDPR mandate for blockchain is the “Right to be forgotten”. This gives individuals the right to have their personal data removed from a database on their request.

However, because of the decentralized character of the largest and safest blockchains, data can not be deleted. Blockchains are designed to last forever and – in principle – to be unchangeable. Which brings blockchain in direct conflict with one of the fundamental rights of the GDPR.

Possible solution: Keystore

When a hash is recorded in the blockchain, it is assumed that it will remain there forever. This way, the direct connection between a recorded hash and the contents of a file can never be removed. A keystore, in this case an extra layer of data that acts as a key, can be added to the validation process. The associated key can function as a “kill switch”, a method to remove the relation between a file and its associated contextual data.

When validating, the hash of the file is supplemented with the unique key. A new hash is taken from this combination of hash and key, which is then recorded in the blockchain.

When verifying the file, the intermediate layer holds the key. Not the “bare” hash of the file, but the hash-with-key combination is matched in the blockchain while verifying the file.

The effect: the “bare” hash of the file is not directly recorded in the blockchain and therefore can not be found and linked to the contents of the file.

source: Digital Technology, a Weapon against Document Fraud, Hélène Mouiche, Senior Analyst at Markess

Challenge 2: Possibility of retrieving personal information with the hash and knowledge of the context

The original file cannot be retrieved or generated using the calculated hash. Nonetheless, there are scenarios imaginable where the hash can be directly linked to a person by having the knowledge of the context and file structure.

Possible solution: Seed

A seed is a piece of unique random data that is added to a file. This data is different for each file. The seed does not have to be visible in the content of the file.

This ensures that files, comparable in terms of content, such as diplomas, are much more unique than just the difference in personal data in the content. Since the structure of the content is no longer similar for a computer reading the file, trying to retrieve other names is no longer possible.

What are the advantages and disadvantages?

The diagram below shows the advantages and disadvantages of the mentioned mechanisms. Options C to F use blockchain technology.

Example: a diploma in PDF format

Files with personal data such as diplomas are an example of documents requiring additional security measures.

(Click on the preview to download and verify the diploma.)

Keystore

There is a name on a diploma. If the owner of the document would insist on removing the proof of authenticity of this document, the link between the document hash and the contextual data can be removed.

Seed

A second characteristic of every diploma is the fixed structure of the content. This makes the following scenario possible:

Someone only has a degree from person A, but wants to know if person B has also obtained this diploma. To find out, someone can change the name in the PDF to the name of person B, and then verify this customized file.

When this new hash matches a hash that was previously recorded, it implies that an original file was previously validated with the name of person B on it.

In that case, it was possible to trace personal data of person B with the help of a hash.

What’s next?

Exchanging of documents is done digitally in an ever increasing pace. This, in addition to progress, unfortunately also entails new forms of fraud. Blockchain technology can become an important pillar of our dealings with information, so that innovations are not hindered by adverse side effects.

In part II we will focus on self-signing, anonymization and pseudonymization.

In developing a good standard method for file security, the ideal balance between security and user-friendliness must always be sought. In that respect, there is still a great need for the expertise of people from almost all sectors of business.

Let’s get in touch!

We would like to thank the following persons for their insights so far: Katja van Kranenburg and Simon Sanders (CMS), Kevin Leeuwis (Oceanco), Frank Verhaest (Isabel), Louis de Bruin (IBM), Olivier Rikken (Axveco), Joshua Jenster and Pim Voets (V-ID), Bastiaan Oosterman (Alterdax), Perry Smit and Herman Hartgers (KVK).

To join the discussion around the new standard for file security, leave a comment or contact us at https://promo.v-id.org.