Blockchain on Azure by Marley Gray
The blockchain, known for the architecture behind the creation of Bitcoin and seen by most as the next big bang in technology since the invention of the internet. But what is the blockchain you might be asking? I’m hoping by the end of this blog post you will have a great understanding of exactly what the Blockchain is and why it might be useful to your organisation along with how you might create your first successful blockchain.
What is the Blockchain?
The Blockchain is described as a shared, distributed ledger. The blockchain within one organisation is useless. Only when multiple organisations use the power of the blockchain to create a decentralised ledger, can the true potential of the blockchain technology start to be seen.
Shared – The more organizations or companies that participate in the blockchain the more valuable the Blockchain becomes. Data becomes vastly visible, accessible and immutable.
Distributed – There are many replicas of the blockchain database. The more replicas there are, the more authentic it becomes.
Ledger – The database is a read/write only database. Therefore, it is an immutable audit trail of every transaction that has happened.
Blockchain truly is a mechanism to bring everyone to the highest degree of accountability. No more missed transactions, human or machine errors. Even an exchange that was not done with the consent of the parties involved. – Ian Khan
The above quote by Ian Khan gives a good description of Blockchain technology. I see a technology that allows us to create transactions that are impervious to fraud and establishes a shared truth.
No one owns the blockchain and therefore the responsibility of the blockchain is shared between all organisations using the network. This is beneficial so let’s take Facebook as a good example: I upload a picture to Facebook and tell Facebook which friends should be able to see it and I expect Facebook to do roughly what I think it will do. Ultimately though, and ignoring some legal and ethical aspects, Facebook controls my data and can unilaterally evolve it (add to it, delete it, modify it) and I am unable to do anything about it. Of course, I feel that I have control, but ultimately it is still Facebook who owns and controls my data. On a Blockchain, this data cannot be manipulated by any one individual depending on the rules that govern the Blockchain.
Distributed ledgers are governed and controlled by rules
A distributed ledger has a set of rules that specify what new information is considered valid or not, and how participants should react to it. This set of rules could be thought of as a constitution.
Sometimes called the blockchain rules, network rules, or the rules of the ledger, these are pre-agreed technical rules about how new data is handled. Participants are subject to this constitution of rules when they create or join a distributed ledger network.
For example in bitcoin, one rule is the limit to the amount of data in one block of transactions. Another is the rule to submit a hash with a specific pattern, in order to create a valid block. A third is you can’t spend a bitcoin that you don’t have.
In a private distributed ledger network, one rule could be that no transaction is valid unless a minimum of three participants approves it with their digital signatures; or for certain assets to be signed every transaction. A participant in a private network may also have legal contractual commitments to the other participants, such as service-level agreements.
The key point is that the evolution of data in a distributed ledger network is that at a technical level; there are rules about how data is handled. Before distributed ledgers, single entities had total control over their data and commitments were made only at the legal contractual / terms-of-service level.
Are there any disadvantages of the Blockchain?
In Marley Gray’s presentation, found in the recommended content of this blog, he outlined that Bitcoin only computes 7 transactions a second. This demonstrates the computing power needed to operate a blockchain ledger. It is extremely slow in comparison to other databases out there for obvious reasons. As an example, just a single bitcoin transaction has been calculated to devour as much energy as what powers 1.57 US households for a day – roughly 5,000 times more energy-hungry than a typical credit card payment. Would you prefer to pay by cash, credit or planet-wide blackout?
Data within the Blockchain has to be deterministic. A block on the chain is propagated between nodes to verify its authenticity. If we have values that aren’t deterministic, then while this block propagates between thousands of nodes the value could change causing the chain to see the new block as fraudulent and prevent it from entering the chain.
Blockchain technology is expensive. I created a private blockchain on Azure and because of the infrastructure required it spun up five virtual machines which totalled £100 per month. This was only for a trial and if we wanted to do this on an industrial level you would incur a much bigger cost. On the public Blockchain, Bitcoin miners are picking up blocks (transactions) within 10 minutes for $0.44 a block. Cost can be reduced or increased based on the cost you assign to the block. If you don’t mind the transaction speed, you can place a lower transaction cost on the block or alternatively a higher cost.
What are the avantages?
We can have a peer to peer exchange to dramatically reduce settlement time. For example, imagine we are developers for cancer research and we are sending medical advice to a testing company. On the blockchain, we would have a live update on the progress of how the medical advice affects the patient. We don’t have to call around or attempt to obtain information because we are both sharing the same ledger. Data becomes instantly available to both parties allows us to build faster data analytics and we can all instantly become more responsive.
Allows separate organisations to share a standard organised ledger. This allows for decentralization and promotes shared control which encourages data sharing:
• (1) Leads to more data, and therefore better models.
• (2) Allows for shared control of AI training data & models.
Competitors (say, banks or music labels) traditionally would never share their data. But it would be straightforward to show how, with combined data from several banks, one could make better models for credit card fraud prevention.
Who uses Blockchain?
There are a lot of companies that use the blockchain but to pull out and observe one we can look at supermarkets. Little known to the public, Walmart supermarket has implemented the blockchain chain technology to track and manage the process of food distribution. The corporate giants have improved efficiency and trust involved in the process of tracking the meat and poultry that they sell. The blockchain technology tracks information (through QR code scanning) at all steps of the process from the farmer, to the broker, distributor, retailer and everything in between.
The initiative for this was down to a Salmonella outbreak within the supply chain which took months to correct due to complicated documenting and IT that they were using. With the Blockchain, all of this data could be retrieved in seconds with AI technology.
Centralised vs. decentralised?
Even if some organizations decide to share, they could share without needing blockchain technology. For example, they could simply pool it into an S3 instance and expose the API among themselves. But in some cases, decentralisation gives new benefits. First is the literal sharing of infrastructure, so that one organisation in the sharing consortium doesn’t control all the “shared data” by themselves. (This was a key stumbling block a few years back when the music labels tried to work together for a common registry). Another benefit is that it’s easier to turn the data & models into assets, which can then be licensed externally for profit. I elaborate on this below. (Thanks to Adam Drake for drawing extra-attention to the hoard-vs-share tension).
CRUD vs CRAB
CRUD stands for Create-Read-Update-Delete. These are the basic operations of persistent storage. These actions don’t apply to the blockchain as you can’t delete or update transactions. Instead, operations on blockchain can be described as CRAB: Create-Retrieve-Append-Burn. The create and Retrieve relate to CRUD which is pretty self-explanatory. Append is appending to the transaction. Burn refers to assigning a completely random public key to the block and therefore rendering all operations as unauthorised to everyone (essentially like forgetting your key).
Bring me onto GDPR
An important aspect of GDPR on Blockchain is the fact that personal data is not to leave the EU. This is a major problem with public Blockchains, since there is no control on who hosts a node. This is less an issue when it comes to private or permissioned blockchains.
There is also a separate section — Art. 17 — on ‘Right to be Forgotten’. This concept is clearly an important one regarding ‘erasure of data’. However, not anywhere in the document, not even in the definitions part — Art. 4 — is there any explanation of what the term erasure of data actually means.
The GDPR initiative probably had only CRUD in mind (“you are always able to Delete information”) when dealing with basic operations on persistent storage. The fact that this doesn’t match with blockchain technology creates some friction. Now, because there is no definition in GDPR of “erasure of data” at this point, you probably need to interpret this as strict, which means that throwing away your encryption keys which encrypts personal data in a blockchain technology is not acceptable as ‘erasure of data’ according to GDPR.
Of course, this has consequences on what we can store on a blockchain. Storing personal data on a blockchain is not an option anymore according to GDPR. A popular option to get around this problem is a very simple one: You store the personal data off-chain and store the reference to this data, along with a hash of this data and other metadata (like claims and permissions about this data), on the blockchain.
How does Blockchain ledger compare to relational databases?
The Blockchain is not controlled by any single entity.
The Blockchain has no single point of failure.
Dive deep into Cryptlets – and Cryplet fabric
thanks to the following authors,