Access Private Data from an Organization outside the Private Data Collection Policy

Overview

In Hyperledger Fabric, private data collection provides a way to keep some data within a subgroup of organizations inside a channel. This is useful when some business data can only be kept in and seen by designated organizations, but not others. In implementation, those organizations of a subgroup keep the private data in local ledger. When they query the data from local ledger, they see the data. Those organizations outside this subgroup do not have the private data in local ledger, and therefore they see nothing when querying their local ledger. However, there is a way to get the private data even though it is not stored in local ledger. This for sure is a security problem. In this article I simulate such a setup, and provide a way to address this problem.

Concept Review

Some basic concepts are involved in this article. Here is a quick review, just good enough for this work. You can always refer to Fabric documentation for more detail.

Each peer within a channel keeps a copy of ledger. Inside the ledger there are two parts: blockchain keeps the blocks (of transactions) received from orderer, and worldstate keeps the latest state updated according to the transactions. In this article we only focus on the worldstate.

In most situations, states (key/value pairs) are kept in all peers. In case some states shall only be stored in a subgroup of organizations, we can use Private Data. A quick example is that in a three-org network, one can define a private data collection such that only Org1 and Org2 has the data, while Org3 cannot keep the actual data. What Org3 (and all other orgs include Org1 and Org2) has is a hash of the data, serving as a proof of existence of the private data when needed.

In Fabric v2.0, an implicit private data collection is defined for each organization. In this demo I am using this for demonstration with a simple chaincode to set/get private data for this implicit private data collection.

After a chaincode is installed in a peer and is committed to a channel, this chaincode is deployed and ready for use. The use of chaincode is always referred to an invoke. A client application invokes a chaincode. In our case it is the CLI issuing peer chaincode invoke command. With CLI, the peer chaincode invoke command requires the following information

  1. endorser(s)
  2. chaincode function and argument list to be endorsed
  3. orderer for processing the transaction upon successful endorsement
  4. channel name
  5. chaincode name

Let’s take a look on a sample CLI peer command.

Image for post
Image for post
A sample “chaincode invoke” using CLI

We see

  1. endorsers: peer0.org1.example.com and peer0.org2.example.com (and their TLS CA certificates)
  2. chaincode function and argument list: ["setPrivateOrg1","name","bob"]
  3. orderer for processing the transaction: orderer.example.com (and its TLS CA certificate)
  4. channel name: mychannel
  5. chaincode name: mycc

The endorser selection should satisfy the requirement of endorsement policy. In our case, we use default: majority, which means that in two-org setup like ours, endorsements from two organizations are required. As a result we specify peers from both org1 and org2 in item 1.

Besides invoke, query is another chaincode function operation. Query provides a way to obtain worldstate directly from peer’s local ledger. Since no update on the ledger, no endorsement is needed when a client performs chaincode query. In case a private data, if the actual private data is not stored in local peer (outside the subgroup), an error is returned when performing query.

Scenario

Here is how I simulate a scenario such that an organization outside a private data collection definition can get the private data.

First Network is used. Two organizations (Org1 and Org2), each with two peers joining the channel mychannel. We are using peer0.org1.example.com and peer0.org2.example.com only.

We also use a chaincode from private data demonstration in a previous article (link). In the chaincode implicit data collection for Org1 (_implicit_org_Org1MSP) is used, and two functions are defined, with self-explanatory names: setPrivateOrg1 and getPrivateOrg1. When these functions are invoked, this implicit data collection for Org1 is used.

This is the chaincode for the demonstration.

Step 1: Deploy chaincode

This is a standard lifecycle chaincode operation to deploy the chaincode on First Network. You can find more introduction on this process in this article (link).

cd fabric-samples/first-network
./byfn.sh up -n -s couchdb
docker exec cli peer lifecycle chaincode package sacc.tar.gz --path github.com/hyperledger/fabric-samples/chaincode/sacc_private/ --label sacc_1docker exec cli peer lifecycle chaincode install sacc.tar.gzdocker exec -e CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/users/Admin@org2.example.com/msp -e CORE_PEER_ADDRESS=peer0.org2.example.com:9051 -e CORE_PEER_LOCALMSPID="Org2MSP" -e CORE_PEER_TLS_ROOTCERT_FILE=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt cli peer lifecycle chaincode install sacc.tar.gzdocker exec cli peer lifecycle chaincode approveformyorg --tls --cafile /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/ordererOrganizations/example.com/orderers/orderer.example.com/msp/tlscacerts/tlsca.example.com-cert.pem --channelID mychannel --name mycc --version 1 --sequence 1 --waitForEvent --package-id sacc_1:37e80bb608bbbfea406d0ff5c9beb14981944cb05e6191c6770b2cda36177c25docker exec -e CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/users/Admin@org2.example.com/msp -e CORE_PEER_ADDRESS=peer0.org2.example.com:9051 -e CORE_PEER_LOCALMSPID="Org2MSP" -e CORE_PEER_TLS_ROOTCERT_FILE=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt cli peer lifecycle chaincode approveformyorg --tls --cafile /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/ordererOrganizations/example.com/orderers/orderer.example.com/msp/tlscacerts/tlsca.example.com-cert.pem --channelID mychannel --name mycc --version 1 --sequence 1 --waitForEvent --package-id sacc_1:37e80bb608bbbfea406d0ff5c9beb14981944cb05e6191c6770b2cda36177c25docker exec cli peer lifecycle chaincode commit -o orderer.example.com:7050 --tls --cafile /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/ordererOrganizations/example.com/orderers/orderer.example.com/msp/tlscacerts/tlsca.example.com-cert.pem --peerAddresses peer0.org1.example.com:7051 --tlsRootCertFiles /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/peers/peer0.org1.example.com/tls/ca.crt --peerAddresses peer0.org2.example.com:9051 --tlsRootCertFiles /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt --channelID mychannel --name mycc --version 1 --sequence 1

Step 2: Open up two terminals for CLI, one for Org1 and one for Org2. This facilitates our demonstration.

For Org1

docker exec -it cli bash
Image for post
Image for post
Org1-CLI: (background in black)

For Org2

docker exec -e CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/users/Admin@org2.example.com/msp -e CORE_PEER_ADDRESS=peer0.org2.example.com:9051 -e CORE_PEER_LOCALMSPID="Org2MSP" -e CORE_PEER_TLS_ROOTCERT_FILE=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt -it cli bash
Image for post
Image for post
Org2-CLI (background in blue)

Note that I add some background colour on Org2 for differentiation.

Step 3: Invoke set private data from Org1

As a common use, we invoke chaincode function from Org1-CLI. Per endorsement policy requirement, we specify both peer0.org1.example.com and peer0.org2.example.com as endorsers.

Note that here we ignore the data recorded in blockchain when it is input as argument list. For better privacy the data should be input through transient data. However, this is not the concern of our demonstration, as let’s simply use argument list. For those who are interested in the transient data, here is my article on this topic (link).

peer chaincode invoke -o orderer.example.com:7050 --tls --cafile /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/ordererOrganizations/example.com/orderers/orderer.example.com/msp/tlscacerts/tlsca.example.com-cert.pem --peerAddresses peer0.org1.example.com:7051 --tlsRootCertFiles /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/peers/peer0.org1.example.com/tls/ca.crt --peerAddresses peer0.org2.example.com:9051 --tlsRootCertFiles /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt -C mychannel -n mycc -c '{"Args":["setPrivateOrg1","name","bob"]}'
Image for post
Image for post

After both endorsers execute chaincode function, they send back endorsement back to Org1-CLI.

Image for post
Image for post
Endorsement stage of chaincode invoke

Assuming no problems in the endorsement, Org1-CLI creates a transaction with proposal and endorsement and sends it to ordering service. Orderer creates a new block which includes this transaction and broadcasts the block back to these peers. The block is committed in each peer, and the private data is kept only in peers of Org1, not in peers of Org2.

Image for post
Image for post
Block is committed in peer. Private data is only in peers of Org1.

Step 4: Get private data from Org1-CLI and Org2-CLI using query

peer chaincode query -C mychannel -n mycc -c '{"Args":["getPrivateOrg1","name"]}'

Org1-CLI

Image for post
Image for post

Org2-CLI

Image for post
Image for post

The result is expected. chaincode query checks state from local ledger. The private data is only found in peer0.org1.example.com, and not found in peer0.org2.example.com. The error is clear: hash is found and private data not found. It is because the data is written in implicit private data collection of Org1.

Image for post
Image for post

If we double check at world state database (CouchDB), we can see

Image for post
Image for post

Both peers keep the data hash (first item in the box), but the actual data is only stored in peer0.org1.example.com (second item).

Step 5: Get private data from Org2 using invoke

While the private data is not accessed by Org2-CLI using query, this is a trick to get it using invoke. The trick is to set peer0.org1.example.com as the endorser. This is not intended to be a real chaincode invoke but just a way to trick the system.

peer chaincode invoke -o orderer.example.com:7050 --tls --cafile /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/ordererOrganizations/example.com/orderers/orderer.example.com/msp/tlscacerts/tlsca.example.com-cert.pem --peerAddresses peer0.org1.example.com:7051 --tlsRootCertFiles /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/peers/peer0.org1.example.com/tls/ca.crt -C mychannel -n mycc -c '{"Args":["getPrivateOrg1","name"]}'
Image for post
Image for post

Now we successfully get the private data from Org2, which is not allowed per policy.

Image for post
Image for post

Access Control: a Possible Way to Solve this Problem

I am expecting there are many ways to avoid this scenario. Here I have tested one: add access control at chaincode level such that certain chaincode functions are limited to specific users. In our case, we can enforce an access control: only users from Org1MSP can access the functions related to private data. Here I am using Client Identity Chaincode Library (CID) to limit both functions setPrivateOrg1 and getPrivateOrg1 are only accessed by users from Org1MSP.

Here is an updated chaincode with CID included on both functions (line 82–88, 118–124), and we will check to make sure only Org1 user accessing the implicit private data.

Repeat the same demonstration shown above. Without going to the whole process, here we only show the result of step 5.

Image for post
Image for post

Now the chaincode blocks access if the transaction is initiated from Org2.

Image for post
Image for post

Summary

It seems we have a trick to access private data from an organization outside a private data collection policy. While “chaincode query” only accesses peer’s local ledger, “chaincode invoke” with an organization within the policy will return the data. Currently we can use access control at chaincode level using client identity (CID) chaincode library to avoid this problem.

Written by

Happy to share what I learn on blockchain. Visit http://www.ledgertech.biz/kcarticles.html for my works. or reach me on https://www.linkedin.com/in/ktam1/.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store