Private Data: Implicit Data Collections in Hyperledger Fabric v2.0
Overview
Private data is an important feature in Hyperledger Fabric, allowing certain data to be stored only inside selected organizations. This provides data privacy within a channel, and reduces the number of channels required if the privacy is needed at data level. In Fabric v2.0, an implicit collection is introduced for each organization, and application chaincode can use it without a collection is defined explicitly. This article is to explore the implicit collection, and see how it is being used in both lifecycle chaincode and an application chaincode.
Private Data at a Glance
You can find introduction of Private Data here in Fabric documentation. Here is a quick overview.
In a typical setup, after the organizations joining a channel, their peer will maintain a local copy of ledger, including a world state database and a blockchain of transactions. The world state of each ledger is the same, maintaining a source of truth across a blockchain platform. This implies that any data stored in the ledger are stored in all peers.
There are scenarios where certain data privacy is needed. Some data shall only be accessed by a subgroup of organizations. In such a case, those organizations outside this subgroup should not have a copy of that data. This is addressed by Private Data, introduced since release 1.2. You can refer to my previous article (link) for a comparison between channel and private data collection when dealing with data privacy.
Private data is implemented as data collections. Inside a collection there are two types of information: the actual private data and its data hash. Collection is defined with a policy showing the subgroup of organizations (which organizations can have the actual data). While private data is stored only in this subgroup, all peers (including those outside the subgroup) keep the data hash, as an evidence for transaction validation.
Here is a quick illustration the various data stored inside the world state database on each peer.
Collections are defined in collection definition, which is a JSON file showing the properties of all the collections. This file is specified when chaincode definition is approved and committed in Fabric v2.0 (or instantiated in previous releases). Here is an example of collection definition, taken from marbles02_private
example in fabric-samples
.
[
{
"name": "collectionMarbles",
"policy": "OR('Org1MSP.member', 'Org2MSP.member')",
"requiredPeerCount": 0,
"maxPeerCount": 3,
"blockToLive":1000000,
"memberOnlyRead": true
},
{
"name": "collectionMarblePrivateDetails",
"policy": "OR('Org1MSP.member')",
"requiredPeerCount": 0,
"maxPeerCount": 3,
"blockToLive":3,
"memberOnlyRead": true
}
]
You can refer to link for detail description on the properties. Here we just focus on the policy. In this example, collectionMarbles
is a collection which both Org1 and Org2 will keep, while collectionMarblePrivateDetails
is a collection only for Org1.
The way chaincode interacting with the collection is through two APIs: PutPrivateData and GetPrivateData. It works just like the PutState and GetState. What we need is to specify the collection to or from which the private data is stored or retrieved, respectively.
Prior to v2.0, we have to define the collection before we can use it. As it is quite common to have a data collection for each organization, v2.0 introduces implicit collection. The name tells us what it means: it is a collection predefined in each peer, corresponding to a private data collection for an organization. We do not need to define one explicitly in the collection definition before using it.
The collection name is _implicit_org_<MSP>
. We specify this collection in PutPrivateData or GetPrivateData if we are using it. And when it is being used in chaincode, it is prefixed with the channel name and the chaincode name.
Interesting enough, this implicit collection is also used by lifecycle chaincode, a system chaincode for deploying an application chaincode in a fabric network. During the operation certain information is written and kept in the implicit collection.
In this article, we mainly focus on the implicit collection, and observe how both the lifecycle chaincode and application chaincode are using it. We are not going through the lifecycle chaincode in detail. Those who are interested can refer to my previous articles (link, link).
Demonstration Setup
Our demonstration is built on First Network of fabric-samples
. This brings up a fabric network with a raft-based orderer cluster, two peer organizations, each of which has two peers (total four peers). A channel mychannel is created and all peers join it. In order to observe the state, we use CouchDB for each peer as the state database and use Fauxton to observe the databases inside CouchDB.
We are using a modified SACC chaincode. SACC is a sample chaincode to store a key/value pair into ledger. We include two functions: setPrivateOrg1 and getPrivateOrg1. The name is self-explanatory and we use this to work on the implicit collection for Org1 for demonstration purpose .
Here is the modified chaincode.
As you can see, lines 51–54, 80–90 and 109–122 are added for the two new functions. And you can see the PutPrivateData being used in line 85, and GetPrivateData in line 114. Both refer to the implicit collection for Org1.
Demonstration
The overall demonstration is grouped into three parts. First we bring up the environment, which is the First Network and prepare the application chaincode sacc_private. Then we deploy chaincode to this fabric network, using lifecycle chaincode. Our focus is mainly on the use of implicit collection during the chaincode operation. Finally we will interact with our application chaincode and see how we can refer to the implicit collection.
Bring Up Environment
Step 1: Bring up all components and setup mychannel
cd fabric-samples/first-network
./byfn.sh up -n -s couchdb
We see this when the script is completely executed.
Step 2: Prepare browsers to show CouchDB for each peer
We need four browsers, one for each peer
- peer0.org1.example.com: http://<host_ip>:5984/_utils/
- peer1.org1.example.com: http://<host_ip>:6984/_utils/
- peer0.org2.example.com: http://<host_ip>:7984/_utils/
- peer1.org2.example.com: http://<host_ip>:8984/_utils/
For convenience they are arranged in sequential tabs. From the port in the URL we know which peer we are inspecting.
Step 3: Create a new chaincode directory and place the new chaincode
For simplicity, just copy the sacc
directory to a new one sacc_private
.
cd fabric-samples/chaincode
cp -r sacc sacc_private
cd sacc_private
Then replace the sacc.go
with the code provided (or just insert the part added).
Step 4: Load the module for the first time
GO111MODULE=on go mod vendor
Deploy Application Chaincode in First Network
We go through the complete lifecycle process to deploy our application chaincode.
Step 5: Lifecycle stage 1: Packaging chaincode
Note that chaincode directory is mapped correctly to CLI container. Therefore we can package this newly created chaincode.
docker exec cli peer lifecycle chaincode package sacc.tar.gz --path github.com/hyperledger/fabric-samples/chaincode/sacc_private/ --label sacc_1docker exec cli ls
Step 6: Lifecycle stage 2: Install chaincode package to peer0.org1 and peer0.org2
# peer0.org1
docker exec cli peer lifecycle chaincode install sacc.tar.gz# peer0.org2
docker exec -e CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/users/Admin@org2.example.com/msp -e CORE_PEER_ADDRESS=peer0.org2.example.com:9051 -e CORE_PEER_LOCALMSPID="Org2MSP" -e CORE_PEER_TLS_ROOTCERT_FILE=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt cli peer lifecycle chaincode install sacc.tar.gz
Note the package ID, as we will use it when approving chaincode definition.
Step 7: Observe State in Peers
There is no change on state in all peers yet. With this as a baseline, we can compare what is added in the coming steps.
Step 8: Approve Chaincode Definition for Org1
We first have Org1 approve chaincode definition.
docker exec cli peer lifecycle chaincode approveformyorg --tls --cafile /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/ordererOrganizations/example.com/orderers/orderer.example.com/msp/tlscacerts/tlsca.example.com-cert.pem --channelID mychannel --name mycc --version 1 --sequence 1 --waitForEvent --package-id sacc_1:669c48a3aa18f629e86cbe99fd4fccae2575b400c0f7da8741f68517ae4a6579
Step 9: Observe State in Peers
First we observe in peer0.org1.
The same is seen in peer1.org1.
Then we observe in peer0.org2.
And the same is also seen in peer1.org2.
As summary, we can see that new databases are created in all peers. But there is difference between organizations. When we approve chaincode for Org1, we see

It is a private data collection setup, and this collection is an implicit collection specific to Org1 (the name tells) used in channel mychannel and chaincode lifecycle. The database with $$p
is the actual data, while the one with $$h
is the hash of the actual data. We see that for implicit collection for Org1, the actual data is kept in peers of Org1, while data hash is kept in all peers within a channel (in both Org1 and Org2 as well).
The content inside this data collection is out of scope in this article. You are encouraged to take a look what is inside, and related them to the process of lifecycle chaincode.
Step 10: Approve chaincode definition for Org2
docker exec -e CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/users/Admin@org2.example.com/msp -e CORE_PEER_ADDRESS=peer0.org2.example.com:9051 -e CORE_PEER_LOCALMSPID="Org2MSP" -e CORE_PEER_TLS_ROOTCERT_FILE=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt cli peer lifecycle chaincode approveformyorg --tls --cafile /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/ordererOrganizations/example.com/orderers/orderer.example.com/msp/tlscacerts/tlsca.example.com-cert.pem --channelID mychannel --name mycc --version 1 --sequence 1 --waitForEvent --package-id sacc_1:669c48a3aa18f629e86cbe99fd4fccae2575b400c0f7da8741f68517ae4a6579
Step 11: Observe State in Peers
First we observe peer0.org1 (and same is found in peer1.org1)
Then we observe peer0.org2 (and same is found in peer1.org2)
We see that new databases are created in all peers in a similar manner. When we approve chaincode for Org2, we see

The result is that another implicit collection, specific to Org2 this time, is being used by lifecycle chaincode.
Step 12: Commit Chaincode
With approval for both organizations made, we now commit the chaincode and now the application chaincode is ready for use.
docker exec cli peer lifecycle chaincode commit -o orderer.example.com:7050 --tls --cafile /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/ordererOrganizations/example.com/orderers/orderer.example.com/msp/tlscacerts/tlsca.example.com-cert.pem --peerAddresses peer0.org1.example.com:7051 --tlsRootCertFiles /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/peers/peer0.org1.example.com/tls/ca.crt --peerAddresses peer0.org2.example.com:9051 --tlsRootCertFiles /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt --channelID mychannel --name mycc --version 1 --sequence 1
If you take a look on the state databases, there is no update on the implicit collections. The updated part after chaincode commit is on mychannel__lifecycle
, which is again out of scope in this article.
After the commit is done, the application chaincode is ready to use.
Use Application Chaincode
We use invoke and query chaincode to interact with the deployed chaincode. As a comparison, we first use the original set and get, as a reference to compare our new functions working on data collection.
Step 13: Invoke set and query from peers in both orgs with get
We first invoke set and make a query from both peer0.org1 and peer0.org2.
docker exec cli peer chaincode invoke -o orderer.example.com:7050 --tls --cafile /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/ordererOrganizations/example.com/orderers/orderer.example.com/msp/tlscacerts/tlsca.example.com-cert.pem --peerAddresses peer0.org1.example.com:7051 --tlsRootCertFiles /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/peers/peer0.org1.example.com/tls/ca.crt --peerAddresses peer0.org2.example.com:9051 --tlsRootCertFiles /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt -C mychannel -n mycc -c '{"Args":["set","name","kc"]}'# query from peer0.org1
docker exec cli peer chaincode query -C mychannel -n mycc -c '{"Args":["get","name"]}'# query from peer0.org2
docker exec -e CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/users/Admin@org2.example.com/msp -e CORE_PEER_ADDRESS=peer0.org2.example.com:9051 -e CORE_PEER_LOCALMSPID="Org2MSP" -e CORE_PEER_TLS_ROOTCERT_FILE=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt cli peer chaincode query -C mychannel -n mycc -c '{"Args":["get","name"]}'
The data is available in both peer0.org1 and peer0.org2.
Step 14: Observe State in Peers
As we are using PutState to write data into ledger, the data is kept in mychannel__mycc
, and this happens in all peers joining the channel.
Step 15: Invoke setPrivateOrg1 and query from peers in both orgs using getPrivateOrg1
docker exec cli peer chaincode invoke -o orderer.example.com:7050 --tls --cafile /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/ordererOrganizations/example.com/orderers/orderer.example.com/msp/tlscacerts/tlsca.example.com-cert.pem --peerAddresses peer0.org1.example.com:7051 --tlsRootCertFiles /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/peers/peer0.org1.example.com/tls/ca.crt --peerAddresses peer0.org2.example.com:9051 --tlsRootCertFiles /opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt -C mychannel -n mycc -c '{"Args":["setPrivateOrg1","name","peter"]}'# query from peer0.org1
docker exec cli peer chaincode query -C mychannel -n mycc -c '{"Args":["getPrivateOrg1","name"]}'# query from peer0.org2
docker exec -e CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/users/Admin@org2.example.com/msp -e CORE_PEER_ADDRESS=peer0.org2.example.com:9051 -e CORE_PEER_LOCALMSPID="Org2MSP" -e CORE_PEER_TLS_ROOTCERT_FILE=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org2.example.com/peers/peer0.org2.example.com/tls/ca.crt cli peer chaincode query -C mychannel -n mycc -c '{"Args":["getPrivateOrg1","name"]}'
The data is only available in peer0.org1 but not in peer0.org2. There is an error message as peer0.org2 is unable to get asset. It is because this data is written on the implicit collection for Org1.
It is clearer when we observe the state database.
Step 16: Observe State in Peers
As we are using PutPrivateData and specifying implicit collection _implicit_org_Org1MSP
to write data into ledger, the data is kept in mychannel_mycc$$p_implicit_org_$org1$m$s$p
, and a corresponding data hash file is also created. This applies in both peers of Org1 (peer0.org1 and peer1.org1).
If we observe peers in Org2 (peer0.org2 and peer1.org2), we see empty in the data portion but a data hash. This is expected as the data collection is an implicit one for Org1.
Here we see while implicit collection is for a specific organization, it is a database designated by a channel name and a chaincode name. As we see in the state there are two databases: one for mychannel-lifecycle,
and one for mychannel-mycc
. They are independent databases, serving for different channel-chaincode combination.
This ends our overall demonstration.
Summary
In this article we have demonstrated and observed how the implicit collection looks like and is being used.
The implicit collection gives us a predefined data collection specific to one organization. A separate collection is created per channel-chaincode combination. Besides being used in our application chaincode, this implicit collection is also used by lifecycle chaincode when an application chaincode is deployed.