IPFS and the Merkle Forest
Web3: JavaScript API for interacting with the Ethereum Virtual Machine
Solidity: Smart Contract Programming Language
IPFS: Distributed File System
This is a quick introduction to calling the Ethereum Virtual Machine using the web3 API, compiling Solidity Smart Contracts, and traversing content addressed data structures on the Interplanetary File System.
These are some of the core technologies that will be used to build Ðapps.
npm install web3 –save
npm install solc –save
npm install ipfs-api –save The Ethereum JS Util Library- “ethereumjs-util”: “4.5.0”
The Ethereum JS Transaction Library – “ethereumjs-tx”: “1.1.2”
The first thing we need to do is get testrpc up and running. Depending on the type of machine you have this could be rather straight forward or it may take a while. The link below should point you in the right direction.
Configure testrpc
Sending Transactions
Once your test Ethereum Node is running, instantiate a new web3 object. This can be done by doing the following:
Start up node in the console:
var Web3 = require(“web3”)
var web3 = new Web3(new Web3.providers.HttpProvider(http://localhost:8545)) We can now call different web3 APIs. Test out the connection to TESTRPC by calling:
web3.eth.accounts You will see all 10 accounts generated from the TESTRPC.
web3.eth.accounts[0] You will see the address for the first address generated from the TESTRPC.
web3.eth.getBalance(web3.eth.accounts[0]) You should be returned the balance address from your first TESTRPC address.
Converting from Wei to Ether
var acct1 = web3.eth.accounts[0]
var acct 2 = web3.eth.accounts[1]
var balance = (acct) => { return web3.fromWei(web3.eth.getBalance(acct), ‘ether’).toNumber() }
balance(acct1)
balance(acct2)
Send an Ethereum Transaction
web3.eth.sendTransaction({from: acct1, to: acct2, value: web3.toWei(1, ‘ether’), gasLimit: 21000, gasPrice: 2000000000 nonce: })
Send a Raw Ethereum Transaction
var pKey1x = new Buffer(pKey1, ‘hex’)
pkey1x
var rawTx = {
nonce: web3.toHex(web3.eth.getTransactionCount(acct1)),
to: acct2,
gasPrice: web3.toHex(2000000000),
gasLimit: web3.toHex(21000),
value: web3.toHex(web3.toWei(25, ‘ether’)),
data: “”}
var tx = new Ethtx(rawTc0)
tx.sign(pKey1x)
tx.serialize().toString(‘hex’)
web3.eth.sendRawTransaction(Ox${tx.serialize().toString(‘hex’)}`, (error, data) => {
if(!error) { console.log(data) }
})```
### Creating Smart Contracts
var Web3 = require(“web3”)
var solc = require(“solc”)
var web3 = new Web3 (new Web3.providers.HttpProvider(http://localhost:8545))
Create a source variable with your Solidity code: (This can be any smart contract code, unless you are trying to import in another contract at the top).
var source = `{ contract Messenger {
function displayMessage() constant returns (string) {
return "This is the message in the smart contract";
}
}`
var compiled = solc.compile(source)
source
You can then start to unpack the different parts of the compiled contract:
compiled.contracts.Messenger
compiled.contracts.Messenger.bytecode
compiled.contracts.Messenger.opcodes
compiled.contracts.Messenger.interface
var abi = JSON.parse(compiled.contracts.Messenger.interface)
In order to deploy the contract to the network you will need the JSON Interface of the Contract, the abi (application binary interface).
var messengerContract = web3.eth.contract(abi)
messengerContract
var deployed = messengerContract.new({
... from: web3.eth.accounts[0],
... data: compiled.contracts.Messenger.bytecode,
... gas: 470000,
... gasPrice: 5,
... }, (error, contract) => { })
You should now see in the testrpc the transaction broadcasted to the network.
Web3.eth.getTransaction(“0x0”)
### Call a Function in the Contract
deployed.displayMessage.call()
IPFS – Hashlink Technology
IPFS defines a new paradigm in the way that we can organize, traverse, and retrieve data using a p2p network that serves hash linked data. This is done by using merkle-links between files that are distributed on the interplanetary file system.
Content-addressing enables unique identifies and unique links between data that lives on the distributed network. It creates distributed, authenticated, hash-linked data structures. This is achieved by Merkle-Links and Merkle-Paths.
A merkle-link is a link between two objects which is content-addressed with the cryptographic hash of the target object, and embedded in the source object. This is the way the Bitcoin and Ethereum blockchains work, they are both essentially giant merkle trees; one with blocks of ordered transactions one with computational operations driving state changes.
IPLD – Interplantary Linked Data
IPLD is a common hash-chain format for distributed data structures. This creates a database agnostic path notation.
any hash –> any data format
This enables cryptographic integrity checking and immutable data structures. Some of the properties of this are longterm archival, versioning, and distributed mutable state.
It shifts the way you think about fetching content. Content addressing is What vs Where.
A “link” as represented as a JSON object is comprised of the link and the link value:
{ “/” : “ipfs/0x0 “ }
Link Key Link Value
Other properties of the link can be defined in the json object as well.
What if we want to create a more dynamic link object. This can be achieved by using a merkle-path.
A Merkle is a unix-style path which initially dereferences through a merkle-link and allows access of elements of the referenced node and other nodes transitively.
This means that you can design an object model on top of IPLD that would be specialized for file manipulation and have specific path algorithms to query this model.
This would look like:
/ipfs/0x0/a/b/c/d
The protocol the hash of the linked object and the traversal
Here are some examples of traversals with the the link JSON object.
2017-01-30-01_30_48-specs_ipld-at-master-%c2%b7-ipld_specs2017-01-30-01_32_52-zoomit-zoom-window2017-01-30-01_33_40-zoomit-zoom-window2017-01-30-01_34_50-zoomit-zoom-window2017-01-30-01_36_18-zoomit-zoom-window2017-01-30-01_37_51-zoomit-zoom-window
This is all essentially a JSON structure for traversing files, navigating through the IPLD object, walking along the ipfs hash to pull arbitrary data that is nested in these data structures.
CID: Content Identifier – format for hash-links
Multihash – multiple cryptographic hashes
Multicodec – multiple serialization formats
Multibase – multiple base encodings
2017-02-21 06_09_27-Juan Benet_ Enter the Merkle Forest - YouTube
This is very powerful because we can use Content Identifiers to traverse different crytographic protcols: Bitcoin, Ethereum, ZCash, Git.
We could also link from crypto to Salesforce or a relational DB by using content addressing.
The paths between these now disparate systems can be resolved by using this uniform, immutable, distributed file system.dsContent addressing is a true computational advancement in the way that we think about adding and retrieve content on the web. We can take existing databases and use the various parts of the IPFS protocol to build clusters of nodes that are serving content in the form of IPLD structures.
IPLD enables futureproof, immutable, secure graphs of data the essentially are giants Merkle Trees. The advantages of converting a relational database or key value db into a merkle tree are endless.
The data and named links gives the collection of IPFS objects the structure of a Merkle DAG — DAG meaning Directed Acyclic Graph, and Merkle to signify that this is a cryptographically authenticated data structure that uses cryptographic hashes to address content.
This protocol enables an entirely new way to search and retrieve data. It is no longer about where a file is located on a website. It is now about what exact piece of data you are looking for. If I send you an email with a link in it and 30 days later you click the link, how can you be certain that the data you are looking at is the same as what I original sent you? You can’t.
With IPLD you can use content addressing to know for certain that a piece of content has not changed. You can can traverse the IPLD object and seamlessly pick out piece of the data. By using IPLD once that data is locally cached you can use the application offline.
You can work on an application with other peers around you as well using a shared state. But why is this is important for businesses?
Content addressing for companies will ensure a number of open standards. We can now take fully encrypted private networks that are content addressed.
This is a future proof system. Hashes that are currently used in databased can be broken but now we have multi-hashing protocols.
We can build blockchains that use IPLD and libp2p.
The IPLD resolver uses two main pieces; the bitswap and the blocks-service.
Bitswap is transferring blocks and blocks-services is determining what needs to be fetched based on what is currently in the local cache and what needs to be fetched. This prevents duplication and increase efficiencies in the system.
We will be creating a resolver for the enterprise that enables them to take their existing noSQL key value and convert them into giant merkle trees that are interoperable. IPLD is like a global adapter for cryptographic protocols.
2017-02-23 04_34_00-Juan Benet_ Enter the Merkle Forest - YouTube
Creating the Enterprise Forrest
We can create a plug and play system of resolvers for IPLD.
We will resolve the database and run a blockchain in parallel. This blockchain will be built using two protocols that are from the IPFS project: ipld and libp2p
The IPLD Resolver for the enterprise will consist of
.put
.get
.remove
.support.add
.support.rm
We will take any enterprise noSQL database (as long as there is a key value store) and build out the content addressed merkle tree on it.
Content Addressing enterprise content on IPFS-Clusters
Enterprise can consist of 10,000 nodes 50,000 nodes 100,000 nodes and the IPLD object has to be under 1 MB.
That all will be hosting the merkle tree mirror of the relational database.
This can also enable offline operation for the network. Essentially they have their own protocol that is mirroring their on-premise system.
This makes data migration a lot easier or not even necessary in the future. Once the data is content addressed and creates the merkle tree, we can then start traversing the data.
We will also build interfaces that can interact with the IPLD data
IPLD is a format that can enable version control. The resolver will essentially take any database, any enterprise implementation, and convert it into a merkle tree.
We are essentially planting the seeds (product) and watering them (services). Once these trees are all in place that can communicating because they are all using the same data format.
Future Proofing your Business
We are creating distributed, authenticated, hash-linked data structures. IPLD is a common hash-chain format for distributed data structures.
Each node in these merkle trees will have a unique Content Identified – a format for these hash-links.
This is a database agnostic path notation any hash – any data format.
This will have a multihash – multiple cryptographic hashes, multicodec – multiple serialization formats, multibase – multiple base encodings
Again, why is the important for businesses? The most important is transparency and security, this is a tamper proof, tamper evident database that can be shared, traversed, replicated, distributed, cached, encrypted and you know now exactly WHAT you are linking to, not where. You know which hash function to verify with. You know what base it is in.
This new protocol enables cryptographic systems to interoperate over a p2p network that serves hash linked data.
IPFS 0.4.5 includes that dag command that can be used to traverse IPLD objects.
Now to write the dag-sql resolver.
Take any existing relational database and you can now traverse that database content addressing.
Content Addressing your database to a cluster of IPFS Cluster nodes on a private encrypted network.
Deterministic head of the cluster then writes new entries.
We use the Ethereum network to assign a key pair for your users to leverage the mobile interface. You can sign in via fingerprint or by facial recognition using the Microsoft Cognitive Toolkit. Your database will run in parallel , you will keep your on premise system and have a content addressed. Content Addressing a filesystem or any type of database creates a Merkle DAG. With this Merkle DAG we can format your data in a way that is secure, immutable, tamper proof, futureproof and able to communicate with other underlying network protocols and apllication stacks. We can effectively create a blockchain network out of your exisiting database the runs securely on a cluster of p2p nodes. This is an IPFS-Cluster. Here are the commands that can add / remove CIDS from peers in the cluster:
I am planting business merkle dag seeds in the merkle forrest. Patches of these forrest will be able to communicate with other protocols via and hash and in any format.
This is the way that the internet will work going into the future. A purely decentralized web of business trees.