What is Blockchain Hashing?

By
Bradley Duschense
December 27, 2024
8
min read

This article examines several of the specifics of the relationship between blockchain and hashing, and how these things work with cryptocurrencies like Bitcoin. To start, I look at what blockchain is and how hashing is related. Following this, I unpack some of the intricacies of hashing algorithms.

Blockchain technology is not necessarily monetary by its creation, rather to paraphrase Vitalik Buterin, the essence of blockchain is “informational and processual.” Blockchain technology could be used in a myriad of ways, Bitcoin is simply an effective application of the technology. A blockchain is a linked list of transactions which contains data and a hash pointer to the previous block in the blockchain. A given blockchain functions based on the verification of a hash and digital signatures. Hashing is the process that the blockchain uses to confirm its state. Each transaction requires one or more digital signatures. Signatures ensure that the transaction is only made by the owner of the address. And that it is received by the correct recipient.

Hashing in Action

Cryptocurrencies like Bitcoin and Ethereum, primarily rely on two computational processes: hashing and the blockchain platform, more commonly understood as the public ledger. It is the revolutionary application of these technologies that is making decentralized currency and peer-to-peer transactions secure and increasingly appealing.

In a blockchain, the hash of a previous block in a sequence is a tamper-proof sequence because as a function of the design, a hash is very sensitive. So, to change any variable of any one of the hashes in a given block would cause a domino effect, altering all of the previous transactions in the block. Blockchain hashes are deterministic; which means that the input data will produce the same result each time.

Blockchain technology is not unique to cryptocurrencies. Blockchain can be used for any number of electronic/digital transactions. However, Bitcoin’s algorithm has applied hashing and blockchain by relying on the participation of autonomous networks, all of which are taking part in the production and confirmation of hash transactions. Transaction hashes are approved using the proof-of-work. Next, the hashed transactions are checked against the consensus rules of the participating networks. Approved transactions are added to the public ledger with the other extent/approved transactions - or blocks.

As mentioned earlier, decentralized and peer-to-peer transactions are a central benefit of blockchain technology. To be sure, blockchain does not need to be decentralized. Third parties like banks and credit card companies also use the technology for their own digital needs.

Blockchain technology operates as a public decentralized ledger. The benefit of such a system is that it has the ability to be monitored by multiple beneficent networks, rather than rely exclusively on a trusted third party and centralized currencies. At its best, blockchain technology applied to cryptocurrency makes a reduction of corruption within both decentralized and centralized currencies possible. This is because it relies on a fellowship of participation, rather than resting solely on traditional financial institutions. Bitcoin is simply an example of a cryptocurrency that trades on the technology of hashing and blockchain, with the central goal of establishing a modern decentralized cryptocurrency.

Critical to the legitimacy of a cryptocurrency is the public ledger that blockchain makes possible. Here is a fun example of a long-lasting but obscure currency the Stone Rai. The Stone Rai is a longstanding currency of the Micronesian island of Yap. Rai are large doughnut-like stones that represent wealth as well as the exchange of wealth. But it was not only valuable to have the 3.5 meter stone in your possession, but the record of the transactions themselves are also equally valuable. Historical records indicate that during the transportation of these large stones they would get lost at sea. However, this did not diminish the value of the stone, nor did it necessarily void the value of exchange, because the record of the transaction was just as valuable. And given the Rai’s mutually accepted value by the Yap, the physical object did not need to be present in order to maintain is fiat value.

Similarly, with Bitcoin and cryptocurrencies, without the ability for networks to express consensus in the public ledger, a cryptocurrency has no value. Bitcoin, like gold, is a limited resource. The system will yield 21 million Bitcoins by 2040. When Bitcoin mining ceases once all of the new coins have been mined. After 2040 the currency will only be used and traded for its exchange value. Presently Bitcoin is mined regularly. So, as time goes on, it becomes more and more scarce and thus more valuable.

So, in order for a decentralized currency like Bitcoin to work, it not only depends on the reliability of the blockchain, it also relies on users possessing an equilibrium of rationality, self-interest, and altruism. That is to say, that given the investment and computational power necessary for mining Bitcoin, there needs to be a future value. However, the system’s reliability also depends on harmony between the needs of the individual and the health of the system to align. Therefore networks need to cooperate and collaborate in order for the system to thrive.

Adding to the Blockchain

Miners must solve for the target hash in order to add to a blockchain. Meeting or solving a hash uses an algorithm that relies on the data from the block header. Each block contains a block header with the number of the block, a timestamp of the transaction as well as the hash of the previous block which contains the nonce.

Continuing, a nonce is a “number only used once.” For a Bitcoin block, a nonce is a 32-bit (4-byte) numerical string. Miners adjust the value of the nonce so that the hash of the block is less than or equal to the current target of the network. The presentation of the block with the correct nonce value constitutes a proof-of-work, as this iterative calculation requires time and resources.

The Genesis Block is the first transaction in the block that starts a new electronic transaction (or coin in the case of Bitcoin). A crucial function of the blockchain is that it relies on hash pointers which contain the address of the previous block, as well as the hash of the new data. The block is like a sequence of chain links. Just like in a chain, each link is connected to the other via its previous and next link. Digital blocks, however, are connected to the previous block using a pointer. A pointer does not store the actual hash value, but “points” to an address of variables. Data from the previous blocks are hashed (or encrypted) which makes a new, unique series of letters and numbers of fixed length.

Therefore, the nonce has a high min-entropy because the variables are chosen from a large distribution. This also means that the nonce is a string of numbers generated at random. High min-entropy means that there is a low likelihood of randomly generating the hash. Then the nonce is added to a hashed block. So, a miner is only successful if they meet the target hash, only then is the nonce is added to a hashed block.

Proof of Transaction

In order for new blocks to be accepted to an extent blockchain, it is necessary to generate a proof-of-work. The proof-of-work is composed of letters and numbers fixed by the original input. This is expressed by the double SHA-256 hashing algorithm.

That means that once the target hash has been obtained, then the block is accepted into the public ledger by the consensus of other participating networks.

Fraudulent transactions are not added to the chain, because only validated transactions are approved and added to the blockchain. The problem of double-spending is avoided with a unique nonce. To use the same transaction and try to spend it twice is called “double-spending.”

This is why cryptocurrencies like Bitcoin depends on the features of blockchain technology; using a block of hashes in an interdependent sequence replaces the need for a trusted third party. However, using blockchain, publicly published list of transactions functions as the guarantor, by developing a public system of participation that is itself the vouchsafe of its own authenticity. Or more correctly, the established blockchain continually guarantee authenticity.

So in this sense, a blockchain like Bitcoin, is self-guaranteeing, because there are multiple networks continuously approving the transactions. The Bitcoin blockchain is, therefore, a public ledger that is composed of successfully hashed blocks. Only the successful blocks are added to the list of transactions that have been mutually approved by independent networks.

Hash and Hashing

A hashing algorithm is a computational function that condenses input data into a fixed size. The result of the computation is the output called a hash or a hash value. Hashes identify, compare or run calculations against files and strings of data. Typically, the program first computes a hash and then compares the values to the original files.

If you didn’t love doing in math in school, that is okay, because while hashing relies on some pretty crazy Alan Turing-esque computations, a computer program does all the math for you. So all you need to remember from math class are the basics of exponents and probability functions.

Digitally signing a piece of software so that it is available for download, is a basic example of hashing. To do this you need a hash of the script of the program you want to download. You also need a digital signature which is also hashed. Software is encrypted when the input data is hashed; then it can be downloaded. So, when someone downloads software, the browser needs to decrypt the file and check the two unique hash values. The browser then runs the same hash function, using the same algorithm and hashes both the file and the signature again. If the browser successfully produces the same hash value, it can confirm that both the signature and the file are authentic and that they have not been altered.

Hash values are deterministic and respond to the parameters of the given variables of the algorithm. The same sequence cannot be reproduced with a different data set as the input, which is why hashing is so useful for cryptocurrencies. The resultant hash of the input of data and is both unique and irreversible. For example, an input of “123” will always have the same output. If this were not the case, but rather 123 came up with a different output every time it was hashed, there would be no consistency or validity to the process. This would mean that your programs are never speaking the same language. The hash used for Bitcoin is a 65-digit-hexadecimal number -which I will explain shortly.

Digital Signatures

Hashing also requires the use of unique digital signatures. For example, SSL certificates (SSL/TLS Protocol) have a role in what makes possible secure data transmission from one device to another. Digital signatures bind a key to a dataset. SSL Certificates, therefore, need to match a specific public key to the intended transaction – kind of like a lock and key.

SSL/TLS uses asymmetric encryption which makes secure key exchanges possible. The security of this transfer relies on two keys: a public key used for encryption, and a private key for the recipient’s decryption. Digital signatures are very sensitive, and small changes result in a very different hash generation.

SHA-256

Presently SHA-256 is the most secure hashing function. This function expresses the possible combinations or values that result from the given input data. SHA stands for Secure Hashing Function, and 256 expresses the numerical quantity of the fixed bit length. This means that the target is correct 256 bit, and as mentioned, Bitcoin uses a 65-hexadecimal hash value.

Using the SHA-256 function makes it (nearly) impossible to duplicate a hash because there are just too many combinations to try and process. Therefore, this requires a significant amount of computational work; really significant. So much so that personal computers no longer mine Bitcoin. Presently miners require Application Specific Integrated Circuits or ASIC. Achieving this target has the probability of 2^256, if you remember your exponents, you will deduce this is an incredibly difficult variable to hit.

Furthermore, using this hash function means that such a hash is intentionally computationally impractical to reverse and as the intentional result that requires a random or brute-force method to solve for the input.

Consider the following, if I have 1 six-sided dice, I have a 1 in 6 chance of rolling a 6. However, the more sides my dice has (say 256 sides), my chances of rolling a 6 get a whole lot lower (that’s 1 in 256: which is still better than your odds of using brute-force on an extent hash).

A hash rate is then the speed at which hashing operations take place during the mining process. If the hash rate gets too high and miners solve the target has too quickly, increasing the potential for a collision, and indicating that the difficulty of the hash needs to be adjusted accordingly. For example every 10 minutes, at present, new Bitcoin is mined.

Collision Resistance

SHA-256 is complex and sensitive, this makes hash sequence reversal, in an effort to find the original input data, basically impossible. The difficulty of meeting SHA-256 means that this hash is extremely secure because it is“collision-resistant.” Collision resistance expresses the likelihood of two different networks solving the same hash at the same chance is minuscule.

Therefore, given the possible permutations of SHA-256, the probability of a collision is negligible. Below is a comparison of two different hash outcomes. The first only uses the single hash function (SHA-1), while the second uses the double hash function (SHA-256). And as you can see, the double hash function produces a much more complicated hash and as a consequence is far more collision-resistant.

Here are a few examples of other cryptographic hash functions and when collision resistance broke, and it will become evident why SHA-256 is currently the favored hash:

  • MD 5: It produces a 128-bit hash.

After ~2^21 hashes collision resistance broke.

  • SHA 1: Produces a 160-bit hash.

Collision resistance broke after ~2^61 hashes.

  • SHA 256: Produces a 256-bit hash.

Bitcoin currently uses the double hash SHA-256.

  • Keccak-256: Produces a 256-bit hash.

Currently used by Ethereum.

Merkle Tree and Merkle Roots

As blocks are continually added to an increasing blockchain, there is a need to reclaim storage space; this is the role of the Merkle Tree. Only the root of the hash is stored (the Merkle Root), and not the entire transaction. Thus with the root, it is still possible to verify the blockchain without sorting through all of the data. Verification processes are therefore simplified because you follow a branch that links transactions to the block it was originally time-stamped in. But the complete transaction itself is not stored as they take up a lot of space. Instead, the check is to ensure that the previous network node accepted the transaction. This process constantly affirms reliability. It also demands that subsequent blocks are added to the chain in the same way.

How is this done? The Merkle Root summarizes all of the data in the related transactions and stores it in the block header. Just as we saw is the case for hashes, the Merkle Root is altered if a single detail in any of the transactions is altered. Using a Merkle tree’s roots makes testing a to see if a specific transaction is included in the set or not much more efficient then going through all of the blocks in the chain.

By repeatedly hashing pairs of nodes you create a Merkle tree. This is repeated until there is only one hash left (this hash is called the Root Hash, or the Merkle Root). Each leaf node is a hash of transactional data, and each non-leaf node is a hash of its previous hashes. Merkle trees are binary and therefore require an even number of leaf nodes. If the number of transactions is odd, the last hash is duplicated once to create an even number of leaf nodes.

I will borrow an example from Shaan Ray. Here are four transactions in a block: A, B, C, and D. Each of these is hashed, and the hash is stored in each leaf node. The result is Hash A, B, C, and D. Consecutive pairs of leaf nodes then summarize a parent node by hashing Hash A and Hash B, resulting in Hash AB, and separately hashing Hash C and Hash D, resulting in Hash CD. The two hashes (Hash AB and Hash CD) are then hashed again to produce the Root Hash (the Merkle Root).

Conclusion and Review

For some, cryptocurrencies may seem too ephemeral to trust, but the basic idea of currency like Bitcoin relies on typical monetary practices of a fiat system. In fact, similar digital monetary practices already exist. Many transactions and bank balances rely on data and not the physical presence of hard currency (like gold). A crucial difference in the application of blockchain in terms of cryptocurrencies is that typically an exchange of currency requires a third party as guarantor; like a bank or credit card company. However, the application of blockchain technology in cryptocurrencies is disrupting the need for a third party, as well as making non-cash peer-to-peer transactions more secure and desirable.

Here is what you should take away from this article, What is Blockchain Hashing?:

Blockchain technology

  • Blockchain as a public ledger: Transactions that use the decentralized network are hashed and added to the public record. The participating networks maintain and approve the record. The blockchain ledger is essential in order to maintain the validity as a fiat currency and makes the reliability of decentralized cryptocurrencies possible.
  • Adding blocks to the blockchain: To add to the blockchain, miners mine for the target hash. This relies on data from the block header, which contains versions of the number of the block, a timestamp of the transaction and the hash of the previous block which contains the nonce.
  • Proof-of-work: The proof-of-work is produced when the target-hash is solved. This is related to the SHA-256 and the current level of difficulty for solving it.
  • Nonce: The nonce is a “number only used once,” in a Bitcoin block is a 32-bit (4-byte). Miners adjust the value of the nonce so that the hash of the block will be less than or equal to the current target of the network.

Hashing

  • Hash algorithms are computational functions. The input data is condensed to a fixed size. The result is the output called a hash, or a hash value. Hashes identify,
    compare or run calculations against files and strings of data. To add to an extent blockchain, the program must first solve for the target-hash for it to accept the
    new block of data.
  • Hashes are deterministic and pre-image resistant: Deterministic: the outcome of a particular set of data input will always have the same result. This makes it possible to keep track of transactions and nearly impossible to recreate the input from the output data (or pre-image resistant).

SHA-256:

  • This produces a 256-bit hash. Given that data is so large, that there are too many possible outcomes to compare hashes to and attempt to solve
    backward. So if one wanted to try to solve for a target hash, this they would need to begin with a random hash a sequence -then test it against the target hash – this would take a nearly incalculable amount of times.
  • Collision Resistance: SHA-256 is collision-resistant because of the large amount of data, so arriving the same target-hash at the same time is nearly impossible. This is also a result of using a target with high min-entropy.

Merkel Trees and Merkel Roots:

  • The Merkle Root summarizes all of the data in the related transactions and is stores it in the block header. Just as we saw with hashes, if a single detail in any of the transactions is altered, so is the Merkle Root. A Merkle Tree is efficient because checking on a single transaction, rather than its root is inefficient.