Guide to Ethereum Virtual Machines

‘Ethereum Virtual Machine’ (EVM) is a term that you are likely to come across when hanging out in the Ethereum blockchain. Furthermore, it is quite common to encounter it when developing a smart contract on the said blockchain.

Ethereum is an open blockchain platform; one of the more notable ones aside from Bitcoin. It lets users build and use decentralized applications (DApps) that run on blockchain technology. Like Bitcoin, no one controls or owns it. It is an open-source project that many people around the world build. However, unlike the Bitcoin protocol, Ethereum is adaptable and flexible. Its overall design makes it very easy to create new applications on the platform.

At its core, Ethereum is a programmable blockchain that allows users to create their own operations of any complexity. It’s a platform for an array of different types of decentralized blockchain applications, including cryptocurrencies. In a more narrow sense, Ethereum refers to a series of protocols that define a platform for decentralized applications. The Ethereum Virtual Machine is at the heart of this.

This guide will provide a more in-depth explanation of what this machine is.

What is it?

‘Virtual machines’ are machines that create complexity between two elements: executing code and the executing machine. This specific layer improves the software’s overall portability. Moreover, it makes sure that the applications are set apart from each other and their host.

The EVM is seen as a ‘quasi-Turing complete machine.’ Turing completeness is a term referring to a system of data manipulation rules and gets its name from Alan Turing. He is the man behind the creation of the eponymous machine. This machine is a mathematical model of computation that illustrates an abstract machine. It basically manipulates symbols on a strip of tape in accordance with a table of rules. The model is generally pretty simple, however, the machine is capable of simulating the computer algorithm’s logic.

Both programming languages and central processing units (CPUs) are notable examples of systems that access and alter data. If these rules are capable of simulating Turing’s hypothetical computing, then the rules are ‘Turing complete.’ A system that’s Turing complete can be mathematically proven to be competent enough to perform any possible calculation or computer program.

So basically, a Turing complete machine is mathematically able to solve any problem you present to it. The EVM is – as mentioned before – only quasi-Turing complete. This is because computations that the machine performs are bound by gas. This essentially functions as a limitation to the total number of computations that can be done.

Smart contract creation

The programming language that typically writes smart contracts is “Solidity.” It is very similar to two other languages, JavaScript and C++. Other programming languages that write smart contracts include “Vyper” and “Bamboo.” The EVM cannot execute languages pertaining to smart contracts that are similar to Solidity directly. In lieu of this, they assemble into low-level machine instructions, which are ‘opcodes.’

Opcodes

The Ethereum Virtual Machine uses opcodes to carry out specific tasks. Altogether, these opcodes are what essentially allow the EVM to officially be Turing complete. What this means is that EVM can calculate (almost) anything when it possesses enough resources. Opcodes are 1 byte, so this means that the maximum amount of opcodes is 256. For the sake of simplicity, one can easily divide all opcodes into these different types:

Stack-manipulating opcodes
Arithmetic/Comparison/Bitwise opcodes
Environmental opcodes
Memory-manipulating opcodes
Storage-manipulating opcodes
Program counter-related opcodes
Halting opcodes

Bytecodes

‘Bytecodes’ are a crucial component for efficiently storing opcodes. Bytecodes are basically what opcodes encode to. Each one of the opcodes is assigned a byte. For instance, the bytecode for STOP is 0x00. For a clear illustration, we will use the bytecode below: 0x6001600101.

During implementation, bytecode is divided into its bytes (1 byte = 2 hexadecimal). Bytes that are in range 0x60-0x7f (PUSH1-PUSH32) are subject to different treatment because they consist of push data. This data needs to attach to an opcode, as opposed to being seen as an independent opcode.

The instruction at the beginning is 0x60 and it translates out to PUSH1. Consequently, we’re aware that the push data is 1 byte in length, so we include the next byte to the stack. The stack is now in possession of 1 item and we are able to move on to the following instruction. We know 0x01 is a portion of a PUSH instruction, so the next instruction that we have to obtain to execute is another 0x60 (PUSH1) alongside the same data. At this point, there are 2 items that are identical that the now stack contains.

The last instruction is 0x01 and the translation of this is ADD. This instruction takes 2 items from the stack before then pushing the sum of these items to the stack. By now, there is one item that the stack contains: 0x02.

Stack & Storage

There are plenty of top-level programming languages that permit its users to pass arguments in a direct manner to functions (function(argument1,argument2)). On the other end of the spectrum, lower-level programming languages use a ‘stack’ as a means to deliver values to functions. The EVM utilizes a 256-bit register stack; this is where the 16 recent items are accessible or manipulated simultaneously. Overall, the stack is only able to hold about 1,024 items.

These limitations result in complex opcodes using ‘contract memory’ as a way to either reclaim or pass data. Be that as it may, memory is not continuous. When the execution of the contract concludes, the memory contents are not saved. A stack is comparable to function arguments, but memory is comparable to affirming variables.

The only way to store data permanently and make it accessible for contract executions in the future is to use ‘storage.’ Contract storage is, in essence, a public database, where values are available for external reading without sending a transaction to the contract. As an additional bonus, there are no fees. Writing to the storage, however, is quite expensive in comparison to writing to memory.

Smart contract interaction costs

Contract executions are typically run by those who are operating an Ethereum node. Because of this, an attacker may attempt to construct contracts. This includes lots of computationally expensive operations that will minimize the speed of the network. In order to prevent these attacks, every opcode has a base gas cost of their very own. Moreover, various complex opcodes also charge users with a dynamic gas cost.

Let’s use the opcode KECCAK256 (formerly SHA3) as an example. The base cost of this opcode is 30 gas and its dynamic cost is 6 gas per word. Instructions that are computationally expensive typically charge a much higher gas fee than the more straightforward instructions. What’s more, every transaction commences at 21,000 gas.

Gas can be refundable when carrying out instructions that actively diminish state size. Establishing a storage value as being zero from non-zero compensates up to about 15,000 gas. Conversely, thoroughly removing a contract (employing the SELFDESTRUCT opcode) rebates 24,000 gas. The completion of the contract execution is the only time when refunds occur, thus contracts are unable to pay for themselves. In addition, a refund must not surpass half the gas that the ongoing contract call uses.

Smart contract deployment

The creation of an ordinary transaction without the presence of a to address happens when a smart contract deploys. A certain amount of bytecode is included as input data. This particular bytecode serves as a ‘constructor’, which writes initial variables to storage. This writing happens before the process of copying the ‘runtime bytecode’ to the code of the contract. In the midst of deployment, creation bytecode only runs once, whereas runtime bytecode runs on pretty much every single contract call.

Using the bytecode above, we split it into three fragments:

Constructor: 60806040526001600055348015601457600080fd5b5060358060226000396000f3fe
Runtime: 6080604052600080fdfe
Metadata: A165627a7a723058204e048d6cab20eb0d9f95671510277b55a61a582250e04db7f6587a1bebc134d20029

At this bytecode’s conclusion, Solidity creates a ‘Swarm hash’ of a metadata file and it eventually gets added. Swarm is a distributed storage platform that also functions as a content dispersion service. To put simply, it is decentralized file storage. Even though the Swarm hash is in the runtime bytecode, it will not be seen as opcodes by the Ethereum Virtual Machine at any point in time. This is due to the fact that its location is completely unreachable. Solidity commonly utilizes this specific layout:

0xa1 0x65 ‘b’ ‘z’ ‘z’ ‘r’ ‘0’ 0x58 0x20 [32 bytes swarm hash] 0x00 0x29

In this particular case, we are able to derive the Swarm hash below:

4e048d6cab20eb0d9f95671510277b55a61a582250e04db7f6587a1bebc134d2

The metadata file consists of various information concerning the contract, including the compiler version or the functions of the contract. Unfortunately, it’s nothing more than an experimental failure. Moreover, there aren’t that many contracts that publicly upload their metadata onto the Swarm network.

Decompiling bytecodes

There have been a good number of projects creating tools that try to make reading bytecode an easier task. Two examples of these services that aid in decompiling a contract a mainnet are:

Sadly, several pieces of the initial contract source – like the names of functions or events – are regularly lost. This is because of optimization that the compiler does. Despite this, a bulk of function names can be shown by brute force. The method behind this is correlating function signatures to extensive datasets that contain names of popular functions and events.

Contract calls sometimes need an ABI (Application Binary Interface), a piece of data that documents all of the functions and events. This includes the input and output that it requires. When it comes to calling a contract’s function, the determination of the function signature happens by hashing the function’s name. This also includes its inputs (using KECCAK256), as well as shortening everything excluding the first four bytes.

Going off of the above image, the function HelloWorld() answers to the signature hash 0x7fffb7bd. Assuming we want to call this particular function, then our transaction data has to begin with 0x7fffb7bd. Any argument that needs to pass to a function (there are none in this case) can be added in 32-byte pieces. These are ‘words’ and they come following the signature hash in the input data of a transaction.

Should an argument contain data worth over 32 bytes (256 bits), the argument is split up into a variety of words. The words are included in the input data following the inclusion of other arguments. Furthermore, there’s the addition of the size of all of the words as another word preceding all array words. At the location of where the argument’s inclusion is, the array words’ start position – which includes the word size – is added instead.

Conclusion

The Ethereum Virtual Machine is an important component in the development of smart contracts. With that in mind, it’s equally as important to understand them if you wish to take part in this particular field. If not for the construction and deployment of these contracts, than to properly comprehend the more technical side of it.

If you're interested in learning more about Ethereum, there's a book called 'Ethereum For Dummies' that I'd recommend you check out, and it's available on Amazon here - https://amzn.to/4gMQtHK

‍