Inside the Ethereum Virtual Machine: How Solidity Data Structures Are Stored in the EVM

In this article, I examined how arrays, structs and mappings are stored on the EVM, but before we dive into this, lets talk about what storage looks like in the EVM. Understanding how EVM storage works is crucial for efficient smart contract development.

Storage in EVM

Storage in the Ethereum Virtual Machine (EVM) is divided into slots. Theoretically, the maximum number of storage slots that can exist in the EVM is 2^256. This is an extremely large number (larger than the number of atoms in the universe) that it’s practically impossible to exhaust. Each slot is 32 bytes, i.e., 256 bits in size, and all things being equal (i.e no packing), each single variable occupies a slot in the EVM. It is worth noting that storing a variable in a slot consumes 20,000 gas for first time storage. Subsequent updates to the same slot then costs 5,000 gas.

Packing

Since allocation of storage slots in the EVM consumes gas, there are techniques that can be used to optimize the amount of gas consumed in storage. Packing is one of such techniques. Packing refers to the ability of several small variables (whose size, when added together, doesn’t exceed 32 bytes) to be “packed” together and occupy the same slot rather than occupying one slot each. This minimizes the number of slots required and therefore reduces the amount of gas consumed.

Packing in solidity occurs by arrangement, i.e., for two small variables to pack, they must be written following each other in the smart contract. For example, if in a smart contract there are 2 uint8 integers and 1 uint256 integer, if the first uint8 occupies line 1, the second uint8 must follow it on line 2 for packing to occur. If the uint256 comes in-between the 2 uint8 integers, then packing can no longer take place since the size of the uint256 integer is already the maximum size of a slot.

How Arrays are stored on the EVM

Arrays are ordered collections of elements of the same type. They can be either fixed-size or dynamic. How they are stored on the EVM depends on whether they are fixed-sized or dynamic.

  • Fixed-size arrays have a fixed number of elements that can’t change throughout the lifetime of the contract. All elements of a fixed-size array are stored in consecutive slots, right next to each other on the EVM. If an array has 5 elements, they will take up 5 slots, one after the other, depending on the size of the elements. And just like stand-alone variables can be packed based on arrangement, elements in a fixed-size array can also be packed if they are smaller than 32bytes. For example, in the array below, all the elements would be stored in the same slot.

  •   uint8[32] public myFixedArray;
    
  • Dynamic arrays, however, can grow or shrink in size. You don't know how many elements they will have until the contract is running. In handling them, the EVM stores the length (how many elements are in the array) in a specific slot (the slot corresponding to the position of the array variable in the contract), and the actual data (i.e., the elements of the array) are stored in another location, starting at the location determined by the keccak256 hash of the storage slot of the array in the contract layout. For example, Let's consider the dynamic array below:

      uint8[] public myDynamicArray = [10, 20, 30]
    

    In this array, the first slot (say slot 0) would hold the number 3, which is the length of the array. The EVM now uses the keccak256 function with the slot location as a parameter, which is 0 (i.e., keccak256(0)), to calculate where the first element (i.e., 10) should be stored, and the next elements (20 and 30) then follow right after in consecutive slots.

Structs Storage on The EVM

A struct in Solidity is a custom data type that allows you to group together different types of variables (usually related to a particular subject) under a single name. It’s similar to classes in python or javascript.

The variables inside a struct are stored in consecutive storage slots depending on their size, i.e they are stored in the order in which the variables are declared in the struct. Smaller data types (like uint8, bool, etc.) can be packed together into a single storage slot if their total size is 32 bytes or less. But for dynamic data types like strings or bytes, their slot in a struct storage only stores a pointer (usually the storage slot number ) to where the actual data is stored, and this location is calculated by passing in the position of the dynamic type (uint8, bool, etc.) in the struct to the keccack256 function, which then returns the exact location where the data stored starts from. For example, suppose you have a struct like this:

struct MyStruct {
    uint32 id;
    uint32 number;
    string name;
}

The id and number variables will pack and be stored in the first available storage slot, (say slot 0). The name, being a dynamic type (string), won't store its data in the next slot (i.e., slot 1). Instead, slot 1 will store a pointer (a reference) to the actual storage location of the string, which is determined by keccak256(1).

Mappings and Their EVM Storage Pattern

Mappings are key-value pairs where keys are mapped to values. Unlike arrays, mappings are unordered and do not store their data in a continuous block of memory. Each key-value pair is stored in a separate location that is calculated using the keccak256 hash of the key and the storage slot, i.e., the slot of the mapping variable in the contract. Because of the way they are stored, mappings do not allow for iteration over keys since Solidity doesn’t keep track of keys; rather, it stores the location of values mapped to the keys.

Conclusion

Since gas costs are intrinsically tied to storage manipulation, optimizing the amount of gas consumed in interacting with blockchain states heavily depends on efficient storage management. Therefore, understanding how storage works on the Ethereum Virtual Machine is crucial in devising efficient gas optimization techniques, especially in an ecosystem like Ethereum where gas optimization is of utmost importance.

6
Subscribe to my newsletter

Read articles from Abolare Roheemah directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Abolare Roheemah
Abolare Roheemah