What is the Graph and why do you care?

The Graph is a protocol for indexing and querying blockchain data. It enables you to use your smart contract's events to create and update entities that represent the internal state of such a smart contract.

Why is this useful? Imagine a smart contract that requires an intensive lookup of information. An NFT marketplace is a good example. You have potentially thousands of listings going on at the same time, assuming that you need a listing ID to buy an item from another user. How do you identify a specific listing that you are interested in? How do you find it? From a front-end point of view, how do you get all listings, and filter them by seller, or by any other criteria?

If you are familiar with Solidity, you know you can do some of these things, but most of them are not very practical or are directly impossible when you reach a certain scale.

Let's see: if you use mappings to index a Listing struct, you need to know beforehand the key that identifies a given listing so you can track it in your frontend. Even if you do some composite key (mapping-ception) and use, say, the user address and the token ID, how do you get all the listings for a given user? You can't traverse a mapping and trying all the existing possibilities is not practical.

"I will use an array then", fine but then you will eventually run out of gas just reading your arrays if your marketplace is even marginally successful.

With The Graph, you can create a Listing entity that gets updated anytime your users interact with any given listing. You can create, update, and delete any particular Listing instance, just telling the Graph the effect that your smart contract's events should have over that entity. Using GraphQL you can get any of those Listing entities, filter them to display the appropriate data, or even fetch info that you can use as input for a new smart contract function call.

"I could do that with a regular database, just storing anything that goes through my front end" I hear you saying. You are right, but there are some advantages to using this approach:

You don't depend on your front end. You already know that not all transactions need to/will go through your front end as anyone can interact directly with your smart contracts.
Less surface area for mistakes. All the Subgraph information comes from the events emitted in the smart contract. We could say that the data comes from a primary source.
Decentralization. This is still a work in progress. If you deploy your Subgraph to the Graph network (or even to the Hosted Service) your users don't need to trust that you won't be manipulating the data in your favor. They still need to trust that you won't stop paying the Network fees so the data remains available, but that's a different problem. If you don't want to go over the trouble of deploying to the Graph network, there are other services that offer more centralized solutions.

Ok, I'm convinced. How do I start?

We will deploy our first Subgraph to the Subgraph Studio for testing.

You will need a wallet to connect to Subgraph Studio and an already deployed contract to bootstrap your subgraph. It's probably best to design your subgraph while on testnet, as you will likely want to change your smart contracts events while creating the subgraph.

Set up your Graph Studio/Hosted Service Account

You will need to connect your wallet (Graph Studio) or your Github Account (Hosted Service) and then create a subgraph through the UI. This is important as you will need that name to be the same in your local setup so you can later deploy it.

Graph CLI

Then, you will need to install the Graph CLI.

# NPM
$ npm install -g @graphprotocol/graph-cli

# Yarn
$ yarn global add @graphprotocol/graph-cli

Subgraph bootstrapping

Using graph init you will start the process of bootstrapping your subgraph.

Choose your protocol. No matter the specific chain your contracts are deployed into, if you are on an EVM chain, choose Ethereum.
Product to initialize. subgraph-studio in our case.
Subgraph slug. You should have set this when creating a new subgraph in the Studio 2 steps ago.
Directory to create subgraph in (self-explanatory).
Ethereum network. Here is where you choose the network where your contract is deployed.
Contract Address. If you have more than one contract address, just choose one. You can add the rest later.
Fetch ABI and start block. If your contract is verified on Etherscan, the CLI should be able to fetch the abi file and start block from there. Otherwise, you will need to provide a directory where the CLI can fetch the abi from.
Contract Name. Same as your contract name. Please don't skip this, or else all your relevant files and handlers will be named as Contract and that's annoying.
Index contract events as entities? This is an example of a Good/Bad Default. We will talk more about this later. Choose YES for now.
After some automatic code generation, you will be asked if you want to add another contract. Do so if you need to but if it fails, don't fret, you can use graph add <CONTRACT_ADDRESS> later to add new contracts.

Code away! Design your subgraph

For our example marketplace subgraph, we will assume that we will only deal with 2 entities: ItemSold and Listing. Item Sold will help us keep track of all successful sales, while we will use Listing to keep track of all marketplace listings during all of their lifecycle.

Remember when we bootstrapped the project, we said yes to indexing all contracts as entities? This is useful to get a feeling of how these entities look according to your events, data types, etc. This is the moment to actually design a data structure (the entities) that makes sense for your application. In our case, if we take Listing, it makes much more sense to just create one entity and modify it according to any changes that a specific instance might have, rather than just keep a record of all the events that happen, disconnected from each other.

schema.graphql

This is the file where you declare your entities, the ones that you want to track and query in your application. Any change to these entities will come from emitted events in your smart contracts.

Eliminate all the events-turned-entities and just leave ItemsSold plus a new Listing entity that gathers all the useful information of a given entity (all the info contained in the Listing struct in ISimpleMarketplace.sol). I also leave the blockNumber, timestamp, and tx hash.

Every entity needs to have a unique ID. By default, the event handlers create one based on the tx hash. If you only want to record all events then you can leave it like that, but if you want to modify an entity state based on the events that affect it (like a Listing having a new price) then you need to have an ID that you can reference in all the events that affect a given entity, and use it as an entry point to modify it.

This is when subgraph design and smart contract development should (ideally) work together. For example, each listing has a unique ID, a counter in the smart contract, and this ID is emitted in all the events that affect a listing.

//@dev from IBasicMarketplace.sol
// Emitted when a valid listing is created
event ListingCreated(uint256 listingId, address seller, address nft, uint256 tokenId, uint256 price, bool exist);
// Emitted when a valid listing is updated by the Seller
event ListingUpdated(uint256 listingId, address seller,  address nft, uint256 tokenId, uint256 price);
// Emitted when a listing is cancelled 
event ListingCancelled(uint256 listingId);

I will be using this listingId, which is emitted in every event as an ID for the Listing entity. This is how I know that I am modifying the correct entity instance for each change in every listing estate. This is not seen clearly in the GraphQL schema but will be clear in the mapping file when we deal with the event handlers.

type ItemSold @entity(immutable: true) {
  id: ID!
  listingId: BigInt! # uint256
  nft: Bytes! # address
  seller: Bytes! # address
  buyer: Bytes! # address
  price: BigInt! # uint256
  blockNumber: BigInt!
  blockTimestamp: BigInt!
  transactionHash: Bytes!
}

type Listing @entity {
  id: ID!
  listingId: BigInt! # uint256
  seller: Bytes! # address
  nft: Bytes! # address
  tokenId: BigInt! # uint256
  price: BigInt! # uint256
  exist: Boolean! # bool
  blockNumber: BigInt!
  blockTimestamp: BigInt!
  transactionHash: Bytes!
}

your-contracts.ts (Mapping file)

In our case simple-marketplace.ts often referred to in the docs as the mapping file, which I found to be a confusing term at first but it actually makes sense.

This is the file that maps the events emitted by your smart contract to the entities that you declared in your schema. It is the backend business logic that handles the changes in your database.

The subgraph is constantly listening for events from our contract. Each time an event hits the subgraph, the appropriate event handler gets triggered. Here you can see clearly why it is important to design our smart contracts with subgraph design in mind, as we use the listingId as a handle to get the correct listing every time we receive an event. At any given point, each listing instance reflects the current state of the listing within our smart contract.

import {
  ListingCancelled as ListingCancelledEvent,
  ListingCreated as ListingCreatedEvent,
  ListingUpdated as ListingUpdatedEvent,
} from "../generated/SimpleMarketplace/SimpleMarketplace"
import {
  Listing
} from "../generated/schema"
import { store } from '@graphprotocol/graph-ts'

export function handleListingCancelled(event: ListingCancelledEvent): void {
  let entity: Listing | null
  entity = Listing.load(event.params.listingId.toString())

  if (entity!== null){
    store.remove("Listing", event.params.listingId.toString())
  }
}

export function handleListingCreated(event: ListingCreatedEvent): void {
  let entity: Listing | null
  entity = new Listing(event.params.listingId.toString())

  entity.listingId = event.params.listingId
  entity.seller = event.params.seller
  entity.nft = event.params.nft
  entity.tokenId = event.params.tokenId
  entity.price = event.params.price
  entity.exist = event.params.exist

  entity.blockNumber = event.block.number
  entity.blockTimestamp = event.block.timestamp
  entity.transactionHash = event.transaction.hash

  entity.save()
}

export function handleListingUpdated(event: ListingUpdatedEvent): void {
  let entity: Listing | null
  entity = Listing.load(event.params.listingId.toString())

  if(entity !== null){
    entity.listingId = event.params.listingId
    entity.seller = event.params.seller
    entity.nft = event.params.nft
    entity.tokenId = event.params.tokenId
    entity.price = event.params.price

    entity.blockNumber = event.block.number
    entity.blockTimestamp = event.block.timestamp
    entity.transactionHash = event.transaction.hash

    entity.save()
  }
}

subgraph.yaml

This file controls de deployment of the subgraph. Here we declare what is the contract's deployment address, what is the starting block to begin the indexing, network, tracked events, handlers... everything!

We will keep just 2 entities, ItemSold and Listing. We delete all the other entities. Regarding event handlers, we keep all of those already declared (we modified their original behavior in the mapping file) except for OwnershipTransferred which is inherited from Ownable and has no interest for us.

specVersion: 0.0.5
schema:
  file: ./schema.graphql
dataSources:
  - kind: ethereum
    name: SimpleMarketplace
    network: sepolia
    source:
      address: "0xf91C1bfb2dbAcbfBD39a171eF0Cd3E3b47893099"
      abi: SimpleMarketplace
      startBlock: 4233781
    mapping:
      kind: ethereum/events
      apiVersion: 0.0.7
      language: wasm/assemblyscript
      entities:
        - ItemSold
        - Listing
      abis:
        - name: SimpleMarketplace
          file: ./abis/SimpleMarketplace.json
      eventHandlers:
        - event: ItemSold(uint256,address,address,address,uint256)
          handler: handleItemSold
        - event: ListingCancelled(uint256)
          handler: handleListingCancelled
        - event: ListingCreated(uint256,address,address,uint256,uint256,bool)
          handler: handleListingCreated
        - event: ListingUpdated(uint256,address,address,uint256,uint256)
          handler: handleListingUpdated
      file: ./src/simple-marketplace.ts

Build and deploy your subgraph

Once you finish with the previous step, how do you deploy your subgraph to Studio?

From your root directory, first, compile the project running:

graph codegen && graph build

You will surely find some compilation errors and will need to make adjustments to the code. (Assemblyscript is quite strict, plus not all JS features are implemented, like object spreading).

Once these difficulties are overcome, run graph auth with the deploy key that you will find in the Subgraph Studio UI, on your newly created subgraph.

graph auth --studio <DEPLOY_KEY>

Then finally, deploy to your subgraph name.

graph deploy --studio <SUBGRAPH_NAME>

You will be asked to provide a version label. You can change your code or metadata and redeploy as many times as you want before publishing to the decentralized network.

Thats it! You successfully deployed your first subgraph.

Next steps: publish to the Graph Decentralized Network

Publishing to the Decentralized network is as easy as pushing "Publish" on your subgraph Studio UI.

You can publish to Ethereum Mainnet, Goerli, Arbitrum One, or Arbitrum Goerli, but that (supposedly) does not have an effect on your indexing as long as the subgraph points to one of the supported networks.

It is recommended that you curate your own subgraph upon deployment, so Indexers and other Curators can pick it up and start indexing it as well. It is recommended to use at least 10K GRT to begin with, which at current prices is close to 740 USD. For testing purposes querying to Subgraph Studio is more than enough.

Indexing and Curation are topics in themselves, and it's probably worth writing another article on them. So for now I'm leaving you the docs on Curators and Indexers.

Create your first Subgraph for your smart contract with The Graph

Table of contents