Subsquid Testnet Indexers: Resolving Common Issues

When completing 'Quests' in the Subsquid Network testnet, developers dive into the intricate realm of indexing blockchain data, a task that demands precision and ingenuity. Yet, this journey is not without its challenges. From decoding complex blockchain structures to optimizing data retrieval, testnet developers often encounter stumbling blocks that test their skills.

In this article, we explore the most common issues faced by developers participating at Subsquid tech quests, and offer some straightforward solutions.

Issue #1: Not completing all the pre-requisites

One of the most common issues is not having all of the requirements installed before starting the quests. This can result in all sorts of errors. So, before starting the quest, we ask to install the following tools:

    1. Install Node v16.x or newer https://nodejs.org/en/download
    2. Install Docker https://docs.docker.com/engine/install/
    3. Install git https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
    4. Install Squid CLI

Issue #2: Data mismatch

Another common issue is when data returned by the query to the GraphQL endpoint doesn't match the corresponding subgraph. There can be multiple reasons for this error, most commonly being incorrectly tracked logs or transactions. We recommend consulting subgraph.yaml to verify that the tracked data from the processor.ts matches.

Issue #3: Event decoding errors

Depending on the version of the SDK used, there can be extra data provided on RPC ingestion. This can result on decoding errors. To resolve this, a config like this

src/processor.ts

export const processor = new EvmBatchProcessor()
  .addLog({
    address: ['0xdac17f958d2ee523a2206206994597c13d831ec7'],
    topic0: [erc20abi.events.Transfer.topic]
  })

must always be matched with a filter like this

src/main.ts

  // ...
  for (let block of ctx.blocks) {
    for (let log of block.logs) {
      // ⌄⌄⌄ this filter ⌄⌄⌄
      if (log.address === '0xdac17f958d2ee523a2206206994597c13d831ec7' &&
          log.topics[0] === erc20abi.events.Transfer.topic) {
        // ...
      }
    }
  }
})

Depending on the rpc endpoint, the squid could be rate limited

will pause new requests for 20000ms {"rpcUrl":"<https://rpc.ankr.com/eth>", 
"reason" : "HttpError: got 429 from <https://rpc.ankr.com/eth>"}

To resolve this, switch the RPC or, in case of Subsquid Cloud deployment, enable the RPC proxy. https://docs.subsquid.io/deploy-squid/rpc-proxy/#processor-configuration

Issue #4 Not Using Batch Upserts

To minimize database operations, we recommend batch upserts.

Avoid loading or persisting single entities unless strictly necessary. For example, here is a possible antipattern for the Gravatar example (link out of date):

processor.run(new TypeormDatabase(), async (ctx) => {
  for (const c of ctx.blocks) {
    for (const log of c.logs) {
// making sure that we process only the relevant logs
      if (log.address !== GRAVATAR_CONTRACT ||
          (log.topics[0] !== events.NewGravatar.topic &&
           log.topics[0] !== events.UpdatedGravatar.topic)) continue
      const { id, owner, displayName, imageUrl } = extractData(log)
// ANTIPATTERN!!!
// Doing an upsert per event drastically decreases the indexing speed
      await ctx.store.save(Gravatar, new Gravatar({
        id: id.toHexString(),
        owner: decodeHex(owner),
        displayName,
        imageUrl
      }))
    }
  }
})

Instead, use an in-memory cache, and batch upserts:

processor.run(new TypeormDatabase(), async (ctx) => {
  const gravatars: Map<string, Gravatar> = new Map();
  for (const c of ctx.blocks) {
    for (const log of c.logs) {
      if (log.address !== GRAVATAR_CONTRACT ||
          (log.topics[0] !== events.NewGravatar.topic &&
           log.topics[0] !== events.UpdatedGravatar.topic)) continue
      const { id, owner, displayName, imageUrl } = extractData(log)
      gravatars.set(id.toHexString(), new Gravatar({
        id: id.toHexString(),
        owner: decodeHex(owner),
        displayName,
        imageUrl
      }))
    }
  }
  await ctx.store.save([...gravatars.values()])
})

Batch-based processing can be used as a drop-in replacement for the handler-based mappings employed by e.g. subgraphs. While the handler-based processing is significantly slower due to excessive database lookups and writes, it may be a good intermediary step while migrating an existing subgraph to Squid SDK.

One can simply re-use the existing handlers while looping over the ctx items:

processor.run(new TypeormDatabase(), async (ctx) => {
  for (const c of ctx.blocks) {
    for (const log of c.logs) {
      switch (log.topics[0]) {
        case abi.events.FooEvent.topic:
          await handleFooEvent(ctx, log)
          continue
        case abi.events.BarEvent.topic:
          await handleFooEvent(ctx, log)
          continue
        default:
          continue
      }
    }
    for (const txn of c.transactions) {
// 0x + 4 bytes
      const sighash = txn.input.slice(0, 10)
      switch (sighash) {
        case '0xa9059cbb':// transfer(address,uint256) sighash
          await handleTransferTx(ctx, txn)
          continue
        case abi.functions.approve.sighash:
          await handleApproveTx(ctx, txn)
          continue
        default:
          continue
      }
    }
  }
})

When using batch upsert, sometimes the array of entites can end up having entities with duplicate ids. Upserting it will result in the database error. One way to resolve this is to use a Map instead of an array. This approach guarantees that only entities with unique entites are saved.

let transfers: Map<string, Transfer> = new Map();
...
ctx.store.upsert([...transfers.values()]);