cardano-graphql icon indicating copy to clipboard operation
cardano-graphql copied to clipboard

searching for transactions by address with a very large number of transactions is slow

Open vae520283995 opened this issue 4 years ago • 6 comments

this is graphql

query getAddressTransactions($address: String!) {
      transactions(
        limit: 100
        where: {
          _or:[
            { inputs: { address: { _eq: $address } } }
            { outputs: { address: { _eq: $address } } }
          ]
        }
        order_by:{
          includedAt:desc
        }
      ) {
        block {
          number
        }
        hash
    		fee
        includedAt
        inputs {
          address
          value
        }
        outputs {
          address
          value
        }
      }
}
{ 
 "address":"DdzFFzCqrhstmqBkaU98vdHu6PdqjqotmgudToWYEeRmQKDrn4cAgGv9EZKtu1DevLrMA1pdVazufUCK4zhFkUcQZ5Gm88mVHnrwmXvT"
}

It takes more than 10 seconds to return results, Is there any other way to get the transactions of an address?

vae520283995 avatar Sep 23 '20 09:09 vae520283995

It's proportional to the number of transactions, and that address has 75483. We're aware of the issue, and will be looking to improve the performance in this area.

rhyslbw avatar Oct 06 '20 11:10 rhyslbw

We're aware of the issue, and will be looking to improve the performance in this area.

Are there any updates to this? By the way, if I want to get latest 10 transactions of the account using order_by: { block:{number:desc}} along with the address summary it also increases the query time from 5 seconds up to 30 seconds even if the amount of transactions is too low (approx. 10-30)

Query example
query getAddressSummary($address: String!) {
   caTxNum: transactions_aggregate(
    where: {
      _or: [
        { inputs: { address: { _eq: $address } } }
        { outputs: { address: { _eq: $address } } }
      ]
    }
  ) {
    aggregate {
      sum{fee}
      count# caTxNum
    }
  }
  caTotalInput: transactions_aggregate(
    where: { inputs: { address: { _eq: $address } } }
  ) {
    aggregate {
      sum {
        totalOutput # caTotalInput
      }
    }
  }
  caTotalOutput: transactions_aggregate(
    where: { outputs: { address: { _eq: $address } } }
  ) {
    aggregate {
      sum {
        totalOutput # caTotalOutput
      }
    }
  }
  caBalance: utxos_aggregate(where: { address: { _eq: $address } }) {
    aggregate {
      sum {
        value # caBalance
      }
    }
  }
  caTxList: transactions(limit: 10 
    order_by: { block:{number:desc}}
    where: {
      _or: [
        { inputs: { address: { _eq: $address } } }
        { outputs: { address: { _eq: $address } } }
      ]
    }
  ) {
    hash # ctbId
    includedAt # ctbTimeIssued
    fee # ctbFees
    inputs {
      # ctbInputs
      address # ctaAddress
      value # ctaAmount
      sourceTxHash # ctaTxHash
      sourceTxIndex # ctaTxIndex
    }
    outputs {
      address # ctaAddress
      value # ctaAmount
      txHash # ctaTxHash
      index
    }
    inputs_aggregate {
      aggregate {
        sum {
          value # ctbInputSum
          
        }
      }
    }
    outputs_aggregate {
      aggregate {
        sum {
          value # ctbOutputSum
        }
      }
    }
  }
}

alexqrid avatar Feb 18 '21 13:02 alexqrid

That makes sense as it still has to get all the transactions first so it can order them. Even if the DB has an index applied.

A tip, make sure the device you're using has high IOPS. E.g, I tried using AWS and the IOPS was so slow that most queries timed out. The price to reserve enough IOPS for it to be usable wasn't worth it.

CyberCyclone avatar Feb 18 '21 19:02 CyberCyclone

We're aware of the issue, and will be looking to improve the performance in this area.

Are there any updates to this? By the way, if I want to get latest 10 transactions of the account using order_by: { block:{number:desc}} along with the address summary it also increases the query time from 5 seconds up to 30 seconds even if the amount of transactions is too low (approx. 10-30)

Query example

query getAddressSummary($address: String!) {

   caTxNum: transactions_aggregate(

    where: {

      _or: [

        { inputs: { address: { _eq: $address } } }

        { outputs: { address: { _eq: $address } } }

      ]

    }

  ) {

    aggregate {

      sum{fee}

      count# caTxNum

    }

  }

  caTotalInput: transactions_aggregate(

    where: { inputs: { address: { _eq: $address } } }

  ) {

    aggregate {

      sum {

        totalOutput # caTotalInput

      }

    }

  }

  caTotalOutput: transactions_aggregate(

    where: { outputs: { address: { _eq: $address } } }

  ) {

    aggregate {

      sum {

        totalOutput # caTotalOutput

      }

    }

  }

  caBalance: utxos_aggregate(where: { address: { _eq: $address } }) {

    aggregate {

      sum {

        value # caBalance

      }

    }

  }

  caTxList: transactions(limit: 10 

    order_by: { block:{number:desc}}

    where: {

      _or: [

        { inputs: { address: { _eq: $address } } }

        { outputs: { address: { _eq: $address } } }

      ]

    }

  ) {

    hash # ctbId

    includedAt # ctbTimeIssued

    fee # ctbFees

    inputs {

      # ctbInputs

      address # ctaAddress

      value # ctaAmount

      sourceTxHash # ctaTxHash

      sourceTxIndex # ctaTxIndex

    }

    outputs {

      address # ctaAddress

      value # ctaAmount

      txHash # ctaTxHash

      index

    }

    inputs_aggregate {

      aggregate {

        sum {

          value # ctbInputSum

          

        }

      }

    }

    outputs_aggregate {

      aggregate {

        sum {

          value # ctbOutputSum

        }

      }

    }

  }

}

Also, that query is very complex. I'd split it up into seperate statements and call them separately. It's easy enough for it to get choked if you ask it to do to much at once.

CyberCyclone avatar Feb 18 '21 19:02 CyberCyclone

In my case there is no a lot of transactions on my addresses (around 1000), but querying for utxos_aggregate and transactions takes a lot of time

In pg node exporter I can see that queries return millions of rows. Seems it's not optimal to return a lot of data from postgres and then process it in hasura

APshenkin avatar Jun 07 '21 09:06 APshenkin

hi, we nowadays get 10 sec queries even for addresses with 0 transactions

'{"query": "{ transactions ( limit: 25 where: { _or: [ { inputs: { address: { _eq: \"addr1q88teuwqrs04n6h68l5t0s54tgn2r0yr7hu5cslpzg75ejxmctnagnrgtk7kwwhn0x6xkrnk9sh2w6lqkf38etlqrzzqaf4qdf\" } } } { outputs: { address: { _eq: \"addr1q88teuwqrs04n6h68l5t0s54tgn2r0yr7hu5cslpzg75ejxmctnagnrgtk7kwwhn0x6xkrnk9sh2w6lqkf38etlqrzzqaf4qdf\" } } } ] }) { block { hash number } } }" }'

dansentivany avatar Sep 26 '22 13:09 dansentivany