data-api-client icon indicating copy to clipboard operation
data-api-client copied to clipboard

Dealing with the 1 MB limit

Open lastmjs opened this issue 3 years ago • 3 comments

Amazing library!

As you know, there is a 1 MB limit for Data API results. I'm just working through how to deal with this generically for my queries (I'm implementing a GraphQL API, so I'm constructing SQL queries generically from GraphQL queries). I need to somehow split up any query given to the Data API into chunks <= 1 MB in size.

My question is, how are people dealing with this? And would it be appropriate to somehow build this chunking into the data-api-client? Not having to deal with this limitation directly would be an amazing feat of abstraction for this library.

Thanks!

lastmjs avatar Sep 21 '20 19:09 lastmjs

I would love to see this abstracted as well! +1

ghost avatar Sep 23 '20 23:09 ghost

I just dealt with this by doing the following:

const queryDatabase = async (query, parameters = {}) => {
  try {
    const { records } = await db.query(query, parameters)
    return records
  } catch (error) {
    console.error(error.message)
    if (error.message.includes('Database returned more than the allowed response size limit')) {
      const getCount = async (_query) => {
        const query = _query.replace(/SELECT\s(.+?)\s(FROM|WHERE)/is, 'SELECT COUNT(*) $2')
        const {
          records: [{ count }],
        } = await db.query(query, parameters)
        return count
      }
      let result = []
      // Get result count
      const count = await getCount(query)
      // Define limit based on your average record size
      const limit = 50
      // Use limits and offsets to page through the query
      for (let page = 1; page <= Math.ceil(count / limit); page++) {
        const limitQuery = query.replace(/;?$/, `\nLIMIT ${limit} OFFSET ${(page - 1) * limit};`)
        const { records } = await db.query(limitQuery, parameters)
        result = [...result, ...records]
      }
      return result
    }
  }
}

It you're doing a batch query that passes parameters as nested arrays, then you will need to handle that for each page query.

gnestor avatar Dec 22 '20 21:12 gnestor

thanks @gnestor! one small note in case someone else ends up here:

my query contained a GROUP BY clause, which doesn't return the correct number of rows with the snippet above, so i changed the following:

const getCount = async (_query) => {
  // old: const query = _query.replace(/SELECT\s(.+?)\s(FROM|WHERE)/is, 'SELECT COUNT(*) $2')
  const query = `SELECT COUNT(*) FROM (${_query}) sub`; // wrap the original query and count the number of results
  const {
    records: [{ count }],
  } = await db.query(query, parameters)
  return count
}

other than that, it works great!

Lane avatar Aug 11 '21 16:08 Lane