DBD-mysql icon indicating copy to clipboard operation
DBD-mysql copied to clipboard

support for RowCacheSize? [rt.cpan.org #102919]

Open mbeijen opened this issue 8 years ago • 1 comments
trafficstars

Migrated from rt.cpan.org#102919 (status was 'open')

Requestors:

From [email protected] on 2015-03-20 21:06:37:

$dbh->{RowCacheSize} always returns undef, even after I explicitly set it. This indicates that it is not supported.  Is it possible to add support?

When I call ->execute on a prepared SELECT, the entire result is slurped into memory immediately, even if I desire to iterate over the results one row at a time (e.g. via fetchrow_arrayref).  For large query results, this results in all available memory being consumed, which kills the process.  The RowCacheSize attribute can make this better: https://metacpan.org/pod/DBI#RowCacheSize

From [email protected] on 2015-03-22 20:20:41:

Hi,

Thanks for your request. Did you use this feature in another DBD driver by any chance?

Would you require all options? These are the ones defined in DBI:

 0 - Automatically determine a reasonable cache size for each C<SELECT>
 1 - Disable the local row cache
>1 - Cache this many rows
<0 - Cache as many rows that will fit into this much memory for each C<SELECT>.

--
Mike

From [email protected] on 2015-03-23 17:47:48:

On 2015-03-22 13:20:41, MICHIELB wrote:
> Hi,
> 
> Thanks for your request. Did you use this feature in another DBD
> driver by any chance?
> 
> Would you require all options? These are the ones defined in DBI:
> 
> 0 - Automatically determine a reasonable cache size for each C<SELECT>
> 1 - Disable the local row cache
> > 1 - Cache this many rows
> <0 - Cache as many rows that will fit into this much memory for each
> C<SELECT>.

The variant that I would use is <0:  e.g.  $dbh->{RowCacheSize} = -128e6;  # buffer size = 128 MB

BTW, this went by on irc #dbi over the weekend, as a possible workaround to running out of memory during large queries:

06:59 <@timbunce> ether: re DBD::mysql memory usage on big results, see the mysql_use_result attribute.
06:59 < Sno> as said - timbunce knows almost everything about databases :)
07:00 <@timbunce> Being in this game a looong time helps. Well, until the rate I forget things exceeds the rate I learn them :)
07:02 < Sno> :)
07:04 <@timbunce> Oldest reference I can see is from 2003 :) http://perl.markmail.org/search/?q=mysql_use_result#query:mysql_use_result%20from%3A%22Tim%20Bunce%22%20order%3Adate-forward+page:1+mid:3xf7ix6bjrmqjnvs+state:results
07:08 <@timbunce> ether: just setting mysql_use_result will mean any locks are held longer while the results are streamed. If that's a problem you can try SQL_BUFFER_RESULT in the SQL 
                  http://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_sql_buffer_result
07:10 <@timbunce> (Claim to fame... I asked Monty, the mysql author, to add the SQL_BUFFER_RESULT feature for just this use-case.)

From [email protected] on 2015-03-23 20:56:27:

On 2015-03-23 13:47:48, ETHER wrote:
> On 2015-03-22 13:20:41, MICHIELB wrote:
> > Hi,
> >
> > Thanks for your request. Did you use this feature in another DBD
> > driver by any chance?
> >
> > Would you require all options? These are the ones defined in DBI:
> >
> > 0 - Automatically determine a reasonable cache size for each
> > C<SELECT>
> > 1 - Disable the local row cache
> > > 1 - Cache this many rows
> > <0 - Cache as many rows that will fit into this much memory for each
> > C<SELECT>.
> 
> The variant that I would use is <0:  e.g.  $dbh->{RowCacheSize} =
> -128e6;  # buffer size = 128 MB
> 
> BTW, this went by on irc #dbi over the weekend, as a possible
> workaround to running out of memory during large queries:
> 
> 06:59 <@timbunce> ether: re DBD::mysql memory usage on big results,
> see the mysql_use_result attribute.
> 06:59 < Sno> as said - timbunce knows almost everything about
> databases :)
> 07:00 <@timbunce> Being in this game a looong time helps. Well, until
> the rate I forget things exceeds the rate I learn them :)
> 07:02 < Sno> :)
> 07:04 <@timbunce> Oldest reference I can see is from 2003 :)
> http://perl.markmail.org/search/?q=mysql_use_result#query:mysql_use_result%20from%3A%22Tim%20Bunce%22%20order%3Adate-
> forward+page:1+mid:3xf7ix6bjrmqjnvs+state:results
>  07:08 <@timbunce> ether: just setting mysql_use_result will mean any
> locks are held longer while the results are streamed. If that's a
> problem you can try SQL_BUFFER_RESULT in the SQL
>                   http://dev.mysql.com/doc/refman/5.7/en/server-
> system-variables.html#sysvar_sql_buffer_result
> 07:10 <@timbunce> (Claim to fame... I asked Monty, the mysql author,
> to add the SQL_BUFFER_RESULT feature for just this use-case.)

mysql_use_result is a possible workaround, but you're just moving the memory problem from client to server --- so you have to make sure that your mysql server has enough memory.

Another possible workaround is to use the MySQL HANDLER syntax <http://dev.mysql.com/doc/refman/5.0/en/handler.html>. This would be a cursor-like solution without any memory consumption, but it only works if you just want to scan over a table (possibly using a simple filter) and you don't need a consistent snapshot of the table (changes would immediately be reflected while scanning through the table).


mbeijen avatar Nov 15 '17 07:11 mbeijen

I guess no movement has been made on this as yet?

I have a script that pulls ~1.2M rows from a DB at first run, then essentially 'tails' new entries from the DB using a prepared statement. This might run for a day or so - but it seems all results stay in RAM within the perl script.

As such, I've noticed the script using ~1.8Gb of RAM after a while - even when it only spits out ~2-3 rows an hour.

I've rehashed the script to create a new prepare statement each loop to hopefully work around this problem, but yeah - has been a little annoying.

EDIT: Looks like having a new $sth->prepare(blah) doesn't help. The script still takes up ~1.8Gb of RAM...

CRCinAU avatar Oct 10 '18 01:10 CRCinAU