DBD-mysql
DBD-mysql copied to clipboard
support for RowCacheSize? [rt.cpan.org #102919]
Migrated from rt.cpan.org#102919 (status was 'open')
Requestors:
From [email protected] on 2015-03-20 21:06:37:
$dbh->{RowCacheSize} always returns undef, even after I explicitly set it. This indicates that it is not supported. Is it possible to add support?
When I call ->execute on a prepared SELECT, the entire result is slurped into memory immediately, even if I desire to iterate over the results one row at a time (e.g. via fetchrow_arrayref). For large query results, this results in all available memory being consumed, which kills the process. The RowCacheSize attribute can make this better: https://metacpan.org/pod/DBI#RowCacheSize
From [email protected] on 2015-03-22 20:20:41:
Hi,
Thanks for your request. Did you use this feature in another DBD driver by any chance?
Would you require all options? These are the ones defined in DBI:
0 - Automatically determine a reasonable cache size for each C<SELECT>
1 - Disable the local row cache
>1 - Cache this many rows
<0 - Cache as many rows that will fit into this much memory for each C<SELECT>.
--
Mike
From [email protected] on 2015-03-23 17:47:48:
On 2015-03-22 13:20:41, MICHIELB wrote:
> Hi,
>
> Thanks for your request. Did you use this feature in another DBD
> driver by any chance?
>
> Would you require all options? These are the ones defined in DBI:
>
> 0 - Automatically determine a reasonable cache size for each C<SELECT>
> 1 - Disable the local row cache
> > 1 - Cache this many rows
> <0 - Cache as many rows that will fit into this much memory for each
> C<SELECT>.
The variant that I would use is <0: e.g. $dbh->{RowCacheSize} = -128e6; # buffer size = 128 MB
BTW, this went by on irc #dbi over the weekend, as a possible workaround to running out of memory during large queries:
06:59 <@timbunce> ether: re DBD::mysql memory usage on big results, see the mysql_use_result attribute.
06:59 < Sno> as said - timbunce knows almost everything about databases :)
07:00 <@timbunce> Being in this game a looong time helps. Well, until the rate I forget things exceeds the rate I learn them :)
07:02 < Sno> :)
07:04 <@timbunce> Oldest reference I can see is from 2003 :) http://perl.markmail.org/search/?q=mysql_use_result#query:mysql_use_result%20from%3A%22Tim%20Bunce%22%20order%3Adate-forward+page:1+mid:3xf7ix6bjrmqjnvs+state:results
07:08 <@timbunce> ether: just setting mysql_use_result will mean any locks are held longer while the results are streamed. If that's a problem you can try SQL_BUFFER_RESULT in the SQL
http://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_sql_buffer_result
07:10 <@timbunce> (Claim to fame... I asked Monty, the mysql author, to add the SQL_BUFFER_RESULT feature for just this use-case.)
From [email protected] on 2015-03-23 20:56:27:
On 2015-03-23 13:47:48, ETHER wrote:
> On 2015-03-22 13:20:41, MICHIELB wrote:
> > Hi,
> >
> > Thanks for your request. Did you use this feature in another DBD
> > driver by any chance?
> >
> > Would you require all options? These are the ones defined in DBI:
> >
> > 0 - Automatically determine a reasonable cache size for each
> > C<SELECT>
> > 1 - Disable the local row cache
> > > 1 - Cache this many rows
> > <0 - Cache as many rows that will fit into this much memory for each
> > C<SELECT>.
>
> The variant that I would use is <0: e.g. $dbh->{RowCacheSize} =
> -128e6; # buffer size = 128 MB
>
> BTW, this went by on irc #dbi over the weekend, as a possible
> workaround to running out of memory during large queries:
>
> 06:59 <@timbunce> ether: re DBD::mysql memory usage on big results,
> see the mysql_use_result attribute.
> 06:59 < Sno> as said - timbunce knows almost everything about
> databases :)
> 07:00 <@timbunce> Being in this game a looong time helps. Well, until
> the rate I forget things exceeds the rate I learn them :)
> 07:02 < Sno> :)
> 07:04 <@timbunce> Oldest reference I can see is from 2003 :)
> http://perl.markmail.org/search/?q=mysql_use_result#query:mysql_use_result%20from%3A%22Tim%20Bunce%22%20order%3Adate-
> forward+page:1+mid:3xf7ix6bjrmqjnvs+state:results
> 07:08 <@timbunce> ether: just setting mysql_use_result will mean any
> locks are held longer while the results are streamed. If that's a
> problem you can try SQL_BUFFER_RESULT in the SQL
> http://dev.mysql.com/doc/refman/5.7/en/server-
> system-variables.html#sysvar_sql_buffer_result
> 07:10 <@timbunce> (Claim to fame... I asked Monty, the mysql author,
> to add the SQL_BUFFER_RESULT feature for just this use-case.)
mysql_use_result is a possible workaround, but you're just moving the memory problem from client to server --- so you have to make sure that your mysql server has enough memory.
Another possible workaround is to use the MySQL HANDLER syntax <http://dev.mysql.com/doc/refman/5.0/en/handler.html>. This would be a cursor-like solution without any memory consumption, but it only works if you just want to scan over a table (possibly using a simple filter) and you don't need a consistent snapshot of the table (changes would immediately be reflected while scanning through the table).
I guess no movement has been made on this as yet?
I have a script that pulls ~1.2M rows from a DB at first run, then essentially 'tails' new entries from the DB using a prepared statement. This might run for a day or so - but it seems all results stay in RAM within the perl script.
As such, I've noticed the script using ~1.8Gb of RAM after a while - even when it only spits out ~2-3 rows an hour.
I've rehashed the script to create a new prepare statement each loop to hopefully work around this problem, but yeah - has been a little annoying.
EDIT: Looks like having a new $sth->prepare(blah) doesn't help. The script still takes up ~1.8Gb of RAM...