dbi icon indicating copy to clipboard operation
dbi copied to clipboard

DBD::Sponge corrected and lazy default PRECISION

Open pilcrow opened this issue 11 years ago • 11 comments

DBD::Sponge defaults statement handle PRECISION to the length of column NAMEs, which means that a result set's PRECISION can be nonsensically smaller than would actually be required for the data:

$sponge->prepare("sponges",
                 { NAME =>   ['phylum',   'class'],                # PRECISION defaults to [6,5]
                   rows => [ ['Porifera', 'Hexactinellida'], ] }); # but data would need at least [8,14]

This request changes that default to the columnar maximum lengths of the data, computed only if needed. Test cases and small POD update. (Noticed this playing with yet another DBI::Shell alternative.)

pilcrow avatar Jun 15 '14 14:06 pilcrow

sub _max_columnar_lengths {
  my ($numFields, $rows) = @_;
  my @precision = map { max map { length ($_||"") } @$_ } @$rows;
  return wantarray ? @precision : \@precision;
}

Nope. That gives us the max data length in each row, one per row. We want the column-wise max data length. (Also see second commit — I renamed sub, and just want to make sure you are looking at final form of this PR.)

pilcrow avatar Jun 15 '14 15:06 pilcrow

Hi @pilcrow would you please be so kind and re-submit a rebased PR with tests and commented changes? For me it's difficult to guess where the patch stands as I'm not that familiar with meta-data in DBD's (I usually transfer those jobs to @Tux or @mjegh)

rehsack avatar Dec 09 '14 09:12 rehsack

Wilco.

On Tuesday, December 9, 2014, Jens Rehsack [email protected] wrote:

Hi @pilcrow https://github.com/pilcrow would you please be so kind and re-submit a rebased PR with tests and commented changes? For me it's difficult to guess where the patch stands as I'm not that familiar with meta-data in DBD's (I usually transfer those jobs to @Tux https://github.com/Tux or @mjegh https://github.com/mjegh)

— Reply to this email directly or view it on GitHub https://github.com/perl5-dbi/dbi/pull/12#issuecomment-66253282.

pilcrow avatar Dec 09 '14 12:12 pilcrow

@pilcrow - I still miss the comments in changes and tests. @Tux or @mjegh - is the PR clear or do you need more information? Can you please place feedback requests here?

rehsack avatar May 05 '15 12:05 rehsack

@rehsack, I updated the branch against current DBI and added some explanatory comments to the tests and changes. (I also documented DBD-Sponge's default TYPE.)

@Tux, @mjegh, it's been a while on this PR, but the basic issue is that a Spongey data set incorrectly defaults PRECISION to the length of the name of fields rather than the length of the data. To correct this, if the user does not specify PRECISION but later requests that attribute, we loop through the rows and remember the longest length for each column. The .t file covers this, user-supplied PRECISION, and a few other DBD-Sponge basics.

pilcrow avatar May 10 '15 14:05 pilcrow

I agree the current behaviour is broken. I'm unsure about this fix though.

The DBI docs for PRECISION say:

For numeric columns, the value is the maximum number of digits (without considering a sign character or decimal point). Note that the "display size" for floating point types (REAL, FLOAT, DOUBLE) can be up to 7 characters greater than the precision (for the sign + decimal point + the letter E + a sign + 2 or 3 digits).

For any character type column the value is the OCTET_LENGTH, in other words the number of bytes, not characters.

(More recent standards refer to this as COLUMN_SIZE but we stick with PRECISION for backwards compatibility.)

So I see two problems:

  • This fix uses the length in characters not bytes. That's easy to fix.
  • This fix applies the character type semantics to all columns and that would be wrong for any numeric columns. This could be fixed by checking the TYPE of the column and using different logic if it's numeric.

timbunce avatar Jul 18 '15 12:07 timbunce

OK I'll revisit.

-Mike

On Sat, Jul 18, 2015 at 7:04 AM, Tim Bunce [email protected] wrote:

I agree the current behaviour is broken. I'm unsure about this fix though.

The DBI docs for PRECISION https://metacpan.org/pod/DBI#PRECISION say:

For numeric columns, the value is the maximum number of digits (without considering a sign character or decimal point). Note that the "display size" for floating point types (REAL, FLOAT, DOUBLE) can be up to 7 characters greater than the precision (for the sign + decimal point + the letter E + a sign + 2 or 3 digits).

For any character type column the value is the OCTET_LENGTH, in other words the number of bytes, not characters.

(More recent standards refer to this as COLUMN_SIZE but we stick with PRECISION for backwards compatibility.)

So I see two problems:

  • This fix uses the length in characters not bytes. That's easy to fix.
  • This fix applies the character type semantics to all columns and that would be wrong for any numeric columns. This could be fixed by checking the TYPE of the column and using different logic if it's numeric.

— Reply to this email directly or view it on GitHub https://github.com/perl5-dbi/dbi/pull/12#issuecomment-122535032.

pilcrow avatar Jul 21 '15 12:07 pilcrow

Thanks @pilcrow!

timbunce avatar Jul 22 '15 15:07 timbunce

Any news on this @pilcrow ?

timbunce avatar Apr 21 '16 14:04 timbunce