For 32bit support in Nextcloud we are typing sizes as int|float as PHP uses float for bigint.

Would you consider merging this change?

Jan 24 '23 17:01 come-nc

Codecov Report

Merging #1444 (301ac98) into master (5684c75) will not change coverage. The diff coverage is n/a.

@@            Coverage Diff            @@
##             master    #1444   +/-   ##
=========================================
  Coverage     97.33%   97.33%           
  Complexity     2830     2830           
=========================================
  Files           176      176           
  Lines          9419     9419           
=========================================
  Hits           9168     9168           
  Misses          251      251

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

Jan 24 '23 17:01 codecov[bot]

When a file size gets bigger than PHP_INT_MAX (e.g. about 2GB on 32-bit systems) then the float representation is going to have some maximum precision. On 32-bit systems, does PHP actually use some 64-bit floating-point binary representation? (that would end up across 2 32-bit registers...) Or does it squeeze down into a 32-bit floating point representation like https://en.wikipedia.org/wiki/Single-precision_floating-point_format ?

And so as the file size gets bigger, it gradually becomes more granular. Something roughly like:

from 2GB to 4GB the size will effectively be to an even number of bytes (an odd number of bytes will not have an exact representation)
from 4GB to 8GB the representable sizes will be multiples of 4
from 8GB to 16GB the representable sizes will be multiples of 8
and so on

And so getSize() will return a number that, when "expanded" as a big int, may be a few bytes more or less than the real size.

How is that likely to impact on clients that want to:

read the whole file (maybe they do not care much, and just keep reading until EOF is reached)
read parts of the file (video players that "jump around" getting parts of the file (they might do something weird at the very end of playing the video)

Jan 30 '23 07:01 phil-davis

@come-nc do you also want/need to expand the return type of implementations such as https://github.com/sabre-io/dav/blob/master/lib/DAV/SimpleFile.php getSize() https://github.com/sabre-io/dav/blob/master/lib/DAV/FS/File.php getSize()

or do you just need IFile to declare the possibility of returning a float from getSize() ?

Jan 30 '23 07:01 phil-davis

may I ask about the use-case? I guess you need this more precise return type for static analysis tools or similar?

Jan 30 '23 08:01 staabm

When a file size gets bigger than PHP_INT_MAX (e.g. about 2GB on 32-bit systems) then the float representation is going to have some maximum precision. On 32-bit systems, does PHP actually use some 64-bit floating-point binary representation? (that would end up across 2 32-bit registers...) Or does it squeeze down into a 32-bit floating point representation like https://en.wikipedia.org/wiki/Single-precision_floating-point_format ?

I do not know the details on that, I only know what happens on PHP side of typing, sorry.

And so getSize() will return a number that, when "expanded" as a big int, may be a few bytes more or less than the real size.

How is that likely to impact on clients that want to:

read the whole file (maybe they do not care much, and just keep reading until EOF is reached)

read parts of the file (video players that "jump around" getting parts of the file (they might do something weird at the very end of playing the video)

This is already the case, getSize is returning a float for these files on 32bits, it’s just that static analysis believe it’s an int because of the docblock.

@come-nc do you also want/need to expand the return type of implementations such as https://github.com/sabre-io/dav/blob/master/lib/DAV/SimpleFile.php getSize() https://github.com/sabre-io/dav/blob/master/lib/DAV/FS/File.php getSize()

or do you just need IFile to declare the possibility of returning a float from getSize() ?

For Nextcloud we only use IFile from what I can see. But I think it’s better to align File and SimpleFile as well as long as sabre is not officially dropping support for 32bits, it should advertise that sizes can be floats.

may I ask about the use-case? I guess you need this more precise return type for static analysis tools or similar?

Yes. This change allow static analysis tools such as psalm to warn us about places where we are making the assumption that size is an int and that will break on 32bit installations. Then we can either fix this code to support float or add a check and error out when it’s a float and document a known limitation on 32bit.

For Nextcloud sizes are mostly used for quota checks and storage space use estimation, so sizes being a few bit off on huge files should not be a problem. And is better than a crash anyway.

Jan 30 '23 09:01 come-nc

For Nextcloud sizes are mostly used for quota checks and storage space use estimation, so sizes being a few bit off on huge files should not be a problem. And is better than a crash anyway.

one thing I can think of where this might get problematic would be e.g. a Content-Length header when sent as e.g. float and does therefore no longer match the actual bytes on the filesystem and therefore would result in a corrupt download.

but I guess these cases would get obvious when we have this more precise return type, right?

Jan 30 '23 10:01 staabm

For Nextcloud sizes are mostly used for quota checks and storage space use estimation, so sizes being a few bit off on huge files should not be a problem. And is better than a crash anyway.

one thing I can think of where this might get problematic would be e.g. a Content-Length header when sent as e.g. float and does therefore no longer match the actual bytes on the filesystem and therefore would result in a corrupt download.

but I guess these cases would get obvious when we have this more precise return type, right?

Not sure, the typing will tell you it’s a float but not that it’s imprecise. It all depends on how you are building the header I suppose.

Jan 30 '23 10:01 come-nc

dav
dav copied to clipboard

Allow float sizes for 32bit support

Codecov Report

dav dav copied to clipboard

Allow float sizes for 32bit support

Codecov Report

dav
dav copied to clipboard