dav
dav copied to clipboard
Allow float sizes for 32bit support
For 32bit support in Nextcloud we are typing sizes as int|float as PHP uses float for bigint.
Would you consider merging this change?
Codecov Report
Merging #1444 (301ac98) into master (5684c75) will not change coverage. The diff coverage is
n/a.
@@ Coverage Diff @@
## master #1444 +/- ##
=========================================
Coverage 97.33% 97.33%
Complexity 2830 2830
=========================================
Files 176 176
Lines 9419 9419
=========================================
Hits 9168 9168
Misses 251 251
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
When a file size gets bigger than PHP_INT_MAX (e.g. about 2GB on 32-bit systems) then the float representation is going to have some maximum precision. On 32-bit systems, does PHP actually use some 64-bit floating-point binary representation? (that would end up across 2 32-bit registers...) Or does it squeeze down into a 32-bit floating point representation like https://en.wikipedia.org/wiki/Single-precision_floating-point_format ?
And so as the file size gets bigger, it gradually becomes more granular. Something roughly like:
- from 2GB to 4GB the size will effectively be to an even number of bytes (an odd number of bytes will not have an exact representation)
- from 4GB to 8GB the representable sizes will be multiples of 4
- from 8GB to 16GB the representable sizes will be multiples of 8
- and so on
And so getSize() will return a number that, when "expanded" as a big int, may be a few bytes more or less than the real size.
How is that likely to impact on clients that want to:
- read the whole file (maybe they do not care much, and just keep reading until EOF is reached)
- read parts of the file (video players that "jump around" getting parts of the file (they might do something weird at the very end of playing the video)
@come-nc do you also want/need to expand the return type of implementations such as
https://github.com/sabre-io/dav/blob/master/lib/DAV/SimpleFile.php getSize()
https://github.com/sabre-io/dav/blob/master/lib/DAV/FS/File.php getSize()
or do you just need IFile to declare the possibility of returning a float from getSize() ?
may I ask about the use-case? I guess you need this more precise return type for static analysis tools or similar?
When a file size gets bigger than PHP_INT_MAX (e.g. about 2GB on 32-bit systems) then the
floatrepresentation is going to have some maximum precision. On 32-bit systems, does PHP actually use some 64-bit floating-point binary representation? (that would end up across 2 32-bit registers...) Or does it squeeze down into a 32-bit floating point representation like https://en.wikipedia.org/wiki/Single-precision_floating-point_format ?
I do not know the details on that, I only know what happens on PHP side of typing, sorry.
And so
getSize()will return a number that, when "expanded" as a big int, may be a few bytes more or less than the real size.How is that likely to impact on clients that want to:
- read the whole file (maybe they do not care much, and just keep reading until EOF is reached)
- read parts of the file (video players that "jump around" getting parts of the file (they might do something weird at the very end of playing the video)
This is already the case, getSize is returning a float for these files on 32bits, it’s just that static analysis believe it’s an int because of the docblock.
@come-nc do you also want/need to expand the return type of implementations such as https://github.com/sabre-io/dav/blob/master/lib/DAV/SimpleFile.php
getSize()https://github.com/sabre-io/dav/blob/master/lib/DAV/FS/File.phpgetSize()or do you just need
IFileto declare the possibility of returning a float fromgetSize()?
For Nextcloud we only use IFile from what I can see.
But I think it’s better to align File and SimpleFile as well as long as sabre is not officially dropping support for 32bits, it should advertise that sizes can be floats.
may I ask about the use-case? I guess you need this more precise return type for static analysis tools or similar?
Yes. This change allow static analysis tools such as psalm to warn us about places where we are making the assumption that size is an int and that will break on 32bit installations. Then we can either fix this code to support float or add a check and error out when it’s a float and document a known limitation on 32bit.
For Nextcloud sizes are mostly used for quota checks and storage space use estimation, so sizes being a few bit off on huge files should not be a problem. And is better than a crash anyway.
For Nextcloud sizes are mostly used for quota checks and storage space use estimation, so sizes being a few bit off on huge files should not be a problem. And is better than a crash anyway.
one thing I can think of where this might get problematic would be e.g. a Content-Length header when sent as e.g. float and does therefore no longer match the actual bytes on the filesystem and therefore would result in a corrupt download.
but I guess these cases would get obvious when we have this more precise return type, right?
For Nextcloud sizes are mostly used for quota checks and storage space use estimation, so sizes being a few bit off on huge files should not be a problem. And is better than a crash anyway.
one thing I can think of where this might get problematic would be e.g. a
Content-Lengthheader when sent as e.g. float and does therefore no longer match the actual bytes on the filesystem and therefore would result in a corrupt download.but I guess these cases would get obvious when we have this more precise return type, right?
Not sure, the typing will tell you it’s a float but not that it’s imprecise. It all depends on how you are building the header I suppose.