cordova-plugin-file
cordova-plugin-file copied to clipboard
readAsText function causes encoding error reading large files
cordova-file-plugin encoding error bug report
Abstract
The readAsText
function of the cordova-file-plugin causes an encoding error when reading an UTF-8 file larger than 2 megabytes on platforms iOS, Android and Windows.
Description
The error is caused by the native implementation of the readAsText
function of the cordova-file-plugin
. Files are cut into chunks at a predefined size and are immediately converted to UTF-8. UTF-8 encoded characters use 4 bytes and if the cut is exactly in one character with 4 bytes, an encoding error is thrown.
You can see the UTF-8 leading byte e2
at the end of data-chunk.dump file. The missing bytes are in the next chunk and if you convert the dump file back to text a filler character is shown (e.g. a ?
).
Workaround
We implemented a workaround using the readAsArrayBuffer
function of the cordova-file-plugin
and converting the typed array to UTF-8 in JavaScript with TextDecoder.decode().
How to reproduce the encoding error
- Check out sampe-app
- Install dependencies
npm i
- Initialize Cordova
npm run init
- Deploy on platform
npm run ios/android/windows
- Press
Fire Parser
in Demo App - Watch the logs
- Set a breakpoint at the throw of encoding error in iOS
- Look at the
data
variable
Related to #238
@janpio Any updates on this issue? The bug also occurs on the UWP platform.
Just wanted to add that I ran into this problem in a project at my job.
It's not that hard to encounter it - loading a file with more than 262,144 bytes that's primarily composed of multibyte characters (that is, non-ASCII ones, in UTF-8 encoding) should do it often.
Here is a bit of Python 3 that generates a UTF-8 file that should consistently break the readAsText()
function:
longstring = 'a' * 262143 + u'‘abc'
file = open('testfile.txt', 'w')
file.write(longstring)
file.close()
This happened to me today as well, but only on iOS. Looking at the implementation I don't see how the same issue could occur on Android however – in the source code at FileUtils.java#L1086 it seems like the whole result is written to the buffer before it gets converted into a string.
I haven't tested it in depth on Android though, so I might be missing something.
Same issue here. Happened randomly in IOS only. Please help fix it. thanks.
+1
+1
For the +1s, some details would be nice.
In my case, the workaround suggested in the report worked just fine.
Have you tried it? If so, did it solve your problem?
@NateEag: Yes, the workaround with readAsArrayBuffer
did worked for me – but it’s only a workaround, not a fix. :)
Gotcha. Yes, a real fix would obviously be better. I just wanted to be sure there wasn't some subtlety being missed that meant the workaround wasn't fully functional.
Thanks for the details, @YvesAmmann !
Anyone found a workaround for the Windows platform?
@MauroIT Does the workaround suggested above fail on Windows? If so, how does it fail?
I've added a Polyfill for window.TextDecoder. It works on Windows now :)
+1 Thank you for the work around, but this issue should really be addressed.
have you try with 30m+ file? I got an error like this when i try a 30m+ large file:
{"type":"error","bubbles":false,"cancelBubble":false,"cancelable":false,"lengthComputable":false,"loaded":0,"total":0,"target":{"_readyState":2,"_error":{"code":1},"_result":null,"_progress":0,"_localURL":"http://localhost/__cdvfile_sdcard__/Download/yuanshen_4.0.0.apk","_realReader":{}}}