ion-js
ion-js copied to clipboard
ion-js crashes when parsing very large string.
ion-js crashes when parsing a very large string
$ npm ls ion-js
[email protected] /home/ANT.AMAZON.COM/calebev/test
└── [email protected]
$ node -v
v18.12.0
ion.load(JSON.stringify(Buffer.alloc(50_000_000).fill('a').toString('base64')))
$ node -p "require('ion-js').load(JSON.stringify(Buffer.alloc(50_000_000).fill('a').toString('base64')))"
<--- Last few GCs --->
[641427:0x560ab00] 21984 ms: Scavenge 2028.7 (2064.9) -> 2028.8 (2069.9) MB, 5.6 / 0.0 ms (average mu = 0.198, current mu = 0.143) allocation failure;
[641427:0x560ab00] 21994 ms: Scavenge 2031.6 (2069.9) -> 2031.8 (2070.9) MB, 7.8 / 0.0 ms (average mu = 0.198, current mu = 0.143) allocation failure;
[641427:0x560ab00] 22003 ms: Scavenge 2032.6 (2070.9) -> 2032.4 (2081.6) MB, 8.7 / 0.0 ms (average mu = 0.198, current mu = 0.143) allocation failure;
<--- JS stacktrace --->
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
1: 0xb6e500 node::Abort() [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
2: 0xa7e632 [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
3: 0xd47ea0 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
4: 0xd48247 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
5: 0xf25605 [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
6: 0xf26508 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
7: 0xf36a13 [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
8: 0xf37888 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
9: 0xf3aa55 v8::internal::Heap::HandleGCRequest() [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
10: 0xeb8baf v8::internal::StackGuard::HandleInterrupts() [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
11: 0x12b7645 v8::internal::Runtime_StackGuard(int, unsigned long*, v8::internal::Isolate*) [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
12: 0x16e9979 [/home/ANT.AMAZON.COM/calebev/.asdf/installs/nodejs/lts-hydrogen/bin/node]
JSON.parse parses the string fine.
I'm implementing a npm registry server and the npm publish api POST's a JSON object with the package tarball base64 encoded as a string. Parsing this body with regular JSON.parse causes multiple copies of data
const body = await buffer(request) // Buffer
const str = body.toString('utf8') // String
const json = JSON.parse(str) // JSON
const tarball = Buffer.from(json._attachments["1.0.0"].data, 'base64') // tarball
I tested ion-js to see if the reader would save memory.
Just checking my understanding of your example:
require('ion-js')
.load(JSON.stringify(Buffer.alloc(50_000_000).fill('a').toString('base64')))
We're creating a 50MB buffer filled with the ASCII character a
, then converting that binary buffer to a base64-encoded string which will be about 67MB long. The resulting string will not be valid Ion syntax.
For example:
{{ImhlbGxvIHdvcmxkIg==}} // Text Ion blob
"ImhlbGxvIHdvcmxkIg==" // Text Ion string
ImhlbGxvIHdvcmxkIg== // Free-standing base64 data, not a valid Ion value
There is indeed an OOM occurring, but I wanted to see whether this correctness issue affects your use case before diving too deeply on the memory management underlying especially large scalar values.
The base64 string is passed to JSON.stringify so it will be a valid ion string. Ignore Buffer / base64, it can be reproduced with
ion.load(JSON.stringify("a".repeat(50_000_000)))
I can also reproduce by feeding the output of dumpText back to load
const value = "a".repeat(3125000)
const ionText = ion.dumpText(value)
ion.load(ionText) // crashes here
Tho with larger strings dumpText will also crash
const value = "a".repeat(25_000_000)
const ionText = ion.dumpText(value) // crashes here