Better error handling when `humanify local` ran on empty file (Error: Failed to stringify code)
⇒ touch foo.js
⇒ npx humanifyjs local --disableGpu foo.js
(node:98609) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
[node-llama-cpp] Using this model ("~/.humanifyjs/models/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf") to tokenize text with special tokens and then detokenize it resulted in a different text. There might be an issue with the model or the tokenizer implementation. Using this model may not work as intended
Processing: 100%
file:///Users/devalias/dev/foohumanify/node_modules/humanifyjs/dist/index.mjs:56611
throw new Error("Failed to stringify code");
^
Error: Failed to stringify code
at visitAllIdentifiers (file:///Users/devalias/dev/foohumanify/node_modules/humanifyjs/dist/index.mjs:56611:11)
at async file:///Users/devalias/dev/foohumanify/node_modules/humanifyjs/dist/index.mjs:56687:12
at async unminify (file:///Users/devalias/dev/foohumanify/node_modules/humanifyjs/dist/index.mjs:196:27)
at async Command.<anonymous> (file:///Users/devalias/dev/foohumanify/node_modules/humanifyjs/dist/index.mjs:56707:3)
Node.js v22.6.0
Turns out that this issue is the same root cause:
- https://github.com/jehna/humanify/issues/111
Which I debugged and identified the root cause + suggested potential places to fix it here:
Looking at your screenshot from https://github.com/jehna/humanify/issues/111#issuecomment-2365764329, it looks like the error is occuring on file 18:
Looking at the code where this happens, it seems to be within
unminify, just afterwebcrackhas run to extract the files, and in the main loop where those files are being processed:https://github.com/jehna/humanify/blob/c2e0be679dcab43ec684668202d709f41452c0e9/src/unminify.ts#L6-L30
Since even with
--verboseenabled,humanifydoesn't output the filenames it's processing, it makes it a bit harder to debug; but running it through a debugger we can see what 'file 18' ofextractedFilescorresponds with anyway (assumingwebcrackconsistently unpacks the code the same way); which since the output log showsi + 1, that makes it actually entry 17 in the array (because it's 0-indexed); which seems to be the extracted file1.js:
Looking at the contents of
output/1.js, we can see that it's an empty file:⇒ cat output/1.js | wc -m 0As part of the main
unminifyloop, the code then callsplugins.reduceon thepluginspassed in tounminify:https://github.com/jehna/humanify/blob/c2e0be679dcab43ec684668202d709f41452c0e9/src/commands/openai.ts#L26-L30
When run as
humanify openai, these plugins are:The
Failed to stringify codeerror you're seeing comes from thevisitAllIdentifiersfunction, just after callingbabel'stransformFromAstAsync:https://github.com/jehna/humanify/blob/c2e0be679dcab43ec684668202d709f41452c0e9/src/plugins/local-llm-rename/visit-all-identifiers.ts#L57-L61
- https://babeljs.io/docs/babel-core#transformfromastasync
Which is called by the
openaiRenameplugin that is passed tounminify:https://github.com/jehna/humanify/blob/c2e0be679dcab43ec684668202d709f41452c0e9/src/plugins/openai/openai-rename.ts#L6-L38
transformFromAstAsyncaccepts a parsedast, and returns an object withcode,map, andast.Presumably, with an empty file as input, that would result in an empty ast, which when passed to
transformFromAstAsync, would result in an empty string returned in thecodefield.Given the check that ends up throwing the error you're seeing is looking for a falsy value in
stringified.code; the empty string would definitely trigger that:https://github.com/jehna/humanify/blob/c2e0be679dcab43ec684668202d709f41452c0e9/src/plugins/local-llm-rename/visit-all-identifiers.ts#L57-L62
So the above seems to be the debugging as to the 'why' of what is going on; though I'm not familiar enough with the codebase to suggest where the best point of implementing the fix is.
If I was to suggest the most minimal/localised fix, it would be to the within
if (!stringified?.code)check withinvisitAllIdentifiers(Ref)Though an argument could also be made that filtering out empty/similar files at the
unminifylevel could have it's merits (Ref); though that would then make it less generic/flexible, as theoretically a plugin in the chain could make it a non-empty file before theopenaiRenameactually needs to process it.So I still think the best fix is probably within
visitAllIdentifiers.In fact, in looking through related issues, this just seems to be a variation of the same root cause that I raised in this issue:
- https://github.com/jehna/humanify/issues/54
I also think the general CLI output should be improved to explain what is happening better. eg. a message before running
webcrack, outputting the actual filenames (not just the file number), etc.These issues also seem to tangentially relate to some aspect of this latter suggestion:
- https://github.com/jehna/humanify/issues/92
- https://github.com/jehna/humanify/issues/58
Originally posted by @0xdevalias in https://github.com/jehna/humanify/issues/111#issuecomment-2375785482
Hacky workaround:
I haven't tried it, but a quick/dirty hack fix could just be to add an
ifcheck to skip empty files withinunminify:Something like:
export async function unminify( filename: string, outputDir: string, plugins: ((code: string) => Promise<string>)[] = [] ) { ensureFileExists(filename); const bundledCode = await fs.readFile(filename, "utf-8"); const extractedFiles = await webcrack(bundledCode, outputDir); for (let i = 0; i < extractedFiles.length; i++) { console.log(`Processing file ${i + 1}/${extractedFiles.length}`); const file = extractedFiles[i]; const code = await fs.readFile(file.path, "utf-8"); + if (code) { const formattedCode = await plugins.reduce( (p, next) => p.then(next), Promise.resolve(code) ); verbose.log("Input: ", code); verbose.log("Output: ", formattedCode); await fs.writeFile(file.path, formattedCode); + } + else { + verbose.log(`Skipping empty file ${i + 1} (${file.path})`); + } } }While this wouldn't really be suited for a proper fix, it should hopefully be enough to do what you need to work around this bug in the meantime.
Originally posted by @0xdevalias in https://github.com/jehna/humanify/issues/111#issuecomment-2378108249
I believe this PR by @j4k0xb will also technically resolve this issue; though I wonder if it would be better UX to detect the empty file earlier on at the command level, and provide a more 'human focussed' error/warning at that stage as well:
- https://github.com/jehna/humanify/pull/134
The nuanced difference between this issue and the original bug in #111 is that this issue is about starting with an empty file, whereas #111 is about webcrack unbundling creating empty files that then break humanify later on when it tries to run over them; which is what #134 specifically fixes/works around.
An empty file is valid JavaScript so there shouldn't be any warning/error imo. That's what most CLIs (formatters, bundlers, etc) do.
An empty file is valid JavaScript so there shouldn't be any warning/error imo.
@j4k0xb I'm specifically talking about the human focussed error/warning when running this against an empty starting file, and providing better feedback to the user around that. If needs be you could add an escape hatch of --allow-empty-files or similar if there was really a need to; but I don't think showing an extra UX focussed note in the log output is a big issue.
And then similarly for empty files when they have been unbundled, providing that feedback in the logs (or perhaps only verbose logs) can be useful.
Like, these tiny UX focussed tweaks I made to what was displayed in the output logs (ignoring the 'skip to file' part) made a MASSIVE difference to helping track down and identify where/why a bug occurred, or why the program seemed to be hanging so long/similar:
I made a few patches to my locally checked out version of
humanifyto help provide better output info + skip through to the relevant files that seem to be causing errors:
src/unminify.ts:diff --git a/src/unminify.ts b/src/unminify.ts index 948b4df..ff95ae0 100644 --- a/src/unminify.ts +++ b/src/unminify.ts @@ -9,22 +9,42 @@ export async function unminify( plugins: ((code: string) => Promise<string>)[] = [] ) { ensureFileExists(filename); + + console.log(`Reading input file: ${filename}`); const bundledCode = await fs.readFile(filename, "utf-8"); + + console.log('Running webcrack to extract/decrypt bundled code...'); const extractedFiles = await webcrack(bundledCode, outputDir); + console.log(`Unbundled ${extractedFiles.length} files from ${filename} into ${outputDir}`); + for (let i = 0; i < extractedFiles.length; i++) { - console.log(`Processing file ${i + 1}/${extractedFiles.length}`); - const file = extractedFiles[i]; + + console.log(`Processing file ${i + 1}/${extractedFiles.length} (${file.path})`); + + // Skip extracted files till we find the one we actually want to try and process + if (file.path !== `${outputDir}/6blF.js`) { + console.log(' [hack] Skipping files that are not the one we want to process'); + continue; + } + const code = await fs.readFile(file.path, "utf-8"); - const formattedCode = await plugins.reduce( - (p, next) => p.then(next), - Promise.resolve(code) - ); + if (code) { + const formattedCode = await plugins.reduce( + (p, next) => p.then(next), + Promise.resolve(code) + ); - verbose.log("Input: ", code); - verbose.log("Output: ", formattedCode); + verbose.log(" Input: ", code); + verbose.log(" Output: ", formattedCode); - await fs.writeFile(file.path, formattedCode); + await fs.writeFile(file.path, formattedCode); + } + else { + verbose.log(` Skipping empty file ${i + 1} (${file.path})`); + } } + + console.log(`Unminified (or skipped) ${extractedFiles.length} files from ${filename} into ${outputDir}`); }Originally posted by @0xdevalias in https://github.com/jehna/humanify/issues/117#issuecomment-2378447998
True, more verbose code and progress logging seems useful But still not a fan of skipping files...
- What if the code uses
require("./empty.js")which get's removed because of this? - After that error got fixed, what information do you gain from seeing which ones are empty (input/output also shows it)?
But still not a fan of skipping files...
@j4k0xb I guess that really depends where 'skipping' is implemented. The hacky workaround in unminify (Ref) was never suggested as a proper fix to this, it was only a hacky workaround, as I said at the bottom of it:
While this wouldn't really be suited for a proper fix, it should hopefully be enough to do what you need to work around this bug in the meantime.
But for example, because unminify reduces over the passed in plugins, and feeds the code into them; the openaiRename plugin:
https://github.com/jehna/humanify/blob/64a1b9511d2e42b705df0858674eab1f5e83dd0d/src/plugins/openai/openai-rename.ts#L6-L20
Or even visitAllIdentifiers itself:
https://github.com/jehna/humanify/blob/64a1b9511d2e42b705df0858674eab1f5e83dd0d/src/plugins/local-llm-rename/visit-all-identifiers.ts#L15-L25
Could have an 'early guard statement' when code is empty, that provides some feedback and then returns before even needing to attempt to parse the AST.
My suggestion is about identifying the appropriate place to check that the inputs are relevant, and react appropriately if not. There's no reason to parse the AST and have all the other code in that file run just to turn it back into an empty string if we can just handle that case specifically at the beginning. While it's potentially a negligible performance gain, at the very least it makes the handling explicit, and makes the code easier to reason about/less chances for bugs to occur because of it.
Like without thinking too deeply about it, I would probably implement something like this:
export async function visitAllIdentifiers(
code: string,
visitor: Visitor,
onProgress?: (percentageDone: number) => void
) {
+ if (code === "") return ""
+
const ast = await parseAsync(code);
const visited = new Set<string>();
const renames = new Set<string>();
if (!ast) {
throw new Error("Failed to parse code");
}
And because visitAllIdentifiers doesn't seem to currently have any patterns around logging, to keep with that, I would also make a change in openaiRename (and other similar plugins) like this:
export function openaiRename({
apiKey,
model
}: {
apiKey: string;
model: string;
}) {
const client = new OpenAI({ apiKey });
return async (code: string): Promise<string> => {
+ if (code === "") {
+ verbose.log("Skipping empty file");
+ return code
+ }
+
return await visitAllIdentifiers(
code,
async (name, surroundingCode) => {
verbose.log(`Renaming ${name}`);
verbose.log("Context: ", surroundingCode);
I'm specifically talking about the human focussed error/warning when running this against an empty starting file, and providing better feedback to the user around that.
In addition to the suggestions in my last comment, I was also going to suggest adding a warning/error at the command level, but in looking at the code again now, I don't think that makes sense:
https://github.com/jehna/humanify/blob/64a1b9511d2e42b705df0858674eab1f5e83dd0d/src/commands/openai.ts#L20-L30
Basically, the openai command is just calling unminify, and it's within unminify that the file is first actually read, before being passed to webcrack:
https://github.com/jehna/humanify/blob/64a1b9511d2e42b705df0858674eab1f5e83dd0d/src/unminify.ts#L6-L13
So the only way I think it would make sense to implement anything new at the command level is if unminify was refactored to instead accept code rather than filename; but at this stage, I'm not really sure if there is any added benefit to doing that; unless it becomes useful to do so as part of the implementation for issues like:
- https://github.com/jehna/humanify/issues/129
- https://github.com/jehna/humanify/issues/3
So I think just the 2 minimal suggestions I made in https://github.com/jehna/humanify/issues/54#issuecomment-2381073944 + the fix in https://github.com/jehna/humanify/pull/134 would probably satisfy my requests on this issue.
The hacky workaround in
unminify(Ref) was never suggested as a proper fix to this, it was only a hacky workaround, as I said at the bottom of it:While this wouldn't really be suited for a proper fix, it should hopefully be enough to do what you need to work around this bug in the meantime.
Looks like that hacky workaround (rather than the better full solution mentioned above) was what @jehna ended up implementing in #134 in the end though.
- https://github.com/jehna/humanify/pull/134/files#r1800294136
https://github.com/j4k0xb/humanify/blob/96c093bc9965ca02fa1c269db1a0d445a7d6bacf/src/unminify.ts#L21-L24
This should be fixed now
This should be fixed now
@jehna Fixed technically, but see notes above in https://github.com/jehna/humanify/issues/54#issuecomment-2412626831 + https://github.com/jehna/humanify/issues/54#issuecomment-2381073944 for why the way it was implemented may not have been the ideal place to do so; and what a more flexible option would have been.