faker icon indicating copy to clipboard operation
faker copied to clipboard

helpers.fake - patterns after root expression fail when referencing instance members

Open Abion47 opened this issue 8 months ago • 11 comments

Pre-Checks

Describe the bug

What I am doing

I am trying to call instance methods of returned generated values from within a helpers.fake expression, for example '{{number.int.toPrecision}}'. The call failed, which led me down a rabbit hole of trying to figure out why. It turns out that, in many cases, referencing instance/prototype functions of the returned expression value in a pattern fails because the function is not setting this when being called.

What's more, the way it fails depends on how the expression is formatted. If the expression uses function shorthand syntax ('number.int.toPrecision'), the fake will fail due to the aforementioned bug of this not getting set. However, if the expression uses parameterized syntax ('number.int(100).toPrecision'), the fake will fail due to the expression not being resolvable. It's also worth noting that in this latter case, properties will also fail. (i.e. 'string.alpha.length' will succeed, but 'string.alpha(10).length' will fail.)

What I expected

Referencing an instance methods/prototype functions in an expression pattern to further modify the generated value.

What actually happened

Referencing an instance methods/prototype functions in an expression pattern results in an error. (The specific error depends on the type of the instance involved and format of the expression. See minimum reproduction.)

Minimal reproduction code

These expressions will all result in various errors that essentially boil down to "this is undefined":

const expNotation = faker.helpers.fake('{{number.int.toPrecision(2)}}'); // Number.prototype.toPrecision requires that 'this' be a Number
const trimmed = faker.helpers.fake('{{string.alpha.trim}}'); // String.prototype.trim called on null or undefined
const isoTimestamp = faker.helpers.fake('{{date.anytime.toISOString}}'); // Method Date.prototype.toISOString called on incompatible receiver undefined

These expressions will all result in "Cannot resolve expression ''":

const expNotation = faker.helpers.fake('{{number.int(100).toPrecision(2)}}');
const trimmed = faker.helpers.fake('{{string.alpha(5).trim}}');
const isoTimestamp = faker.helpers.fake('{{date.anytime({"refDate": 1735714800000}).toISOString}}');
const stringLength = faker.helpers.fake('{{string.alpha(10).length}}');

Additional Context

The impetus for this ticket was that I was trying to convert a Date object returned from a date expression (e.g. date.anytime) to an ISO timestamp within the resulting string using '{{date.anytime.toISOString}}'. The reason I cannot use alternative approaches (such as using helpers.mustache and passing the timestamp in as a parameter) is because I am using a utility I created that uses a JSON template object with properties defined using expression strings to generate entire complex objects.

Environment Info

System:
    OS: macOS 15.4
    CPU: (14) arm64 Apple M3 Max
    Memory: 1.52 GB / 96.00 GB
    Shell: 5.9 - /bin/zsh
  Binaries:
    Node: 22.13.1 - ~/.asdf/installs/nodejs/22.13.1/bin/node
    npm: 10.9.2 - ~/.asdf/installs/nodejs/22.13.1/bin/npm
    pnpm: 10.5.2 - ~/.asdf/installs/nodejs/22.13.1/bin/pnpm
  Browsers:
    Chrome: 135.0.7049.41
    Safari: 18.4

Which module system do you use?

  • [ ] CJS
  • [x] ESM

Used Package Manager

pnpm

Abion47 avatar Apr 04 '25 17:04 Abion47

I've come up with a working solution for the first case of accessing instance members after a function reference (i.e. date.anytime.toISOString). I'm still trying to figure out the case of after a function call (i.e. string.alpha(10).length).

Abion47 avatar Apr 04 '25 18:04 Abion47

For starters, thank you for the detailed bug description as well as listing the use-cases to immediately cross out potential work-around's. You use case sounds quite intriguing and is probable the best use-case I came across to do what faker.helpers.fake was designed to do. These kind of feedback give us the assurance that these helpers are actually useful as well as give insight how they might get used in the real world.

I was able to confirm the bug and labeled you issue as "accepted". The reproduction calls all seem valid to me. Our test cases should probably include these to prevent further regression and guide a developer that might tackle this issue.

Additionally I applied the "help wanted" label. This means that help from the community is highly appreciated.

xDivisionByZerox avatar Apr 04 '25 18:04 xDivisionByZerox

Ok I did some digging and found 2 issues:

1. Primitive Values

Our internal eval function does handle property resolution (obviously), but stops for primitives. It only resolves the properties in the chains if the previous part is either a function or an object:

https://github.com/faker-js/faker/blob/048c32581bd5d15f333ef63e4ca968e02793373d/src/modules/helpers/eval.ts#L229-L249

This would happen when one would try to access prototype methods via the locale definitions faker.helpers.fake('person.first_name.generic.toUpperCase')

This could be adjusted by adding a handler for primitives that calls the function on the prototype of the primitive:

case 'boolean':
case 'number':
case 'bigint':
case 'string': {
  const primitiveFn: unknown = entrypoint[key as keyof typeof entrypoint];
  if (typeof primitiveFn === 'function') {
    return primitiveFn.call(entrypoint);
  }

  return undefined;
}

Some questions that did arise from my side:

  • what would be the expected outcome be when someone tries to us an invalid primitive function faker.helpers.fake('person.first_name.generic.abcd')
  • what would be the expected outcome if these "invalid" function have been added to the primitives prototype before?\
String.prototype.abcd = () => 'abcd';
faker.helpers.fake('person.first_name.generic.abcd')
  • should index accessing be allowed? faker.helpers.fake('person.first_name.generic.2') // returns the 3rd letter from a random generic first name

2. Functions without this scope

When part of an expression is a function, it is resolved by invoking the function and accessing the desired property ('airline.airline().name' => invokes 'airline()' to access the 'name' property). The function executions are always done standalone and not on the parent object. This is due to the algorithm implementation to resolve the string expression. To do that, the eval function iterates over each part of the expression, extracts the next expression member and resolves it. Then repeats this time starting from the previously resolved expression member.

Lets do an example for: fakeEval('person.firstName()', faker);

Iteration Current Result Expression Resolve Intermediate Result Full Result
1 faker 'person.firstName()' PersonModule (faker.person) PersonModule
2 PersonModule 'firstName()' firstName function reference (faker.person.firstName) result of calling firstName()

Note that the result of the second iteration is the result of calling firstName and not faker.person.firstName. This would be equivalent to the following:

const fn = faker.person.firstName;
const result = fn();

This only works because we bind all methods of a module to the module itself:

https://github.com/faker-js/faker/blob/048c32581bd5d15f333ef63e4ca968e02793373d/src/internal/module-base.ts#L8-L12

To allow for other the function property resolve handler would need to have access to the previous iteration result and call the method on it.

pseudocode

function resolveProperty(entrypoint: unknown, key: string, previousEntrypoint: unknown): unknown {
  case 'function': {
      try {
        entrypoint = entrypoint.call(previousEntrypoint);
      } catch {
        return undefined;
      }

      return entrypoint?.[key as keyof typeof entrypoint];
    }
}

While I did do all this analysis, I want to strongly mention a specific line in the docs of the fake functions:

https://github.com/faker-js/faker/blob/048c32581bd5d15f333ef63e4ca968e02793373d/src/modules/helpers/index.ts#L1250

Looking at that I would more lean towards categorizing you issue as a feature request, then a bug. Since you did such a good job describing your use case, I want to be fair and at least allow the though of adding this behavioral into the faker library by leaving this issue ope for discussion. To be honest, the fake method has been troubling us (the maintainer team) quite some time, since the template literal syntax was introduced. Handling all these edge cases (such as the one you found) could probably be an entire library in itself.

xDivisionByZerox avatar Apr 04 '25 21:04 xDivisionByZerox

I've pushed a branch on a fork with my tentative solution: https://github.com/Abion47/faker/tree/%40fix/helper-fake-instance-access

Abion47 avatar Apr 04 '25 22:04 Abion47

Alternative solution:

  • #3176

Instead of expanding the fake method to match eval, it might be better to use function calls to generate the result using plain code.

{{number.int(100).toPrecision(2)}} -> (faker) => faker.number.int(100).toPrecision(2)

Or do you have a usecase that requires you to use string sources?

ST-DDT avatar Apr 04 '25 23:04 ST-DDT

Alternative solution:

Instead of expanding the fake method to match eval, it might be better to use function calls to generate the result using plain code.

{{number.int(100).toPrecision(2)}} -> (faker) => faker.number.int(100).toPrecision(2)

Or do you have a usecase that requires you to use string sources?

Yes, my use case is laid out under "Additional Context".

EDIT: For instance, my utility takes a JSON object like this:

{
  "id": "{{string.uuid}}",
  "name": "{{person.fullName}}",
  "created": "{{date.past}}",
  "specialties": "{{word.noun}}, {{word.noun}}, and {{word.noun}}"
}

and uses the object as a template to mass generate a bunch of complex objects for mock database population.

Abion47 avatar Apr 05 '25 00:04 Abion47

Thanks for the code example that makes it a lot easier to understand.

Fake only produces strings. Does your DB coerce the values from string to the propper value? And is that the reason you need the date iso string? Would your DB/tooling also work with the actual date object? Have you considered using eval?

const template = "date.anytime()";
const result = new Function(...Object.keys(faker), `return ${template};`)(...Object.values(faker));

I've looked at your commit and it looks good to me. I haven't tested it for edge cases yet though.

ST-DDT avatar Apr 05 '25 08:04 ST-DDT

@ST-DDT fake only producing strings* makes it all the more important for it to be able to mutate the results from fakeEval for when the raw output is something that doesn't cleanly stringify, or if it can, cannot be easily converted back to it's previous type.

For example, consider the output of fakeEval("{{airline.airline}}") which returns an object (and, incidentally, is one of the unit tests for fakeEval testing that it can indeed return objects). Calling faker.helpers.fake("{{airline.airline}}") returns "[object Object]", which is certainly not what anyone making that call intended to receive. It's fine for this specific use case since Airline is an internal type that fake can already handle getting properties from (e.g. "{{airline.airline.name}}"), but not in the case of other faker methods that return a non-primitive type that isn't an internal type, such as Date. (Another issue would be 'number.bigInt', which when stringified just appears like a normal integer with there being no trivial way to tell if a stringifed integer should be parsed as a BigInt or Number. But on that specific case, I digress, because there isn't an obvious solution to that one and any such solution would be off-topic on this issue anyway.)

My point is just that any non-primitive value that any faker method returns is going to have issues working in fake, and as the library of methods grows, more such methods will appear. Without a reliable way to mutate them into a clean value within the expression, those methods are going to have limited use if not be explicitly unsupported, which is just going to eventually lead to the natural conclusion of dropping fake altogether. (And I can see with #3176 you've already started down that road.)

But yes, the database I'm using (and I suspect most databases in general) support converting an ISO-8601 timestamp to whatever their internal representation is for a datetime, whereas I can't depend on it doing the same for a string-ified JS Date object.

And I'm not going to use eval unless I have absolutely no other choice. There is just too much that can go wrong there.


I'm sure edge cases exist, as my commit's approach opens the door to perform all sorts of unholy operations, including calling functions that don't return anything (I can't think of a "working" example, but something like {{string.alpha.split.forEach}}).


(*: As an aside, why does fake only return strings anyway? It seems like its only reason for doing so is to be able to reliably call itself recursively with the current iterative templating approach, but that seems largely arbitrary when there are plenty of faker methods that don't return strings and faker.helpers.fake('{{number.int}}') actually returning a number seems like it would be far more intuitive. [Though in the case of compound patterns like faker.helpers.fake('{{number.int}} & {{number.int}}'), returning a string does make the most sense.])

Abion47 avatar Apr 05 '25 18:04 Abion47

As an aside, why does fake only return strings anyway?

Mostly historic purposes with some API spec tossed on top. It is basically an enhanced alternative to https://fakerjs.dev/api/helpers.html#mustache

but that seems largely arbitrary when there are plenty of faker methods that don't return strings and faker.helpers.fake('{{number.int}}') actually returning a number seems like it would be far more intuitive.

https://github.com/faker-js/faker/blob/7b120567139b2ac84d052f42f1f352c6697f2aa2/src/modules/helpers/eval.ts#L66

Is kind of intended as a step in resolving that. But there hasnt been any demand for it to be publicly exposed. We could add support for recursive patterns in fakeEval, but IMO at some point you just end up with it being identical to eval.

ST-DDT avatar Apr 06 '25 09:04 ST-DDT

@ST-DDT That was another thing that seems confusing to me. The only real difference between fake and mustache seems like fake can parse more advanced fields within the tags whereas mustache can take input variables. These two methods seem trivial to merge into one by just letting fake take input variables, and looking at the code, that functionality seems like it would be easy to add. For instance: https://github.com/Abion47/faker/tree/%40feat/fake-input-vars

And as far as allowing fake/mustache to return non-strings, that shouldn't be hard either. In fact, doing so might even make them more performant by replacing the current recursion-into-string-concatenation method with a purely iterative approach using Array.join. I made a branch as a proof-of-concept showing this in action: https://github.com/Abion47/faker/tree/%40feat/fake-array-join

Abion47 avatar Apr 07 '25 16:04 Abion47

These two methods seem trivial to merge into one by just letting fake take input variables

While you can add input variable support to fake (and it might actually be a nice feature), the behavior is entirely different. If you want to pre fill certain values, you might want to run a template through mustache first and then through fake. For a simple replace syntax, it might be easier to define the replacement tokens than building an appropriate tree structure. There are other considerations such as, order of precendence and (not) having an input prefix. IMO we should track that in a different feature request issue. Would you like to create it?

In fact, doing so might even make them more performant by replacing the current recursion-into-string-concatenation method with a purely iterative approach using Array.join.

Currently the code also allows nested expressions. But I agree the implementation can potentially be optimized in an iterative approach independent of new features. Feel free to open a PR for that.

As for returning non strings, it would be API breaking.

ST-DDT avatar Apr 07 '25 20:04 ST-DDT