proposal-pipeline-operator icon indicating copy to clipboard operation
proposal-pipeline-operator copied to clipboard

Real-world examples of better DX for constructors and factory APIs like `Object.fromEntries`

Open justingrant opened this issue 6 months ago • 6 comments

more empiric evidence for pipe’s impact would be good. (from https://github.com/tc39/proposal-pipeline-operator/issues/232#issuecomment-2784879225)

Thanks so much for persevering with this proposal. Here's 8 snippets (I could probably provide hundreds more) from our small company's codebase showing how Pipeline could make it easier to use one common API: Object.fromEntries.

A few surprising (to me) observations from this exercise:

  • Ironically, the biggest usability win was NOT actually moving Object.entries to use Pipeline. Instead, a much larger win was removing calls to reduce (see the last 2 snippets) that were only in our code because reduce is chainable and Object.entries is not. Apparently, some developers at our company love chaining so much that they were willing to write much more code (and much more complicated code) just to avoid having to use a nested function call that breaks method chaining. 🤔 It'd be fun to see a plenary slide that includes the line "Pipeline will reduce reduce". 😄
  • Pipeline and Prettier will likely reinforce each other to improve readability. With nested function calls, each call is indented and moved to a new line by Prettier, and all the extra indents in turn cause function calls to exceed Prettier's line length causing their arguments to also be split across lines. Method chains however (including pipes) are moved to new lines at the same nesting level. The result with Pipeline is not only clearer code that reads right-to-left and top-to-bottom, but it's also less indented so each chained call is much less likely to be split across multiple lines, further improving readability.
  • We don't really use FP. In our codebase, Pipeline would simply make our existing, non-FP code easier to read and easier to write. You don't need to love FP to love Pipeline.

Note that Object.entries is not an outlier. I could provide a similar laundry list of snippets for all popular constructors and factory methods like [...new Set(array)] (for deduping arrays), Temporal.*.from, Array.from, etc. And that's just builtins; any class constructor or factory method in our own code or in libraries we use would get easier to use chainably.

// current
return Object.fromEntries(
  Object.entries(value).map(([k, v]) => [typeof v === 'string' ? `${k}.untranslated` : k, transformValue(v)])
);

// with Pipeline
return Object.entries(value)
  .map(([k, v]) => [typeof v === 'string' ? `${k}.untranslated` : k, transformValue(v)])
  |> Object.fromEntries(%);


// current
const result = Object.fromEntries(
  Object.entries(grouped).map(([machineId, attempts]) => [
    machineId,
    attempts[0],
  ])
);

// with Pipeline
const result = Object.entries(grouped)
  .map(([machineId, attempts]) => [machineId, attempts[0]])
  |> Object.fromEntries(%);


// current
const trimmed = Object.fromEntries(
  Object.entries(payload.message).map(([key, value]) => [
    key,
    value ? value.trim() : unknownLabel,
  ]),
);

// with Pipeline
const trimmed = Object.entries(payload.message)
  .map(([key, value]) => [key, value ? value.trim() : unknownLabel])
  |> Object.fromEntries(%);


// current
const MdIconsTransformedKeys = Object.fromEntries(
  Object.entries(MdIcons).map(([iconName, icon]) => [
    transformIconName(iconName),
    icon,
  ]);

// with Pipeline
const MdIconsTransformedKeys = Object.entries(MdIcons)
  .map(([iconName, icon]) => [transformIconName(iconName), icon])
  |> Object.fromEntries(%);


// current
return Object.fromEntries(
  this.$storeTS.state.printer.printer.material_manager.materials_list.map(m => [m.name, m.color])
);

// with Pipeline
return this.$storeTS.state.printer.printer.material_manager.materials_list
  .map(m => [m.name, m.color])
  |> Object.fromEntries(%);


// current
return Object.entries(localStorage)
  .map(([k, v]) => [String(k), v])
  .filter(([k, _v]) => k.startsWith(FLAG_PREFIX))
  .reduce(
    (acc, [k, v]) => ({ ...acc, [k.slice(FLAG_PREFIX.length)]: !!v }),
    {}
  );

// with Pipeline
return Object.entries(localStorage)
  .map(([k, v]) => [String(k), v])
  .filter(([k, _v]) => k.startsWith(FLAG_PREFIX))
  .map([k, v]) => [k.slice(FLAG_PREFIX.length), !!v])
  |> Object.fromEntries(%);


// current
return Object.entries(colors)
  .toSorted(([_nameA, colorA], [_nameB, colorB]) => {
    const lightnessA = parseToHsl(colorA).lightness;
    const lightnessB = parseToHsl(colorB).lightness;
    return lightnessA - lightnessB;
  })
  .reduce((acc, [name, color]) => {
    acc[name] = color;
    return acc;
  }, {} as Record<string, string>);

// with Pipeline
return Object.entries(colors)
  .toSorted(([_nameA, colorA], [_nameB, colorB]) => {
    const lightnessA = parseToHsl(colorA).lightness;
    const lightnessB = parseToHsl(colorB).lightness;
    return lightnessA - lightnessB;
  })
  |> Object.fromEntries(%) as Record<string, string>);


// current
return Object.keys(state.availableCommands)
  .filter((key) => key.endsWith('CALIBRATE'))
  .reduce((o, key) => {
    return {
      ...o,
      [key]: state.availableCommands[key],
    };
  }, {} as GcodeCommands);

// with Pipeline
return Object.keys(state.availableCommands)
  .filter((key) => key.endsWith('CALIBRATE'))
  .map((key) => [key, state.availableCommands[key]]
  |> Object.fromEntries(%) as GcodeCommands);

justingrant avatar May 29 '25 09:05 justingrant

@js-choi You asked for real-world code examples at https://github.com/tc39/proposal-pipeline-operator/issues/311, so this will probably interest you.

gustavopch avatar May 29 '25 11:05 gustavopch

1000% yes, Object.fromEntries is one of the most useful functions we have, and one of the most annoying to be unable to chain.

ljharb avatar May 29 '25 17:05 ljharb

@justingrant: These are indeed excellent real-world examples. Thank you very much for providing them. They all would be great to add to the documentation eventually.

To sharpen these examples, would you be able to answer #311, here or there?

  • Do you have insight into why the authors of the snippets above did not instead use temporary variables that they could have plugged into Object.entries? (Some developers “love chaining so much that they were willing to write much more code (and much more complicated code) just to avoid having to use a nested function call that breaks method chaining,” but could they have used more temporary variables instead?)
  • In your opinion (and other developers in your company), would rewriting the snippets above to use temporary variables make their readability better or worse than the current, deeply nested subexpressions within Object.entries?
// current
return Object.fromEntries(
  Object.entries(value).map(([k, v]) => [typeof v === 'string' ? `${k}.untranslated` : k, transformValue(v)])
);

// with temporary variables
const entries = Object.entries(value)
  .map(([k, v]) => [typeof v === 'string' ? `${k}.untranslated` : k, transformValue(v)]);
return Object.fromEntries(entries);

// with Pipeline
return Object.entries(value)
  .map(([k, v]) => [typeof v === 'string' ? `${k}.untranslated` : k, transformValue(v)])
  |> Object.fromEntries(%);

js-choi avatar May 29 '25 18:05 js-choi

I usually do use temporary variables - and that's annoying. in particular because it means i can't do the entire action in a single expression position, so instead of v => Object.entries(value).map(([k, v]) => [typeof v === 'string' ? ${k}.untranslated : k, transformValue(v)]) |> Object.fromEntries(%) (but nicely formatted) i have to have braces and a return

ljharb avatar May 29 '25 19:05 ljharb

  • Do you have insight into why the authors of the snippets above did not instead use temporary variables that they could have plugged into Object.entries? (Some developers “love chaining so much that they were willing to write much more code (and much more complicated code) just to avoid having to use a nested function call that breaks method chaining,” but could they have used more temporary variables instead?)

There was a surprisingly long list!

  • Temp variables are more cognitive work and feel more like a break in "flow". Instead of only worrying about the next step in a pipeline of data, you need to think about a name, consider whether you need a comment, whether the last line should be return with the last subexpression or a final temp variable that's then returned, etc.
  • Finding unique names is especially hard in long pipeline (using "pipeline" generically here and below, not to mean the Pipeline proposal), so you end up with nonsense names like filtered, evenMoreFiltered, transformed, transformedAgain, etc.
  • When you have a function with multiple pipelines, each temp name in every pipeline must be unique. Given that names are often generic (per bullet above), this often breaks the ability to easily refactor by moving pipelines between functions.
  • Similarly, it's harder to re-use pipeline steps with temp variables, because instead of just copying/pasting the line with the step, you need to manually update references to previous steps' temp variables.
  • (this is @ljharb's point above) Many contexts don't allow statements: object literal property values, braceless arrow function bodies, ternaries/if/while, etc. To use temp variables in these contexts you need to refactor your code quite a lot, which wastes time and makes code harder to read. Long object literals (which are common in our code) are a particularly bad case because the temp variable declaration can often be 10+ lines away from its use.
  • Inserting steps (like adding filter call) in the middle of a chain is trivial: just add code. To insert steps with temp variables, you must update the following step, which is manual and highly error-prone. ESLint can warn about unused variables in the simple case, but if the temp var is used 2+ times in the following line, then it's easy to forget to update one of the uses, resulting in devilish bugs to find later.
  • Appending steps at the end can be even worse, because the final result is often reused multiple times later in the same scope, sometimes tens of lines away. To add a new temp variable to the end, you need to painstakingly find all usage of the current end variable and update them to the new one. Forgetting to update one is easy to do and uncatchable by ESLint. I've been bitten by this bug many times. The best solution I've found is to use an IDE refactoring to rename the last step's variable to the new step's name, but inevitably I don't remember to do this until I've already written the new step, so there's a lot of re-editing involved. With chains, it's less bug-prone and soooo much easier: just press Enter before the semicolon and start typing.
  • Inserting or appending temp vars makes code reviews harder, because the diff extends beyond the code you're adding.
  • Temp variables have no ergonomic alternative to optional chaining. You either need to use let and if statements (which is brittle, esp. with multiple chains in the same scope) or add previousVar !== undefined && to every temp variable declaration which is ugly AF (and easy to forget at peril of hard-to-find bugs) or you need to add huge numbers of optional chaining operators downstream.
  • Temp variables aren't as IDE-friendly. With a chain, once you press the . your IDE tells you exactly what's possible at that position. Tabbing >>> typing.
  • Anything you have to name invites nitpicky bikeshedding complaints that waste everyone's time.
  • Names often need to be refactored later if the underlying code changes, e.g. removeBlue => onlyGreen.

To be fair, there is one case where temp variables shine: debugging. It's vastly easier to debug separate statements than to deal with intra-statement results. I'll sometimes decompose chains for this reason alone. This IMO is something to keep in mind for the design & implementation of Pipeline. For example, if Pipeline is implemented under the covers as creating non-observable temp variables, then ideally IDE and browser debuggers would be able to view those temp vars even if userland code cannot. (Kinda like debuggers can view the return value of a function when stepping through.)

BTW, while writing the previous paragraph, I realized that Pipeline would enable a nice pattern for logging inside a chain:

return Object.keys(state.availableCommands)
  .filter((key) => key.endsWith('CALIBRATE'))
  |> console.log(`filtered: ${JSON.stringify(%)}`) ?? %
  .map((key) => [key, state.availableCommands[key]]
  |> Object.fromEntries(%) as GcodeCommands);

This pattern would be ESLint-friendly, so you'd get an ESLint error if you put a console.* inside a pipeline but don't return % from that step.

  • In your opinion (and other developers in your company), would rewriting the snippets above to use temporary variables make their readability better or worse than the current, deeply nested subexpressions within Object.entries?

Better. Decomposition is always more readable, especially with names that serve as documentation for each step. But readability is just one consideration, and with temp variables the added readability comes at a high cost in programmer time, bugs, and ongoing maintenance. Pipeline also makes code more readable, but without the laundry list of problems noted above.

I think the high-order bit is to ask a different question: despite the availability of temp variables, why do developers persist in NOT using them for pipelines, instead reaching for known-awful solutions like deep nesting or JS's worst API reduce? Having TC39 proclaim "developers should use temp variables" seems unlikely to change anyone's behavior, so it doesn't really solve the problem that Pipeline is trying to solve.

justingrant avatar May 29 '25 22:05 justingrant

Another constructor/factory use case I was remind of today is conversion from BigInt to Number. Here's one example where shaving a few characters prevents Prettier from wrapping to multiple lines, making the already-less-readable non-pipeline code even worse.

const microseconds = lastSeen
  ? Number(
      Temporal.Instant.from(lastSeen).add({ minutes: 1 }).epochNanoseconds /
        1000n
    )
  : undefined;

With pipeline:

const microseconds = lastSeen
  ? Temporal.Instant.from(lastSeen).add({ minutes: 1 }).epochNanoseconds / 1000n
      |> Number(%)
  : undefined;

(I'm assuming that operator precedence puts pipeline after division.)

justingrant avatar Jun 06 '25 04:06 justingrant