tailwind-merge Add performance test

Discussed in https://github.com/dcastil/tailwind-merge/discussions/252

^{Originally posted by tordans June 15, 2023} It would be awesome to have the performance test that clsx uses run for twMerge and twJoin and have a page in the repo that shows the results.

Benchmark results by clsx https://github.com/lukeed/clsx/tree/master/bench
Benchmark linked by clsx and classnames https://github.com/JedWatson/classnames/tree/main/benchmarks

I found existing references to performance, but in the end they do not really tell me how the libraries compare.

https://github.com/dcastil/tailwind-merge/discussions/243 says "349 calls to twMerge with a total of 5.1 ms execution time." but how does this compare the the clsx/classnames benchmar?
https://github.com/dcastil/tailwind-merge/discussions/164 explains very well why it is fast, but not how fast
https://github.com/dcastil/tailwind-merge/blob/main/docs/features.md#performance does the same (and probably should reference twJoin as well)

Jun 15 '23 20:06 dcastil

Nice to haves would be:

Compare at least twMerge, twJoin and clsx
Test scenario should be somewhat realistic with some repetition and many unique calls like on app startup
Post performance benchmark on every new PR, ideally with diff to main branch as well
Link to benchmark from docs

Jun 15 '23 20:06 dcastil

twMerge is incredibly useful and it perfectly solved tailwind utility classes conflict issues. however my big concern is performance. I do a simple test with below code. it took over 2ms which is huge. as a comparison, using classNames is almost 0.

  const cn1 =
    "flex w-full justify-between rounded-lg bg-purple-100 px-4 py-2 text-left text-sm font-medium text-purple-900 hover:bg-purple-200 focus:outline-none focus-visible:ring focus-visible:ring-purple-500 focus-visible:ring-opacity-75";
  const cn2 =
    "flex w-full justify-between rounded-lg bg-purple-100 px-4 py-2 text-left text-sm font-medium text-purple-900 hover:bg-purple-200 focus:outline-none focus-visible:ring focus-visible:ring-purple-500 focus-visible:ring-opacity-75";
  const cn3 = "flex items-center w-full justify-between tracking-0_2 hover:bg-gray-500 cursor-pointer px-16 px-24";  
let start = performance.now();
  twMerge(
    cn1,
    cn2,
    cn3,
  );
  let end = performance.now();
  console.log(`Execution time: ${end - start} ms`);
//classNames test
  classNames(
    cn1,
    cn2,
    cn3,
  );

for a big page that can easily have thousands of DOM nodes, that will be slow. due to the performance concern, we can not recommend twMerge in our project.

Jun 22 '23 23:06 larry-cedar

Hey @larry-cedar! 👋

tailwind-merge computes a large data structure on the first call to twMerge which it caches for future calls which makes the first call probably 1000x slower than subsequent calls (you can read about it here). Therefore comparing a single twMerge call with a single classNames call doesn't really make sense. If you do a simple comparison like that, I'd suggest to do more calls, e.g. 1000.

  const cn1 =
    "flex w-full justify-between rounded-lg bg-purple-100 px-4 py-2 text-left text-sm font-medium text-purple-900 hover:bg-purple-200 focus:outline-none focus-visible:ring focus-visible:ring-purple-500 focus-visible:ring-opacity-75";
  const cn2 =
    "flex w-full justify-between rounded-lg bg-purple-100 px-4 py-2 text-left text-sm font-medium text-purple-900 hover:bg-purple-200 focus:outline-none focus-visible:ring focus-visible:ring-purple-500 focus-visible:ring-opacity-75";
  const cn3 = "flex items-center w-full justify-between tracking-0_2 hover:bg-gray-500 cursor-pointer px-16 px-24";

  let twMergeStart = performance.now();
  for (let i = 0; i < 1000; i++) {
    twMerge(
      cn1,
      cn2,
      cn3,
    );
  }
  let twMergeEnd = performance.now();
  console.log(`twMerge execution time: ${twMergeEnd - twMergeStart} ms`);

  let classNamesStart = performance.now();
  for (let i = 0; i < 1000; i++) {
    classNames(
      cn1,
      cn2,
      cn3,
    );
  }
  let classNamesEnd = performance.now();
  console.log(`classNames execution time: ${classNamesEnd - classNamesStart} ms`);

But that comparison is still not very realistic because twMerge calls 2-1000 will all be cache hits. You get a better comparison when you call twMerge often with different inputs.

I wrote about a realistic test case here: https://github.com/dcastil/tailwind-merge/discussions/243#discussioncomment-6120314

Results of reloading the app running this code in development mode with Vite on a MacBook Pro 16" with a M1 Pro chip (from 2021) in the Chrome browser with devtools open:

349 calls to twMerge with a total of 5.1 ms execution time.

Jun 23 '23 11:06 dcastil

Hey @dcastil thanks for your reply. I see twMerge cache the computation. but still the initial rendering will be very slow. I am trying to solve this problem, but TBH, i don't have better idea on how to make the first call faster.

Jun 23 '23 15:06 larry-cedar

There's no way of making the first call faster at the moment unfortunately. You can check out some alternatives to tailwind-merge here: https://github.com/dcastil/tailwind-merge/blob/v1.13.2/docs/when-and-how-to-use-it.md#alternatives

Jun 24 '23 13:06 dcastil

Research:

What I want

In every PR, post a comment with benchmark
Benchmark comment should include perf results of PR branch, base branch and their diff
Comment should update when further commits are pushed to PR
Ideally ability to rerun benchmark on PR because of fluctuation in results over time

Github Action

Possible benchmark action: https://github.com/benchmark-action/github-action-benchmark
- Can post alerts, but doesn't seem to be meant to always post comments in PR with diff to base
I could build my own according to https://github.com/andresz1/size-limit-action/blob/dd31dce7dcc72a041fd3e49abf0502b13fc4ce05/src/main.ts
Blog article where someone built one themselves: https://werat.dev/blog/running-benchmarks-for-pull-requests-via-github-actions/
Inspiration: https://github.com/boa-dev/criterion-compare-action
- Warns that benchmarks on GitHub Actions fluctuate as load on GitHub Actions does
Benchmark results might be misleading on CI because of fluctuating load

Data

Code to extract twMerge calls from app:

export const twMergeInternal = extendTailwindMerge({ /* … */ })

const twMergeCalls: any[] = []

export function twMerge() {
    twMergeCalls.push(Array.from(arguments))

    return twMergeInternal.apply(null, arguments as any)
}

window.createPerfDataJson = () => {
    console.log(
        JSON.stringify(twMergeCalls, (key, value) => {
            if (value === undefined) {
                return null
            }

            return value
        }),
    )
}

Maybe I should also extract twJoin calls to perf test those.

Jul 09 '23 13:07 dcastil

Interesting article checking whether GitHub Action runners are good enough for benchmarking: https://labs.quansight.org/blog/2021/08/github-actions-benchmarks

Jun 22 '24 12:06 dcastil

@rortan134 created a benchmark in https://github.com/rortan134/tailwind-merge/blob/benchmark/tests/benchmark.bench.ts as part of https://github.com/dcastil/tailwind-merge/pull/445 which I could use.

Full code:

import { benchmarkSuite } from 'jest-bench' // v^29.7.1

import { createTailwindMerge, createTailwindMerge2 } from '../src'

const tailwindMerge = createTailwindMerge(() => ({
    cacheSize: 20,
    separator: ':',
    theme: {},
    classGroups: {
        fooKey: [{ fooKey: ['bar', 'baz'] }],
        fooKey2: [{ fooKey: ['qux', 'quux'] }, 'other-2'],
        otherKey: ['nother', 'group'],
    },
    conflictingClassGroups: {
        fooKey: ['otherKey'],
        otherKey: ['fooKey', 'fooKey2'],
    },
    conflictingClassGroupModifiers: {},
}))
const tailwindMerge2 = createTailwindMerge2(() => ({
    cacheSize: 20,
    separator: ':',
    theme: {},
    classGroups: {
        fooKey: [{ fooKey: ['bar', 'baz'] }],
        fooKey2: [{ fooKey: ['qux', 'quux'] }, 'other-2'],
        otherKey: ['nother', 'group'],
    },
    conflictingClassGroups: {
        fooKey: ['otherKey'],
        otherKey: ['fooKey', 'fooKey2'],
    },
    conflictingClassGroupModifiers: {},
}))

benchmarkSuite('createTailwindMerge', {
    setup() {},
    'v1': () => {
        tailwindMerge('my-modifier:fooKey-bar my-modifier:fooKey-baz')
    },
    'v2': () => {
        tailwindMerge2('my-modifier:fooKey-bar my-modifier:fooKey-baz')
    },
})

Jul 24 '24 19:07 dcastil

I think it will be useful to setup codspeed to measure performance changes over PRs, like in oxc I can create a PR

Aug 04 '24 16:08 XantreDev

@XantreDev that would be nice indeed! I didn't know of Codspeed and really like how they solve the problem with fluctuating performance on shared vCPUs. Feel free to submit a PR. I can set up an account on CodSpeed.

Here is a data set from real twMerge call arguments in an app you can test against: perf-test-data.json

Aug 11 '24 10:08 dcastil

I just created an account on Codspeed and added the Codspeed token to the repository secrets. It should be possible to access it as ${{ secrets.CODSPEED_TOKEN }} in a GitHub action workflow.

Aug 11 '24 10:08 dcastil

I just created an account on Codspeed and added the Codspeed token to the repository secrets. It should be possible to access it as ${{ secrets.CODSPEED_TOKEN }} in a GitHub action workflow.

Thanks. I will implement it soon

Aug 11 '24 12:08 XantreDev