circleci-cli icon indicating copy to clipboard operation
circleci-cli copied to clipboard

[bug] circleci tests split can't handle many files

Open sibelius opened this issue 3 years ago • 1 comments

  • [x] I have read Contribution Guidelines.
  • [x] I have checked for similar issues and haven't found anything relevant.
  • [x] This is not a security issue (which should be reported here: https://circleci.com/security/)

Do you want to request a feature or report a bug? bug

What is the current behavior? circleci tests split breaks like this

Error: failed to read input: bufio.Scanner: token too long

Can you provide an example? We get a list of files_changed based on the base branch and compare url Then we get a list of testfiles to test And we split the testfiles to run then in parallel

we have an example here https://github.com/sibelius/monorepo-101

FILES_CHANGED=$(yarn --silent entria-deploy changes ${CIRCLE_COMPARE_URL} --baseRef ${BASE_REF})
echo $FILES_CHANGED
TESTFILES=$(yarn --silent jest --findRelatedTests --listTests $FILES_CHANGED)
TESTFILES_SPLITTED=$(echo $TESTFILES | circleci tests split | xargs -n 1 echo)

What is the expected behavior? It should not break, it should split well

Which version of the CLI and OS are you using? Did this work in previous versions? 0.1.8599+d6e83e7

Please provide the output of circleci version and circleci diagnostic.

If you have any questions, feel free to ping us at @CircleCI-Public/x-team.

this StackOverflow can help https://stackoverflow.com/questions/21124327/how-to-read-a-text-file-line-by-line-in-go-when-some-lines-are-long-enough-to-ca

sibelius avatar Jul 08 '20 14:07 sibelius

any fix for this?

sibelius avatar Apr 11 '22 13:04 sibelius

Hello,

First, really sorry we get back to this subject so late!! This issue does not happen in this repo, the CLI circleci that you can use in CircleCI jobs is actually another CLI which implementation is internal. We could increase the buffer size there but still I have a question: Can you provide the content of the TESTFILES variable from your example. The why of this question is that, we do use bufio.Scanner to read stdin line by line, the scanner can read lines with a maximum length of 64 * 1024. Do you have a filepath that exceeds this limit?

JulesFaucherre avatar Nov 22 '23 16:11 JulesFaucherre

our monorepo is huge

sibelius avatar Nov 22 '23 16:11 sibelius

Insisting a bit here but are you sure you are using new lines? The maximum unix filepath size is supposed to be 4096 octets and window's is 32Ko, I would really like to see a path that exceed 64Ko

JulesFaucherre avatar Nov 23 '23 10:11 JulesFaucherre

Hi @sibelius :wave:

Looking into your example repo, I can see that the list of TESTFILES is created with jest --findRelatedTests. The output of this command is indeed separated by new lines.

$ npx jest --findRelatedTests --listTests $(find ./packages -name index.ts)
/project/packages/packageD/src/__tests__/packageD.spec.ts
/project/packages/packageC/src/__tests__/packageC.spec.ts
/project/packages/packageB/src/__tests__/packageB.spec.ts
/project/packages/packageA/src/__tests__/packageA.spec.ts

However, the output of this command is stored in an environment variable, which is then passed as argument to echo, the output of which is piped into circleci tests split's standard input. On the circleci/node:lts image that you seem to be using, echo points to bash echo, which (according to help echo):

    Display the ARGs, separated by a single space character and followed by a
    newline, on the standard output.

In effect, using echo as you do removes all newlines and replaces them with a single space character before passing to circleci tests split.

$ TESTFILES=$(npx jest --findRelatedTests --listTests $(find ./packages -name index.ts))

$ echo $TESTFILES
/project/packages/packageD/src/__tests__/packageD.spec.ts /project/packages/packageC/src/__tests__/packageC.spec.ts /project/packages/packageB/src/__tests__/packageB.spec.ts /project/packages/packageA/src/__tests__/packageA.spec.ts

The circleci command then expects this very long line to be a single path including spaces. In other words, since the test files are not separated by newlines, you would be getting no benefit from splitting the tests anyway.

The solution here would be to quote the environment variable in your echo command. That way, echo interprets the whole contents of the variable as a single argument, including new lines, instead of a string of arguments separated by new lines.

$ echo "$TESTFILES"
/project/packages/packageD/src/__tests__/packageD.spec.ts
/project/packages/packageC/src/__tests__/packageC.spec.ts
/project/packages/packageB/src/__tests__/packageB.spec.ts
/project/packages/packageA/src/__tests__/packageA.spec.ts

Let me know if this helps!

loderunner avatar Dec 04 '23 16:12 loderunner

good enough for me

sibelius avatar Dec 04 '23 16:12 sibelius