cucumber-jvm
cucumber-jvm copied to clipboard
High Overhead in FileFeatureOrigin Caused by Repeated Canonical File Resolution
🤔 What's the problem you're trying to solve?
In configurations with a significant number of tests, the test discovery time is noticeably long. In our project setup:
- 200+ feature files
- 6000+ containers
- 20000+ tests
Runtime environment: Windows (might behave differently on Linux or other Windows file systems) Java version: 21
On development machines running Windows, it takes approximately 60–70 seconds to discover features and start test execution. That is true also in case of filtering using tags.
Upon analysis, the slowdown seems to be caused by the FileFeatureOrigin class calling FileSource, which in turn spends most of the time in the file.getCanonicalFile() method:
- [FeatureOrigin.java#L90](https://github.com/cucumber/cucumber-jvm/blob/731f3999958a5935df1309869c6507e40a686f9a/cucumber-junit-platform-engine/src/main/java/io/cucumber/junit/platform/engine/FeatureOrigin.java#L90)
- [FileSource.java#L73](https://github.com/junit-team/junit5/blob/7d02a6a2521617ab8502f8614794b7bb0a568603/junit-platform-engine/src/main/java/org/junit/platform/engine/support/descriptor/FileSource.java#L73)
✨ What's your proposed solution?
We tried two approaches to reduce the discovery time:
-
Parallel creation of
FeatureDescriptionsBy first collecting all features and then creatingFeatureDescriptionsin parallel, we reduced processing time to ~20 seconds. Related code: [FeatureResolver.java#L250](https://github.com/cucumber/cucumber-jvm/blob/731f3999958a5935df1309869c6507e40a686f9a/cucumber-junit-platform-engine/src/main/java/io/cucumber/junit/platform/engine/FeatureResolver.java#L250) -
Custom
FileSourceimplementation with caching We implemented a customFileSourcethat maintains a static cache (using Guavamemoize, but even aHashMapwould suffice) to avoid repeated calls togetCanonicalFile(). This reduced feature discovery time to ~5 seconds, a significant improvement for frequent test runs.
This caching works as long as there's a one-to-many relationship between files and test cases.
🔄 Have you considered any alternatives or workarounds?
Providing selected features only when running tests, but can't be used when testing with all scenarios.
📚 Any additional context?
This may be relevant to other users with large test suites, especially on Windows, and could potentially be generalized into a configurable optimization in the cucumber-jvm engine.