Aura.Router icon indicating copy to clipboard operation
Aura.Router copied to clipboard

Convert routes to a node tree to improve route matching perfomance

Open marcelthole opened this issue 7 months ago • 13 comments

Hey, we added this router in one of our projects and noticed a perfomance problem with this change.

With this PR i introduced a TreeNode array for the Map that will put all possible routes per route segment. This will reduce the amount of Routes that have to be checked to the routes that match at least in the segment names. The specific matching with HTTP Verb, and validating the dynamic properties in the URL will be applied as before for the matched routes.

I also created this phpbench File to compare the Results:

MatchBench.php

<?php

namespace Aura\Router\Benchmark;

use Aura\Router\RouterContainer;
use GuzzleHttp\Psr7\ServerRequest;
use Psr\Http\Message\ServerRequestInterface;

/**
 * @BeforeMethods("setUp")
 */
class MatchBench
{
    /** @var RouterContainer $container */
    private $container;

    public function setUp()
    {
        $this->container = new RouterContainer();
        $map = $this->container->getMap();

        foreach ($this->routesProvider() as $key => $route) {
            $map->get($key, $route, static function () use ($key) { return $key; });
        }
    }

    /**
     * @Iterations(3)
     */
    public function benchMatch()
    {
        $matcher = $this->container->getMatcher();
        foreach ($this->routesProvider() as $route) {
            $result = $matcher->match($this->stringToRequest($route));
            if ($result === false) {
                throw new \RuntimeException(sprintf('Expected route "%s" to be an match', $route));
            }
        }
    }

    private function routesProvider()
    {
        $segments = ['one', 'two', 'three', 'four', 'five', 'six'];
        $routesPerSegment = 100;

        $routeSegment = '';
        foreach ($segments as $index => $segment) {
            $routeSegment .= '/' . $segment;
            for ($i = 1; $i <= $routesPerSegment; $i++) {
                yield $index . '-' . $i => $routeSegment . $i;
            }
        }
    }

    /**
     * @param string $url
     * @return ServerRequestInterface
     */
    private function stringToRequest($url)
    {
        return new ServerRequest('GET', $url, [], null, '1.1', []);
    }
}

Result:

/var/www # vendor/bin/phpbench run tests/Benchmark/ --ref=current
PHPBench (1.4.1) running benchmarks... #standwithukraine
with configuration file: /var/www/phpbench.json
with PHP version 8.4.1, xdebug ✔, opcache ❌
comparing [actual vs. current]

\Aura\Router\Benchmark\MatchBench

    benchMatch..............................I2 - [Mo828.080ms vs. Mo1.734s] -52.25% (±0.95%)

Subjects: 1, Assertions: 0, Failures: 0, Errors: 0

i also played with some caching for the generated tree route and got this result. But this is not part of this PR. i would create a followup PR for this if this PR is fine for you.

/var/www # vendor/bin/phpbench run tests/Benchmark/ --ref=current
PHPBench (1.4.1) running benchmarks... #standwithukraine
with configuration file: /var/www/phpbench.json
with PHP version 8.4.1, xdebug ✔, opcache ❌
comparing [actual vs. current]

\Aura\Router\Benchmark\MatchBench

    benchMatch..............................I2 - [Mo17.109ms vs. Mo1.734s] -99.01% (±1.49%)

Subjects: 1, Assertions: 0, Failures: 0, Errors: 0

I added this Benchmark here in the PR description because PHPBench does not support PHP 5.5 anymore.

Summary by CodeRabbit

  • New Features
    • Improved route matching performance by organizing routes into a hierarchical tree structure based on path segments.
    • Added a comprehensive benchmarking script to measure route matching speed and memory usage across various route types and counts.
  • Bug Fixes
    • Enhanced handling of non-routable routes to ensure they are correctly excluded during matching.
  • Tests
    • Added benchmark tests to measure performance of route generation and matching.
    • Introduced new tests to verify the correctness of the tree-based route organization and matching logic.

marcelthole avatar Apr 10 '25 09:04 marcelthole

Hey @marcelthole

I have noticed this PR, but not sure how the phpbench was checking the current code ( in 3.x branch ) vs the new code in the PR.

@koriym @pmjones what are your thoughts on this ?

harikt avatar Apr 18 '25 05:04 harikt

Another option you could do is create a different matcher class and use it for now.

Optionally : We can add support of passing different Matcher to the constructor of the RouteContainer.

harikt avatar Apr 18 '25 06:04 harikt

To the phpbench topic: In this case i created the bench Testfile and stored the result and compared it later with this patch. https://phpbench.readthedocs.io/en/latest/guides/regression-testing.html#comparison For later steps it could be also done via github workflow to create the branch in the target branch and after that run the benchmark again with the changed files.

And yes for now i could use my own Matcher for that in our codebase. But if this feature could improve the perfomance of the router for larger route configurations it would be a benefit for other people too :)

marcelthole avatar Apr 18 '25 11:04 marcelthole

I believe this https://github.com/auraphp/Aura.Router/pull/207 will improve the performance. Have you checked this ?

I will wait for others input for this PR for the current time.

harikt avatar Apr 18 '25 12:04 harikt

#207 will only help in the case where i would generate a route if i saw it correctly. That has no impact on the bootstrapping time to match a route.

But did also a Bench for that change:


/**
 * @BeforeMethods("setUp")
 */
class GeneratorBench
{
    /**
     * @var Generator
     */
    private $generator;

    public function setUp()
    {
        $map = new Map(new Route());
        foreach ($this->routesProvider() as $key => $route) {
            $map->get($key, $route, static function () use ($key) { return $key; });
        }

        $map->get('dummy', '/api/user/{id}/{action}/{controller:[a-zA-Z][a-zA-Z0-9_-]{1,}}{/param1,param2}');
        $this->generator = new Generator($map);
    }


    private function routesProvider()
    {
        $segments = ['one', 'two', 'three', 'four', 'five', 'six'];
        $routesPerSegment = 100;

        $routeSegment = '';
        foreach ($segments as $index => $segment) {
            $routeSegment .= '/' . $segment;
            for ($i = 1; $i <= $routesPerSegment; $i++) {
                yield $index . '-' . $i => $routeSegment . $i;
            }
        }
    }


    /**
     * @Revs(1000)
     * @Iterations (10)
     */
    public function benchMatch()
    {
        $this->generator->generate('dummy', [
            'id' => 1,
            'action' => 'doSomethingAction',
            'controller' => 'My_User-Controller1',
            'param1' => 'value1',
            'param2' => 'value2',
        ]);
    }
}
\Aura\Router\Benchmark\GeneratorBench

    benchMatch..............................I9 - [Mo8.025μs vs. Mo8.353μs] -3.92% (±3.11%)

Yes it will be a little bit faster, but will not help me at all. But for now i added a own Implementation for that in our codebase and added some unit tests for the changes aswell

marcelthole avatar Apr 23 '25 12:04 marcelthole

@coderabbitai review

koriym avatar Apr 25 '25 13:04 koriym

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai[bot] avatar Apr 25 '25 13:04 coderabbitai[bot]

Walkthrough

This update introduces a hierarchical tree-based route organization and matching mechanism to the routing system. The Map class now provides a method to represent routes as a tree keyed by path segments, optimizing route lookup. The Matcher class leverages this tree to filter candidate routes before applying matching rules, improving efficiency. Benchmark tests are added to measure the performance of both route generation and matching. Additional unit tests verify the correctness of the new tree structure and matching logic, including handling of parameterized and optional segments, as well as routability constraints. A new comprehensive benchmark script was also added to evaluate matching speed and memory usage across various route types and counts.

Changes

File(s) Change Summary
src/Map.php Added getAsTreeRouteNode() method to convert routes into a hierarchical tree keyed by path segments, normalizing parameter patterns and handling optional segments.
src/Matcher.php Modified match method to use a tree-based filtering of routes via new getMatchedTree() method; updated requestRoute signature and logic to improve clarity and correctness.
tests/Benchmark/GeneratorBench.php Introduced benchmark class to measure URL generation performance with complex and numerous routes.
tests/Benchmark/MatchBench.php Introduced benchmark class to measure performance of route matching over a large, hierarchical route set.
tests/MapTest.php Added test for getAsTreeRouteNode() to verify correct tree structure and exclusion of non-routable routes.
tests/MatcherTest.php Added test for matching using the new tree-based approach; updated imports and logger expectations.
bench.php Added a comprehensive benchmarking script for route matching performance and memory usage, supporting multiple route types and baseline comparisons.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Matcher
    participant Map

    Client->>Matcher: match(request)
    Matcher->>Matcher: getMatchedTree(path)
    Matcher->>Map: getAsTreeRouteNode() (cached)
    Matcher->>Matcher: Filter candidate routes by path segments
    Matcher->>Matcher: Iterate and apply rules to candidates
    Matcher-->>Client: Return matched route or false

Poem

🐇
In tunnels of code, routes branch like a tree,
Paths and parameters, sorted cleverly.
Matching is swift, with segments in line,
Benchmarks now sparkle, performance divine!
Tests hop along, ensuring it’s right—
Routing’s more nimble, from morning to night.
—A rabbit in the router’s delight!

✨ Finishing Touches
  • [ ] 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

coderabbitai[bot] avatar Apr 25 '25 13:04 coderabbitai[bot]

[!NOTE] Generated docstrings for this pull request at https://github.com/auraphp/Aura.Router/pull/211

coderabbitai[bot] avatar May 02 '25 04:05 coderabbitai[bot]

Sorry, I pushed the benchmark script here by mistake, but I revert it.

koriym avatar May 02 '25 16:05 koriym

I have done an AI benchmarking program and analysis. I think each is reasonable. Aura.Router Performance Benchmark Analysis for PR#208

koriym avatar May 02 '25 16:05 koriym

ah i guess i also found a bug here. I need to add a test for routings like /api/export/country{/countryId}.csv that this will match /api/export/country.csv and /api/export/country/1.csv (same behavior for /api/export.{format} for /api/export.csv and /api/export.jsonl

  • [ ] Add tests for the two examples

marcelthole avatar May 09 '25 06:05 marcelthole

@marcelthole did you noticed @pmjones comment ? If not https://github.com/auraphp/Aura.Router/pull/208#discussion_r2072398254

harikt avatar May 09 '25 07:05 harikt