deno_std icon indicating copy to clipboard operation
deno_std copied to clipboard

refactor(yaml): make yaml string based

Open timreichen opened this issue 6 months ago • 4 comments

This PR makes the yaml parser string based where possible. This is easier to read and allows for simplifications in future PRs. It seems to have only a tiny impact on performance: Current:

benchmark                                                                         time/iter (avg)        iter/s      (min … max)           p75      p99     p995
--------------------------------------------------------------------------------- ----------------------------- --------------------- --------------------------
parse() handles single document yaml string                                                4.8 µs       207,800 (  4.5 µs … 174.0 µs)   4.7 µs   6.0 µs   8.0 µs
parseAll() handles yaml string with multiple documents                                     8.1 µs       122,700 (  7.6 µs … 170.5 µs)   8.0 µs  10.3 µs  19.6 µs
parse() throws with `!!js/*` yaml types with default schemas                              13.0 µs        77,080 ( 10.4 µs …   9.9 ms)  11.5 µs  43.7 µs  72.3 µs
parse() handles `!!js/*` yaml types with extended schema while parsing                     6.4 µs       155,100 (  6.3 µs …   7.0 µs)   6.4 µs   7.0 µs   7.0 µs
parse() throws with `!!js/function` yaml type with extended schema                        12.3 µs        81,570 ( 11.0 µs … 331.2 µs)  12.5 µs  19.1 µs  29.2 µs
parseAll() accepts parse options                                                           5.7 µs       174,100 (  5.7 µs …   6.1 µs)   5.8 µs   6.1 µs   6.1 µs
parse() handles __proto__                                                                 30.3 ms          33.0 ( 29.5 ms …  32.8 ms)  30.8 ms  32.8 ms  32.8 ms
parse() returns `null` when yaml is empty or only comments                               833.3 ns     1,200,000 (814.3 ns … 885.3 ns) 834.8 ns 885.3 ns 885.3 ns
parse() handles binary type                                                               38.0 µs        26,330 ( 34.1 µs … 369.6 µs)  36.5 µs  85.0 µs  96.5 µs
parse() handles boolean types                                                             16.7 µs        60,050 ( 15.8 µs … 171.3 µs)  16.4 µs  24.8 µs  44.4 µs
parse() handles float types                                                                8.4 µs       118,900 (  8.0 µs … 160.3 µs)   8.3 µs   9.0 µs  10.3 µs
parse() handles integer types                                                             74.0 µs        13,520 ( 71.2 µs … 310.5 µs)  73.6 µs 103.8 µs 109.6 µs
parse() handles timestamp types                                                           31.1 µs        32,180 ( 29.7 µs … 266.0 µs)  31.0 µs  36.6 µs  61.7 µs
parse() handles omap type                                                                 54.0 µs        18,500 ( 51.7 µs … 190.8 µs)  53.6 µs  76.8 µs  88.3 µs
parse() handles !!pairs type                                                              30.9 µs        32,410 ( 29.4 µs … 198.7 µs)  30.6 µs  40.0 µs  62.1 µs
parse() handles anchors and aliases                                                       43.8 µs        22,850 ( 41.5 µs … 186.3 µs)  43.2 µs  79.9 µs  94.1 µs
parse() handles escaped strings in double quotes                                          22.7 µs        44,130 ( 21.3 µs … 291.5 µs)  22.5 µs  27.1 µs  40.0 µs
parse() handles single quoted scalar                                                      20.2 µs        49,430 ( 19.4 µs … 169.8 µs)  20.1 µs  22.2 µs  29.2 µs
parse() handles %YAML directive                                                           61.6 µs        16,230 ( 55.8 µs …   1.4 ms)  61.1 µs  81.2 µs  90.0 µs
parse() handles %TAG directive                                                            37.9 µs        26,360 ( 36.2 µs …   1.2 ms)  37.6 µs  46.1 µs  55.2 µs
parse() throws with invalid strings                                                       32.2 µs        31,080 ( 30.7 µs … 202.4 µs)  32.2 µs  36.0 µs  44.4 µs
parse() handles merge (<<) types                                                          19.6 µs        50,990 ( 18.5 µs … 493.4 µs)  19.4 µs  22.8 µs  29.5 µs
parse() handles block scalars                                                             48.1 µs        20,810 ( 46.3 µs … 464.9 µs)  47.8 µs  54.9 µs  68.2 µs
parse() handles BOM at the beginning of the yaml                                           1.4 µs       715,100 (  1.4 µs …   1.5 µs)   1.4 µs   1.5 µs   1.5 µs
parse() throws if there are more than one document in the yaml                             7.6 µs       131,000 (  7.6 µs …   7.8 µs)   7.6 µs   7.8 µs   7.8 µs
parse() throws when the directive name is empty                                            8.3 µs       120,600 (  8.2 µs …   8.5 µs)   8.3 µs   8.5 µs   8.5 µs
parse() ignores unknown directive                                                          9.0 µs       110,700 (  8.9 µs …   9.3 µs)   9.0 µs   9.3 µs   9.3 µs
parse() handles document separator                                                         1.5 µs       656,300 (  1.5 µs …   1.6 µs)   1.5 µs   1.6 µs   1.6 µs
parse() show warning with non-ascii line breaks if YAML version is < 1.2                  13.4 µs        74,480 ( 11.8 µs …   6.8 ms)  12.5 µs  16.6 µs  18.5 µs
parse() throws with directive only document                                                9.3 µs       107,000 (  9.2 µs …   9.6 µs)   9.4 µs   9.6 µs   9.6 µs
parse() throws with  in the middle of the document                                       11.7 µs        85,230 ( 11.2 µs … 267.5 µs)  11.7 µs  12.6 µs  13.7 µs
parse() handles complex mapping key                                                       25.9 µs        38,670 ( 25.0 µs … 349.8 µs)  25.6 µs  29.3 µs  35.0 µs
parse() handles unordered set                                                              3.9 µs       259,600 (  3.8 µs …   4.1 µs)   3.9 µs   4.1 µs   4.1 µs
parse() throws with empty mapping key                                                     11.3 µs        88,120 ( 10.8 µs … 264.0 µs)  11.2 µs  12.5 µs  19.5 µs
parse() throws on duplicate keys                                                          13.4 µs        74,860 ( 12.5 µs … 325.0 µs)  13.3 µs  17.3 µs  23.7 µs
parse() allows duplicate keys when `allowDuplicateKeys` option is set to `true`            3.3 µs       301,900 (  3.3 µs …   3.4 µs)   3.3 µs   3.4 µs   3.4 µs
parse() throws at reseverd characters '`' and '@'                                         18.8 µs        53,200 ( 18.1 µs … 301.2 µs)  18.7 µs  22.2 µs  30.2 µs
parse() handles sequence                                                                  11.7 µs        85,530 ( 11.0 µs … 340.3 µs)  11.6 µs  12.5 µs  14.1 µs
parse() handles mapping                                                                   13.5 µs        74,290 ( 12.8 µs … 356.1 µs)  13.2 µs  17.2 µs  27.8 µs
parse() handles string                                                                     1.2 µs       823,800 (  1.2 µs …   1.2 µs)   1.2 µs   1.2 µs   1.2 µs
parse() handles regexp value with extended schema option                                  38.4 µs        26,040 ( 36.8 µs … 310.8 µs)  38.1 µs  46.8 µs  55.8 µs

PR:

benchmark                                                                         time/iter (avg)        iter/s      (min … max)           p75      p99     p995
--------------------------------------------------------------------------------- ----------------------------- --------------------- --------------------------
parse() handles single document yaml string                                                5.0 µs       198,400 (  4.7 µs … 985.0 µs)   5.0 µs   6.4 µs   7.6 µs
parseAll() handles yaml string with multiple documents                                     8.3 µs       119,900 (  7.9 µs …  99.8 µs)   8.3 µs   9.3 µs  12.5 µs
parse() throws with `!!js/*` yaml types with default schemas                              11.6 µs        86,020 ( 10.4 µs …   2.4 ms)  11.4 µs  16.1 µs  22.9 µs
parse() handles `!!js/*` yaml types with extended schema while parsing                     6.8 µs       147,200 (  6.4 µs … 217.6 µs)   6.8 µs   7.5 µs   8.0 µs
parse() throws with `!!js/function` yaml type with extended schema                        12.1 µs        82,380 ( 11.3 µs … 219.2 µs)  12.3 µs  13.8 µs  16.2 µs
parseAll() accepts parse options                                                           5.9 µs       170,800 (  5.8 µs …   6.2 µs)   5.8 µs   6.2 µs   6.2 µs
parse() handles __proto__                                                                 30.2 ms          33.1 ( 29.7 ms …  31.3 ms)  30.5 ms  31.3 ms  31.3 ms
parse() returns `null` when yaml is empty or only comments                               850.1 ns     1,176,000 (823.5 ns … 888.3 ns) 853.8 ns 888.3 ns 888.3 ns
parse() handles binary type                                                               38.3 µs        26,100 ( 34.9 µs … 197.0 µs)  37.2 µs  84.7 µs  89.5 µs
parse() handles boolean types                                                             17.6 µs        56,720 ( 16.6 µs … 221.8 µs)  17.3 µs  26.6 µs  38.3 µs
parse() handles float types                                                                8.9 µs       111,900 (  8.5 µs … 138.7 µs)   8.9 µs   9.6 µs  11.5 µs
parse() handles integer types                                                             71.7 µs        13,960 ( 68.5 µs … 299.6 µs)  71.2 µs  93.1 µs 116.2 µs
parse() handles timestamp types                                                           31.4 µs        31,820 ( 30.0 µs … 225.2 µs)  31.4 µs  39.4 µs  49.2 µs
parse() handles omap type                                                                 52.0 µs        19,240 ( 49.8 µs … 186.2 µs)  51.8 µs  66.9 µs  80.4 µs
parse() handles !!pairs type                                                              31.0 µs        32,240 ( 29.6 µs … 159.6 µs)  31.0 µs  35.1 µs  48.6 µs
parse() handles anchors and aliases                                                       44.6 µs        22,440 ( 42.2 µs … 209.1 µs)  44.1 µs  73.5 µs  88.5 µs
parse() handles escaped strings in double quotes                                          22.8 µs        43,880 ( 21.4 µs … 234.4 µs)  22.7 µs  26.1 µs  38.0 µs
parse() handles single quoted scalar                                                      20.4 µs        49,020 ( 19.5 µs … 134.5 µs)  20.3 µs  22.5 µs  31.2 µs
parse() handles %YAML directive                                                           62.7 µs        15,950 ( 56.6 µs …   1.7 ms)  62.3 µs  75.3 µs  87.3 µs
parse() handles %TAG directive                                                            37.9 µs        26,370 ( 35.0 µs … 287.7 µs)  38.0 µs  44.5 µs  54.8 µs
parse() throws with invalid strings                                                       32.2 µs        31,020 ( 31.0 µs … 292.3 µs)  32.3 µs  38.1 µs  45.5 µs
parse() handles merge (<<) types                                                          20.0 µs        49,990 ( 19.0 µs … 227.6 µs)  19.9 µs  26.1 µs  34.2 µs
parse() handles block scalars                                                             49.0 µs        20,390 ( 46.6 µs … 308.2 µs)  48.7 µs  63.4 µs  75.1 µs
parse() handles BOM at the beginning of the yaml                                           1.4 µs       706,100 (  1.4 µs …   1.5 µs)   1.4 µs   1.5 µs   1.5 µs
parse() throws if there are more than one document in the yaml                             7.8 µs       128,400 (  7.7 µs …   8.0 µs)   7.8 µs   8.0 µs   8.0 µs
parse() throws when the directive name is empty                                            8.3 µs       120,900 (  8.2 µs …   8.4 µs)   8.3 µs   8.4 µs   8.4 µs
parse() ignores unknown directive                                                          9.1 µs       110,000 (  9.0 µs …   9.2 µs)   9.1 µs   9.2 µs   9.2 µs
parse() handles document separator                                                         1.6 µs       628,000 (  1.6 µs …   1.7 µs)   1.6 µs   1.7 µs   1.7 µs
parse() show warning with non-ascii line breaks if YAML version is < 1.2                  13.8 µs        72,310 ( 12.2 µs …   8.3 ms)  13.0 µs  16.1 µs  21.5 µs
parse() throws with directive only document                                                9.4 µs       106,600 (  9.2 µs …   9.7 µs)   9.4 µs   9.7 µs   9.7 µs
parse() throws with  in the middle of the document                                       12.0 µs        83,040 ( 11.5 µs … 254.2 µs)  12.0 µs  13.7 µs  20.0 µs
parse() handles complex mapping key                                                       26.7 µs        37,420 ( 25.2 µs … 362.0 µs)  26.4 µs  34.4 µs  47.5 µs
parse() handles unordered set                                                              3.9 µs       254,100 (  3.9 µs …   4.1 µs)   3.9 µs   4.1 µs   4.1 µs
parse() throws with empty mapping key                                                     11.4 µs        87,720 ( 11.0 µs … 264.1 µs)  11.4 µs  12.1 µs  12.5 µs
parse() throws on duplicate keys                                                          13.7 µs        73,140 ( 12.8 µs … 272.1 µs)  13.7 µs  14.6 µs  17.4 µs
parse() allows duplicate keys when `allowDuplicateKeys` option is set to `true`            3.4 µs       291,900 (  3.4 µs …   3.5 µs)   3.4 µs   3.5 µs   3.5 µs
parse() throws at reseverd characters '`' and '@'                                         19.0 µs        52,570 ( 18.2 µs … 274.6 µs)  19.0 µs  20.4 µs  23.4 µs
parse() handles sequence                                                                  12.0 µs        83,330 ( 11.3 µs … 332.6 µs)  11.9 µs  12.8 µs  15.2 µs
parse() handles mapping                                                                   13.8 µs        72,590 ( 13.1 µs … 364.5 µs)  13.6 µs  14.6 µs  16.9 µs
parse() handles string                                                                     1.3 µs       796,700 (  1.2 µs …   1.3 µs)   1.3 µs   1.3 µs   1.3 µs
parse() handles regexp value with extended schema option                                  38.5 µs        25,940 ( 36.8 µs … 320.4 µs)  38.2 µs  56.0 µs  60.5 µs

timreichen avatar Apr 15 '25 21:04 timreichen