openvsx icon indicating copy to clipboard operation
openvsx copied to clipboard

/api/-/query response size

Open amvanbaren opened this issue 2 years ago • 30 comments

Fixes #379

Testing steps

  • Open this PR in Gitpod.
  • Let Gitpod initialize. This can take 5 - 10 minutes, because it will download 485 vscode.bat extensions and then publish them to the local instance.
  • Get https://8080-<GITPOD_WORKSPACE_ID>.gitpod.io/api/v4/-/query?extensionId=vscode.bat&includeAllVersions=false. The response should be ~80 KB. The first object in the extensions array is the latest version. All other objects are version links.
  • Get https://8080-<GITPOD_WORKSPACE_ID>.gitpod.io/api/v4/-/query?extensionId=vscode.bat&includeAllVersions=true. The response should be ~800 KB. The extensions array contains all versions. Only the last object is a version link for the latest version.
  • Get https://8080-<GITPOD_WORKSPACE_ID>.gitpod.io/api/v3/-/query?extensionId=vscode.bat&includeAllVersions=true. The response should be ~800 KB.
  • Get https://8080-<GITPOD_WORKSPACE_ID>.gitpod.io/api/v2/-/query?extensionId=vscode.bat&includeAllVersions=true. The response should be ~650 KB.
  • For comparison, get https://8080-<GITPOD_WORKSPACE_ID>.gitpod.io/api/-/query?extensionId=vscode.bat&includeAllVersions=true. The response should be ~30 MB.

amvanbaren avatar May 18 '22 18:05 amvanbaren

I can confirm that this works in Gitpod as described.

Response sizes:

32487536 verquery.json
  665435 verquery.v2.json

That is a reduction to about 2% of the original response size. Nice work!

To compare the output files in the first detailed version entry, I ran the following command (may need recent bash or zsh):

diff -u <(jq -S '.extensions[0]' verquery.json) \
        <(jq -S '.extensions[0].versions[0]' verquery.v2.json)

Accordingly, v2 removes the following entries from the detailed .versions[] entries:

allVersions
downloadCount
name
namespace
namespaceAccess
namespaceUrl
preview
reviewCount
reviewsUrl
unrelatedPublisher

Congratulations for getting rid of those n**2 allVersions entries.

According to

jq -S '.extensions[0]|del(.versions,.versionLinks)' verquery.v2.json

the following entries have been added to the level above .versions[]:

downloadCount
name
namespace
namespaceUrl
preview
reviewCount
reviewsUrl

Remarks and questions

  • The hierarchy-level-changing move of the mentioned tags makes sense if and only if the mentioned entries are not version-specific. (Are they?)

  • For example, this model implies that an extension cannot be renamed or moved to another namespace in a new version. (Can it?)

  • What about the following removed entries?

    namespaceAccess
    unrelatedPublisher
    
  • Why keep the toplevel .extensions[]? By now, it seems to be bound to have no less and no more than one entry.

  • It would be nice to have a version of this PR that only deletes the allVersions entries and preserves the existing response structure in all other respects.

    • This way, clients can be adapted easily by just tweaking the query URL.
    • There would be no need to check if perhaps the change breaks the existing data model.
    • The little redundancy left would be eliminated by compression.
    • The main goal, making the queries scale well with version count, would still be achieved.

ccorn avatar May 19 '22 20:05 ccorn

The hierarchy-level-changing move of the mentioned tags makes sense if and only if the mentioned entries are not version-specific. (Are they?)

Yes, the changes reflect the database model. The data hierarchy is: Namespace > Extension > ExtensionVersion.

  • downloadCount: total extension downloads.
  • name: extension name.
  • namespace: namespace name.
  • namespaceUrl: url pointing to the namespace.
  • preview: whether the extension is in preview mode.
  • reviewCount: total extension reviews.
  • reviewsUrl: url pointing to extension reviews.

For example, this model implies that an extension cannot be renamed or moved to another namespace in a new version. (Can it?)

Currently it can't. However, issue #427 wants to add this ability.

What about the following removed entries? namespaceAccess unrelatedPublisher

These entries have been deprecated for quite a while now. I made the bold move to exclude them completely.

Why keep the toplevel .extensions[]? By now, it seems to be bound to have no less and no more than one entry.

It seems that way, because this query is by extensionId (format: {namespaceName}.{extensionName}), which limits the extensions[] to 1. It's also possible to query by namespaceName or namespaceUuid to get all extensions in a namespace, or to query by extensionName to get all extensions matching extensionName across all namespaces.

It would be nice to have a version of this PR that only deletes the allVersions entries and preserves the existing response structure in all other respects.

I'll add a /api/v3/-/query endpoint that only deletes the allVersions entries.

amvanbaren avatar May 20 '22 10:05 amvanbaren

I've added the /api/v3/-/query endpoint. The response size is ~800 KB, so the allVersions entry contributes the most to the response size.

amvanbaren avatar May 20 '22 12:05 amvanbaren

  • preview: whether the extension is in preview mode.

But that one is version-specific. E.g.

Query URL preview
https://open-vsx.org/api/vscode/bat/1.62.3 false
https://open-vsx.org/api/vscode/bat/1.64.0-next.d9fa2b12136 true

ccorn avatar May 20 '22 17:05 ccorn

For example, this model implies that an extension cannot be renamed or moved to another namespace in a new version. (Can it?)

Currently it can't. However, issue #427 wants to add this ability.

Maybe it is better to err on the side of caution then.

ccorn avatar May 20 '22 18:05 ccorn

I've added the /api/v3/-/query endpoint. The response size is ~800 KB, so the allVersions entry contributes the most to the response size.

Thanks! I will try it. Awesome progress!

ccorn avatar May 20 '22 18:05 ccorn

Maybe it is better to err on the side of caution then.

In fact, the code mentions namespaceUuid and extensionUuid, so I suppose that the actual immutable identifiers are UUIDs, not names. Therefore, at least in principle, names could be variable.

ccorn avatar May 20 '22 18:05 ccorn

I have tried this PR at https://github.com/eclipse/openvsx/pull/454/commits/5189e62e9c2d7135fad81bf4ae960f83e836c392 now.

As before, I downloaded the responses and, out of curiosity, piped them through gzip -c | wc -c and zstd -c | wc -c to see how compression affects the transfer size.

Response                size        gzip        zstd
verquery.json       32487533     3183772       34584
verquery.v2.json      665432       26802       20876
verquery.v3.json      825209       28031       21052

With v2 and v3, the responses are short enough that gzip is much more effective. zstd is the winner as expected, but that's a sideshow.

v3 still has the structure of v2 with a sub-array versions and a map versionLinks replacing the common allVersions. Apart from that, no tags have been moved up the hierarchy or removed, so the following comparison command shows absolutely no difference:

diff -u <(jq -S '.extensions|del(.[].allVersions)' verquery.json) \
        <(jq -S '.extensions[0].versions' verquery.v3.json)

which should reduce the effort needed for adaptation to the new API.

Conclusion

I can confirm that v3 works as advertised.

Remark

Personally, I think that the versionLinks map is unnecessary:

  • All the information available at the links in its values is already present in the versions array.
  • Users of the query API certainly would know the /api/${namespace}/${name}/${version} endpoint anyway.

So I'd remove the versionLinks (or provide a single string entry in each versions[] entry named versionLink). That leaves only the versions sub-array. Which can be flattened away as is the case now. Then the only difference to the current state of affairs would be that the uncalled-for .allVersions entries would no longer be present.

That means that current clients not interested in the version links just have to change their /api/ to /api/v4/ (or whatever version number emerges) and nothing else. As a result, server load would soon drop by orders of magnitude. Wouldn't that be great?

ccorn avatar May 21 '22 02:05 ccorn

  • preview: whether the extension is in preview mode.

But that one is version-specific. E.g. Query URL preview https://open-vsx.org/api/vscode/bat/1.62.3 false https://open-vsx.org/api/vscode/bat/1.64.0-next.d9fa2b12136 true

It currently is, because there was a mix-up between preview and pre-release #386. The current production version still runs the old code where the preview property actually indicates whether a version is a pre-release or not.

In fact, the code mentions namespaceUuid and extensionUuid, so I suppose that the actual immutable identifiers are UUIDs, not names. Therefore, at least in principle, names could be variable.

Yes, the actual identifiers are UUIDs. Names could be variable, but an extension name must be unique within a namespace. Also a lot of the API endpoints take namespaceName and extensionName as parameters. How this change would impact the server needs to be investigated further.

amvanbaren avatar May 23 '22 08:05 amvanbaren

I have tried this PR at ab6a02aa05977daa5405c60a18c9afbaf8714cad in Gitpod.

Indeed, v4 is precisely the original response without all the .allVersions erntries, as I have wished for.

Response sizes with ìncludeAllversions:

Response             size    gzip   zstd
verquery.json    32487536 3183746  34586
verquery.v2.json   665435   26815  20696
verquery.v3.json   825212   27741  20974
verquery.v4.json   766596   21605  16309

The v4 response is a bit longer than v2 when uncompressed, but shorter when compressed.

Besides, I notice that the original response has allVersions URLs ending with

/api/${namespace}/${name}/universal/${version}

whereas v2 and v3 use only

/${namespace}/${name}/${version}

which results in broken links, presumably because the /api prefix is missing.

Response sizes without ìncludeAllversions:

Response           size  gzip  zstd
ver1query.json    67056  6843  6836
ver1query.v2.json 60211  6751  6768
ver1query.v3.json 60268  6754  6797
ver1query.v4.json  1652   723   736

In fact, v4 has no allVersions (nor versionLinks) map even in unversioned queries without includeAllVersions. Conceptually that is OK with me, but this raises the question how to get the version links map if one is not interested in all the details associated with includeAllVersions.

It would probably be easiest if queries or API lookups

  • without an extension's version specified (example: /api/${namespace}/${name}, counterexample: /api/${namespace}/${name}/${version}) and
  • meant to yield only one version per extension (that is, without includeAllVersions)

(and only those queries) had one allVersions map included as before. That is, mimic the current behavior, except that allVersions is NOT included if includeAllVersions is true or if the query already specifies a version.

That way,

  • The scenarios where allVersions might actually be useful still have it available,
  • results of queries with specified extension version would not have allVersions and thus be long-term cacheable (unless the version is an alias like latest),
  • the complexity explosion from the combination of includeAllVersions and allVersions would be prevented.

ccorn avatar May 26 '22 05:05 ccorn

What about making the query parameter includeAllVersions ternary?

includeAllVersions Meaning
false no allVersions, metadata for one version
true no allVersions, metadata for all versions
links or unspecified allVersions and metadata for one version

That would be more flexible because clients can get cross-reference links in addition to metadata for one specific version, as is the case now. Yet it would still prevent complexity explosion.

Edit: The low-budget binary version of this proposal would be to define the semantics of false as those for links, i.e. allVersions is included if and only if includeAllVersions is false.

The only drawback would be that there is no long-term caching available: The ternary approach means that a concrete version (unlike latest and such) with an explicit includeAllVersions=false could be delivered with longer-term caching headers.

ccorn avatar May 26 '22 07:05 ccorn

In fact, v4 has no allVersions (nor versionLinks) map even in unversioned queries without includeAllVersions. Conceptually that is OK with me, but this raises the question how to get the version links map if one is not interested in all the details associated with includeAllVersions.

I've added a url field to the extension object. Clients can search the extensions array to find the url (versionLink) by a combination of namespaceName, extensionName and version. Version links are minimal extension objects. Only namespace, name, version and url are specified.

Calling https://8080-<GITPOD_WORKSPACE_ID>.gitpod.io/api/v4/-/query?extensionId=vscode.bat&includeAllVersions=false shows the difference between a 'full' extension object (first object) and a 'minimal' extension object (all other objects).

amvanbaren avatar May 26 '22 08:05 amvanbaren

Version links are minimal extension objects. Only namespace, name, version and url are specified.

Interesting approach. Going to take a look. However, we would need the engines sub-objects for the task of finding a compatible version as well. Current Theia download:plugins logic also uses the files.download object of the selected version, but your idea is probably that that information can be retrieved separately.

We see here that this approach makes assumptions about what meta-information the client must scan before it can select one version. My guess is that these assumptions are not going to be satisfactory for long.

ccorn avatar May 26 '22 08:05 ccorn

The download:plugins logic calls: 140    const extension = await client.getLatestCompatibleExtensionVersion(id); and getLatestCompatibleExtensionVersion calls: 119    const extensions = await this.getAllVersions(id); this.getAllVersions(id); calls the /api/-/query?extensionId=<EXT_ID>&includeAllVersions=true The API call returns 'full' extension objects for all versions. The download:plugins logic doesn't use the allVersions map.

includeAllVersions does not refer to the allVersions map. Instead it is to specify what to include in the response when extensionVersion is undefined. To give you an overview:

  • if extensionVersion is defined, then ignore the includeAllVersions parameter.
  • if includeAllVersions=true and extensionVersion is undefined, then for each extension include all versions in the response.
  • if includeAllVersions=false or undefined and extensionVersion is undefined, then for each extension only include the latest version in the response.

amvanbaren avatar May 26 '22 10:05 amvanbaren

Oh, I see: ìncludeAllVersions=false in v4 now returns several entries in the extensions array, with all but the first entry minimalized.

ìncludeAllVersions=true returns all entries as before, augmented with url entries.

Neat! Some bugs I noticed:

  • The new v4 url field seems to be missing an /api (or api/v4).

  • v4 adds a spurious entry stub:

    $ diff -u <(jq -S 'del(.extensions[].allVersions)' verquery.json) \
              <(jq -S 'del(.extensions[].url)' verquery.v4.json)
    --- /dev/fd/63  2022-05-26 12:10:14.145286993 +0200
    +++ /dev/fd/62  2022-05-26 12:10:14.145286993 +0200
    @@ -24764,6 +24764,11 @@
           "verified": false,
           "version": "1.44.2",
           "versionAlias": []
    +    },
    +    {
    +      "name": "bat",
    +      "namespace": "vscode",
    +      "version": "latest"
         }
       ]
     }
    

Response sizes with and without includeAllVersions:

Response             size    gzip   zstd
verquery.json    32487536 3183855  34527
verquery.v2.json   667379   26817  20970
verquery.v3.json   827156   27971  21145
verquery.v4.json   816598   23999  18589
ver1query.json      67056    6843   6837
ver1query.v2.json   62155    6716   6748
ver1query.v3.json   62212    6716   6779
ver1query.v4.json   85436    6366   5750

Somehow I like this concept, although I probably should not. It is better structured in that it does not pretend that allVersions were a property of any specific version, yet it is compatible with existing usage (apart from allVersions of course), as long as only one extension ID is queried.

However, It seems to be more cumbersome for queries whose response can cover more than one extension ID. For those, a ternary includeAllVersions, as described earlier, might be useful. Thus, an explicit false would reduce the extensions array to one entry per extension ID again.

ccorn avatar May 26 '22 10:05 ccorn

v4 adds a spurious entry stub

No, that is intended I guess, (It also has a url, but my diff command removes that for purposes of comparing). So it's just another minimized entry.

Edit: No, wait. This happens in the response for includeAllVersions=true. That one should not have minimized entries at all. So yes, that's a bug.

ccorn avatar May 26 '22 11:05 ccorn

Maybe the 'minimized' entries add to the confusion. I mean when includeAllVersions=false the response still contains multiple extension entries: one 'full' object and 'minimized' objects for all other versions.

Instead only setting the allVersions map for the first version for each extension (${namespace}.${name}) solves the issue and is closer to the original implementation. It would require minimal changes for clients to adopt. If extensionVersion is defined the allVersions map is set on the matching version. In other cases the allVersions map is only set on the latest version.

amvanbaren avatar May 26 '22 11:05 amvanbaren

I suppose that allVersionsshould never be needed when includeAllVersions=true. (The url entries are better suited for that case.) So we do not need to make incongruent element layouts for those responses. The only question where allVersions might be wanted is for queries without includeAllVersions. The ternary approach would allow clients to be explicit about their expectations.

ccorn avatar May 26 '22 11:05 ccorn

I suppose that allVersionsshould never be needed when includeAllVersions=true. (The url entries are better suited for that case.) So we do not need to make incongruent element layouts for those responses. The only question where allVersions might be wanted is for queries without includeAllVersions. The ternary approach would allow clients to be explicit about their expectations.

I've added extensionVersion to the table to see how it interacts with includeAllVersions

includeAllVersions extensionVersion Meaning
false specified no allVersions, metadata for specified version
false unspecified no allVersions, metadata for latest version
true specified ignore includeAllVersions, i.e. includeAllVersions is unspecified
true unspecified no allVersions, metadata for all versions
links or unspecified specified allVersions and metadata for specified version
links or unspecified unspecified allVersions and metadata for latest version

Edit: I've added the /api/v5/-/query endpoint: https://8080-<GITPOD_WORKSPACE_ID>.gitpod.io/api/v5/-/query?extensionId=vscode.bat&includeAllVersions=false&extensionVersion=1.46.1 By changing or omitting the includeAllVersions and extensionVersion parameters you can test the responses.

amvanbaren avatar May 26 '22 13:05 amvanbaren

Thanks! I will try v5 within the next two hours.

ccorn avatar May 26 '22 17:05 ccorn

This took some time because I tried lots of combinations and wrote a bash script to automate most of those tests.

Summary

I have tried 3388130615bdbfe04dc103d20ce4df0181b8010d. It works as advertised, reduces response size (in some cases drastically), is both reasonably backwards-compatible and (with the ternary includeAllVersions) more flexible than the original API.

I look forward to see the v5 API in production.

More detailed v5 notes

allVersions is not deleted, but emptied. I suppose that this is a good idea, as it might prevent unnecessary failures in structure-checking clients.

To give you an idea of what I have checked, the v5-related output of my test script is given below. Response filenames contain the values of extensionId, extensionVersion, and includeAllVersions, if specified, separated by underscores.

Response                                      size    gzip   zstd
verquery.vscode.bat_true.json             32487533 3183822  34929
verquery.v5.vscode.bat_true.json            826637   23332  18566
verquery.vscode.bat.json                     67056    6830   6837
verquery.vscode.bat_false.json               67056    6830   6837
verquery.v5.vscode.bat.json                  62305    6310   5575
verquery.v5.vscode.bat_links.json            62305    6310   5575
verquery.v5.vscode.bat_false.json             1778     723    750
verquery.vscode.bat_1.53.2.json              66918    6797   6806
verquery.vscode.bat_1.53.2_true.json         66918    6797   6806
verquery.vscode.bat_1.53.2_false.json        66918    6797   6806
verquery.v5.vscode.bat_1.53.2.json           62150    6274   5548
verquery.v5.vscode.bat_1.53.2_true.json      62150    6274   5548
verquery.v5.vscode.bat_1.53.2_links.json     62150    6274   5548
verquery.v5.vscode.bat_1.53.2_false.json      1623     681    711

Check verquery.v5.vscode.bat_1.53.2.json = verquery.v5.vscode.bat_1.53.2_links.json
Check verquery.v5.vscode.bat_1.53.2_true.json = verquery.v5.vscode.bat_1.53.2_links.json
Check verquery.v5.vscode.bat.json = verquery.v5.vscode.bat_links.json

latest = 1.64.0-next.d9fa2b12136
Check version in verquery.v5.vscode.bat_1.53.2_false.json = 1.53.2: true
Check version in verquery.v5.vscode.bat_false.json = 1.64.0-next.d9fa2b12136: true
Check version in verquery.v5.vscode.bat_1.53.2_links.json = 1.53.2: true
Check version in verquery.v5.vscode.bat_links.json = 1.64.0-next.d9fa2b12136: true
Check version in verquery.v5.vscode.bat_1.53.2.json = 1.53.2: true
Check version in verquery.v5.vscode.bat.json = 1.64.0-next.d9fa2b12136: true

vercount = 485
Check version count in verquery.v5.vscode.bat_1.53.2_false.json == 1: true
Check version count in verquery.v5.vscode.bat_false.json == 1: true
Check version count in verquery.v5.vscode.bat_1.53.2_true.json == 1: true
Check version count in verquery.v5.vscode.bat_true.json == 485: true
Check version count in verquery.v5.vscode.bat_1.53.2_links.json == 1: true
Check version count in verquery.v5.vscode.bat_links.json == 1: true
Check version count in verquery.v5.vscode.bat_1.53.2.json == 1: true
Check version count in verquery.v5.vscode.bat.json == 1: true

Check url presence in verquery.v5.vscode.bat_1.53.2_false.json: true
Check url presence in verquery.v5.vscode.bat_false.json: true
Check url presence in verquery.v5.vscode.bat_1.53.2_true.json: true
Check url presence in verquery.v5.vscode.bat_true.json: true
Check url presence in verquery.v5.vscode.bat_1.53.2_links.json: true
Check url presence in verquery.v5.vscode.bat_links.json: true
Check url presence in verquery.v5.vscode.bat_1.53.2.json: true
Check url presence in verquery.v5.vscode.bat.json: true

Check allVersions empty in verquery.v5.vscode.bat_1.53.2_false.json: true
Check allVersions empty in verquery.v5.vscode.bat_false.json: true
Check allVersions nonempty in verquery.v5.vscode.bat_1.53.2_true.json: true
Check allVersions empty in verquery.v5.vscode.bat_true.json: true
Check allVersions nonempty in verquery.v5.vscode.bat_1.53.2_links.json: true
Check allVersions nonempty in verquery.v5.vscode.bat_links.json: true
Check allVersions nonempty in verquery.v5.vscode.bat_1.53.2.json: true
Check allVersions nonempty in verquery.v5.vscode.bat.json: true

Compare other metadata in verquery.v5.vscode.bat_1.53.2_false.json with original
Compare other metadata in verquery.v5.vscode.bat_false.json with original
Compare other metadata in verquery.v5.vscode.bat_1.53.2_true.json with original
Compare other metadata in verquery.v5.vscode.bat_true.json with original
Compare other metadata in verquery.v5.vscode.bat_1.53.2_links.json with original
Compare other metadata in verquery.v5.vscode.bat_links.json with original
Compare other metadata in verquery.v5.vscode.bat_1.53.2.json with original
Compare other metadata in verquery.v5.vscode.bat.json with original

The script found no v5-related errors (but one v4-related error mentioned earlier: one spurious added minimized entry in verquery.v4.vscode.bat_true.json).

In particular, the semantics related to .allVersions is precisely as detailed in the six-case table combining extensionVersion state with the ternary includeAllVersions.

Unrelated notes

For includeAllVersions=trueand unspecified extensionVersion, all API versions so far, including the original, return the latest version not at position [0]. In my tests, those occur in position [2].

While not relevant for this PR, this means that the logic in theia download:plugins for determining the latest compatible version might get fooled: There the idea is to pick the first listed version that has a compatible engines entry.

Therefore I would like to know what the actual sorting order is.

ccorn avatar May 27 '22 10:05 ccorn

For includeAllVersions=trueand unspecified extensionVersion, all API versions so far, including the original, return the latest version not at position [0]. In my tests, those occur in position [2]. Therefore I would like to know what the actual sorting order is.

The SemanticVersion class is used to parse and compare versions. The parsing logic is in the constructor: https://github.com/eclipse/openvsx/blob/72706d126e642d6d714c5ae40bc970dd35340986/server/src/main/java/org/eclipse/openvsx/util/SemanticVersion.java#L24-L42

And here's the comparison logic: https://github.com/eclipse/openvsx/blob/72706d126e642d6d714c5ae40bc970dd35340986/server/src/main/java/org/eclipse/openvsx/util/SemanticVersion.java#L86-L106

The unit test describes how SemanticVersion is supposed to work: https://github.com/eclipse/openvsx/blob/72706d126e642d6d714c5ae40bc970dd35340986/server/src/test/java/org/eclipse/openvsx/util/SemanticVersionTest.java#L18-L30

You can see on line 24 that versions 1.2.3-next.bc11e2c5 and 1.2.3-next.6aa3b0d6 must be equal according to the SemanticVersion logic. This is incorrect, build hashes should also be compared to each other. I've removed the SemanticVersion class and added the semver4j library. It's used in the server code to parse and compare semvers based on the NPM convention.

allVersions is not deleted, but emptied. I suppose that this is a good idea, as it might prevent unnecessary failures in structure-checking clients.

That was a bug. allVersions is no longer added to the response if it's empty.

@ccorn Thanks for taking the time to test and review the changes. It really helps a lot.

amvanbaren avatar May 30 '22 09:05 amvanbaren

Thanks! I have updated the test script according to your bug fixes, so now the version of the first result of includeAllVersions=true responses is checked as well, and the size check of allVersions has been updated to include a presence/absence check.

Except for that one v4 version count error, all my tests pass. I have attached the test log output test-eclipse-openvsx-pull-454-c015d32-out.txt.

ccorn avatar May 31 '22 22:05 ccorn

While the test results agree with your specifications, current open-vsx.org behaves differently:

Browsing https://open-vsx.org/api/-/query?extensionId=vscode.bat, I get in .extension[0]:

version: "1.62.3"
versionAlias: ["latest"]

But if I replace open-vsx.org with the gitpod instance for this PR, I get a -next version (all API versions, including the one without /v). Should not "latest" point to a non-next version?

With includeAllVersions=true, current open-vsx.org returns the greatest -next version first, then, at an appropriately farther position, the above stable version tagged with "latest" makes its appearance. With this PR however, "latest" always points to the first entry, usually a -next version.

Should that be so?

ccorn avatar Jun 01 '22 00:06 ccorn

For the version sorting, I suppose that the least-surprising approach is (still) a lexicographic one, but with the version suffix (an unordered hash value) replaced with the publishing timestamp (which actually indicates an order). That is, begin with what SemanticVersion does and in case of equality (or EquivalentTo in semver4j) compare the timestamps.

It would be easiest if most of the original server code remained in effect, with just the includeAllVersions handling refined according to the case table and .allVersions included only if includeAllVersions is (effectively) links.

ccorn avatar Jun 01 '22 00:06 ccorn

Should not "latest" point to a non-next version?

latest is now the absolute latest version, pre-release or not. The change was made here: https://github.com/eclipse/openvsx/pull/419 as a response to https://github.com/eclipse/openvsx/pull/410#issuecomment-1034831850. These changes are not yet running in production.

For the version sorting, I suppose that the least-surprising approach is (still) a lexicographic one, but with the version suffix (an unordered hash value) replaced with the publishing timestamp (which actually indicates an order). That is, begin with what SemanticVersion does and in case of equality (or EquivalentTo in semver4j) compare the timestamps.

Ok, that's how the previous sort worked. I'll revert the changes. https://github.com/eclipse/openvsx/blob/72706d126e642d6d714c5ae40bc970dd35340986/server/src/main/java/org/eclipse/openvsx/LocalRegistryService.java#L389-L397

amvanbaren avatar Jun 01 '22 14:06 amvanbaren

Ok, that's how the previous sort worked. I'll revert the changes.

I see. But how can the "latest" entry show up at [2] then (in 3388130615bdbfe04dc103d20ce4df0181b8010d)? If that is just a side effect of, say, parallelism in filling the gitpod instance, that could be ignored (though I'd still like to see the original timestamps in effect). But double-checking would be good because correct sorting and correct labelling of versions seems crucially important to me.

ccorn avatar Jun 01 '22 15:06 ccorn

At 49429150eaef7220863902fa352de7674c4dd0f9, I get a 500 Internal Server Error.

java.lang.IllegalArgumentException: Argument is not an array
	at java.base/java.lang.reflect.Array.getLength(Native Method)
	at org.hibernate.internal.util.collections.ArrayHelper.toList(ArrayHelper.java:122)
	at org.eclipse.openvsx.repositories.ExtensionVersionJooqRepository.toExtensionVersion(ExtensionVersionJooqRepository.java:179)
	at org.jooq.impl.ResultImpl.map(ResultImpl.java:1318)
	at org.eclipse.openvsx.repositories.ExtensionVersionJooqRepository.fetch(ExtensionVersionJooqRepository.java:162)
	at org.eclipse.openvsx.repositories.ExtensionVersionJooqRepository.findAllActiveByExtensionNameAndNamespaceName(ExtensionVersionJooqRepository.java:91)
	at org.eclipse.openvsx.repositories.RepositoryService.findActiveExtensionVersionsByExtensionName(RepositoryService.java:276)
	at org.eclipse.openvsx.LocalRegistryService.queryV5(LocalRegistryService.java:682)
	at org.eclipse.openvsx.LocalRegistryService$$FastClassBySpringCGLIB$$c162f6fb.invoke(<generated>)
	at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687)
	at org.eclipse.openvsx.LocalRegistryService$$EnhancerBySpringCGLIB$$8d073861.queryV5(<generated>)
	at org.eclipse.openvsx.RegistryAPI.getQueryV5(RegistryAPI.java:514)
	at jdk.internal.reflect.GeneratedMethodAccessor160.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
	at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:893)
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:807)
	at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1061)
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:961)
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
	at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:626)
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:733)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.springframework.web.filter.ShallowEtagHeaderFilter.doFilterInternal(ShallowEtagHeaderFilter.java:106)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:327)
	at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:115)
	at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:81)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:119)
	at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:113)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:126)
	at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:81)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:105)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:149)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:218)
	at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:212)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.oauth2.client.web.OAuth2AuthorizationRequestRedirectFilter.doFilterInternal(OAuth2AuthorizationRequestRedirectFilter.java:178)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:103)
	at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:89)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.csrf.CsrfFilter.doFilterInternal(CsrfFilter.java:115)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.web.filter.CorsFilter.doFilterInternal(CorsFilter.java:91)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.header.HeaderWriterFilter.doHeadersAfter(HeaderWriterFilter.java:90)
	at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:75)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:110)
	at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:80)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:55)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
	at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:211)
	at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:183)
	at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:358)
	at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:271)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.springframework.session.web.http.SessionRepositoryFilter.doFilterInternal(SessionRepositoryFilter.java:141)
	at org.springframework.session.web.http.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:82)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:93)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:542)
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143)
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:374)
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1590)
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
	at java.base/java.lang.Thread.run(Thread.java:834)

ccorn avatar Jun 07 '22 13:06 ccorn

I see. But how can the "latest" entry show up at [2] then (in https://github.com/eclipse/openvsx/commit/3388130615bdbfe04dc103d20ce4df0181b8010d)? If that is just a side effect of, say, parallelism in filling the gitpod instance, that could be ignored (though I'd still like to see the original timestamps in effect). But double-checking would be good because correct sorting and correct labelling of versions seems crucially important to me.

Different sort comparators were used in VersionUtil, ExtensionVersion and LocalRegistryService. I've refactored the code, so only ExtensionVersion.SORT_COMPARATOR is used to sort extension versions. Now [0] is the latest.

amvanbaren avatar Jun 07 '22 19:06 amvanbaren

Different sort comparators were used in VersionUtil, ExtensionVersion and LocalRegistryService. I've refactored the code, so only ExtensionVersion.SORT_COMPARATOR is used to sort extension versions. Now [0] is the latest.

Indeed it works. According to the test script output applied to 6c698ec63a8396fc3cefc9f516724df4aaf8ded9, all v5 tests pass, and that includes checking [0] == latest. (In fact, all my tests pass, even those unrelated to v5, except the one v4 list output size test, as noted before.) Cursory manual inspection confirms that the version sorting goes indeed by version (without hash part), then by timestamp.

This v5 looks like a suitable and desirable change.

ccorn avatar Jun 07 '22 20:06 ccorn