triplea icon indicating copy to clipboard operation
triplea copied to clipboard

NullPointerException on Map update

Open emmanuel-perlow opened this issue 4 years ago • 4 comments

On a fresh install (or when a map update check is triggered) a NullPointerException is thrown at game-app/game-core/src/main/java/games/strategy/engine/auto/update/UpdateMapsCheck.java line 82

if (installedVersion < availableMap.getVersion()) {

(the getVersion() returns null)

I investigated and the reason is that the JSON object returned by the server (https://prerelease.triplea-game.org/maps/listing) contains the entry "version": null for all listed maps.

The code states that getVersion() is deprecated and getLastCommitDateEpochMilli() + file time stamp should be used.

I quickly contemplated updating the code but I think relying on the user's time stamp of the file/folder is a flawed approach to test whether a file is older than a given reference. (Time stamps should only be used to check whether a file is newer than a reference). For example, simple every day operations like copying or moving files will change the various time stamps (at least under Linux but I guess Windows will show similar behaviour).

Instead of relying on the time stamp, I think using a hash over the map files to check whether a map is out of date would be the better approach. The server computes its hash and sends it to the client, the client can check that against its computed hash and if they do not match conclude that the map of the server is newer. Since this needs not to be cryptographically secure, any hash (or hash-like function) will do.

Reproduction Steps

  1. Trigger a map update check with the 2.6 prerelease server on commit d74000cdfe1

emmanuel-perlow avatar Aug 17 '21 07:08 emmanuel-perlow

I agree with your assessment that the approach is flawed. Is it good enough?

There are some drawbacks to the hash approach:

  • Hash could have its own problem where a hand-modified map shows as out of date when it is not.
  • Hashing requires us to hash every file of every map, we can use a very speedy hash algorithm, but nonetheless this could be slow if a user has all maps downloaded (this is ballpark 5GB)
  • The hash algorithm would have to be consistent between server and what the client would compute (across all OS)

FWIW, the server does download the map zip to then compute the exact download size. So the server very well could hash all of the map files and then store that hash.

#9576 proceeds with the update to go with file time stamps. If we can reason through a good way to do hashes or an alternative, then the iterative update to do that would be welcome.

DanVanAtta avatar Sep 07 '21 05:09 DanVanAtta

There are some drawbacks to the hash approach:

  • Hash could have its own problem where a hand-modified map shows as out of date when it is not.

I guess people modifying maps by hand are more of the power-user kind. Thus, documenting what the update mechanism does and how it works, should give them enough background to react accordingly.

  • Hashing requires us to hash every file of every map, we can use a very speedy hash algorithm, but nonetheless this could be slow if a user has all maps downloaded (this is ballpark 5GB)

We don't need to hash every map on every startup. The client can cache the hash and compare that stored value with the hash provided by the server. So we only need to compute that hash once after downloading the map, and there I don't see a bottleneck. To make sure that the cached hash is not out-of-date we can use the file timestamp (here we compare against an older reference, which is fine), so we only recompute the hash for maps which (could have) changed.

  • The hash algorithm would have to be consistent between server and what the client would compute (across all OS)

I guess, you are not referring to a particular hash algorithm (SHA1, BLAKE, ...) as those are consistent. I think you are more concerned about line endings and path delimiters? We should normalise paths to use '/' as delimiter as this is safe on Windows, MacOS, Linux, and *BSD. For the individual files we take them as is with whatever line ending they have (or does TripleA modify the map files after extraction?).

A simple idea of calculating a hash for a map (i.e. a directory and all its files) would be

  1. calculate the hash for each file
  2. construct the string
    <hash> <path>
    abc... path/to/file_1
    def... another/path/to/file_2
    ...
    
  3. hash the above string and use it as "the hash of the map"

If you think that looks familiar, then probably it is because it's similar to how Git and other content tracking systems work. If we want to go with that approach we should even seriously consider using git or a library like libgit2 or JGit to calculate the hash for the directory/map.

#9576 proceeds with the update to go with file time stamps.

No objections here.

emmanuel-perlow avatar Sep 09 '21 09:09 emmanuel-perlow

@emmanuel-perlow, @DanVanAtta Can we close this issue or is there something that should still be considered?

frigoref avatar May 30 '25 13:05 frigoref

In my opinion this is still suboptimal.

At the very least the NullPointerException should be handled/caught properly.

emmanuel-perlow avatar Sep 01 '25 19:09 emmanuel-perlow