flatpak-external-data-checker
flatpak-external-data-checker copied to clipboard
htmlchecker: allow specifying error handling on encoding error
I need to fetch a binary encoded file, which contains the update name. The file is binary and just contains some readable strings. The problem is, that it already fails with Error querying for new versions: 'utf-8' codec can't decode bytes in position
thus I am not able to apply any regexes on it. Now I can set an attribute called encoding-error
to ignore
. This is kinda hacky.
I am using the following checker code which now just works fine (tested it):
x-checker-data:
type: html
url: http://versions.teamspeak.com/ts3-client-2
version-pattern: "\u0006stable\u0010.*3\\.(\\d+\\.\\d+)\u0012"
encoding-error: ignore
url-template: https://files.teamspeak-services.com/releases/client/3.$version/TeamSpeak3-Client-linux_amd64-3.$version.run
So, anything on this?
I'm not the maintainer here, but I don't think you should try force a binary download through htmlchecker
. HTML by definition is a text language. I took a look at http://versions.teamspeak.com/ts3-client-2, and it's definitely not HTML. What an odd choice to encode that in a custom binary format instead of JSON or something.
As it turns out, I think this is protobuf format.
$ hd ts3-client-2
00000000 08 05 12 16 0a 06 73 65 72 76 65 72 10 e5 b0 f0 |......server....|
00000010 f0 05 1a 06 33 2e 31 31 2e 30 12 1e 0a 0f 61 6c |....3.11.0....al|
00000020 70 68 61 5f 6c 69 6e 75 78 5f 78 38 36 10 e6 c3 |pha_linux_x86...|
00000030 f9 fd 05 1a 05 33 2e 35 2e 36 12 1d 0a 0e 62 65 |.....3.5.6....be|
00000040 74 61 5f 6c 69 6e 75 78 5f 78 38 36 10 e6 c3 f9 |ta_linux_x86....|
00000050 fd 05 1a 05 33 2e 35 2e 36 12 1f 0a 10 73 74 61 |....3.5.6....sta|
00000060 62 6c 65 5f 6c 69 6e 75 78 5f 78 38 36 10 e6 c3 |ble_linux_x86...|
00000070 f9 fd 05 1a 05 33 2e 35 2e 36 12 13 0a 04 62 65 |.....3.5.6....be|
00000080 74 61 10 dd ff aa a8 06 1a 05 33 2e 36 2e 32 12 |ta........3.6.2.|
00000090 15 0a 06 73 74 61 62 6c 65 10 dd ff aa a8 06 1a |...stable.......|
000000a0 05 33 2e 36 2e 32 12 14 0a 05 61 6c 70 68 61 10 |.3.6.2....alpha.|
000000b0 e9 f7 96 ab 06 1a 05 33 2e 36 2e 33 18 04 |.......3.6.3..|
000000be
$ ~/go/bin/protoscope ts3-client-2
1: 5
2: {
1: {
14:SGROUP
12: 4.5449766e30i32 # 0x72657672i32
}
2: 1578899557
3: {"3.11.0"}
}
2: {
1: {"alpha_linux_x86"}
2: 1606312422
3: {"3.5.6"}
}
2: {
1: {"beta_linux_x86"}
2: 1606312422
3: {"3.5.6"}
}
2: {
1: {"stable_linux_x86"}
2: 1606312422
3: {"3.5.6"}
}
2: {
1: {"beta"}
2: 1695203293
3: {"3.6.2"}
}
2: {
1: {"stable"}
2: 1695203293
3: {"3.6.2"}
}
2: {
1: {"alpha"}
2: 1701166057
3: {"3.6.3"}
}
3: 4
It looks like each item is a tuple of name, time of update and version number. While you could probably get away with parsing it with a regex, it's certainly not robust. This seems like it needs to be a custom checker to be done correctly.
Alternatively, there could maybe be a type: raw
checker that reads in binary data and then uses a binary regex before decoding the match back to a string.