sssekai icon indicating copy to clipboard operation
sssekai copied to clipboard

ROW AssetBundle URL changes & CN's new TGRP format

Open YumYummity opened this issue 9 months ago • 20 comments

The correct CN ABInfo URL is return f"" f"https://lf3-mkcncdn-tos.dailygn.com/obj/sf-game-lf/gdl_app_5236/AssetBundle/{self.config.ab_version or self.config.app_version}/Release/cn_online1/android91/AssetBundleInfoNew.json"

cn_online1 instead of cn_online, and android91.

YumYummity avatar Mar 30 '25 02:03 YumYummity

The AB URL appears to be incorrect as well, I'll check.

YumYummity avatar Mar 30 '25 02:03 YumYummity

AB Endpoint for CN appears is return f"https://lf3-j1gamecdn-cn.dailygn.com/obj/sf-game-lf/gdl_app_5236/AssetBundle/{self.config.ab_version or self.config.app_version}/Release/cn_online1/android65/".

I think the downloadPath provided in the ABInfo should be used? Is there a reason it's not?

YumYummity avatar Mar 30 '25 03:03 YumYummity

Thanks again! These should be ready in the next version bump.

I think the downloadPath provided in the ABInfo should be used? Is there a reason it's not?

JP/EN server AB Info don't have it. That plus storing bundles with their keys in bundles - which are guaranteed to be unqiue already - won't cause accidental overwrites either it's not currently used.

mos9527 avatar Mar 30 '25 03:03 mos9527

Thank you for the updates!

YumYummity avatar Mar 30 '25 03:03 YumYummity

@mos9527 I believe the ROW urls are starting to change for the AB endpoints; it'll download some things correctly, but others will return a 404. (this is only happening now, and not in the past.)

Image

I think the downloadPath should be used for ROW only

YumYummity avatar Mar 31 '25 01:03 YumYummity

@mos9527 I believe the ROW urls are starting to change for the AB endpoints; it'll download some things correctly, but others will return a 404. (this is only happening now, and not in the past.)

Image

Can you do some packet capture of the asset downloads with ROW builds of the game? Would be helpful if we know what kind of URL schema they're using now.

I think the downloadPath should be used for ROW only

Should be doable. We already had JP/EN specialization here: https://github.com/mos9527/sssekai/blob/main/sssekai/abcache/init.py#L666

mos9527 avatar Mar 31 '25 02:03 mos9527

Can you do some packet capture of the asset downloads with ROW builds of the game? Would be helpful if we know what kind of URL schema they're using now.

I might be able to later, but the changes I've seen are only the android number. Changing that seemed to work for a while, but it's specified in the downloadPath for ROW.

YumYummity avatar Mar 31 '25 02:03 YumYummity

Hi, I'm not sure whether my situation is related to this issue, however when I'm trying to fetch CN abcache, error calls out that the url might still be wrong with a "failed to resolve" error. The full error log reported from Python is attached below:

0405.log

When I'm checking through the responses, I find out that nuverse provided two entries for the CN server in my case:

"entry":"https://mkcn-prod-public-60001-1.dailygn.com/api/;https://mkcn-prod-public-60001-2.dailygn.com/api/"

Also, it seemed like nuverse used multiple servers for dowloading. In my situation, it connected to all of the following servers (in connection sequence order, or see the last screenshot):

https://lf3-j1gamecdn-cn.dailygn.com
https://lf6-j1gamecdn-cn.dailygn.com
https://lf9-j1gamecdn-cn.dailygn.com
https://lf26-j1gamecdn-cn.dailygn.com

I've tried to modify my local copy of the init.py, to make SEKAI_API_ENDPOINT being https://mkcn-prod-public-60001-1.dailygn.com and SEKAI_AB_INFO_ENDPOINT & SEKAI_AB_INFO_ENDPOINT being https://lf26-j1gamecdn-cn.dailygn.com (which I tested to be the fastest in my region).

However, there are a bunch of errors indicating 404 Client Error: Not Found for url displaying in the terminal.

Image

As I am not familiar with how these all work, I am unsure whether it is because the client requested both /Release/cn_online1/android65/android65.tgrp and /Release/cn_online1/android96/android96.tgrp as the following screenshot shows.

Image

Please let me know if you need extra information, I would be happy to help if there is anything I can do to assist with testing and solving this problem.

gingerbreap avatar Apr 05 '25 07:04 gingerbreap

Hi, I'm not sure whether my situation is related to this issue, however when I'm trying to fetch CN abcache, error calls out that the url might still be wrong with a "failed to resolve" error. The full error log reported from Python is attached below:

0405.log

The CN url at works for me with no 404s, but I'm only downloading events and music jackets/charts.

However, there are a bunch of errors indicating 404 Client Error: Not Found for url displaying in the terminal.

Image

404 means wrong AssetBundle URL, the old CN url would link to a KR-like CN server where the assets were in Chinese, but 404 on the CN exclusives and have the KR exclusives instead.

The amount of 404s probably means completely wrong URL.

As I am not familiar with how these all work, I am unsure whether it is because the client requested both /Release/cn_online1/android65/android65.tgrp and /Release/cn_online1/android96/android96.tgrp as the following screenshot shows.

Image

The .tgrp files just seem to be like large zips of a lot of assets at once, since /android65/asset/url still works to download individual assets. As for the android number, that seems to be defined with the entry's downloadPath.

YumYummity avatar Apr 05 '25 20:04 YumYummity

@YumYummity I assume you're in China for all the testing, which can explain my issue towards not having access to the current CN url in the source code. Even if I use proxy, sssekai would still return errors like CERTIFICATE_VERIFY_FAILED. I think I would have to stick on manually modifying these links.

gingerbreap avatar Apr 06 '25 09:04 gingerbreap

@YumYummity I assume you're in China for all the testing, which can explain my issue towards not having access to the current CN url in the source code. Even if I use proxy, sssekai would still return errors like CERTIFICATE_VERIFY_FAILED. I think I would have to stick on manually modifying these links.

I'm actually in Canada :)

VPN also works fine for USA etc

YumYummity avatar Apr 06 '25 15:04 YumYummity

@YumYummity I still doubt if https://mkcn-prod-public-30001-1.dailygn.com should be used as the API Endpoint for CN Server. Can you please check again to see if you have access to https://mkcn-prod-public-30001-1.dailygn.com/api/system currently? It seems to me that it no longer exist as it will return 403 forbidden even if I bypassed the wrong certificate issue (the certificate info of the domain is telling me that it is only returning results from a DNS provider). Everything looks good (except for still having a bunch of 404s) after switching the endpoint to https://mkcn-prod-public-60001-1.dailygn.com or https://mkcn-prod-public-60001-2.dailygn.com. For reference, I've attached the packet log for first log in downloading and in game downloading (as CN use a strage tgrp format to pack everything and download together at initial download but download everything normally in game).

(Initial Download) CSV Format HAR Format (In-game Download) CSV Format Har Format

Also, I've done some asset downloads (about 10% of the initial download) from TW server, as I noticed there is also discussion about how things goes differently when downloading assets. @mos9527

CSV Format HAR Format

gingerbreap avatar Apr 07 '25 05:04 gingerbreap

I still doubt if https://mkcn-prod-public-30001-1.dailygn.com should be used as the API Endpoint for CN Server.

Seems like this endpoint is no longer in use. Which is to be expected considering this one is added back in the closed-beta test back in 2024

;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 18295
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;mkcn-prod-public-30001-1.dailygn.com. IN A

;; AUTHORITY SECTION:
dailygn.com.            572     IN      SOA     vip3.alidns.com. hostmaster.hichina.com. 2025031815 3600 1200 86400 600

;; Query time: 0 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Mon Apr 07 16:08:57 CST 2025
;; MSG SIZE  rcvd: 132

Everything looks good (except for still having a bunch of 404s) after switching the endpoint to https://mkcn-prod-public-60001-1.dailygn.com or https://mkcn-prod-public-60001-2.dailygn.com.

60001 is indeed currently is use. I will be updating the endpoints to this one too.

;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19975
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;mkcn-prod-public-60001-1.dailygn.com. IN A

;; ANSWER SECTION:
mkcn-prod-public-60001-1.dailygn.com. 463 IN CNAME alb-2db3d0jqa9se83rvq32dumazb.cn-shanghai.volcalb.com.
alb-2db3d0jqa9se83rvq32dumazb.cn-shanghai.volcalb.com. 58 IN A 14.103.30.202
alb-2db3d0jqa9se83rvq32dumazb.cn-shanghai.volcalb.com. 58 IN A 14.103.23.201

;; Query time: 0 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Mon Apr 07 16:11:09 CST 2025
;; MSG SIZE  rcvd: 161

mos9527 avatar Apr 07 '25 08:04 mos9527

CSV Format HAR Format

@gingerbreap Thanks a lot. These have been quite helpful!

The schema pattern shown in these capture does seem to allow CN bundles to download. You can thoroughly test this with https://github.com/mos9527/sssekai/blob/main/misc/abcache_availability_check.py. It ~~practically DDOSes~~ hits the asset server with bundle requests w/o actually downloading them and collects the response code. I think it should be fast enough.

code=<pending>, count=48663     code=200, count=88

I'll run the CN server test on my mac box - which seems to work already - then move on to KR/TW server tests. Will post results ASAP.

UPDATE 1: CN assets do seem to (largely) work. Image

UPDATE 2: TW assets too. Image

UPDATE 3: KR assets Image

mos9527 avatar Apr 07 '25 08:04 mos9527

I'm wondering if there's an another way to do this for better efficiency and to avoid possible DDOS. The bundle location seemed to have already located in the AssetBundleInfo, as (a part of) the downloadPath parameter:

Image

Would it be possible to do apidecrypt first, then extract the downloadPath and match them with the file for easier indexing and downloading?

Edit: Guess CN only have the bundle name as at initial download it does not perform file-by-file download but bulk download. I wonder how CN client extract certain files from the big boy. The tgrp file I got was not a standard archive file, and when the the client sends GET it has the range header with it. Nuverse in their patch note have mentioned that they "optimized the download speed", and probably by this measure. I'm wondering whether we will be able to extract assets from this bulk archive, just in case someday they don't update / put their assets in their sever by files anymore.

gingerbreap avatar Apr 07 '25 09:04 gingerbreap

I'm wondering if there's an another way to do this for better efficiency and to avoid possible DDOS. The bundle location seemed to have already located in the AssetBundleInfo, as (a part of) the downloadPath parameter:

Image

Would it be possible to do apidecrypt first, then extract the downloadPath and match them with the file for easier indexing and downloading?

Edit: Guess CN only have the bundle name as at initial download it does not perform file-by-file download but bulk download. I wonder how CN client extract certain files from the big boy. The tgrp file I got was not a standard archive file, and when the the client sends GET it has the range header with it. Nuverse in their patch note have mentioned that they "optimized the download speed", and probably by this measure. I'm wondering whether we will be able to extract assets from this bulk archive, just in case someday they don't update / put their assets in their sever by files anymore.

This is indeed interesting - I'm guessing they went and implemented some kind of VFS with their new .tgrp format - I'm doubtful about the 'performance uplift' claim though.

BTW seems like the contents themselves aren't encrypted (besides Sekai's own implementation). Image

The header seem to contain a ~3MB GZip payload with lots of plaintext. Could be asset hashes and their respective offsets in the archive Image

Though considering how the header is missing a lot of info compared to AssetbundleIndexNew (bundleName for one) - plus the fact that the new URL schema has been validated on CN servers, we probably don't need to do much with this format atm.

mos9527 avatar Apr 07 '25 11:04 mos9527

I'm wondering if there's an another way to do this for better efficiency and to avoid possible DDOS. The bundle location seemed to have already located in the AssetBundleInfo, as (a part of) the downloadPath parameter

This also should probably be an non-issue considering only one of us have to do the validation to see if the URL schema works and check if they'd actually updated anything if something goes wrong again.

mos9527 avatar Apr 07 '25 11:04 mos9527

Both TW and KR seemed to be working on my side (don't know why they all stopped working at some point near the end, see screenshot). Was checking CN but got many 403s, might because I'm not using the latest code base but the 0.7.23 release?

Image

Also, I did some cross-check using some parts of the bytes header in the http request to copy the address of them and save as a new file, then open them directly using AssetStudio, and worked. So as the following part (Tried the following address starting with 10 00 00 00 AA 91 96 8B 86 46 53 and ends before the other, and just worked). Can't believe they're just combining all the files together and let the client to split them.

Image

gingerbreap avatar Apr 07 '25 12:04 gingerbreap

Both TW and KR seemed to be working on my side (don't know why they all stopped working at some point near the end, see screenshot). Was checking CN but got many 403s, might because I'm not using the latest code base but the 0.7.23 release?

There's actually not much changed API wise between 0.7.23 and 0.7.24. This might be a temporary IP ban - though I didn't get any (despite having sent >40k requests under 30mins).

0.7.24 was released a while ago which intergrated all the fixes mentioned in this issue. With any luck that should be resolved.

Also, I did some cross-check using some parts of the bytes header in the http request to copy the address of them and save as a new file, then open them directly using AssetStudio, and worked. So as the following part (Tried the following address starting with 10 00 00 00 AA 91 96 8B 86 46 53 and ends before the other, and just worked). Can't believe they're just combining all the files together and let the client to split them.

This is quite confusing indeed. Overheads coming from hitting different bundle URL paths should be negligible - with HTTP/2 there's already multiplexing so that's even less of an issue. If anything - serving such a humongous blob of data sounds like a disaster for CDN caching.

IMO this is just them flipping dataminers off - and they're not doing a good job at it.

mos9527 avatar Apr 07 '25 13:04 mos9527

Currently trying with 0.7.24. Thanks for the update.

Update: Download is still ongoing, but incredibly slow and having a bunch of IncompleteRead Errors.

Image

(Moved the following to a new issue)

gingerbreap avatar Apr 07 '25 13:04 gingerbreap