Import stop on any failure, FR: add a retry option
Hi there,
I'm on my way to upload my 177k (!) pictures, and unfortunately at the first sign of failure... the whole program stops and doesn't go forward. Between the "immich content" step and the "google photo puzzle" steps, it takes already more than 1h to go back to the same step.
Sounds like there should be an option to either:
- retry right now
- resume from where it failed
- skip the file.
In particular now, I'm facing an Internal Server Error on a file and not sure I can ever go beyond that. (Filing a separate issue right now)
Apparently there easy "Internal Errors" where the server replies immediately, and immich-go may do something smart with the error. And they are errors that hurt the server. All connections are frozen until a timeout run off after 20 minutes...
A long timeout is needed to give to the server enough time to process large files. The server remains quite during long minutes.
At the moment, I try to find conditions (or logical errors in the immich-go code) that lead this kind of freezing errors. In many cases, it's sufficient to run immich-go to clear the error. So it's difficult to get accurate logs.
Absolutely, they appear to be quite random, and that would make sense from your point of view if they are server-side.
Thing is, in my case, I run it on 180k medias, so it always fails over time, and I run into some edge cases like #700 #701 #703 and actually #706 (sorry!).
#700 / #699 (mines) and #706 seem to be the same issue though.
The UI isn't condusive of whether immich-go is hanging waiting for the server, or simply had stopped, so I killed immich-go and restarted over afer a while. Regardless, 2 nights I went to bed with immich-go running to wake up to the error, meaning it probably hanged for hours, instead of timing out and retrying.
FYI in #699 & #706:
2025-02-11 20:09:01 ERR upload error file=takeout-20250206T162211Z-044:Takeout/Google Photos/Photos from 2018/4016.png error=updateAsset, PUT, http://localhost:2283/api/assets/6151d447-cccf-4a8b-8f78-431d988a0251, 500 Internal Server Error
Failed to update asset
│2025-02-12 06:18:13 ERR upload error file=takeout-20250209T204248Z-005:Takeout/Google Photos/XXXXXXXXX/IMG_5616.HEIC error=updateAsset, PUT, http://192.168.1.164:2283/api/assets/b77cb63e-e26e-4cae-ae11-9fac6ad0083f │
I'm restarting now with --log-level=DEBUG --api-trace
Grab the server log as well
having the same issue as well! after this error, I think the tool just hang and did not continue to try to upload the remaining files
2025-04-04 22:57:44 WRN Info file=PhotosLibrary.photoslibrary:originals/3/3B72B5EB-82FA-40A2-944F-84240DB9843F.png warning=can't read metadata for this format '.png' │ │2025-04-04 23:12:11 ERR upload error file=PhotosLibrary.photoslibrary:originals/3/3B37B5E5-1E04-4F7D-9A7D-29E61606B8E5.mov error=io: read/write on closed pipe │ │AssetUpload, POST, http://10.0.0.46:2283/api/assets │ │Post "http://10.0.0.46:2283/api/assets": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Update: Currently retrying the tool, will send the server log here once I finish this run
Post "http://10.0.0.46:2283/api/assets": context deadline exceeded
This is a typical case when retrying would not help...
Just gives more time to the server to ingest your file. Add --client-timeout=20m to the command line
In my case it seems like immich-go is processing files for the server far faster than the server can ingest them.
So as immich-go runs, it completes the scan and then begins uploading. The server jobs report fluctuates around ~15 "Active". Meanwhile, the number of "waiting" jobs keeps increasing into the thousands. I presume that's because the queue is growing faster than the server can process them. At about 10,000 "waiting", I eventually get the following error.
│2025-04-19 10:41:07 ERR upload error file=takeout-20250412T215025Z-001:Takeout/Google Photos/xxxxxxx/GOPR0398.MP4 error=io: read/write on closed pipe
│AssetUpload, POST, http://192.168.86.112:2283/api/assets │
│Post "http://192.168.86.112:2283/api/assets": write tcp 192.168.86.112:54006->192.168.86.112:2283: write: broken pipe │
│2025-04-19 10:41:07 ERR io: read/write on closed pipe │
│AssetUpload, POST, http://192.168.86.112:2283/api/assets │
│Post "http://192.168.86.112:2283/api/assets": write tcp 192.168.86.112:54006->192.168.86.112:2283: write: broken pipe
After the error, the server continues processing the 10,000 that were waiting.
I'm not certain I've read the situation exactly right, this is just my interpretation. Happy to provide logs if helpful.
Which version of immich-go are you using?
0.25.3
@simulot I got this to work quite well and wanted to share my experience in case it's useful to you. I'm running both immich and Immich-go on a synology 1019+, which is on the lower end of hardware but probably also a very common use case.
- My environment: immich-go is running direct on the nas, and immich itself is also running in docker on the same machine
- When first getting started, I'd regularly run into various errors that stopped the service. This was very frustrating when I intended to run it overnight on my collection of ~240,000 assets and the service ended up erroring out soon after going to bed with no progress made for the rest of the night.
- (So in direct response to this thread, +1 from me for having a retry option. I think it'd help a lot of people.)
- On my synology NAS, the default temp directory wasn't working for larger files. There's some sort of space cap. Running the env variable as you recommended in that other thread (that was me) made a world of difference.
- The next problem I encountered appears to be related to the lower powered CPU on my NAS. Once immich-go began importing files, the server struggled to keep up with all the jobs (transcoding, facial recognition, etc.) and would eventually close the pipe and error out immich-go (again, a retry/resume option would be helpful here). In my case, I've worked around this problem by pausing most of the immich "jobs" that happen at import. (others can do this at https://(immich-address)/admin/jobs-status) In particular, I've turned off Generate Thumbnails, Face Recognition, and Transcode videos.
- For good measure, I set the client timeout to 20m and instructed it to ignore errors.
For reference any anyone else looking to avoid some of the problems I ran into (presumably due to a relatively low powered machine), the solution that worked for me was
sudo env IMMICHGO_TEMPDIR=/volume1/temp/ ./immich-go upload from-google-photos --server=http://<immichIP>:<port> --time-zone=<insertyourshere> --client-timeout=20m --on-server-errors=continue --api-key=<insertyourshere> takeout-*.zip
With this exact setup (the above command + turning off all the extra immich jobs) I went from being able to only haltingly upload ~5,000 assets at a time, to being able to upload the entire directory of 240,000 in one single run. I did have 12 errors along the way but I'll deal with those later. The important thing is almost everything is in immich now.