timeliner icon indicating copy to clipboard operation
timeliner copied to clipboard

Change in Google Takeout format?

Open lewurm opened this issue 2 years ago • 5 comments

Quoting from https://github.com/mholt/timeliner/wiki/Data-Source:-Google-Photos :

  • Google can change the Takeout archive format at any time, breaking this implementation. Please help maintain this feature if you use it!

Did this happen now? Exports larger than 50gb will be split now: gtakeout1 gtakeout2

While the first archive of a split seems to be accepted fine by timeliner import, the remaining archives do not print anything (even with -v) and exit after a few seconds.

I also tried to unpack all the files and repackage them into a single large one, but timeliner import fails right away:

2022/01/08 20:57:38 [ERROR][google_photos/[email protected]] Importing: importing: walking metadata.json: walking ._IMG_1337.HEIC.json: decoding item metadata file Takeout/Google Photos/Album2021/._IMG_1337.HEIC.json: invalid character '\x00' looking for beginning of value

Maybe that's related to the way I repackage it? The file headers look like this:

$ file takeout-20220106T172751Z-001.tgz takeout-20220106T172751Z-all.tgz
takeout-20220106T172751Z-001.tgz: gzip compressed data, from FAT filesystem (MS-DOS, OS/2, NT), original size modulo 2^32 2273460736
takeout-20220106T172751Z-all.tgz: gzip compressed data, last modified: Fri Jan  7 20:52:46 2022, from Unix, original size modulo 2^32 395496448

where takeout-20220106T172751Z-all.tgz is my repackaged archive (on macOS).

Anyway that would be merely a workaround, but it would be great if timeliner import supports those split archives generated by Google.

lewurm avatar Jan 08 '22 20:01 lewurm

Good question. I haven't tried with split takeout files yet. Am mobile right now but want to get this working. Contributions / proposals welcome here 🙂

mholt avatar Jan 08 '22 20:01 mholt

From a quick look, it seems like all .json files are in the first split archive only. I might dig into the source code a bit tomorrow 🙂

lewurm avatar Jan 08 '22 21:01 lewurm

Ohh that's interesting... hmm, and somewhat problematic. Will think on this. Let me know if you think of something!

mholt avatar Jan 09 '22 04:01 mholt

So tried my repackaging idea again, but this time using GNU tar on macOS (brew install gnu-tar) and then timeliner at least doesn't trip:

$ cat takeout-20220106T172751Z-0*.tgz | gtar xzivf -
$ gtar -cvzf takeout-20220106T172751Z-all.tgz Takeout/

However, I still do not see GPS info in most pictures when doing timeliner import ... with the combined archive. Not sure what's going on, but it's definitely quite slow and does a lot of disk reading.

I was looking a bit at takeoutarchive.go regarding supporting multiple archives, but I think instead it would be easier and more performant if instead it would operate on the unpacked Takeout folder. It even looks like with archiver v4 that should be rather easy to do, while also keeping support for a single archive file?

lewurm avatar Jan 12 '22 10:01 lewurm

Nice find with the gnu-tar fix. I also wonder if filenames like ._* are macOS-only or something weird.

However, I still do not see GPS info in most pictures when doing timeliner import ... with the combined archive. Not sure what's going on, but it's definitely quite slow and does a lot of disk reading.

One thought... if they already existed in your timeline, it's possible that timeliner is skipping those ones entirely. Or maybe our EXIF reader just isn't finding the data in some files for some reason.

It even looks like with archiver v4 that should be rather easy to do, while also keeping support for a single archive file?

Yep, exactly, and I've already got that working locally in Timeliner's successor, Timelinize:

  • https://twitter.com/timelinize
  • https://github.com/mholt/timeliner/issues/76#issuecomment-997246237

And was the primary motivation for writing archiver v4.

It's my nights-and-weekends project so I still have a lot to do before it's polished enough to share, but I'm making progress :muscle:

mholt avatar Jan 12 '22 18:01 mholt

I now have more info about Timelinize, as well as a Discord community if you want to help try it out and offer feedback. https://timelinize.com (also updated this project's README).

mholt avatar Jan 19 '24 18:01 mholt