GooglePhotosTakeoutHelper icon indicating copy to clipboard operation
GooglePhotosTakeoutHelper copied to clipboard

Make date extraction more modular

Open tkuenzle opened this issue 2 years ago • 1 comments

First of all, thanks a lot @TheLastGimbus for coming up with this nice script, I have found it to be very useful!

Similar to #111 I realized that there are a few files where this app is not able to infer the correct date but upon further inspection, it looks like it would not be hard at all to add them to this script.

When I took a look at the code however, I identified two problems:

  • Adding more extraction methods makes the code more and more complex
  • Depending on the use case we might want different priorities of the ways to extract the date (e.g. analyzing the file name, using the Google Folder to get the year, etc.)

In order to address these two issues and make the script much easier to extend in the future, I would like to propose the following change:

Move all the logic for date extraction into a separate date_extractors module which could look like this:

class DateExtractor:
  def extract_date(self, file_path: Path, ) -> datetime | None:
      raise NotImplementedError

class ExifDateExtractor(DateExtractor):
  def extract_date(self, file_path: Path, ) -> datetime | None:
    # implementation for extracting the date from the EXIF data 
    
class FileNameDateExtractor(DateExtractor):
  def extract_date(self, file_path: Path, ) -> datetime | None:
    # implementation for extracting the date from the file name 
    
...

We could then define a list of extractors (and even make this configurable through a command line argument) that specifies the all the extractors that we would like to apply in order of priority. Getting the date would then be as simple as

extractors = [ExifDateExtractor(), FileNameDateExtractor()]

date = next(
    extracted_date
    for extractor in extractors
    if (extracted_date := extractor(file_path)) is not None
)

Such an implementation would nicely separate the extraction logic from priority and make it very easy for other people to add their own extractors.

If you think this could be a good idea and would be open for such a change, I would be happy to discuss the details with you and come up with a PR.

tkuenzle avatar May 02 '22 16:05 tkuenzle

hmmm, this seems nice

altough, generally, i'm tired of this script because it's messy+spaghetty+hack-ish. Currently I'm thinking of re-writing it in Dart or something. Tho I will take this to consideration if i stick with Python

TheLastGimbus avatar May 06 '22 11:05 TheLastGimbus

Another issue resolved by v3 :tada:

TheLastGimbus avatar Dec 17 '22 23:12 TheLastGimbus