Jellyfin.Plugin.Stash icon indicating copy to clipboard operation
Jellyfin.Plugin.Stash copied to clipboard

Feature: Add config option for full-path matching for deterministic matching of files on the same disk

Open C84186 opened this issue 11 months ago • 0 comments

Related: #19

The following is a very common deployment:

  1. a user hosts jellyfin on the same host as stash
  2. both stash and jellyfin have access to the same filesystem

In this scenario, the stash API has the unique advantage that there's absolutely no need to guess in order to find an exact match for a given scene in jellyfin.

tl;dr

Either by configuration, or by default, when searching for a scene, use the entire path (incuding extension) in an exact search.

If needed, can allow for mapping of jellyfin -> stash paths, to address cases where the file system is under a different mountpoint between the services.

Current behaviour:

How the search query is currently constructed

  • Take the item's path
  • Take only the base-name (the file's name, discarding the folders leading to it)
  • Strip out the item's extension
  • Use the INCLUDE modifier in the search query so any item in stash with a matching basename will be matched

https://github.com/DirtyRacer1337/Jellyfin.Plugin.Stash/blob/a9fc5bda9fa4e3eced69737570bed3576034f852/Jellyfin.Plugin.Stash/Providers/StashAPI.cs#L72-L92

This has the following issues:

  • If the filename is something common, there will be multiple results
    • In my case, this occurs because I have duplicates of the same scene (see below)
    • In other cases, a file might have a path like follows:
      • /media/Gape Queens (2017) WEB-DL 540p SPLIT SCENES MP4-KLEENEX/1 Adriana Chechik.mp4
      • The base name will be 1 Adriana Chechik
        • This may also match against multiple entries in stash.
        • In this case, the collision is more likely to cause errors than if it were a duplicate of the same scene, as that title could apply to many scenes

Both cases would cause the problem described in #19 to occur

In action

In the current implementation, the following query is constructed:

https://my-stash-instance/graphql?query=query{findScenes(scene_filter%3A{path%3A{value%3A"\"HussiePass.20.09.04.Mz.Dani.\""%2Cmodifier%3AINCLUDES}}){scenes{id%2Ctitle%2Cdate%2Cpaths{screenshot}movies{movie{front_image_path}}}}}

An unescaped / prettified version of this:

take note of the INCLUDES modifier

findScenes(
  scene_filter: {path: {
    value: "HussiePass.20.09.04.Mz.Dani.", 
    modifier: INCLUDES}}
) {
  scenes {
    id
    title
    date
    paths {
      screenshot
    }
    movies {
      movie {
        front_image_path
      }
    }
  }
}

This gives the following result (with some fields redacted for brevity & anonymity)

{
  "data": {
    "findScenes": {
      "scenes": [
        {
          "id": "7899",
          "title": "54 Inch PAWG vs 13 Inch Cock",
          "date": "2020-09-04"
        },
        {
          "id": "7900",
          "title": "54 Inch PAWG vs 13 Inch Cock",
          "date": "2020-09-04"
        },
        {
          "id": "7901",
          "title": "54 Inch PAWG vs 13 Inch Cock",
          "date": "2020-09-04"
        }
      ]
    }
  }
}

Desired behaviour

For example, I can copy the following from jellyfin's Media Info panel:

image

and run the following graphql query from https://my-stash-instance/playground:

take note of the EQUALS modifier

{
  findScenes(
    scene_filter: {
      path: {
        value: "/data/cool_storage/qBittorrent/Complete/xxx_video/HussiePass.20.09.04.Mz.Dani.54.Inch.PAWG.vs.13.Inch.BBC..480p.MP4-XXX/HussiePass.20.09.04.Mz.Dani..mp4",
        modifier: EQUALS}}
  ) {
    scenes {
      id
      title
    }
  }
}

And the following result is returned:

{
  "data": {
    "findScenes": {
      "scenes": [
        {
          "id": "7901",
          "title": "54 Inch PAWG vs 13 Inch Cock"
        }
      ]
    }
  }
}

Comparison between exact and partial matching

To demonstrate these side-by-side:

{
  findScenesExact: findScenes(
    scene_filter: {path: {
      value: "/data/cool_storage/qBittorrent/Complete/xxx_video/HussiePass.20.09.04.Mz.Dani.54.Inch.PAWG.vs.13.Inch.BBC..480p.MP4-XXX/HussiePass.20.09.04.Mz.Dani..mp4",
      modifier: EQUALS}}
  ) {
    scenes {
      id
      title
    }
  }
  findScenesInclude: findScenes(
    scene_filter: {path: {
      value: "HussiePass.20.09.04.Mz.Dani.", 
      modifier: INCLUDES}}
  ) {
    scenes {
      id
      title
      date
    }
  }
}
{
  "data": {
    "findScenesExact": {
      "scenes": [
        {
          "id": "7901",
          "title": "54 Inch PAWG vs 13 Inch Cock"
        }
      ]
    },
    "findScenesInclude": {
      "scenes": [
        {
          "id": "7899",
          "title": "54 Inch PAWG vs 13 Inch Cock",
          "date": "2020-09-04"
        },
        {
          "id": "7900",
          "title": "54 Inch PAWG vs 13 Inch Cock",
          "date": "2020-09-04"
        },
        {
          "id": "7901",
          "title": "54 Inch PAWG vs 13 Inch Cock",
          "date": "2020-09-04"
        }
      ]
    }
  }
}

In the "exact" case we see one result, in the "include" case we see 3 results.

Limitations of the "exact" approach

For some containerized deployments, the mapping of the filesystem may not be identical between stash and jellyfin

This scenario is not unique to stash / jellyfin, and is well-documented in similar media apps, for example, sonarr: https://trash-guides.info/Sonarr/Sonarr-remote-path-mapping/

To address this, we could add a configuration option for mapping these paths, some apps support this, I myself have implemented similar in my own script for transferring torrents across systems: https://github.com/C84186/qBittorrent_tag_by_pattern/blob/main/paths.py#L60

That being said, I think it's reasonable to make exact matching a configuration option, and then put the onus on the user to ensure their filesystem paths match up (it's not that hard)

C84186 avatar Mar 18 '24 04:03 C84186