immich icon indicating copy to clipboard operation
immich copied to clipboard

feat(web): View all duplicates page

Open arnolicious opened this issue 9 months ago • 25 comments
trafficstars

Feature Description

Add a new intermediate page, that displays all the found duplicates alongside how many duplicate assets each duplicate has. From there, you can click on any duplicate to arrive at the "old" duplicate selection mechanism for that specific stack. After resolving, it should move on to the next duplicate in the list image

With this change, we can now also link from "Viewing an asset" straight to the duplication page for that duplicate stack.

image

Implementation

  • Changed the page params of the "old" duplicate page to require the duplicateId
    • Here I'm a little unsure why there's the optional [[photos=photos]] param, I assume it's not relevant anymore with these changes, but I wanted confirmation before removing it
  • Added the new page under utilities/duplicates
  • Added a link to the deduplication page from the asset viewer, if the asset has a duplicateId

arnolicious avatar Feb 02 '25 16:02 arnolicious

Can you add some screenshots to the PR description?

bo0tzz avatar Feb 02 '25 16:02 bo0tzz

There you go :) @bo0tzz

arnolicious avatar Feb 02 '25 16:02 arnolicious

What happens if you have 10_000 or 20_000 duplicates? Will it crash the page?

alextran1502 avatar Feb 03 '25 04:02 alextran1502

Seems to run just fine, at least for me. I adjusted the load function of utilities/duplicates/+page.ts like this to stress test:

export const load = (async ({ params }) => {
  await authenticate();
  const asset = await getAssetInfoFromParam(params);
  const duplicates = await getAssetDuplicates();
  const $t = await getFormatter();

  const mockedDuplicates = [...duplicates];

  for (let i = 0; i < 10_000; i++) {
    mockedDuplicates.push(...duplicates);
  }

  return {
    asset,
    duplicates: mockedDuplicates,
    meta: {
      title: $t('duplicates'),
    },
  };
}) satisfies PageLoad;

arnolicious avatar Feb 03 '25 17:02 arnolicious

How about 20,000? usually, with these types of rendering, the load can be stressed on both the server load and the web browser crashing. I am very positive we will run into issues since the query isn't lazy paginated as well as the view doesn't have lazy loading mechanism

I chose to go with single duplicate because of the aforementioned limitation

alextran1502 avatar Feb 03 '25 17:02 alextran1502

Alright, so we'd need to implement a client side pagination to merge this?

I assume only client side, since the current single duplicate implementation also already calls the entire duplicate list from the server, so this shouldn't be a problem

arnolicious avatar Feb 03 '25 17:02 arnolicious

Let's take a few steps and understand the problem we are trying to solve. Below are some of my feed back, and thoughts

  • The intermediate page doesn't have a clear indication that the user can click on the group of duplicates to handle them.
  • There should be bulk action on this page, and the group of duplicate should be selectable to perform deduplication.
  • I agree with you that we should have pagination on this page, breadcrumb navigation is a good solution.
  • It would be nice to show all the duplicated assets while hovered on each group

Heck, perhaps we can replace the current view if the above points are implemented. What do you think?

alextran1502 avatar Feb 03 '25 19:02 alextran1502

Make sure to consider #13851 before you get too far.

NicholasFlamy avatar Feb 03 '25 22:02 NicholasFlamy

Alright, thanks for all the feedback! I'll give it a try in the coming days/weeks

My idea for combining the 2 views would be something like this: Having the duplicates group be selectable, and when selected the old duplicate resolution component gets shown at the top of the page.

Is that something you had in mind?

arnolicious avatar Feb 05 '25 14:02 arnolicious

I think it is worth to draft a wireframe design so we don't misunderstand how the page will look like

alextran1502 avatar Feb 05 '25 15:02 alextran1502

How about something like this? The currently selected group would be the one displayed in the top section, and visible in the bottom list with a border. image

Tho for my personal use case, I would wish to be able to see a bigger part of all the duplicates somewhere, somehow

arnolicious avatar Feb 07 '25 14:02 arnolicious

I'm not a huge fan of the explicit numbered pages. Maybe there's a more subtle way to paginate this, or to use virtual scroll like the timeline?

mertalev avatar Feb 07 '25 16:02 mertalev

I did also consider this, would just be a bit more clicking if you specifically wanted to get to the end of the list for example image

arnolicious avatar Feb 07 '25 16:02 arnolicious

I did also consider this, would just be a bit more clicking if you specifically wanted to get to the end of the list for example image

I like this a lot better, how are you planning to handle loading, paginating in this scroll view?

alextran1502 avatar Feb 07 '25 23:02 alextran1502

I did also consider this, would just be a bit more clicking if you specifically wanted to get to the end of the list for example image

I love this. This is exactly what I would like.

NicholasFlamy avatar Feb 08 '25 00:02 NicholasFlamy

I would create a renderedItems list that only contains the current page + the ones before and after. And each time you change page, that list is updated like a sliding window over the entire list.

I don't think that should be too hard to implement, without the need for any additional libraries.

arnolicious avatar Feb 08 '25 09:02 arnolicious

I happened to see this while searching for search improvements so just leaving a note to consider (or disregard), that all the pages (timeline, people, albums, search etc) in the app have a consistent theme/pattern. A vertically scrolling/paginated on scroll approach but now with the above screenshot we seem to be introducing a brand new way/pattern of horizontal scrolling on click that not only goes against the existing pattern but makes scrolling through 1000s of duplicates a tedious click through. (as opposed to the usual vertically scrolling/paginated on scroll approach in the app).

Apologies, if not relevant please disregard.

atlas-shrugged08 avatar Feb 10 '25 21:02 atlas-shrugged08

I agree a vertical scroll oriented design would be more intuitive, easier to use and more consistent with other pages.

That being said, I'm not a web dev and not sure how difficult it is to make it a virtual scroll. It's at least easy to do it the way the search result page works (load more results on scroll without removing earlier entries from the DOM). It would start having issues once you scroll enough to have >5000 groups loaded, but that might be fine tbh.

mertalev avatar Feb 10 '25 21:02 mertalev

That being said, I'm not a web dev and not sure how difficult it is to make it a virtual scroll.

Just a bit ago I ran into the work done for the timeline and I almost fainted. (The timeline has inconsistent sizing that has to be estimated and fixed on the scrollbar and stuff, so that's why it's so complicated.)

NicholasFlamy avatar Feb 10 '25 21:02 NicholasFlamy

Well tbh, this was more of a ui problem than a technical one. I'm sure we can make virtual scroll work horizontally or vertically.

I just didn't have a good idea on how to make a vertically scrolling list/grid while also having the dedupe mechanism on the same page.

If someone's got ideas or suggestions, I'm open :D

arnolicious avatar Feb 10 '25 21:02 arnolicious

on the same page

Is that truly necessary? What about clicking on an image pops up the current page (to the specific image you click)? And if you wanna be fancy, it could be a big (almost full-screen) popup.

NicholasFlamy avatar Feb 10 '25 23:02 NicholasFlamy

Well, that is the current implementation, basically. List of duplicate groups, you click on it and land on the known duplicate page.

But Alex suggested to merge those into one: https://github.com/immich-app/immich/pull/15856#issuecomment-2631906971

arnolicious avatar Feb 10 '25 23:02 arnolicious

Well, that is the current implementation, basically. List of duplicate groups, you click on it and land on the known duplicate page.

Oh yeah, I just tried it. It's pretty good but yeah, some stuff could be addressed. Also, you should try rebasing when you merge main and then force push.

NicholasFlamy avatar Feb 11 '25 03:02 NicholasFlamy

So, a little update on what I've done now:

  • Added a virtual list to virtualize the duplicates grid which should avoid performance issues on huge lists
  • Added the Bulk actions to the overview page
  • Removed the bulk actions from the detail page, since that seemed confusing / redundant now
  • Hopefully improved UX:
    • Added helper text to the grid view, to explain to click on a duplicate group to handle it
    • Added a border when hovering over the items in the grid view to show "more clickability"
    • Added a description and icon to the link from the asset detail panel to the duplicate

In wanting to implement Mert's comments, and improve effiency in general, I also added 2 new endpoints for duplicates:

  • GET duplicates/:id
  • GET duplicates/info - Gives out the list of all duplicates, but only with one asset per duplicate

However I didn't end up actually using those:

  • Since the bulk actions are now in the grid view, I do need the entire asset list there to perform the bulk actions.
  • The duplicates/:id Endpoint was meant for the detail page, however if I don't have the list of all the duplicates, idk how to load the next duplicate group after handling the current one.

arnolicious avatar Feb 15 '25 11:02 arnolicious

This feature looks great, for now I am going to convert the PR to a draft, once you're ready for a review again, please feel free to change it back once you've implemented the requested changes and then make new requests for reviews from people. Thanks!

zackpollard avatar Mar 03 '25 18:03 zackpollard

I have lost many battles against the CI, but I won the war!

I'd like to gather some feedback / review on the current state of things, as described in my last comment. In my opinion, things seem to work and look pretty good ✨

arnolicious avatar May 17 '25 12:05 arnolicious

Fetching all duplicates is the status quo right now, so that in itself isn't a regression. Agree it should be tested on a large number of duplicates though.

mertalev avatar May 17 '25 18:05 mertalev

Alrighty, I have some improvements:

  • The duplicate image elements now only fetch the image from the server, once it's in the viewport
  • Built a DIY virtual grid that only renders items in the viewport, replacing the additional dependency
  • It also tracks the scroll speed, to only render a placeholder when scrolling fast
    • This helps when scrolling far down faster: Since only the placeholders are rendered before, no img fetches are started, that would hang up the fetches of the currently visible images

I tested all this using this imagenet dataset, included twice as two different external libraries, which contains 34k images

arnolicious avatar May 20 '25 11:05 arnolicious

@midzelis could you have a look at the grid?

danieldietzler avatar May 26 '25 10:05 danieldietzler

I did also consider this, would just be a bit more clicking if you specifically wanted to get to the end of the list for example

I like this a lot better, how are you planning to handle loading, paginating in this scroll view?

I also like this version a lot.

midzelis avatar May 27 '25 00:05 midzelis