transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Add LightGlue model

Open sbucaille opened this issue 1 year ago β€’ 2 comments

What does this PR do?

This PR implements LightGlue, a keypoint matching model that builds on top of SuperPoint and optimizes SuperGlue inference time while setting SotA results in image matching tasks. SuperGlue being currently implemented through this PR. This should be added after. This branch is based on SuperGlue's branch and the PR is created before SuperGlue's is done to get CI tests results. Once SuperGlue's PR merged, this PR will be rebased onto main.

For now LightGlue is implemented like SuperGlue as a standalone model using AutoModel but this PR will also be the place where AutoModelForKeypointMatching will be created to group SuperGlue and LightGlue together.

This model supports FlashAttention2

Who can review?

@amyeroberts (cc @NielsRogge)

sbucaille avatar Jun 30 '24 15:06 sbucaille

@amyeroberts

I had a weird bug with the test_sdpa_can_compile_dynamic test at first with this following message :

C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0] Error in codegen for ComputedBuffer(name='buf0', layout=AliasedLayout('cpu', torch.float16, size=[1], stride=[1]), data=Pointwise(
C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0]   'cpu',
C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0]   torch.float16,
C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0]   def inner_fn(index):
C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0]       _ = index
C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0]       tmp0 = ops.index_expr(s2, torch.float16)
C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0]       return tmp0
C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0]   ,
C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0]   ranges=[1],
C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0]   origin_node=None,
C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0]   origins={cat}
C0710 21:51:39.831000 140316135185280 torch/_inductor/scheduler.py:781] [24/0] ))

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

var = <torch._inductor.codegen.cpp.CppCSEVariable object at 0x7f63391083a0>
cache = {'c10::convert<half>(ks0)': <torch._inductor.codegen.cpp.CppCSEVariable object at 0x7f63391083a0>}

    def find_fp32_var(var, cache):
        fp32_cse_var = None
        fp32_cse_var_name = None
        lowp_dtype = None
        for expr, cse_var in cache.items():
            if cse_var == var:
                lowp_dtype = is_to_lowp_dtype(expr)
                if lowp_dtype:
                    m = re.search(r"tmp\d+", expr)
>                   assert m
E                   torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
E                   AssertionError: 
E                   
E                   
E                   You can suppress this exception and fall back to eager by setting:
E                       import torch._dynamo
E                       torch._dynamo.config.suppress_errors = True

This message pointed at an integer variable height that I am using to normalize keypoints. By adding the following lines before where it happens, I managed to pass the test, but is it a good way of doing it ?

height = torch.tensor(height, device=device)
width = torch.tensor(width, device=device)

For some reason, I can't manage to pass the following error with test_sdpa_can_dispatch_on_flash as it can't find flash attention backend although I am running the latest torch version 2.3.1, on WSL with a 4080 GPU.

    def forward(self, q: torch.Tensor, k: torch.Tensor, v: torch.Tensor) -> Tuple[torch.Tensor, None]:
>       attn_output = torch.nn.functional.scaled_dot_product_attention(q, k, v)
E       RuntimeError: No available kernel. Aborting execution.

By taking the valuable feedback on the SuperGlue PR, this LightGlue PR is now very close to a final state. Some docs are still needed here and there and I need to complete the model doc on https://huggingface.co/stevenbucaille/lightglue

sbucaille avatar Jul 10 '24 21:07 sbucaille

@sbucaille Great!

For the sdpa test, converting ints to torch.tensors is definitely OK. There's a bunch of different functionalities which require this e.g. when using jit.trace

For the FA2 test, I'm not sure :/ It looks like an install issue, but I haven't seen before so uncertain

amyeroberts avatar Jul 10 '24 23:07 amyeroberts

@amyeroberts In the end I removed the sdpa support for now as I am running into a bug similar to this issue, I guess I'll see later for that

sbucaille avatar Aug 31 '24 20:08 sbucaille

@sbucaille feel free to let us know whenever this is ready for another review!

Rocketknight1 avatar Dec 03 '24 12:12 Rocketknight1

Hi @Rocketknight1, thanks for the interest in my PR ! The code concerning LightGlue is close to its final form, I've tried to replicate all the feedback Amy and @qubvel provided me on the SuperGlue's PR, including the implementation of batching and using copies from existing transformers classes. Although, I currently included in the branch the changes from SuperGlue's PR for two reasons :

  • For personal convenience, having SuperGlue's code while implementing LightGlue was helpful
  • Since LightGlue is a keypoint matching like SuperGlue, I thought it would be worth to create an AutoModelForKeypointMatching including both models.

I still have minor things to fix such as this : matched_image

I've also included a run-slow commit to check the integration tests, if one of you can trigger the action to see the result, but I suspect we will encounter the same problem as discussed here

I'd say I'll make another pass on the whole code, fix the several problems I mentionned and will ping you when it's ready ?

sbucaille avatar Dec 03 '24 16:12 sbucaille

@Rocketknight1 @qubvel PR is ready for a review, I still need to update lightglue.md and hub repo docs. Should I contact the authors about whether they would like to have the model pushed to their own repo ? Part of LightGlue's authors are the same as SuperGlue's but LightGlue was not made by Magic Leap but a research team from Zurich. Also their licence is I believe fully open source as described here, I'll update the repo licence accordingly.

sbucaille avatar Dec 03 '24 23:12 sbucaille

Hi @qubvel, thanks for this fast first review ! I tried to answer most of your comments, I think pushing new commits made some of them "outdated" and then disappeared. To recap the recent changes, I've addressed the comments, from small changes to bigger. I've tried to refactor most of the match_image_pair and forward methods, tried to use more list comprehension to repetitive instructions as well as used nn.utils.rnn.pad_sequence which I forgot the existence (and I think it could be useful for SuperGlue's PR as well) which allows me not necessarily instantiate a big tensor upfront and then filling it but rather batching together tensors with inconsistent shapes. Still one thing missing that I will do tomorrow morning is the lightglue.md. Other than that let me know what you think !

sbucaille avatar Dec 05 '24 17:12 sbucaille

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@qubvel thanks for triggering the slow tests. Is there any guidelines for installing transformers with flash-attn such that I don't get the following error on test_sdpa_can_dispatch_on_flash :

>       attn_output = torch.nn.functional.scaled_dot_product_attention(
            query_layer,
            key_layer,
            value_layer,
            attn_mask=attention_mask,
            dropout_p=self.dropout_prob if self.training else 0.0,
        )
E       RuntimeError: No available kernel. Aborting execution.

On a second setup I use I have the same error, the first was on WSL, the second is Ubuntu πŸ€”

Anyway, looks like there is another bug triggered by the grid_sample instruction in SuperPoint, after a small search I found this issue that may apply to us but I can't replicate this error on my setup even by downgrading torch version

sbucaille avatar Dec 09 '24 09:12 sbucaille

Re flash attention, do you have https://github.com/Dao-AILab/flash-attention installed? Idk if thats required, also it might be that a specific version of GPU is required (A or H architecture).

qubvel avatar Dec 09 '24 10:12 qubvel

@qubvel , addressed the last comments, sorry for the missing typehint πŸ˜… About flash attention, I indeed followed these instructions, on one side I'm using a 3090Ti, on the other a 4080 which both should be supported GPU's. But what's weird is that flash attention works perfectly when used directly by the library, it only fails when it tries to use it through torch.nn.functional.scaled_dot_product_attention πŸ€”

sbucaille avatar Dec 09 '24 10:12 sbucaille

Hi @qubvel , hope you are doing well ! Let me gently bump this PR as I have several remarks regarding the PR :

  • The failing test should not be a problem once pytorch release a new version I believe as their fix has been merged.
  • Just pushed a commit that makes verifying model outputs only if we provide the official LightGlue checkpoint url. With the release of the MINIMA version of LightGlue, we wouldn't be able to convert this version of LightGlue because the verification would naturally fails as the weights are different. So I've set a default variable with the official weights

About this MINIMA version, I guess I will reach out with them when the PR is merged so they can upload their own version on the hub so that it can be used with transformers. lmkwyt !

sbucaille avatar Jan 08 '25 14:01 sbucaille

Hi @qubvel, Rebased the branch on main as SuperGlue's got merged. I've applied the latest suggestions aswell. I'm not sure what triggered the review request to this many people though, I just force pushedπŸ˜…

sbucaille avatar Jan 20 '25 12:01 sbucaille

No worries, @sbucaille! This is a CODEOWNERS experiment in transformers. We were trying to automate review ping with GitHub tools, but there seemed to be too many false positives, so it's going to be canceled.

Will review PR this week, as far as I remember it's in a good shape, and thanks for updating with recent comments πŸ€—

qubvel avatar Jan 20 '25 14:01 qubvel

Thanks @ArthurZucker for your review, I realize this contribution falls short to your standards regarding modular and the different copies. Let me work on that again, I propose to come up with a modular version based on more recent models like llama. I'll let you know when it is ready for a new review !

sbucaille avatar Jan 23 '25 21:01 sbucaille

@ArthurZucker @qubvel I freshened up the PR with the use of modular (super easy to use btw nice work on that !). Mainly used llama as suggested, although the forward is a bit modified to perform cross attention. Let me know what you think of this first try !

sbucaille avatar Jan 25 '25 13:01 sbucaille

Added plotting in a new method in the image processor with matplotlib soft dependency in the utils files

sbucaille avatar Jan 28 '25 21:01 sbucaille

@ArthurZucker just a gentle bump here πŸ€—

sbucaille avatar Feb 06 '25 14:02 sbucaille

Hahah sorry @sbucaille catching up on my late notifications!

ArthurZucker avatar Feb 13 '25 08:02 ArthurZucker

Hi @ArthurZucker, no worries ! Since the authors created their organization on the Hub (https://huggingface.co/ETH-CVG), I updated the convert script to upload the model on a more official repo. I also fixed the small merge conflicts and rebased on main

sbucaille avatar Feb 20 '25 16:02 sbucaille

Hi @ArthurZucker, just rebased the branch on main after merge conflicts appeared. Also added a small line with require_backends to the post processing method as I saw other post processing methods (such as RT DETR image processor) included it.

sbucaille avatar Mar 10 '25 10:03 sbucaille

Hey @ArthurZucker small gentle bump here πŸ€—

sbucaille avatar Mar 31 '25 22:03 sbucaille

can you just rebase on main and @Cyrilvallez will merge

ArthurZucker avatar Apr 11 '25 15:04 ArthurZucker

Hi @ArthurZucker thanks for the review ! @Cyrilvallez minor changes are made and branch is rebased on main

sbucaille avatar Apr 11 '25 20:04 sbucaille

@Cyrilvallez rebased again due to another merge conflict

sbucaille avatar Apr 15 '25 15:04 sbucaille

Hi @Cyrilvallez , I addressed your comments, let me know what you think !

EDIT : any idea what causes this error in the image processor ? Running tests locally works fine 'LightGlueImageProcessingTester' object has no attribute '_testMethodName'

sbucaille avatar Apr 29 '25 18:04 sbucaille

Hi @Cyrilvallez , Addressed most of your comments, there is still one concerning the attn_implementation where I'd like your input. Although, I have another problem regarding using modular with SuperGlueImageProcessor. After tweaking my code for several hours, I can't figure out how to do it. Let me explain. If I follow strictly like I did in modular_lightglue.py, I naively write this, similarly to what I do with LlamaAttention :

from ..llama.modeling_llama import LlamaAttention, eager_attention_forward # To show that this import path works 
from ..superglue.image_processing_superglue import SuperGlueImageProcessor
class LightGlueImageProcessor(SuperGlueImageProcessor): ...

But running python utils/modular_model_converter.py --files_to_parse lightglue leads to this error :

Converting src/transformers/models/lightglue/modular_lightglue.py to a single model single file format
Traceback (most recent call last):
  File "/home/steven/transformers/utils/modular_model_converter.py", line 1810, in <module>
    converted_files = convert_modular_file(file_name)
  File "/home/steven/transformers/utils/modular_model_converter.py", line 1736, in convert_modular_file
    wrapper.visit(cst_transformers)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/metadata/wrapper.py", line 204, in visit
    return self.module.visit(visitor)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/module.py", line 89, in visit
    result = super(Module, self).visit(visitor)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/base.py", line 233, in visit
    visitor.on_leave(self)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_visitors.py", line 137, in on_leave
    leave_func(original_node)
  File "/home/steven/transformers/utils/modular_model_converter.py", line 1357, in leave_Module
    self.visited_modules[file] = ModelFileMapper.visit_and_merge_dependencies(
  File "/home/steven/transformers/utils/modular_model_converter.py", line 945, in visit_and_merge_dependencies
    wrapper.visit(mapper)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/metadata/wrapper.py", line 204, in visit
    return self.module.visit(visitor)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/module.py", line 89, in visit
    result = super(Module, self).visit(visitor)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/base.py", line 227, in visit
    _CSTNodeSelfT, self._visit_and_replace_children(visitor)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/module.py", line 74, in _visit_and_replace_children
    body=visit_body_sequence(self, "body", self.body, visitor),
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/internal.py", line 227, in visit_body_sequence
    return tuple(visit_body_iterable(parent, fieldname, children, visitor))
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/internal.py", line 193, in visit_body_iterable
    new_child = child.visit(visitor)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/base.py", line 227, in visit
    _CSTNodeSelfT, self._visit_and_replace_children(visitor)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/statement.py", line 442, in _visit_and_replace_children
    body=visit_sequence(self, "body", self.body, visitor),
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/internal.py", line 177, in visit_sequence
    return tuple(visit_iterable(parent, fieldname, children, visitor))
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/internal.py", line 159, in visit_iterable
    new_child = child.visit(visitor)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/base.py", line 218, in visit
    should_visit_children = visitor.on_visit(self)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_visitors.py", line 123, in on_visit
    retval = visit_func(node)
  File "/home/steven/transformers/utils/modular_model_converter.py", line 657, in visit_ImportFrom
    import_module = self.python_module.code_for_node(node.module)
  File "/home/steven/transformers/.venv/lib/python3.10/site-packages/libcst/_nodes/module.py", line 136, in code_for_node
    node._codegen(state)
AttributeError: 'NoneType' object has no attribute '_codegen'

So I changed a bit the way I import the SuperGlueImageProcessor class by doing the following :

from ..superglue import SuperGlueImageProcessor

class LightGlueImageProcessor(SuperGlueImageProcessor):
    pass

Doing so doesn't raise an error like above, but this is the resulting image_processing_lightglue.py file :

#                🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
#           This file was automatically generated from src/transformers/models/lightglue/modular_lightglue.py.
#               Do NOT edit this file manually as any edits will be overwritten by the generation of
#             the file from the modular. If any change should be done, please apply the change to the
#                          modular_lightglue.py file directly. One of our CI enforces this.
#                🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
# Copyright 2025 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from ..superglue import SuperGlueImageProcessor


class LightGlueImageProcessor(SuperGlueImageProcessor):
    pass

Is there something wrong in what I do ? Any ideas ? πŸ€”

Anyway, thanks for coming back to my PR, I also add very little time recently to work on my PR's πŸ˜…

sbucaille avatar Jun 04 '25 14:06 sbucaille

Hey! Your first way of doing is correct, but indeed, this is a small bug due to an import in superglue: you can solve it in the PR by switching this line -> instead of from ... import is_torch_available, is_vision_available, change it to from ...image_utils import is_torch_available, is_vision_available, and then modular should work! πŸ€—

Cyrilvallez avatar Jun 16 '25 13:06 Cyrilvallez

@Cyrilvallez comments addressed ! I pinged you again on a previous comment you made just to be sure !

sbucaille avatar Jun 16 '25 19:06 sbucaille

@Cyrilvallez thanks for the quick iteration, I've made the changes, added the comment agout SuperPoint eager attention and moved the functions down to where they belong, feel free to merge ! πŸ€—

sbucaille avatar Jun 17 '25 14:06 sbucaille