refiners icon indicating copy to clipboard operation
refiners copied to clipboard

[Ready For Review] Implement Fuyu

Open LouisRouss opened this issue 1 year ago • 8 comments
trafficstars

This is an early draft PR to indicate that i'm working on the fuyu 8b implementation. A lot of work still needs to be done, including

  • [x] Finalization of the Fuyu architecture
  • [x] Adaptation of the weights + validate the architecture
  • [x] Change Tokenizer
  • [x] Inference function
  • [x] Unit Testing

I will update this PR and ping you as soon as more significant progress has been made or if I encounter any blockers that require discussion. Thank you !

LouisRouss avatar Feb 22 '24 21:02 LouisRouss

This bounty is stale because it has been opened for 7 days with no activity.

github-actions[bot] avatar Mar 13 '24 08:03 github-actions[bot]

Hi Just to give a quick update, weights now adapt nicely and the model works as intended. It took me a while to debug things due to my low amount of computational power but everything should be quicker now ! Next steps are cleaning and documenting the code, writing a proper inference function and finally some unit testing as written in the original message Thanks !

LouisRouss avatar Mar 26 '24 21:03 LouisRouss

This bounty is stale because it has been opened for 7 days with no activity.

github-actions[bot] avatar Apr 03 '24 08:04 github-actions[bot]

This bounty was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Apr 11 '24 08:04 github-actions[bot]

Hi, this PR is finally ready for review ! A few precisions about the implementation :

  • The Hugging Face model doesn't use flash attention in the persimmon/fuyu model. Thus to have exactly the same logits the is_optimized argument in our model needs to be set to False. Note that the model produces consistent answers regardless of the attention optimization state.

  • The output logits match those of the Hugging Face model when both are set to float32. Although the test script fails with float16, the final answers remain consistent across both data types for any given (image, prompt).

  • Finally Hugging Face implements caching of the key and value states in each attention layers by default during generation for speeding up decoding. I didn't implement this functionnality. This could lead to minor variations in the final answer after several iterations. With use_cache set to False the HuggingFace model output the same exact final answer as this implementation. It isn't mentionned in the Adept.ai blog post, however flash attention is.

To use the model :

import requests
from PIL import Image

from refiners.fluxion.utils import load_from_safetensors
from refiners.foundationals.fuyu.fuyu import create_fuyu, Fuyu8b

config = Fuyu8b()
network = create_fuyu(config)
tensors = load_from_safetensors("/path/to/fuyu.safetensors")
network.load_state_dict(tensors)

url = "https://huggingface.co/adept/fuyu-8b/resolve/main/bus.png"
image = Image.open(requests.get(url, stream=True).raw)
prompt = "Generate a coco-style caption.\\n"

answer = network.generate([image], [prompt], max_len_generation=100)

I'm open to any feedback especially on the caching functionality and an eventual necessity to implement it.

Thank you !

LouisRouss avatar Apr 20 '24 21:04 LouisRouss

Hi, Just realized that I didn't pass the CI/CD because of Pyright, I'm aware of the issue and will correct every Pyright errors this evening. Apologies for the setback

Edit : Done

LouisRouss avatar Apr 22 '24 10:04 LouisRouss

Thank you for the reviews ! Concerning the test script, i know generate the references in a conftest.py before the different test_ functions. I hope this solution will be satisfying :)

LouisRouss avatar May 12 '24 16:05 LouisRouss

This bounty is stale because it has been opened for 7 days with no activity.

github-actions[bot] avatar May 20 '24 08:05 github-actions[bot]

This bounty is stale because it has been opened for 7 days with no activity.

github-actions[bot] avatar May 29 '24 08:05 github-actions[bot]

This bounty was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Jun 06 '24 08:06 github-actions[bot]

Hi @LouisRouss, sorry for the delay! We're very busy with other projects at the moment, I will keep the PR open

Laurent2916 avatar Jun 06 '24 08:06 Laurent2916

This bounty is stale because it has been opened for 7 days with no activity.

github-actions[bot] avatar Jun 15 '24 08:06 github-actions[bot]

This bounty was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Jun 23 '24 08:06 github-actions[bot]