refiners [Ready For Review] Implement Fuyu

trafficstars

This is an early draft PR to indicate that i'm working on the fuyu 8b implementation. A lot of work still needs to be done, including

[x] Finalization of the Fuyu architecture
[x] Adaptation of the weights + validate the architecture
[x] Change Tokenizer
[x] Inference function
[x] Unit Testing

I will update this PR and ping you as soon as more significant progress has been made or if I encounter any blockers that require discussion. Thank you !

Feb 22 '24 21:02 LouisRouss

This bounty is stale because it has been opened for 7 days with no activity.

Mar 13 '24 08:03 github-actions[bot]

Hi Just to give a quick update, weights now adapt nicely and the model works as intended. It took me a while to debug things due to my low amount of computational power but everything should be quicker now ! Next steps are cleaning and documenting the code, writing a proper inference function and finally some unit testing as written in the original message Thanks !

Mar 26 '24 21:03 LouisRouss

This bounty is stale because it has been opened for 7 days with no activity.

Apr 03 '24 08:04 github-actions[bot]

This bounty was closed because it has been inactive for 7 days since being marked as stale.

Apr 11 '24 08:04 github-actions[bot]

Hi, this PR is finally ready for review ! A few precisions about the implementation :

The Hugging Face model doesn't use flash attention in the persimmon/fuyu model. Thus to have exactly the same logits the is_optimized argument in our model needs to be set to False. Note that the model produces consistent answers regardless of the attention optimization state.
The output logits match those of the Hugging Face model when both are set to float32. Although the test script fails with float16, the final answers remain consistent across both data types for any given (image, prompt).
Finally Hugging Face implements caching of the key and value states in each attention layers by default during generation for speeding up decoding. I didn't implement this functionnality. This could lead to minor variations in the final answer after several iterations. With use_cache set to False the HuggingFace model output the same exact final answer as this implementation. It isn't mentionned in the Adept.ai blog post, however flash attention is.

To use the model :

import requests
from PIL import Image

from refiners.fluxion.utils import load_from_safetensors
from refiners.foundationals.fuyu.fuyu import create_fuyu, Fuyu8b

config = Fuyu8b()
network = create_fuyu(config)
tensors = load_from_safetensors("/path/to/fuyu.safetensors")
network.load_state_dict(tensors)

url = "https://huggingface.co/adept/fuyu-8b/resolve/main/bus.png"
image = Image.open(requests.get(url, stream=True).raw)
prompt = "Generate a coco-style caption.\\n"

answer = network.generate([image], [prompt], max_len_generation=100)

I'm open to any feedback especially on the caching functionality and an eventual necessity to implement it.

Thank you !

Apr 20 '24 21:04 LouisRouss

Hi, Just realized that I didn't pass the CI/CD because of Pyright, I'm aware of the issue and will correct every Pyright errors this evening. Apologies for the setback

Edit : Done

Apr 22 '24 10:04 LouisRouss

Thank you for the reviews ! Concerning the test script, i know generate the references in a conftest.py before the different test_ functions. I hope this solution will be satisfying :)

May 12 '24 16:05 LouisRouss

This bounty is stale because it has been opened for 7 days with no activity.

May 20 '24 08:05 github-actions[bot]

This bounty is stale because it has been opened for 7 days with no activity.

May 29 '24 08:05 github-actions[bot]

This bounty was closed because it has been inactive for 7 days since being marked as stale.

Jun 06 '24 08:06 github-actions[bot]

Hi @LouisRouss, sorry for the delay! We're very busy with other projects at the moment, I will keep the PR open

Jun 06 '24 08:06 Laurent2916

This bounty is stale because it has been opened for 7 days with no activity.

Jun 15 '24 08:06 github-actions[bot]

This bounty was closed because it has been inactive for 7 days since being marked as stale.

Jun 23 '24 08:06 github-actions[bot]

refiners refiners copied to clipboard

[Ready For Review] Implement Fuyu

refiners
refiners copied to clipboard