transformers
transformers copied to clipboard
Adding EDSR model
What does this PR do?
Fixes #19631
I will add the other components based on this page.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
cc @alaradirik and @NielsRogge
Sorry for the delay. Can some one help me on putting the model file into the organisation space? Thanks
Sorry for the delay. Can some one help me on putting the model file into the organisation space? Thanks
Hi @venkat-natchi, thanks for working on this!
I can help you with that but I saw that there is no conversion script yet. The conversion script (e.g. convert_original_XXX.py) loads the pre-trained original model and the randomly initialized HF model with the corresponding configuration, and replaces each parameter of the HF model with the corresponding learned parameter of the original model. We also have a convenient push_to_hub() method that can be added to the conversion script to create a repo on the hub and push the converted / pre-trained HF model and files. See an example conversion script over here.
cc @sgugger @NielsRogge
@venkat-natchi I guess you also need to rebase your branch on main as TensorFlow new release broke a lot of things so tests won't pass unless you do this.
Thanks guys.!!
Started with a convert_script and rebased with main branch.
There is multiprocessing here for data loading. I need some help in disengaging it and implement a simple processing step.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
@alaradirik and @NielsRogge Friendly ping here.
Hi @venkat-natchi, would it be possible to rebase your branch on the main branch of transformers?
This way, the CI becomes greener, and allows us to review the PR in depth.
Sure, will do. Thanks
Hello, can I work on this issue? Although I'm new to open-source contributions, I've worked on super-resolution models in the past and I was wondering why HuggingFace did not have these. I am familiar with PyTorch.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hello, can I work on this issue? Although I'm new to open-source contributions, I've worked on super-resolution models in the past and I was wondering why HuggingFace did not have these. I am familiar with PyTorch.
Hi @asrimanth, perhaps you could collaborate with @venkat-natchi on this PR if they are okay with it? Super resolution is definitely a task we would like to add to transformers and this would be a great first addition :)
Hello @alaradirik, Sure! I am interested. How do I get started?
Hello @alaradirik, Sure! I am interested. How do I get started?
@venkat-natchi can add you as a contributor to their forked transformers repo and you two could collaborate on this branch if they are okay with it. @venkat-natchi would you prefer to work on the PR on your own or hand it over to @asrimanth instead?
In any case, you can refer to the guidelines to get started with adding a model. I'd recommend first checking you can run the original repo without any issues though. Here are some summarized points that might help:
- Each model, including different checkpoints of the same model, has it's own repo on the Hub (see DETR-ResNet-50 repo as an example). This is basically a git repo that stores the checkpoint specific configuration, preprocessing configuration and the model weights.
- The code (this PR) added to transformers acts as a boilerplate to load different checkpoints - EDSR trained on different datasets or with different resolution or larger / smaller architecture.
- configuration_edsr.py should contain all the hyperparameters, the input image size and architectural details (e.g. number of hidden layers) to initialize the model.
- image_processing_edsr.py should contain the ImageProcessor class that takes in the raw input image and preprocesses it to the format expected as input to the model (resizing to a fixed input size, normalization, cropping, etc.)
- modeling_edsr.py should contain the model definition.
- The conversion script:
- Loads the pretrained original model and randomly initializes the HF implementation with the corresponding configuration
- Copies the pretrained parameters (weights and biases) of the original model to the corresponding parameters of the randomly initialized HF model (the conversion step)
- Forward propagates an arbitrary input through both the original model and converted HF model and checks if the outputs match
- Uploads the converted HF model to the hub
- Each model and image processor class is tested with scripts under
tests/models/<MODEL_NAME>/, you can refer to other test files to see what tests to add.
Once you are done, you would need to run the following commands to check the PR passes all CI tests:
make style
make quality
make repo-consistency
RUN_SLOW=TRUE pytest tests/models/edsr/test_modeling_edsr.py
RUN_SLOW=TRUE pytest tests/models/edsr/test_image_processor_edsr.py
We can do an in-depth review once the PR passes most tests or the configuration, preprocessing and modeling is mostly complete.
Hope this helps!
Sure, I will add you as collaborator.
Sorry for the delay. @asrimanth I added you as a collaborator.
You can find the working version of the original repository here https://github.com/venkat-natchi/EDSR-PyTorch/blob/master/src/trial.py
Hello @alaradirik and the HuggingFace Team, I seem to run into an error where the EDSR_PRETRAINED_MODEL_ARCHIVE_LIST is empty and this appears to be causing some problems. From my understanding, I have to upload the model weights to a URL and mention it in that list. The existing pre-trained models are as follows:
url = {
"r16f64x2": "https://cv.snu.ac.kr/research/EDSR/models/edsr_baseline_x2-1bc95232.pt",
"r16f64x3": "https://cv.snu.ac.kr/research/EDSR/models/edsr_baseline_x3-abf2a44e.pt",
"r16f64x4": "https://cv.snu.ac.kr/research/EDSR/models/edsr_baseline_x4-6b446fab.pt",
"r32f256x2": "https://cv.snu.ac.kr/research/EDSR/models/edsr_x2-0edfb8a3.pt",
"r32f256x3": "https://cv.snu.ac.kr/research/EDSR/models/edsr_x3-ea3ef2c6.pt",
"r32f256x4": "https://cv.snu.ac.kr/research/EDSR/models/edsr_x4-4f62e9ef.pt",
}
Should I upload these weights into the hub? If so, should I upload these to my profile? Is there a way to load these weights from the URL like torch.hub.load? Please let me know.
Should I upload these weights into the hub? If so, should I upload these to my profile? Is there a way to load these weights from the URL like torch.hub.load? Please let me know.
Hi @asrimanth , that's correct, the EDSR_PRETRAINED_MODEL_ARCHIVE_LIST contains the links to the uploaded checkpoints's configuration files on the Hugging Face Hub, see an example repo over here. Note that each link / repo contains the converted model, not the original model weights. So you should first complete the configuration, preprocessing, modeling and conversion scripts, and then convert and upload each checkpoint released by the authors.
Repos on the hub are placed under the organization that wrote the paper (Seoul National University in this case). We can ask them to create an organization on the hub but we will place the repos under the huggingface organization until they do so.
Since model conversion is the last step, you can fill in the list with the repo paths you intend to create. For example:
EDSR_PRETRAINED_MODEL_ARCHIVE_LIST = {
"huggingface/edsr-base-x2": "https://huggingface.co/huggingface/edsr-base-x2/resolve/main/config.json",
"huggingface/edsr-base-x3": "https://huggingface.co/huggingface/edsr-base-x3/resolve/main/config.json",
"huggingface/edsr-base-x4": "https://huggingface.co/huggingface/edsr-base-x4/resolve/main/config.json",
"huggingface/edsr-x2": "https://huggingface.co/huggingface/edsr-x2/resolve/main/config.json",
...
}
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.