Magma icon indicating copy to clipboard operation
Magma copied to clipboard

[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents

Results 25 Magma issues
Sort by recently updated
recently updated
newest added

Bumps [torch](https://github.com/pytorch/pytorch) from 2.3.1 to 2.7.0. Release notes Sourced from torch's releases. PyTorch 2.7.0 Release Notes Highlights Tracked Regressions Backwards Incompatible Changes Deprecations New Features Improvements Bug fixes Performance Documentation...

dependencies
python

Bumps [gradio](https://github.com/gradio-app/gradio) from 4.44.1 to 5.31.0. Release notes Sourced from gradio's releases. [email protected] Features #11229 231ccfe - Chatbot autoscroll fix. Thanks @​dawoodkhan82! #11224 834e92c - Fix re-rendering with key when...

dependencies
python

Hi, thanks a lot for sharing this great work and the open-source code! I saw that video finetune script is in the todo list. I want to use the video...

Hi, thanks for the excellent work! I deployed the FastAPI server on my machine via Docker. I used the `https://github.com/microsoft/Magma/blob/main/server/test_api.py` to test the model inference, simply replace the image and...

Hello @jwyang. Thank you for the amazing work. I am trying to finetune magma model on the following datasets for now - 1. [Mind2Web](https://github.com/OSU-NLP-Group/Mind2Web) 2. [Omniact](https://huggingface.co/datasets/Writer/omniact) I also want to...

@jwyang, Thank you again for this wonderful work. Can you please tell when are you planning to release finetuning code for UI?

TypeError: sum() received an invalid combination of arguments - got (bool, dim=int), but expected one of: * (Tensor input, *, torch.dtype dtype = None) * (Tensor input, tuple of ints...

I am trying to write a sample app to inference magma using the deployed models in Azure Foundry but running into issues. I can only seem to get simple text...

There are two sets of dependencies that fail to resolve: 1. `gradio==4.46.0` results in a “package not found” error — possibly due to the version being yanked. Removing the version...

Looking at run_detect_segment, we can guess that it requires an annotation file for the video, and that file consists of a start time, an end time, and a text prompt....