Xing Wang

Results 8 comments of Xing Wang

I'm also waiting for Qwen-VL-Max to be deployed locally, I used it for visually rich document understanding, found it great

@NEOOOOOOOOOO , do you know how to deploy UI-TARS-1.5-7B locally using vLLM? there is no documentation about local deployment. thanks

your problem is caused by: unknown shorthand flag: 'f' in -f in your system you cannot run docker compose -f ../compose.yml up -d

now the author has already modified codes, so that you can decide if use flash attention by setting `use_flash_attn`: ```python path = 'OpenGVLab/InternVL2-8B' model = AutoModel.from_pretrained( path, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, use_flash_attn=True,...

+1, after setting KVM=N, I got the same issue as @glienard . BdsDxe: failed to load Boot0002 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0xA,0x0)/Scsi(0x0,0x0): Not Found BdsDxe: loading Boot0001 "UEFI...

do you know how many screenshots are used by WorldModel to plan next instruction? seems only the latest screenshot?

in WORLD_MODEL_GENERAL_EXAMPLES, for **vision**, some value is **SCREENSHOTS**, some value is **SCREENSHOT**, what are the difference? ``` Current state: external_observations: vision: '[SCREENSHOTS]' internal_state: agent_outputs: [] user_inputs: [] ``` ``` Current...