QIN2DIM

Results 172 comments of QIN2DIM

> Automated deployment @ utc 2023-11-12 02:46:21.150401 | Attributes | Details | | ---------- | ---------------------------- | | prompt | Please click on all images containing the largest animal in...

> Automated deployment @ utc 2023-11-12 04:27:46.536436 | Attributes | Details | | ---------- | ---------------------------- | | prompt | Please click on all images containing the largest animal in...

> Automated deployment @ utc 2023-11-12 04:43:46.241308 | Attributes | Details | | ---------- | ---------------------------- | | prompt | Please click on all images containing the largest animal in...

> Automated deployment @ utc 2023-11-12 06:28:34.766607 | Attributes | Details | | ---------- | ---------------------------- | | prompt | Please click on all images containing the largest animal in...

It's similar to that issues: - https://github.com/QIN2DIM/hcaptcha-challenger/issues/104

@Vinyzu @bigboy-baby https://loguru.readthedocs.io/en/stable/overview.html#suitable-for-scripts-and-libraries It looks like you can disable loguru logging like this: ```python logger.disable("my_library") logger.disable("__main__") logger.enable("__main__") ``` Debugging a headless browser is very difficult, so there must be necessary...

In addition, Playwright can save browser recording information and XHR packet, which is very helpful for studying Captcha

It might be better to add `kibana` to the same docker-comopse config file 🤔

## Update Date: 2024-08-30 我们热切期待大型语言模型的视觉能力能够再进一步提升。这些模型在图像理解和处理方面已经取得了令人瞩目的进展,但我们相信其潜力尚未完全发掘。 ## 谜一样的多模态大模型 Date: 2024-04-13 尽管目前最先进的模型,如 Gemini 1.5 Pro、gpt-4-turbo和claude-3-opus-20240229,在面对 hCAPTCHA 的多模态挑战时,还无法仅依靠单步提示就能顺利解决。 出于实验研究的目的,我们搭建了一个简易的LangGraph有向无环状态机。这个模型使用了一点标注的数据集,并通过问答的形式来辅助识别和整理输出结果。 引入“人在回路中”(human-in-the-loop)的方法就像给了答案提示一样。例如,把“the odd one out”直接翻译成“wolf”,然后用边界框标出所有目标并加上序号。这样做的目的是帮助模型更好地理解和处理任务。 ![object_detection_point_1](https://github.com/QIN2DIM/hcaptcha-challenger/assets/62018067/4b1ff1bd-ce72-4826-90f1-ab9de574e2af) ![PixPin_2024-04-11_22-45-26](https://github.com/QIN2DIM/hcaptcha-challenger/assets/62018067/8661d277-f256-4d72-8525-cf91f8517773) ## 潦草的日志与阶段性结论 于是,一个简洁的提示词模板就形成了:`` 然而,尽管这种指导已经很直接了,LVM(Large Vocabulary Models)在处理风格化任务时仍然不尽人意。或者说,虽然它能够处理,但想要达到传统监督学习模型那种高度定制化、轻量级且易于部署的效果,并同时保持高精度,这还是不太现实的。 目前可用的数据集规模还相对较小,我尚未进行过更严格的基准测试,但从直觉来看,LVM 在提示词引导下的表现似乎连基本水平都未达到——至少与当前最先进的图像分类和目标检测模型相比是这样的。...

!好,已手动防沉迷 ![image](https://user-images.githubusercontent.com/62018067/131862150-2388d51b-f45d-42c8-b402-1ff5655ae1e6.png)