ms-agent icon indicating copy to clipboard operation
ms-agent copied to clipboard

[WIP]feature: support more general image extraction

Open alcholiclg opened this issue 5 months ago • 0 comments

Change Summary

  1. Add examples in unittest.
  2. Support capturing current base url in docling htmlbackend.
  3. Support fecthing image using base url.
  4. Support normalizing base url.
  5. Rename the validate_url function to resolve_url.
  6. Slightly repackage image extraction logic.

Related issue number

Checklist

  • [ ] The pull request title is a good summary of the changes - it will be used in the changelog
  • [ ] Unit tests for the changes exist
  • [ ] Run pre-commit install and pre-commit run --all-files before git commit, and passed lint check.
  • [ ] Some cases need DASHSCOPE_TOKEN_API to pass the Unit Tests, I have at least pass the Unit tests on local
  • [ ] Documentation reflects the changes where applicable
  • [ ] My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

alcholiclg avatar Jul 22 '25 05:07 alcholiclg