markitdown
markitdown copied to clipboard
feat(converter): add video converter.
This PR is a step towards resolving #154.
It introduces a VideoConverter class that converts videos to markdown by:
- Extracting metadata (if exiftool is installed)
- Performing speech transcription (if speech_recognition and pydub are installed)
- Generating a summary via a multimodal LLM from the transcription [This is optional and defaults to True if
llm_clientis configured]
Notes:
- I believe checking the file type based on the extension is not ideal. There are many video extensions, and I think checking the mime_type would be a better approach, as it can cover a wider range of video files.
- I’m unsure about the testing strategy .. should we focus only on testing exiftool? Please share your thoughts on this.
- Additionally, I suggest refactoring
Mp3Converterinto a more generalAudioConverter, as there are many audio extensions to consider. If you agree with this, I can submit a separate PR for it.
could you add tests?
@l-lumin, could you provide a sample video file that is allowed to be uploaded to the repo?
@l-lumin, could you provide a sample video file that is allowed to be uploaded to the repo?
I think you can use the file you tested locally.If it's wrong, can change it later
@l-lumin, okey I created a sample video file using ffmpeg. I've added test for exiftool for now. Maybe we can add tests for transcription as well, but #194 should be merged first.