Tharsis Souza
Tharsis Souza
best config 1. OpenAI o1 for transcript generation 2. Google TTS Multispeaker for audio generation Upload to README. This is going to be fun!
mypy podcastfy/*.py ``` podcastfy/content_generator.py:56: error: Unexpected keyword argument "max_output_tokens" for "ChatGoogleGenerativeAI" [call-arg] podcastfy/content_generator.py:56: error: Incompatible types in assignment (expression has type "ChatGoogleGenerativeAI", variable has type "Llamafile") [assignment] podcastfy/content_generator.py:62: error: Unexpected...
"evals are surprisingly often all you need" But here we are evaluating a pretty novel dimension: How can we systematically quantify generated text/audio is engaging, follows a target configuration while...
Integrate with docling instead of building our own content parsers from scratch: https://github.com/DS4SD/docling Improved support, maintainability as well as coverage of courses since docling supports (PDF, DOCX, PPTX, Images, HTML,...
https://github.com/soimort/you-get Enable download of vídeos, images etc from input website
Currently, the tool supports URLs and PDF files as input. Users can also provide a file containing a set of links for processing. We aim to add an option allowing...
Add support for additional tags dynamically given TTS model. Right now it's the intersection of OpenAI/MS Edgeand ElevenLabs supported tags fixed values. Modify text_to_speech.py::clean_tss_markup
Add option that allow users to process provided website url + its subpages.
String should have at most 4096 characters before passing to TSS. We should split the transcript into 4096 chunks -> send to TSS then stitch them together to generate the...
Add a way for users to know how much a podcast generation will cost before running it