markitdown icon indicating copy to clipboard operation
markitdown copied to clipboard

ImportError: "RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work"

Open Collab-with-Rajibul opened this issue 1 year ago • 15 comments

C:\Users\HP\AppData\Local\Programs\Python\Python310\lib\site-packages\pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)

The above error is encountered even if I just import the module "markitdown"

Collab-with-Rajibul avatar Dec 14 '24 10:12 Collab-with-Rajibul

Do you have ffmpeg installed ?

SH4DOW4RE avatar Dec 14 '24 20:12 SH4DOW4RE

After looking into installed packages, pydub has that error if you don't have a path to either ffmpeg or avconv in your system's PATH variable

Here's the code from pydub if you're curious:

def which(program):
    """
    Mimics behavior of UNIX which command.
    """
    # Add .exe program extension for windows support
    if os.name == "nt" and not program.endswith(".exe"):
        program += ".exe"

    envdir_list = [os.curdir] + os.environ["PATH"].split(os.pathsep)

    for envdir in envdir_list:
        program_path = os.path.join(envdir, program)
        if os.path.isfile(program_path) and os.access(program_path, os.X_OK):
            return program_path


def get_encoder_name():
    """
    Return enconder default application for system, either avconv or ffmpeg
    """
    if which("avconv"):
        return "avconv"
    elif which("ffmpeg"):
        return "ffmpeg"
    else:
        # should raise exception
        warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
        return "ffmpeg"

SH4DOW4RE avatar Dec 14 '24 20:12 SH4DOW4RE

Do you have ffmpeg installed ?

I am using this for conversion of pdf documents only and transcriptions are not required for my application. Do I still have to install ffmpeg?

Collab-with-Rajibul avatar Dec 15 '24 05:12 Collab-with-Rajibul

After looking into both the code of pydub and markitdown, I saw that pydub checks for either ffmpeg or avconv when it is first imported, and the code inside markitdown only checks if the module ins't found, as that cannot be as it is installed as a dependency, I think it should instead check for if either the module is not found (just in case) as well as a RuntimeWarning error coming from pydub, as that would indicate that pydub cannot convert mp3, and would put the variable IS_AUDIO_TRANSCRIPTION_CAPABLE to false, not running anymore pydub related code.

TLDR: Either you uninstall pydub if you don't need it. Or you install either ffmpeg or avconv to remove that error from pydub

~~If markitdown devs want, I can make a pull request to try and find a fix for this issue~~ Made one already just in case

SH4DOW4RE avatar Dec 15 '24 16:12 SH4DOW4RE

Audio Transcript:

Error. Could not transcribe this audio.

caesarllj avatar Dec 16 '24 09:12 caesarllj

What did you try before getting this error ?

SH4DOW4RE avatar Dec 16 '24 09:12 SH4DOW4RE

The Pull Request got accepted. Can you please try again on the new version

SH4DOW4RE avatar Dec 17 '24 06:12 SH4DOW4RE

I still encounter error RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work when trying to convert the .pdf files

Image

zerowu49 avatar Apr 22 '25 01:04 zerowu49

I still encounter error RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work when trying to convert the .pdf files

Image

me too

lmf avatar Apr 22 '25 03:04 lmf

Obvious question, but, do you have either ffmpeg or avconv installed ?

SH4DOW4RE avatar Apr 22 '25 08:04 SH4DOW4RE

This warning indicates that the pydub library did not find ffmpeg or avconv, so it defaults to using ffmpeg, but it may not function properly. Pydub relies on ffmpeg or avconv to process audio files, so ffmpeg needs to be installed and configured.

You can refer to this article to install ffmpeg and restart the Python environment: https://www.coderjia.cn/archives/a921ea34-f7ad-4b70-b596-7c9010e0b0a8

Sbwillbealier avatar Apr 28 '25 08:04 Sbwillbealier

it would be useful to have a flag to tell markitdown to focus only on text and do not look into multimedia

brauliobarahona avatar May 14 '25 06:05 brauliobarahona

it would be useful to have a flag to tell markitdown to focus only on text and do not look into multimedia

Only import the conversion types you want: markitdown[pdf, docx, pptx, xlsx, xls, outlook]

bdlgq-mike avatar May 16 '25 01:05 bdlgq-mike

it would be useful to have a flag to tell markitdown to focus only on text and do not look into multimedia

Only import the conversion types you want: markitdown[pdf, docx, pptx, xlsx, xls, outlook]

I mean when calling it to convert a pdf, in specific cases one might want to convert only the text from the pdf

brauliobarahona avatar May 16 '25 07:05 brauliobarahona

Ran into this exact same issue after following the README local source build / install. Ended up finding the answer (need to install ffmpeg...) https://stackoverflow.com/a/74658329 on StackOverflow.

I'd dare say the README.md could use at least one mention to the pre-requisite to have ffmpeg installed for the tooling to function. That or a specific plugin that is default installed with it as a dependency.

Coderrob avatar Oct 20 '25 13:10 Coderrob