echogarden icon indicating copy to clipboard operation
echogarden copied to clipboard

SoX playback unexpectedly stops after a few seconds on macOS

Open Tendaliu opened this issue 2 years ago • 27 comments

Hi, I am encountering an issue with the text-to-speech synthesis using Microsoft Edge as the engine on my Macmini (M2 chip). here is the output:

Echogarden v0.10.5

Get voice list for microsoft-edge.. 2154.3ms Selected voice: 'Microsoft Server Speech Text to Speech Voice (zh-CN, XiaoxiaoNeural)' (zh-CN, Chinese (China))

Synthesizing segment 1/1: "我的测试已经开始了"

Synthesizing sentence 1/1: "我的测试已经开始了" Prepare for synthesis.. 0.2ms Get voice list for microsoft-edge.. 0.6ms Initialize microsoft-edge module.. 0.6ms Request synthesis from Microsoft Edge cloud API.. 1055.9ms Transcode with command-line ffmpeg.. 689.0ms Convert boundary events to timeline.. 0.1ms Postprocess synthesized audio.. 2.1ms Total synthesis time: 1752.9ms Merge and postprocess sentences.. 0.1ms

我的

Merge and postprocess segments.. 9.4ms

I suspect the problem may be related to SoX

Tendaliu avatar Aug 07 '23 07:08 Tendaliu

Here is my command: '/Library/Application Support/node-v18.17.0-darwin-arm64/bin/echogarden' speak "我的测试已经开始了" --engine=microsoft-edge --microsoftEdge.trustedClientToken="6A5AA1D4EAFF4E9FB37E23D68491D6F4" --voice=XiaoxiaoNeural --speed=1 2>&1

Tendaliu avatar Aug 07 '23 07:08 Tendaliu

Thank for reporting. The project is untested on macOS, especially ARM64, since I have no access to a macOS machine.

I'm glad (surprised) it works at all!

Anyway, in ARM64 there is no auto-install of SoX via a package. It will use whatever version you have on system path. In all other platforms I chose versions that I believe will work well (on Linux I even made a custom static build, which I added in 0.10.0)

  • On Windows I had a problem with 14.4.2, where it would unexpectedly exit in the middle of playback (usually after about a minute). For this reason I made sure to bundle 14.4.1, to ensure that users never use the newer, buggy version.

  • On Intel Mac, I also decided to go with 14.4.1. Though I have made no testing of it.

  • On Linux, I decided to go with 14.4.2, since given my testing it worked okay.

Can you describe the problem in a more detailed way?

Can you type sox --version and tell me what version you have installed? If you have 14.4.2.. maybe you can try to somehow find 14.4.1 instead? (that may not be easy)..

If that's the problem, we can work on making a package for SoX especially for macOS ARM64, that is known to work.

Another possibility is that something about the command line can be changed to fix this.

rotemdan avatar Aug 07 '23 08:08 rotemdan

The SoX version is 14.4.2. It can generate audio normally, but in the terminal, it only plays back two words(我的).

And if I use SoX of 14.4.1. the error would be

nhandled promise rejection: Error: spawn Unknown system error -86

I think it's because of the CPU type

Tendaliu avatar Aug 07 '23 08:08 Tendaliu

Another problem I want you know is that if the program start the ffmpeg or SoX for the first time, a warning would be displayed to the user, stating that the app cannot be opened because it is from an unidentified developer. Then I have to change the security settings for these two apps on the Mac. I believe this issue can be resolved at the code level, something like chmod a+x. Sorry if I'm wrong. I am only familiar with Lua and Python.

Tendaliu avatar Aug 07 '23 08:08 Tendaliu

The title of the issue you said "Incomplete synthesis". Can you check if the synthesis itself is complete by outputting it to a file?

Maybe what you meant is that "playback stops unexpectedly"?

I've seen the problem before, as I mentioned.

Can you replicate the problem in the command line by running, say,

sox audiofile.mp3 -d

On windows, in 14.4.2 I replicated the intermediate exit problem on the command-line as well.

The warning you are describing seems to suggest that you might be using the ffmpeg and sox packages that are auto-installed by the program. However, I published packages only for Intel macOS. Can you at least tell if you're using ARM64 or Intel macOS? That's an extremely important detail!

Are you using some emulation mode? Your node directory says node-v18.17.0-darwin-arm64, which confuses me, but now that I checked I realized it has this name for all macOS versions. I'll need more information anyways.

rotemdan avatar Aug 07 '23 08:08 rotemdan

Yes it's "playback stops unexpectedly" and the output file is fine.

My mac is ARM64 and initially, I encountered problems while installing Homebrew, so I manually installed ffmpeg and SoX, and modified two files: ffmpegtranscoder.js and Soxpath.js.

Tendaliu avatar Aug 07 '23 09:08 Tendaliu

Here's where you can download SoX for macOS, to try on the command line. Note these builds are old, because development has stopped since 2015:

14.4.2 (2015): https://sourceforge.net/projects/sox/files/sox/14.4.2/sox-14.4.2-macosx.zip/download

14.4.1 (2013) - what I currently use: https://sourceforge.net/projects/sox/files/sox/14.4.1/sox-14.4.1-macosx.zip/download

Anyway, due to the security warnings you gave, in version 0.10.10 I disabled the use of internal packages on macOS for both ffmpeg and sox (including on Intel macOS - at least temporarily). Try the newer version

(Note I did these changes before I read your latest message)

I'm confused. I'm not sure why you needed to edit the source files? what exact changes did you make to ffmpegtranscoder.js and Soxpath.js?

Try to see if you can find any version of SoX that works properly for command line playback.

rotemdan avatar Aug 07 '23 09:08 rotemdan

The code that auto-installs the packages (which has now been disabled on 0.10.10), should only run when x64 is detected (not on ARM64). You shouldn't have needed to change anything?:

if (process.platform == "darwin" && process.arch == "x64") {
	const ffmpegPackagePath = await loadPackage("ffmpeg-6.0-macos64")
}

Maybe you're running on some sort of emulation mode which reports x64?

You can test what process.arch is reported by starting the node as interpreter (just run node). Then typing:

process.arch

rotemdan avatar Aug 07 '23 09:08 rotemdan

I did not add the program to the environment variable, so I need to remove the conditional statement:

process.arch == "x64".

else if (process.platform == "darwin") { const soxPackagePath = await loadPackage("sox-14.4.1-macosx"); soxPath = path.join(soxPackagePath, "sox"); }

Tendaliu avatar Aug 07 '23 09:08 Tendaliu

Acturally after that, the echogarden can find ffmpeg and SoX on my arm64 Mac properly(: As a video creator, I don't actually need the playback function much, but I will try other versions of SoX.

Tendaliu avatar Aug 07 '23 09:08 Tendaliu

The best information I can get right now is to download both:

14.4.2 (2015): https://sourceforge.net/projects/sox/files/sox/14.4.2/sox-14.4.2-macosx.zip/download

14.4.1 (2013) - what is included in the package: https://sourceforge.net/projects/sox/files/sox/14.4.1/sox-14.4.1-macosx.zip/download

And run,

sox some-audio-file.mp3 -d

In the command line, on macOS, and see if there are any problems with playback.

If I had access to a Mac I would have tested it a long time ago, but unfortunately I don't.

(Also running node on the command line, then process.arch, would help me know exactly what platform architecture is detected by node)

rotemdan avatar Aug 07 '23 09:08 rotemdan

tendaliu@tendadeMac-mini ~ % "/Library/Application Support/node-v18.17.0-darwin-arm64/app/packages/sox-14.4.1-macosx-20230718/sox" zsh: bad CPU type in executable: /Library/Application Support/node-v18.17.0-darwin-arm64/app/packages/sox-14.4.1-macosx-20230718/sox

tendaliu@tendadeMac-mini ~ % node Welcome to Node.js v18.17.0. Type ".help" for more information.

process.arch 'arm64'

tendaliu@tendadeMac-mini ~ % /Users/tendaliu/Downloads/sox-14.4.2/sox /Users/tendaliu/Desktop/1691229123.mp3 -d /Users/tendaliu/Downloads/sox-14.4.2/sox FAIL formats: no handler for file extension `mp3'

Tendaliu avatar Aug 07 '23 09:08 Tendaliu

Thanks for the information! This is really helpful.

It seems that the sox-14.4.1-macosx-20230718 doesn't seem work with ARM64 - which is expected, that's why I've set to use it only when x64 is detected. However, earlier you said it did run, but the playback stopped after a few seconds. This is confusing.

The build you downloaded for sox-14.4.2 may not have included support for MP3. Can you try a wav file like sox some-audio.wav -d? That would really be helpful!

rotemdan avatar Aug 07 '23 09:08 rotemdan

tendaliu@tendadeMac-mini ~ % /Users/tendaliu/Downloads/sox-14.4.2/sox /Users/tendaliu/Desktop/name.wav -d

/Users/tendaliu/Desktop/name.wav:

File Size: 8.62M Bit Rate: 1.54M Encoding: Signed PCM
Channels: 2 @ 16-bit
Samplerate: 48000Hz
Replaygain: off
Duration: 00:00:44.88

In:100% 00:00:44.88 [00:00:00.00] Out:2.15M [ | ] Clip:0
Done.

Tendaliu avatar Aug 07 '23 09:08 Tendaliu

it works very well in the terminal

Tendaliu avatar Aug 07 '23 10:08 Tendaliu

I think this information is very important:

Merge and postprocess sentences.. 0.1ms 我的

Merge and postprocess segments.. 9.4ms

The playback happened before the merge job was really done

Tendaliu avatar Aug 07 '23 10:08 Tendaliu

It's good to hear it works with 14.4.2. You can also try a wav file with longer duration (more than a minute, to ensure that it never stops in the middle).

This still doesn't explain which version of SoX are you using with echogarden, since the one from the package gave you a zsh: bad CPU type in executable error? Can you explain?

You can test this version of SoX (14.4.2) to see if it works correctly with echogarden

  • Upgrade echogarden to latest (0.10.10)
  • Have the 14.4.2 sox executable at the current working directory.
  • Run echogarden speak-file some-long-text-file.txt and see what happens. Try testing a text with long paragraphs.

I'm not sure what you mean that the "playback was really done"? Which version of SoX is used? What is the duration of the audio?

rotemdan avatar Aug 07 '23 10:08 rotemdan

image

Tendaliu avatar Aug 07 '23 10:08 Tendaliu

This should be normal behavior:

  • Each individual segment is synthesized and then played back (before any merging occurs).
  • Merge and postprocess segments happens after all segments have been synthesized (and played).

In terms of playback, does using SoX 14.4.2 with Echogarden works correctly? Does it stop abruptly? If so, then how many seconds does it play?

rotemdan avatar Aug 07 '23 10:08 rotemdan

image

Yes it stop abruptly even before it finished the 'guess'. I reinstalled Echogarden. Since I had previously installed SoX 14.4.2 using Homebrew, it was able to find SoX.

Tendaliu avatar Aug 07 '23 10:08 Tendaliu

Thank you.

At least now I know that Echogarden runs under macOS, on a basic level. And from what I see, that ffmpeg, at least, works, and that sox is able to at least start and produce some sound. That's a good start.

The SoX you have from Homebrew is most likely a different build from the one downloaded from the web.

It would be useful test the playback of sox you got from Homebrew with a wav file like:

sox audio-file.wav -d

(I mean, most likely it would work correctly, but it's still worthwhile to ensure that).

If it does work correctly, I can't do much right now, without having access to a macOS machine, or somebody else with access helping to debug it.

The code is basically calling the command line executable and passing raw audio to its stdin. In src/Audio/AudioPlayer.ts, the command line it's using is:

const player = spawn(
	soxPath,
	['-t', 'raw', '-r', `${rawAudio.sampleRate}`, '-e', 'signed', '-b', '16', '-c', channelCount.toString(), '-', '-d'],
	{}
)

This is tested to work correctly in both Windows and Linux (on the versions of SoX that are bundled in packages).

The code that could possibly stop the player too early are the handlers responding to the sox process' error and close events:

player.once("error", (e) => {
	reject(e)
})

player.once('close', () => {
	playerProcessClosed = true
	resolve()
})

If there's no error message shown, it's possible that close event is fired. I can't know what is causing this event to trigger early.

It's possible to debug this, by adding console.log() messages:

player.once("error", (e) => {
	console.log(`SoX produced an error: ${e}`)
	reject(e)
})

player.once('close', () => {
	console.log(`SoX closed`)
	playerProcessClosed = true
	resolve()
})

Another option is that somehow the closing of stdin pipe is causing it to exit early. This has not been a problem in Windows or Linux:

player.once("spawn", () => {
	player.stdin!.write(audioBuffer)
	player.stdin!.end()
	player.stdin!.on("error", () => { })

	playerSpawnedOpenPromise.resolve(null)
})

So right now I can't really do much. I'll try to think of some other solution for debugging. Or someone else can try to help find the cause, or a possible solution.

rotemdan avatar Aug 07 '23 11:08 rotemdan

image

The sox from Homebrew is totally fine

Tendaliu avatar Aug 07 '23 11:08 Tendaliu

Echogarden simply calls the sox command line executable, like you do, only it streams the audio input to the executable's stdin. There is no difference otherwise.

So:

It's possible that stdin playback is buggy on sox on macOS, and not much can be done about it. I can't really know, but since I already know that 14.4.2 is broken on Windows, it's possible it has problems on Mac as well. I don't know where to find a build of version 14.4.1 that works on macOS ARM64. It's very unlikely that it even exists since there are virtually no 14.4.1 or 14.4.2 builds for even Linux! I had to compile it myself, which took me like 3 hours to get right on Linux, even with assistance of a chatbot.

In general, I've had to go through several workarounds on both Windows and Linux until I got proper playback to work. SoX seems to be pretty buggy overall.

There's no real command-line alternative to SoX that I know of, which does playback and takes stdin input (and it wouldn't be possible for me to test it on macOS anyway).

rotemdan avatar Aug 07 '23 11:08 rotemdan

On 0.11.5, when running on macOS, SoX playback now first writes the audio to a temporary wav file, and plays the file, instead of using stdin.

Here is the relevant commit.

The command line I'm using is exactly the one you tested (sox audio.wav -d) so it should work correctly.

Just note that if you abort the program by pressing, say, by esc or ctrl-c, the temporary file will not be erased from the temporary path, which is ~/Library/Caches/echogarden on macOS. This may lead to this directory accumulating wav files with random names (which is a part of the reason I tried to avoid this approach), also, if the audio is very long, it may take a little bit of time until the file is fully written to disk.

Please let me know if you encounter any problem, since I still don't have access to a macOS machine.

rotemdan avatar Aug 20 '23 12:08 rotemdan

I want you to take a look at this project, which allows users to define the location of ffmpeg within the command line. I believe this approach is more conducive for packaging the program and for creating a user interface for it.

https://github.com/WyattBlue/auto-editor

Tendaliu avatar Aug 22 '23 12:08 Tendaliu

auto-editor has a CLI option called ffmpeg-location:

--ffmpeg-location
Set a custom path to the ffmpeg location

In general, I would rather not to require the user to specify this. I already auto-download an ffmpeg executable on Windows and Linux. Initially, I assumed that macOS would be easy, but now I see that there are security errors.

If I give the user the option to specify the ffmpeg location, it would only apply to macOS in practice. It wouldn't be very convenient for the user, since every time they run the CLI they have to pass --ffmpeg-location=some/path/to/ffmpeg (it would be easier in a configuration file but the software isn't ready to support this option in a configuration file, since it doesn't yet support global options).

I don't see this as a great solution - more of a last resort.

Maybe there's a version of ffmpeg for macOS that is pre-signed and doesn't give security warnings?

What I used in the package (that you said produced a warning), is from the page linked in the official FFMpeg website: static FFmpeg binaries for macOS 64-bit

The real problem is that I don't have access to a macOS machine. I might be able to find better solutions if I did, possibly for this as well.

rotemdan avatar Aug 22 '23 12:08 rotemdan

I can look for ffmpeg and sox executables in the current directory, if they are not found in path. This doesn't require adding new configuration options. It would only take effect in practice on macOS. Do you want me to add this?

Edit: It may be that it already looks for ffmpeg and sox in the current directory on macOS? Did you try? If it doesn't it isn't expected behavior.

rotemdan avatar Aug 22 '23 12:08 rotemdan

On v2.0.0, SoX has been replaced by the newly developed audio-io package, which uses a direct native interface to the Core Audio driver on macOS.

I'm closing this issue.

rotemdan avatar Dec 05 '24 13:12 rotemdan