shairport icon indicating copy to clipboard operation
shairport copied to clipboard

RFC: time synchronised playback

Open Nachtzuster opened this issue 11 years ago • 163 comments

This converts buffered playback to time synchronised playback. -adds a ntpsender and a ntpreceiver thread to measure sender time offset -tries to synchronise audio at the sync packets by adjusting the playback rate

The code is not quite ready for show time: Regulation is twitchy There are audible glitches maybe due to the above? Time related function should probably be moved to timing.[ch] What to do with sinks that do not provide delay information?

Anyway, I'd like some feedback from interested parties.

Nachtzuster avatar Feb 01 '14 13:02 Nachtzuster

I haven't yet had a chance to look through this properly (let alone test it), but still, some quick thoughts.

Primarily: this is a really important feature that shairport has never had. Thanks for addressing it, and for the quality code.

What is the significance of npt_* identifiers? Equiv ntp?

I suggest using the linear filtering bits for all the offset measurement and calculation stuff. There are also 2 things that could be done better than linear filtering:

  • Speed of convergence could be much improved by using adaptive filters or preloading filter state
  • median filtering rather than min/max removal for smoothing jumpy network lagged packets if linear/adaptive filters are not enough

Suggest also using uint32_t, uint64_t rather than traditional C int names.

cport, tport are malloc()'d but never free()'d in the timing fns commit.

Delay can probably be estimated for generic outputs by seeing how many samples they swallow before slowing down to realtime data consumption; that's why the original buffer fill tracking sets its target fill value after it starts to play.

outbuf = silence + 4 * (frame_size - play_samples); could just be outbuf = silence ??

The player_thread_func is getting a bit heavy, I think it's time to break it up.

Finally, the app should probably report its audio latency in RTSP header, once everything is firmed up a bit.

abrasive avatar Feb 03 '14 00:02 abrasive

Thanks for taking time to look into this. For now I fixed the most embarrassing things, ie the npt<>ntp naming, the free() - woops

outbuf = silence + 4 * (frame_size - play_samples); does indeed reduce to outbuf = silence; most of the time, except the very last time, there we have an partial silence frame

when you get around to trying the code, note that I've only tested with the Alsa sink for now.

Nachtzuster avatar Feb 03 '14 11:02 Nachtzuster

Because silence is all full of zeroes, you don't need to start playing it partway through?

abrasive avatar Feb 03 '14 23:02 abrasive

ok, now I get it. Yes it should be outbuf = silence; it doesn't really matter which zeroes get played out :-)

Nachtzuster avatar Feb 04 '14 11:02 Nachtzuster

Hey, this looks promising!

Unfortunately, it segfaults on two machines, (32-bit and 64-bit). I enabled debug logs in common.c, here's a log http://pastebin.com/raw.php?i=FhDyikPg

I can't seem to figure out how to get debugging symbols compiled, so logs from gdb would be useless. Let me know if you need any more info.

Oh, and it failed to get compile on the 32-bit one with undefined reference to `clock_gettime', I worked around that by adding -lrt.

MohammadAG avatar Feb 07 '14 13:02 MohammadAG

@MohammadAG Building with make CFLAGS="-O0 -g" will give you debug symbols.

plietar avatar Feb 07 '14 13:02 plietar

Here's a log with a backtrace then, thanks.

http://pastebin.com/raw.php?i=EPRNNmsp

Edit: okay, I've debug into the code more, it seems like this issue happens when neither alas or pulse audio are there.

MohammadAG avatar Feb 07 '14 13:02 MohammadAG

Hi Mohammad, I'm guessing you're not using the alsa sink. For now that is the only tested/working sink. Pulse might work too, all the rest won't work, they'll segfault on you :-)

Nachtzuster avatar Feb 07 '14 13:02 Nachtzuster

Hi,

Yup, you're right (I guess you didn't see my edit above :p) Anyway, the issue now is that shairport is now ahead of my Mac (which is sending the audio).

Should it be doing that? I'm trying to find the right value for config.delay, just in case it's the reason this is happening.

Another edit: 1750000 seems to be perfect.

MohammadAG avatar Feb 07 '14 13:02 MohammadAG

Fixed some more leaks/naming. Also corrected outputbuffer pointer, fixing the glitches. That makes it a bit more usable. @MohammadAG: I had not bothered with figuring out what the exact delay should be, so thanks for the better number, its in there now.

Anyway, the code is a decent enough state to add more features like support for generic outputs and a decent regulation. I'll be working on that now.

Nachtzuster avatar Feb 09 '14 13:02 Nachtzuster

I would like to express my interest in the functionality. Originally I was going to offer to help but it looks like the learning curve on the audio areas are pretty steep and it will probably take me awhile to get into understanding how all of this works. I'm thinking the only way to get synchronous whole home playback when paired with AirFoil and matched to other apple and licenced airplay devices is with time synched vs. buffered playback. I have all of the airport express hardware to not care, but think that this project would really kick off in embedded applications if the time synchronized feature was available. I want to build an embedded device that has airplay/google cast/and bt 4.0 incoming playback support with automatic handoff between protocols. Then I want to make an iOS/Android app that will let you create custom "zones" with your embedded devices paired in different combos. IE: Living Room + Kitchen = Zone A and then it would create services for AirPlay/GoogleCast. Basically an Audio only version of the AirPort Express with other protocol support and Android NFC pairing for BT. Your project is a great foundation in making something like that exist.

arodd avatar Feb 19 '14 04:02 arodd

Been low on activity, but still getting closer to completion :-) flush/record rtsp calls are now handled properly, which means pause/resume will also work and stay in sync.

still need:

  • a better rate control algorithm
  • delay needs to be user settable @MohammadAG On my machine, 2205000 sounds spot on. It is also the value I found in the unofficial documentation on the web somewhere, So I'll probably use that in the end.
  • need to (better) handle buffer under-runs

@arodd Great ideas! If you want to help; just testing code is already helpful - for any code base :+1:

Nachtzuster avatar Mar 06 '14 22:03 Nachtzuster

@Nachtzuster Great work, thanks However, the code does not build on OS X (and probably other BSDs) due to the lack of clock_gettime()/CLOCK_MONOTONIC. I haven't looked at the code, but the calls can probably be replaced by other on these platforms. I'll try tomorrow on my Raspberry Pis

plietar avatar Mar 06 '14 22:03 plietar

That's not a BSD issue, it's a Mac issue. clock_gettime() is POSIX.1 as of 1993, but on a Mac you have to use Mach calls - host_get_clock_service() for the SYSTEM_CLOCK (which is supposed to be monotonic). Yum.

abrasive avatar Mar 06 '14 23:03 abrasive

This is a nice piece of work and I'm really excited to see it coming through.

I've only had a quick look at it and realise there are couple of parts which are still being tuned so I'd just add two comments at this stage:

  • the delay default required by my airport express is approx. 23000000
  • config.start_buffer_size is used in a couple of places to determine the buffering effect. This could be set with the calculated buffering level or it's use should be re-evaluated.

I'd be happy to soak test the timing/sync algorithm when it's refined. I run multiple Pi based Shairports plus a couple of airport expresses in a domestic environment.

BewickPlace avatar Mar 20 '14 12:03 BewickPlace

Had another play with this tonight and it works quite well for me, plus when combined with my suggested refinements of resync and retransmission ( #307 / #310 ) I can get a good stable operation.

Just a couple of issues came out on the Pi:

  • LL casting of the constants in the time algorithms is required (otherwise time goes negative).
  • ()s are required in the get_delay return statement to ensure precedence (otherwise this goes negative)

Thanks again for the good work.

BewickPlace avatar Mar 20 '14 21:03 BewickPlace

Moving on: all features are in place and working more or less properly now, still some loose end though... and I guess the old rate control stuff should be ripped out? Also the issues on OSX and Pi need addressing. (don't have access to those systems though) Still need to properly measure the delay, will just record output from iTunes (R) and remote Shairport (L) and check the waveform in some editor.

Anyway:

  • rate control is simpler/smoother
  • will properly handle under-runs: drop everything and re-start at the right time

@BewickPlace: Thanks! testing again would be great now :-)

Nachtzuster avatar Mar 21 '14 17:03 Nachtzuster

I've spent some time on this today, looking at general performance together with implementing my #307 #310 changes as required.

Regarding the rate control, I found the algorithm was divergent on my Pi, with the extremes of the error growing over time. I thought I might try copying the existing biquad algorithm, but then realised this wasn't necessary as "sync_time" is proportional to the error and by altering the playback rate by simple linear function of this I achieved a very stable sync.

By taking elements of #307 and #310 updates relevant to the new buffering approach I have now not only been able to get rapid auto re-sync after underrun, but also achieve good synchronisation between the impacted device and other devices that are playing.

One other change I made was to force all synchronisation to start from a timestamp - the "Ouch!" warning - as I found this was a repeatable situation when selecting the shairport device (on iTunes) when itunes was already streaming to another device.

The overall package is a very pleasing step forwards.

I would offer up my changes but am unsure how to push them back to someone else's fork. Do let me know.

BewickPlace avatar Mar 23 '14 23:03 BewickPlace

Strike most of my comments from above. I've just discovered that you updated the files after I'd pulled down a close and started trying it out. I haven't looked closely at your changes but a couple look very similar. I'll take a closer look when I have time.

BewickPlace avatar Mar 25 '14 10:03 BewickPlace

and I guess the old rate control stuff should be ripped out?

Yes. Please.

Amazing piece of work!

abrasive avatar Mar 25 '14 11:03 abrasive

I agree great work!

Have looked at it tonight and my comments above re the Ouch warning and the rate control still stand. So I pulled the changes into a branch off my fork and reimplemented my upgrades albeit on your slightly changed approach to resyncing. There are also a couple of tweaks tightening the resend and buffer handling. You can see this on my sntp branch.

I hope the changes are something you want to take on board.

As I said the results are really pleasing.

BewickPlace avatar Mar 25 '14 22:03 BewickPlace

Looks like iTunes isn't sending sync along with the first packet for late joiners.... Hey! that does not comply with the unofficial spec. :-) @BewickPlace your code looks good, I'll fold that in, thanks!

The ratecontrol not working is probably to another constant type error, maybe #define LOSS 850000 should be #define LOSS 850000.0 to make it work.

Looking to borrow a Pi to debug, that might help too ;-)

Nachtzuster avatar Apr 02 '14 06:04 Nachtzuster

Thanks for your help on this. I've been using my build of this (which is Pi based) over the last week or so and it remains good. I've now scaled this up with multiple "rooms" (devices) playing simultaneously. This is generally fine but I do get the occasional drop out. This is caused by the Pi wifi adapter!! (any help on this gratefully received) but the seamless in-sync recovery now does a excellent job for me. I posted up a couple of tweaks to the advance packet handling yesterday.

BewickPlace avatar Apr 03 '14 09:04 BewickPlace

@plietar I've added mach timing support. Well, at least I tried to, zero experience with OSX... Would you -or anyone else for that matter- mind trying how far I got it right and report back? For now, it does not get configured in automatically yet, it requires a manual hack to do so: add #define MACH_TIME to config.h after you've run ./configure

Thanks a lot.

Nachtzuster avatar Apr 09 '14 16:04 Nachtzuster

Hi, I wanted to report my experience. I'm running the code in an olinuxino board (iMX233 454Mhz, 64Mb) with 3.14.0-1-ARCH linux with the integrated sound card. Transmitting from iTunes in a mac.

After setting asound.conf to dmix and 44100 I was getting great results (with the master branch) for the first audio track, but if it keep playing the following ones, It started to increase the number of (++++++++) messages. If I restarted the track, it came perfect again, so there was some kind of accumulative problem with the playback rate I imagine.

After trying with Nachtzuster branch, the transmission goes wild from the start. Please find the output file here: https://www.dropbox.com/s/6ewaf8a93ey3j81/debug.txt

Do you know what it may be causing it? thanks for your feedback and keep up with the good work.

mcantarutti avatar Apr 10 '14 11:04 mcantarutti

Looks like you're having alsa buffer underruns, hmmm. Are you seeing the same if you lower the verbosity to -vv -v or even w/o logging?

Nachtzuster avatar Apr 10 '14 15:04 Nachtzuster

yes

mcantarutti avatar Apr 10 '14 16:04 mcantarutti

@mcantarutti can't reproduce the issue I'm afraid: the dmix sink work fine here. so it might be specific to you alsa setup.

Nachtzuster avatar Apr 12 '14 14:04 Nachtzuster

@abrasive things are spread out in 16 commits right now, I'm guessing this should be reduced to 5-6 aptly commented commits - or you want everything squashed down into one commit? After that is done, I'm at the bottom of my to-do list apart from: -OSX support: need testers -verify the default delay: working on testing on a couple of different machines.

@BewickPlace should compile/work on a Pi out-of-the-box now.

Nachtzuster avatar Apr 17 '14 06:04 Nachtzuster

A condensed patchset would be ideal, if you have the time. Looking forward to merging this!

abrasive avatar Apr 17 '14 07:04 abrasive