node-ytdl-core
node-ytdl-core copied to clipboard
[Feature request] YouTube Live Stream Audio Only
Hello, as we all know, when requesting 'audio only' from node-ytdl, it will gives us a URL which is a chunk of 5 seconds audio playable directly from the browser. so the problem is we have to make a new request every < 5 seconds to get new audio. I tried to implement a mechanism where using Axios in a loop with a circular buffer, when getting a response, data will be pushed to the buffer and then write it to a WriteStream (a local file). The result was a bit unreliable, i was able to play it on VLC not on other players because data headers was also pushed with the chunks. when playing, i can hear duplicate samples ex. the first 15s are the same as the original first 5s request but duplicated, which made me think that YouTube will gives you the same audio if the request was not buffered or played completely or the request time was a bit early, like if we make 2 requests under 2 seconds we will get the same response. Anyway let's see what we can accomplish with that and with the help of anyone who is interested playing audio from a live stream because we all want to save some bandwidth and want to listen to music without watching while working. Here is some of my implementations:
const duration = 120;
const bufferSize = sampleRate * duration;
const buf = new CircularBuffer(bufferSize);`
async function callAxios(url) {
try {
let response = await axios( { method: 'get', url: url, responseType:'arraybuffer' });
return response.data
}catch (err) {
console.log(err)
}
}
async function loopThrough(url) {
let streaming = true;
// Loop through listings until they don't exist
do {
const data = await callAXios(url);
if (data) {
buf.push(data);
console.log(buf.size());
// streaming = false;
writeStream.write(buf.deq(), function () {
})
}else{
streaming = false;
writeStream.end();
}
} while (streaming)
}
mimeType: 'audio/mp4; codecs="mp4a.40.2"',
qualityLabel: null,
bitrate: 144000,
audioBitrate: 128,
itag: 140,
url: 'https://r2---sn-vbxgv-cxtz.googlevideo.com/videoplayback?expire=1621879552&ei=oJarYL2YI_K1xN8P9aCe-A8&ip=MYIP&id=KvRVky0r7YM.3&itag=140&source=yt_live_broadcast&requiressl=yes&mh=mz&mm=44%2C29&mn=sn-vbxgv-cxtz%2Csn-25glene6&ms=lva%2Crdu&mv=m&mvi=2&pl=24&initcwndbps=222500&vprv=1&live=1&hang=1&noclen=1&mime=audio%2Fmp4&ns=Yl-cSuGcwe5MjklshWtfjmsF&gir=yes&mt=1621857627&fvip=4&keepalive=yes&fexp=24001373%2C24007246&beids=9466587&c=WEB&n=3-RD52cbC59tTArA&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Clive%2Chang%2Cnoclen%2Cmime%2Cns%2Cgir&sig=AOq0QJ8wRQIgbPiNUb8oGioz2IucWT_lVzPGUucvx49z-CgsReJUCXYCIQDCS3pueUwS2i8LMVeWhzQ-SWPlQO0u6odkPgSOAz-4pA%3D%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRQIhAOhfxTRYA7MruZNdjdvfyjgnO-sm2VkQ69aOQkilx7zbAiBZG4LESU2pZScRCExSTXIXNZ8BcZs2bzElAvl835lwIw%3D%3D&ratebypass=yes',
lastModified: '1621825709929201',
quality: 'tiny',
projectionType: 'RECTANGULAR',
targetDurationSec: 5,
maxDvrDurationSec: 43200,
highReplication: true,
audioQuality: 'AUDIO_QUALITY_MEDIUM',
audioSampleRate: '48000',
audioChannels: 2,
hasVideo: false,
hasAudio: true,
container: 'mp4',
codecs: 'mp4a.40.2',
videoCodec: null,
audioCodec: 'mp4a.40.2',
isLive: true,
isHLS: false,
isDashMPD: false
Added a delay for 5 seconds
async function delay(ms) {
// return await for better async stack trace support in case of errors.
return await new Promise(resolve => setTimeout(resolve, ms));
}
async function loopThrough(url) {
...
await delay(5000)
}
Now I got a continuous audio, but there is some missing audio part which was skipped due to unsynchronized requests. So the basic theory almost works, but i think this is not the best way to approach it. Maybe node streams will do better? The audio coming from the response needs to be buffered, after that will do another request and so on. And also need to add a codec header in this case AAC was used.
i'd expect some kinda header to track which range to get 🤔 not just request ever ~5 sec
if you try 1 request and bump the result to a file and open it, you can see at the beginning some header information and then the audio data.
requestOptions.headers = Object.assign({}, requestOptions.headers, {
Range: `bytes=${0}-${''}`,
});
const ondata = chunk => {
downloaded += chunk.length;
stream.emit('progress', chunk.length, downloaded, contentLength);
console.log(chunk.length);
console.log(downloaded);
console.log(contentLength);
};
results:
16384
16384
NaN
16384
32768
NaN
16384
49152
NaN
16384
65536
NaN
15926
81462
NaN
end
seems it works, now I'm trying to make a callback on end event and make another request.
const stream = createStream();
doStream(stream, url).on('end', ()=> {
doStream(stream, url)
})
function doStream(stream, url) {
const requestOptions = Object.assign({}, [], {
maxReconnects: 6,
maxRetries: 3,
backoff: { inc: 500, max: 10000 },
});
requestOptions.headers = Object.assign({}, requestOptions.headers, {
Range: `bytes=${0}-${''}`,
});
let req = miniget(url, requestOptions);
let contentLength, downloaded = 0;
const ondata = (chunk) => {
downloaded += chunk.length;
stream.emit('progress', chunk.length, downloaded, contentLength);
console.log(chunk.length);
console.log(downloaded);
writeStream.write(chunk)
};
// req.on('response', res => {
// if (stream.destroyed) { return; }
// contentLength = contentLength || parseInt(res.headers['content-length']);
// stream.emit('response', res);
// console.log(res)
// });
req.on('data', ondata);
req.on('prefinish', (data) => {
});
req.on('end', () => {
if (stream.destroyed) { return; }
stream.emit('end', '');
// console.log(req)
});
req.on('drain', (drain) => {
// console.log(drain)
});
req.on('unpipe', (unpipe) => {
// console.log(unpipe)
});
return stream
}
same result, repetitive audio.
ftypdash iso6mp41 8moov lmvhd ������ �� @ (mvex trex �trak \tkhd ������ @ 8mdia mdhd ������ �� U� -hdlr soun SoundHandler �minf $dinf dref url �stbl [stsd Kmp4a �� 'esds @ � stts stsc stco stsz smhd {emsg http://youtube.com/streaming/metadata/segment/102015 ���Sequence-Number: 7457566
Ingestion-Walltime-Us: 1622114400732901
Ingestion-Uncertainty-Us: 112
Capture-Walltime-Us: 1622114400529764
Stream-Duration-Us: 37287320969666
Max-Dvr-Duration-Us: 14400000000
Target-Duration-Us: 5000000
First-Frame-Time-Us: 1622114409631562
First-Frame-Uncertainty-Us: 112
Finalized-Sequence-Number: 7457565
Finalized-Media-End-Timestamp-Us: 37287320985666
Finalized-Media-Lmt-Us: 1622090390317185
Finalized-Media-Lmt-Us: 1622090390317186
Finalized-Media-Lmt-Us: 1622090390317187
Finalized-Media-Lmt-Us: 1622090390317188
Finalized-Media-Lmt-Us: 1622090390317189
Finalized-Media-Lmt-Us: 1622090390317190
Finalized-Media-Lmt-Us: 1622090390317191
Finalized-Media-Lmt-Us: 1622090390317192
Finalized-Media-Lmt-Us: 1622090390317193
Finalized-Xformat: 133
Finalized-Xformat: 134
Finalized-Xformat: 135
Finalized-Xformat: 136
Finalized-Xformat: 139
Finalized-Xformat: 140
Finalized-Xformat: 140:slate=user
Finalized-Xformat: 160
Finalized-Xformat: 247:slate=user
Current-Segment-Xformat: 140
Encoding-Alias: L1_BA
The response consists of 5 chunks, each one has a length about 16384 bytes (it differs), so the first chunk is the MP4 header info like the moov and stts. we need to parse those header info so that we can know the 'end' offset. Any suggestions?
[
'Last-Modified',
'Fri, 28 May 2021 12:17:52 GMT',
'Content-Type',
'audio/mp4',
'Date',
'Fri, 28 May 2021 12:18:44 GMT',
'Expires',
'Fri, 01 Jan 1990 00:00:00 GMT',
'Cache-Control',
'no-cache, must-revalidate',
'Accept-Ranges',
'bytes',
'Content-Length',
'81064',
'Connection',
'keep-alive',
'Alt-Svc',
'h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"',
'X-Walltime-Ms',
'1622204324579',
'X-Bandwidth-Est',
'131830985',
'X-Bandwidth-Est-Comp',
'17349397',
'X-Bandwidth-Est2',
'17349397',
'X-Bandwidth-App-Limited',
'false',
'X-Bandwidth-Est-App-Limited',
'false',
'X-Bandwidth-Est3',
'14048912',
'Pragma',
'no-cache',
'X-Sequence-Num',
'7475549',
'X-Segment-Lmt',
'1622204272988277',
'X-Head-Time-Sec',
'37377234',
'X-Head-Time-Millis',
'37377234071',
'X-Head-Seqnum',
'7475549',
'Vary',
'Origin',
'Cross-Origin-Resource-Policy',
'cross-origin',
'X-Content-Type-Options',
'nosniff',
'Server',
'gvs 1.0'
]
I tried to increase 'X-Sequence-Num' by 1 on nextChunk requests, it gives me the same response but after like 7 requests i get new audio. maybe also needs 'X-Head-Time-Millis' to be increased.
[ftyp] size=8+16
major_brand = dash
minor_version = 0
compatible_brand = iso6
compatible_brand = mp41
[moov] size=8+560
[mvhd] size=12+96
timescale = 48000
duration = 0
duration(ms) = 0
[mvex] size=8+32
[trex] size=12+20
track id = 1
default sample description index = 1
default sample duration = 0
default sample size = 0
default sample flags = 0
[trak] size=8+404
[tkhd] size=12+80, flags=3
enabled = 1
id = 1
duration = 0
width = 0.000000
height = 0.000000
[mdia] size=8+304
[mdhd] size=12+20
timescale = 48000
duration = 0
duration(ms) = 0
language = und
[hdlr] size=12+33
handler_type = soun
handler_name = SoundHandler
[minf] size=8+219
[dinf] size=8+28
[dref] size=12+16
[url ] size=12+0, flags=1
location = [local to file]
[stbl] size=8+159
[stsd] size=12+79
entry_count = 1
[mp4a] size=8+67
data_reference_index = 1
channel_count = 2
sample_size = 16
sample_rate = 48000
[esds] size=12+27
[ESDescriptor] size=2+25
es_id = 1
stream_priority = 0
[DecoderConfig] size=2+17
stream_type = 5
object_type = 64
up_stream = 0
buffer_size = 0
max_bitrate = 0
avg_bitrate = 0
DecoderSpecificInfo = 11 90
[Descriptor:06] size=2+1
[stts] size=12+4
entry_count = 0
[stsc] size=12+4
entry_count = 0
[stco] size=12+4
entry_count = 0
[stsz] size=12+8
sample_size = 0
sample_count = 0
[smhd] size=12+4
balance = 0
[emsg] size=8+1139
[sidx] size=12+32
reference_ID = 1
timescale = 48000
earliest_presentation_time = 0
first_offset = 0
[moof] size=8+1028
[mfhd] size=12+4
sequence number = 7333
[traf] size=8+1004
[tfhd] size=12+16, flags=2002a
track ID = 1
sample description index = 1
default sample duration = 1024
default sample flags = 0
[tfdt] size=12+8, version=1
base media decode time = 1794422115536
[trun] size=12+944, flags=201
sample count = 234
data offset = 1044
[mdat] size=8+78296
parsing the MP4 headers, no byte range found.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hello again, so after a long digging into the problem, finally i made it. https://github.com/joek85/yt-live