just_audio icon indicating copy to clipboard operation
just_audio copied to clipboard

memory-based AudioSource

Open ekuleshov opened this issue 3 years ago • 25 comments

Is your feature request related to a problem? Please describe.

As discussed in #278 a memory-based AudioSource could be a good option to serve audio data packaged with the app.

Describe the solution you'd like

A memory-based AudioSource would allow to feed audio data loaded from assets, e.g. player.setBytes(await rootBundle.load(assetPath)).buffer.asUint8List()) or even in-memory generated audio data.

Describe alternatives you've considered

A proxy-based options discussed in #278 they have various shortcomings.

Additional context

N/A

ekuleshov avatar Jan 09 '21 06:01 ekuleshov

As mentioned in #278 a purely memory-based solution might be a long way off, but if I look at the use cases, maybe a file-based solution will suffice in the short term.

I will restore the old copy-asset-to-file approach with a better caching mechanism to address the asset use case, while the player.setBytes method could also be implemented by writing those bytes to file and passing that to the decoder.

ryanheise avatar Jan 09 '21 11:01 ryanheise

Note to self: on iOS this can be done using AVAssetResourceLoaderDelegate.

On Android, this can be done with ByteArrayDataSource.

ryanheise avatar Jan 11 '21 06:01 ryanheise

My use case is this: I want to have control over the cache.

In the context of a chat:

  1. the user records his voice
  2. user sends his recording in the chat

the user that sent the recording has the recording locally while the person he sent it to does not have it. The recording metadata has an ID, which I can use for the cache key in both scenario to store the data.

With what just_audio currently has I cannot have control. + I saw that just_audio was going in the direction of handling the cache, which there are already packages for and is not audio.

cedvdb avatar Aug 07 '21 09:08 cedvdb

Another use case I can think of:

  • displaying a wave form before the file is played. I believe to do that you'd need the byte array directly. That byte array could be used for the display of waveform and the playing.

cedvdb avatar Aug 07 '21 09:08 cedvdb

@cedvdb the wave form use case is a different matter because it requires access to the decoded samples, whereas the memory-based audio source being discussed would hold encoded data. Your use case I think would be best served by a different plugin. To my knowledge, such a plugin doesn't exist yet.

For all other use cases involving a memory-based audio source that have been presented above, they can for now be handled by using files as an intermediary. So if you happen to have memory-based audio data, simply write it to a file, and then pass the URI of that file into UriAudioSource, or in the case of web, you could use a data URI.

ryanheise avatar Aug 07 '21 12:08 ryanheise

@ryanheise Is there any way for me to play Uint8List data with pure sample data by writing it to a file? If I try to play the file I am getting: TYPE_SOURCE: None of the available extractors (FlvExtractor, FlacExtractor, WavExtractor, FragmentedMp4Extractor, Mp4Extractor, AmrExtractor, PsExtractor, OggExtractor, TsExtractor, MatroskaExtractor, AdtsExtractor, Ac3Extractor, Ac4Extractor, Mp3Extractor, JpegExtractor) could read the stream.

arnirichard avatar Oct 08 '22 06:10 arnirichard

You will need to encode it into a supported format. For your use case, probably WAV will be the easiest.

ryanheise avatar Oct 08 '22 06:10 ryanheise

Since it hasn't already been mentioned above, another solution aside from writing the audio to a file is to define your own subclass of StreamAudioSource that returns the audio data from memory. If you search the issues database for this class name you will find examples of how people are using it.

ryanheise avatar Nov 17 '22 06:11 ryanheise

@ryanheise Is it possible for an AudioPlayer with a StreamAudioSource to play such that the audio source only returns chunks of its data at a time? Currently, when setting the source to a StreamAudioSource and playing, the start and end parameters of StreamAudioSource.request are null, so the entire stream is requested at once. Even if the sourceLength and contentLength of the returned StreamAudioResponse indicate that there is yet more data in the source after the content that is returned, request is only ever called once, and any remaining data is ignored. How can a StreamAudioSource respond to request when there is an unknown number of chunks of data that it does not yet have, such as when the sound is being generated in realtime?

yaakovschectman avatar Jun 05 '23 21:06 yaakovschectman

It is the native level player that actually makes the requests, and its first request is also going to be a normal request assuming the server doesn't support range requests. If after learning from the first response that the server supports range requests, it will optionally make range requests for the end of the file with the duration metadata is found and then further ones for any seek requests.

But this doesn't mean that your stream audio source needs to actually load the whole file into memory at once. The API uses a stream for this very purpose. The stream means that the whole file will NOT be loaded all at once but instead streamed over time. So when generating the stream output, you only need to load up the next chunk of memory as the stream gets to that point in time.

ryanheise avatar Jun 06 '23 00:06 ryanheise

@ryanheise If I understand correctly, you are basically saying to return a Stream that currently contains whatever is already loaded, but will stream what is loaded in the future as it happens, correct? If so, and especially when the total length of audio is unknown at the start or is potentially infinite, what values should be in the sourceLength and contentLength of the response?

yaakovschectman avatar Jun 06 '23 11:06 yaakovschectman

If I understand correctly, you are basically saying to return a Stream that currently contains

No, a stream doesn't "currently contain" anything, a stream supplies data at the rate at which the consumer consumes it. The native level player decides to request the entire range of the file and consumes from your stream at a rate of its own choosing. Typically on Android it will not actually consume the whole file at the maximum rate possible, it will consume just enough to feed the decoder in time for playback without stuttering, and if you pause playback it will also pause consumption. On iOS, there are options to control the size of the lookahead buffer, but iOS seems to ignore them, so on that platform, the native iOS player will potentially consume the entire file at the maximum rate. You can have a play around with the various load configuration options for Android and iOS that you pass into the constructor, which are intended to control Android's and iOS's lookahead buffer size, among other things.

If you want to impose your own rate limiter (e.g. because iOS is not rate limiting itself like Android does), you could in theory do that by controlling the rate at which you add things to your stream.

ryanheise avatar Jun 06 '23 11:06 ryanheise

@ryanheise I see, but what values ought be returned in sourceLength and contentLength when streaming audio of an unknown total length, e.g. from a livestream?
I am testing this on Windows with just_audio_windows, for clarity.

Suppose I want to continuously play a sine tone whose frequency is controlled in realtime by a slider. I would not be able to supply audio data very far ahead of time, as the slider value may have changed by the time the player reaches that new data; I would instead need to generate the data at the same rate it's consumed. Am I understanding you correctly that I would need to manually delay adding data to the stream by however long each block of data should take to play?

yaakovschectman avatar Jun 06 '23 13:06 yaakovschectman

I don't have windows and can't test it, but regarding the source/content length I can just say that StreamAudioResponse follows the HTTP standard for range requests. Are you familiar with them? You can use null if unknown, and it will produce the correct range header, but consulting online resources about range requests should help.

ryanheise avatar Jun 06 '23 13:06 ryanheise

From what you stated before, it sounds like a response with null for its lengths causes a header indicating that the source supports range requests to be sent back to the player. Though when I return a response with null for the two length parameters and 0 for the offset, I see the following error when play is called:

[ERROR:flutter/runtime/dart_vm_initializer.cc(41)] Unhandled Exception: Null check operator used on a null value
#0      _proxyHandlerForSource.handler (package:just_audio/just_audio.dart:3127:64)

When the offset is left null, the above error is not reported, but no sound is played and no subsequent range requests are made.

From a high level, if I want to continuously play some raw audio data that's generated as it's played, what would my StreamAudioSource subclass need to do? Respond to the first normal request with null for its source/content length and a Stream to which I add data as it's generated?

yaakovschectman avatar Jun 06 '23 14:06 yaakovschectman

Just to check again, are you familiar with how range requests work? You'll need to share all parts of the range request header and all parts of the response header for me to see whether that is a sensible combination or not (if it's a sensible combination, then it's a just_audio bug. If it's not a sensible combination, you need to change your code.)

ryanheise avatar Jun 06 '23 15:06 ryanheise

I'm afraid I do not understand the relevance to the use case of creating an AudioPlayer and calling play on it. When the player loads the audio source, it sends a request to the StreamAudioSource and calls its request method, right? And then request returns a StreamAudioResponse, correct? Where do request and response headers even factor into this process, and how would one know what they contain?

yaakovschectman avatar Jun 06 '23 15:06 yaakovschectman

StreamAudioResponse implements the HTTP range request protocol. It feeds through just_audio's HTTP proxy. StreamAudioSource feeds through just_audio's HTTP proxy. Since this feature works through the HTTP proxy, it's all based on the way HTTP works. I'm off to bed now, but in the meantime, can you read about HTTP range requests so that my previous responses may make more sense?

ryanheise avatar Jun 06 '23 15:06 ryanheise

From what I found, a client can check if a server supports range requests using a HEAD request. Is such a HEAD request being sent and processed behind the scenes? If so, does the user have any control over the request or response headers, and how might I debug the request and response of either the HEAD request or the actual GET requests to share it with you as you have asked? It seems that the contents of the headers when using StreamAudioSource and AudioPlayer.setAudioSource are opaque to the developer.
For example, I subclass StreamAudioSource with its custom request method. I create an instance of it, and an instance of AudioPlayer. I set the player's source to the stream source I created, and then call play on it. I understand from what you have said that whatever requests and responses result from this are fed through the HTTP proxy, but I do not see any way of directly interacting with the proxy in the API in order to intercept, view, or modify the headers of these requests and responses

yaakovschectman avatar Jun 06 '23 16:06 yaakovschectman

From what I found, a client can check if a server supports range requests using a HEAD request. Is such a HEAD request being sent and processed behind the scenes?

There is another open issue to support HEAD requests, but until then, it only supports GET requests, which means that the first GET request will typically be for the entire file. You can then set rangeRequestsSupported to false in your response to maintain that and prevent future range requests. That should allow you to also use null as a contentLength (which would not normally be allowed if you were using range requests).

ryanheise avatar Jun 07 '23 01:06 ryanheise

Firstly, I still have no information on how to view the header contents of requests and responses to give you that information as you have requested.
Secondly, what is setting rangeRequestsSupported to false supposed to accomplish? As a test, I currently return a response with a stream that just contains one second of PCM data with a WAV header. When the source and content length are the length of the data, and the offset is 0, it plays. When all three are set to null, regardless of the value of rangeRequestsSupported, nothing plays, even though the method is called with the same parameters (i.e. 0, null).

For reference, my request method is as follows:

  @override
  Future<StreamAudioResponse> request([int? start, int? end]) async {
    int t0 = start ?? 0;
    int nSecs = 1;
    int t1 = end ?? (_sampleRate * nSecs + _kWavHeaderSize);
    int len = t1 - t0;
    print('Requesting $start -> $end');
    assert(len >= _kWavHeaderSize);
    Uint8List data = Uint8List(len);
    // Write a WAV header to the beginning of data and return the offset at which it ends.
    // Mono, 44100 hz sample rate, 8 bit PCM.
    int endOfHeader = _writeWavHeader(data.buffer.asByteData(), _sampleRate, 1, 1);
    for (int i = 0; i < data.length - endOfHeader; i++) {
      // frequency is 440, _sampleRate 44100.
      double samp = sin(i * 2 * pi * frequency / _sampleRate) * 0.5 + 0.5;
      data[endOfHeader + i] = (samp * 255).toInt();
    }
    return StreamAudioResponse(
      sourceLength: null,
      contentLength: null,
      offset: null,
      stream: Stream.fromIterable([data]),
      contentType: 'audio/wav',
      rangeRequestsSupported: false,
      );
  }

Edit: It actually seems what is happening when the AudioPlayer plays with the above as its source is that it makes the request, is silent for about 5 minutes, and then plays the returned audio.

yaakovschectman avatar Jun 07 '23 13:06 yaakovschectman

You can print out the parameters to see the values that correspond to the request header, and you can print out the parameters in the response object to see the values that correspond to the response header. You can also look at the just_audio code to see how these values are converted to and from the headers to see what these values correspond to, in case the comments do not make that correspondence clear.

What rangeRequestsSupported does is again correspond to an HTTP header (in this case, accept-ranges). That's why I suggested you familiarise yourself with the way range requests work first, as that will help you to understand the API you're using. As for why I mentioned it, it was so that you could pass in a null for a particular parameter (see previous comment). just_audio will assume that if you say that you DO support range requests, that you will also specify the parameter that are necessary for the range (i.e. offset and contentLength). If you say that you support range requests but you then don't actually support it by giving the range values, then just_audio will give you an assertion error.

ryanheise avatar Jun 07 '23 13:06 ryanheise

By the parameters, do you mean just start and end as passed to StreamAudioSource.request, and the constructer arguments for the StreamAudioResponse? If not, then any further parameters seem to be completely opaque to the developer, so ho/where would I be able to find and log their values? And if so, when play is called, the source receives a request for 0, null as I stated above.

I understand that rangeRequestsSupported corresponds to the HTTP header regarding range requests, and I have already looked into HTTP range requests in general. I was asking what disabling range requests is supposed to accomplish in practical terms, because, as I mentioned above, I can return null for the source length, content length, and offset, whether I support range requests or not, and observe the same behavior, i.e. silence and no failed assertions.

Putting a somewhat more detailed print message in request, with sourceLength: null, contentLength: null, offset: null, rangeRequestsSupported: false results in the following being logged when play is called, and no sound being actually played:

[just_audio_windows] Called setVolume
[just_audio_windows] Called setSpeed
[just_audio_windows] Called setPitch
[just_audio_windows] Called setSkipSilence
[just_audio_windows] Called setLoopMode
[just_audio_windows] Called setShuffleMode
[just_audio_windows] Called play
[just_audio_windows] Called load
flutter: Requesting 0 to null, returning data of 44144 long.

Note that if I set rangeRequestsSupported: true in the response, the exact same output is logged and the same behavior is observed. No failed assertion is reported.

yaakovschectman avatar Jun 07 '23 14:06 yaakovschectman

Note that if I set rangeRequestsSupported: true in the response, the exact same output is logged and the same behavior is observed. No failed assertion is reported.

rangeRequestsSupported is a signal for subsequent requests.

By the parameters, do you mean just start and end as passed to StreamAudioSource.request, and the constructer arguments for the StreamAudioResponse?

Yes

If not, then any further parameters seem to be completely opaque to the developer, so ho/where would I be able to find and log their values?

just_audio is open source, from there you can see that these parameters are directly translated into the HTTP headers.

If you have found a bug, you are welcome to contribute to the code, although according to your comment it is actually working:

Edit: It actually seems what is happening when the AudioPlayer plays with the above as its source is that it makes the request, is silent for about 5 minutes, and then plays the returned audio.

So this is the behaviour of the native player, which at its own discretion will read the entire file until the duration is known, and then decide it is ready to play. There are other formats besides the WAV format that allow you to specify the duration as metadata in one of the early frames or in one of the last frames. Most transcoding software will give you the option to encode for web streaming in which case it'll put the metadata up the front of the file so that it doesn't need to download the entire file before being ready to play. Of course since then, range requests have become adopted, which allow the player to jump to the end of the file if that's where the metadata is stored, thus allowing playback to start immediately without having to download the entire file first.

ryanheise avatar Jun 07 '23 14:06 ryanheise

Actually, this discussion is probably a bit off-topic for this issue. The purpose of this issue is to be able to play an audio asset that CAN fit in memory, but you want to stream audio. Yes, there is the StreamAudioSource which can handle your use case, but this issue is not about your use case. If you want help using StreamAudioSource, you can post a question on StackOverflow, or if you want a feature that is not supported by StreamAudioSource, you can submit a feature request in a separate issue so as not to divert the current issue into a different discussion.

ryanheise avatar Jun 08 '23 03:06 ryanheise