DeepSpeech icon indicating copy to clipboard operation
DeepSpeech copied to clipboard

Add BuildFromBuffer init variant for Java binding

Open vikramambrose opened this issue 4 years ago • 10 comments

Currently the libdeepspeech java binding only offers DeepSpeech(String modelPath) for initialization, This doesn't work in a production environment where the Android app may not be able to provide a "path". Current TFLite java bindings allow for init from java.io.File and java.nio.ByteBuffer. Please add one of these options.

vikramambrose avatar Jul 13 '20 12:07 vikramambrose

This doesn't work in a production environment where the Android app may not be able to provide a "path".

Please document this usecase, this is not one we got any report from.

Please add one of these options.

Adding those will impair the ability to perform mmap operation and it does have a very negative impact on the performances.

lissyx avatar Jul 13 '20 15:07 lissyx

There's also a constructor that takes a java.nio.MappedByteBuffer, maybe that is enough?

reuben avatar Jul 13 '20 16:07 reuben

(And doesn't impair performance)

reuben avatar Jul 13 '20 16:07 reuben

@lissyx Apps delivered through the app store can only access assets through the AssetManager. These files are not kept individually on the filesystem. They can only be accessed through limited means. See https://developer.android.com/reference/kotlin/android/content/res/AssetManager The "filename" in the docs refers to a virtual, relative path into the "assets" blob. Its not something one can fopen in C.

@reuben I think MappedByteBuffer should be fine too. It looks like a MappedByteBuffer can be obtained through an AssetFileDescriptor. In fact I am already doing this for other TFL models.

    public MappedByteBuffer mapAssetFD(AssetFileDescriptor afd) throws IOException {
        FileInputStream inputStream = afd.createInputStream();
        long startOffset = afd.getStartOffset();
        long declaredLength = afd.getDeclaredLength();

        FileChannel fileChannel = inputStream.getChannel();
        return fileChannel.map(FileChannel.MapMode.READ_ONLY, startOffset, declaredLength);
    }

Note that whatever mechanism is added, it needs to work for everything that currently relies on a system file path, so Scorer and Alphabet would also need to be enhanced.

I would recommend looking into what NDK api is available to make this easier; https://developer.android.com/ndk/reference/group/asset

vikramambrose avatar Jul 14 '20 08:07 vikramambrose

Apps delivered through the app store can only access assets through the AssetManager. These files are not kept individually on the filesystem. They can only be accessed through limited means.

Thanks, i know that, but you can easily download and place those files in your app storage directory, it works very well.

Note that whatever mechanism is added, it needs to work for everything that currently relies on a system file path, so Scorer and Alphabet would also need to be enhanced.

Yes, you are starting to grasp the issue here, its not that trivial.

External scorer we ship is 900MB so we need to be careful there...

lissyx avatar Jul 14 '20 08:07 lissyx

If you are able to assemble a patch set that improves behavior, you are welcome, though.

lissyx avatar Jul 14 '20 10:07 lissyx

External scorer we ship is 900MB so we need to be careful there...

Exactly! Making copies is not an option.

You too are starting to grasp the issue here, its not that trivial :)

vikramambrose avatar Jul 14 '20 17:07 vikramambrose

External scorer we ship is 900MB so we need to be careful there...

Exactly! Making copies is not an option.

You too are starting to grasp the issue here, its not that trivial :)

Don't package it as an asset in the application ? I know it's not the most convenient way to distribute, but for the moment, alternatives, as you said, are much worse.

lissyx avatar Jul 15 '20 09:07 lissyx

@reuben I think MappedByteBuffer should be fine too. It looks like a MappedByteBuffer can be obtained through an AssetFileDescriptor. In fact I am already doing this for other TFL models.

@vikramambrose So, do you have experience on that? We're kind of overloaded so it's complicated for us, as of now, to investigate that kind of change, but if you have feedback / patches to share, that's welcome.

lissyx avatar Jul 21 '20 07:07 lissyx

@vikramambrose Can you share feedback on that matter?

lissyx avatar Jul 27 '20 13:07 lissyx