kiwix-android icon indicating copy to clipboard operation
kiwix-android copied to clipboard

Reduce mass-storage consumption with Android custom apps with embedded ZIM

Open mhutti1 opened this issue 7 years ago • 29 comments

From @kelson42 on August 28, 2016 13:33

We have a working system to build custom apps with embedded ZIM files (see #6). The only way to have it working correctly is to use the "assets". The problem with that solution is that you can not access it directly because you do not have a filehandle. As a consequence:

  • It use twice more storage memory (one time in assets, on time copied somewhere on the fs)
  • Decompression is pretty stupid and use a lot of memory which cause install process crashes on low-end devices with big APK.

A solution could be to access directly the content in the asset file: http://www.50ply.com/blog/2013/01/19/loading-compressed-android-assets-with-file-pointer/#comment-1850768990

Copied from original issue: kiwix/kiwix#297

mhutti1 avatar Mar 18 '17 12:03 mhutti1

From @kelson42 on October 15, 2016 10:55

Have talked to a Google developer and it seems to be a "standard" way of doing... even if not well enought documented. That said, this method will add an overhead (zip decompressions)... that said this might be not much worse (in term of app perfomance).

mhutti1 avatar Mar 18 '17 12:03 mhutti1

@kelson42 @mhutti1 I have been researching on this issue and would appreciate some help on a problem.

I am trying to find the code snippet where we load the embedded zim files from the assets folder and open it, in order to explore the issue further. I can't find it. The closest thing i found is the following code snippet inside KiwixMobileActivity:

if (BuildConfig.HAS_EMBEDDED_ZIM) {
            String appPath = getPackageResourcePath();
            File libDir = new File(appPath.substring(0, appPath.lastIndexOf("/")) + "/lib/");
            if (libDir.exists() && libDir.listFiles().length > 0) {
              filePath = libDir.listFiles()[0].getPath() + "/" + BuildConfig.ZIM_FILE_NAME;
            }
            if (filePath.isEmpty() || !new File(filePath).exists()) {
              filePath = String.format("/data/data/%s/lib/%s", BuildConfig.APPLICATION_ID,
                  BuildConfig.ZIM_FILE_NAME);
            }
          }

I think in order to access files from assets we need to use AssetsManager but we aren't using it. Can you please explain or point me to the file where we are loading the embedded zim files. I have limited knowledge of assets and embedded files.

dr0pdb avatar Mar 03 '18 20:03 dr0pdb

So our current solution is a bit of a hack. We package the ZIM as a .so file and save it as if it were a native library. It gets extracted automatically on install and we read it as if it were a ZIM file. If we uses Assets then the same happens except we do the extraction our self neither are efficient. We should implment the C++ solution but it is very challenging.

mhutti1 avatar Mar 03 '18 23:03 mhutti1

This is IMO a too complicared task for the Gsoc.

kelson42 avatar Mar 04 '18 06:03 kelson42

What is the problem with downloading the zim file once the app is downloaded?

RohanBh avatar Mar 08 '18 17:03 RohanBh

Like when we download big games, game files are downloaded after the app is installed.

RohanBh avatar Mar 08 '18 17:03 RohanBh

Because this is for the specific use case where distribution is offline.

mhutti1 avatar Mar 08 '18 23:03 mhutti1

But they'll need to be online to download the app, right?

RohanBh avatar Mar 09 '18 03:03 RohanBh

@RohanBh thune of just distributing an Apk, outsider thé Google App Store.

kelson42 avatar Mar 09 '18 20:03 kelson42

Gotcha.

RohanBh avatar Mar 09 '18 20:03 RohanBh

@mhutti1 I looked into the C++ approach you mentioned, the problems that I'm seeing are as follows:

Modifications required to zimlib

  • file_compound.cpp reads the zim file in parts/sections as per requirements and thus needes the file path to manage indexing. A new logic will be required for that.
  • FilePart, FileImpl, File will have to be reimplemented to accommodate the FILE* logic.
  • zim::Reader will have to be modified to accept FILE*

Modifications required to kiwixlib

  • Changes to JNIKiwixReader and reader.cpp to access assets as FILE*

The main area of concern is the modification of zimlib. It would require a lot of work and from there on would have to maintained by us. Posted this for anyone looking to go ahead with the C++ approach.

sakchhams avatar Mar 22 '18 19:03 sakchhams

@sakchhams Thanks I thought as much.

mhutti1 avatar Mar 25 '18 01:03 mhutti1

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Jun 26 '19 05:06 stale[bot]

It could be possible to use the raw folder to include the zim, I think this would be exempt from compression but presents the problem the we can use only public InputStream openRawResource (int id) or public AssetFileDescriptor openRawResourceFd (int id) to read from the file. It seems somewhat feasible using the FD and not relying on the <android/asset_manager.h>. Which begs significant rework of the C library as previously stated.

Another, undesirable solution, is @kelson42 mentioned there is a java version of "kiwixlib" and that would simplify matters in one respect. I don't really envision embedded Zims being released as part of 3.1 milestone as of right now.

macgills avatar Oct 22 '19 10:10 macgills

@macgills So the ZIM will be kept in the resource file/blob after installation? but in that case the data would be not manipulated/compressed, so if we know the ZIM offset and the length it might be directly usable by the libkiwix? Would you need in that case a new libkiwix ABI? If "yes" which one?

kelson42 avatar Oct 22 '19 13:10 kelson42

The zim file would be an internal resource to the application so the apk itself would include it. I have this far been unlucky in being able to expose the file's actual location on disk and it might infact be impossible but we can definitely expose an InputStream or an AssetFileDescriptor, right now we construct a Reader like so JNIKiwixReader(zimFile.path) and I don't personally know the extent to which we would have to alter kiwixlib but I doubt it is simple.

There are 3 potential locations for packaging zims in an app

  1. libs - the current solution (broken I think?) where we pretend the zim is a C library so it gets extracted for us
  2. assets - a folder for bundling assets, must be accessed by AssetManager, this seemingly already has a native component we can import? Assets are compressed
  3. res/raw - uncompressed raw asset as described above

The problem with 2/3 is to use them with our existing solution we would have to write out the contents to a new file, doubling the space taken on disk, to expose them for kiwixlib.

macgills avatar Oct 22 '19 14:10 macgills

The solution has to allow to use the Zim file directly, so no extraction or copy which need additional CPU/disk.

kelson42 avatar Oct 22 '19 17:10 kelson42

Then kiwixlib must work with an InputStream/AssetFileDescriptor/AssetManager is my current assessment

macgills avatar Oct 23 '19 08:10 macgills

@macgills libkiwix deals with a path and it can deal probably with a path/offset/size (with the ZIM content being the data in the path, at the offset and for size bytes). Can you provide this?

kelson42 avatar Oct 23 '19 08:10 kelson42

No, the "files" do not have a path, they are bundled as a bunch of bytes in the apk. The file descriptor seems feasible as I linked earlier.

AssetFileDescriptor afd = getContext().getAssets().openFd("test.mp3");
setDataSource(afd.getFileDescriptor(), afd.getStartOffset(), fd.getLength());

macgills avatar Oct 23 '19 08:10 macgills

@macgills Of course the file here is not the ZIM file, but some kind of asset file. Here an other doc about how to do something like that https://groups.google.com/d/msg/android-ndk/ppCEAY6Hpag/OEhxQRx2iY0J. Seems quite clear to me this is feasible. We need to have here a similar principle like we handle video files within ZIM files in Kiwix Android. I just need now to have a clear answer about how to retrieve these informations from the Android SDK/NDK.

kelson42 avatar Oct 23 '19 08:10 kelson42

It is all the bytes of a zim file, the InputStream/AssetFileDescriptor is a logical wrapper around a sequence of bytes contained in the apk not some other type of file. I think the raw folder is best or at least most idiomatic for the intended usage

macgills avatar Oct 23 '19 08:10 macgills

By using the noCompress aapt option in the android closure of build.gradle

  aaptOptions {
    noCompress "zim"
  }

I can confirm we can successfully read a file from either raw or assets as an InputStream or AssetFileDescriptor

      resources.openRawResourceFd(R.raw.svg_test).declaredLength
      resources.openRawResource(R.raw.svg_test).read()
     
      assets.openFd("svg_test.zim").declaredLength
      assets.open("svg_test.zim").read()

macgills avatar Oct 23 '19 11:10 macgills

@macgills thx @mgautierfr in worse case with could have the apk path (instead of the fd, but this would not be the most robust approach).

kelson42 avatar Oct 23 '19 11:10 kelson42

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Dec 22 '19 12:12 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Oct 17 '20 02:10 stale[bot]

Here the libkiwix related feature request: https://github.com/kiwix/kiwix-lib/issues/423

kelson42 avatar Nov 11 '20 09:11 kelson42

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Jan 10 '21 13:01 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Sep 21 '22 04:09 stale[bot]

@MohitMaliFtechiz @gouri-panda Could you please update this ticket:

  • Do we stil use a fake ZIM file as .so?
  • Could we somehow achieve pack the ZIM file in the assets now and avoid copy/decompression at install time?

kelson42 avatar Aug 28 '23 04:08 kelson42