core-bioimage-io-python icon indicating copy to clipboard operation
core-bioimage-io-python copied to clipboard

Can I check if my tensorflow model is compatible

Open odinsbane opened this issue 1 year ago • 12 comments

I have a lot of uncertainty in this process. It would be nice if I can debug the various steps.

I have a model, I load images as [t, c, z, y, x] and then I convert them to batches [b, c, z, y, x] using tiling, then I process the batches with the model. The output is then reconstructed into a prediction.

I think I setup the data ok because I don't get a random assertion error about the shape, but when I try to build the model, I get an error that my tensorflow directory is a directory.

What do I need to do to specify the weight type?

weight_type="tensorflow_saved_model_bundle",

According to this, bioimageio_weights_spec I should be able to use a tensorflow folder, but it isn't clear.

odinsbane avatar Nov 21 '22 15:11 odinsbane

Hi @odinsbane, we are happy to help with this, but would need a bit more information:

  • Can you please post the exact error message you are getting?
  • Can you try to put together a minimal example with the code and weights to reproduce this error?

constantinpape avatar Nov 21 '22 17:11 constantinpape

Thank you for the quick response. I can easily put together an example from what I have. Maybe you can help me before I start uploading things.

I have a tensorflow model. There is one input and three outputs. I made a version with one output to get it to work.

I don't mind changing the format, either .h5 or directory.

This is the code I used to build the model.

#!/usr/bin/env python3

from bioimageio.core.build_spec import build_model

build_model(
    # the weight file and the type of the weights
    weight_uri="simple",
    weight_type="tensorflow_saved_model_bundle",
    # the test input and output data as well as the description of the tensors
    # these are passed as list because we support multiple inputs / outputs per model
    test_inputs=["junk.npy"],
    test_outputs=["output.npy"],
    input_axes=["bczyx"],
    output_axes=["bczyx"],
    # where to save the model zip, how to call the model and a short description of it
    output_path="simple-sample.zip",
    name="OrganoidDnaModel",
    description="For segemnting organoid dna in 3D",
    # additional metadata about authors, licenses, citation etc.
    authors=[{"name": "Odinsbane"}],
    license="CC-BY-4.0",
    documentation="organoids.md",
    tags=["nucleus-segmentation"],  # the tags are used to make models more findable on the website
    cite=[{"text": "Smith et al.", "doi": "doi:10.1002/xyzacab123"}],
)

This is the output.

python build_package.py /home/username/Desktop/bioimage-prep/bioimage-io-env/lib/python3.9/site-packages/bioimageio/spec/shared/_resolve_source.py:433: CacheWarning: found cached /tmp/username/bioimageio_cache/https/raw.githubusercontent.com/bioimage-io/bioimage.io/main/site.config.json. Skipping download of https://raw.githubusercontent.com/bioimage-io/bioimage.io/main/site.config.json. warnings.warn(f"found cached {local_path}. Skipping download of {uri}.", category=CacheWarning) /home/username/Desktop/bioimage-prep/bioimage-io-env/lib/python3.9/site-packages/bioimageio/spec/shared/_resolve_source.py:433: CacheWarning: found cached /tmp/username/bioimageio_cache/https/bioimage-io.github.io/collection-bioimage-io/collection.json. Skipping download of https://bioimage-io.github.io/collection-bioimage-io/collection.json. warnings.warn(f"found cached {local_path}. Skipping download of {uri}.", category=CacheWarning) Traceback (most recent call last): File "/home/username/Desktop/bioimage-prep/basic/build_package.py", line 5, in build_model( File "/home/username/Desktop/bioimage-prep/bioimage-io-env/lib/python3.9/site-packages/bioimageio/core/build_spec/build_model.py", line 813, in build_model weights, tmp_archtecture = _get_weights( File "/home/username/Desktop/bioimage-prep/bioimage-io-env/lib/python3.9/site-packages/bioimageio/core/build_spec/build_model.py", line 103, in _get_weights weight_hash = _get_hash(weight_path) File "/home/username/Desktop/bioimage-prep/bioimage-io-env/lib/python3.9/site-packages/bioimageio/core/build_spec/build_model.py", line 32, in _get_hash with open(path, "rb") as f: IsADirectoryError: [Errno 21] Is a directory: '/home/username/Desktop/bioimage-prep/basic/simple'

If I try using a .h5 file (same mode, I load it in python, then save it with save_format="h5") I get a different error. What do you think is the best way to proceed?

odinsbane avatar Nov 21 '22 21:11 odinsbane

I tried a version of the build script using an h5 version of the tensorflow model, and everything seems to be working. It created a zip file and I am trying to upload the zip file to bioimage.io (or zenodo via bioimage.io) which suites my purpose well. So maybe the sole purpose of this issue is, why doesn't the directory format work?

odinsbane avatar Nov 21 '22 22:11 odinsbane

I tried a version of the build script using an h5 version of the tensorflow model, and everything seems to be working.

Ok, note that in this case you need to set the weight format to keras_hdf5.

So maybe the sole purpose of this issue is, why doesn't the directory format work?

I think that we are expecting a zip file of the weights here. It should work if you zip the folder with the weights and then pass the filepath of the zipped file to the build_model function.

constantinpape avatar Nov 22 '22 08:11 constantinpape

That worked! I think it worked because it created the bioimage-model zip file.

It could be helpful if the documentation said that the directory needs to be zipped, that they names in the weights spec page are the values that should passed to the weights_spec parameter. The "weights_spec" appears to be generated documentation (similar to Tensorflow doc's?)

odinsbane avatar Nov 22 '22 08:11 odinsbane

That worked! I think it worked because it created the bioimage-model zip file.

Ok, thanks for checking!

It could be helpful if the documentation said that the directory needs to be zipped, that they names in the weights spec page are the values that should passed to the weights_spec parameter. The "weights_spec" appears to be generated documentation (similar to Tensorflow doc's?)

I agree, it would be good to update this. (And indeed the weight spec documentation is autogenerated.) I am not so familiar with the doc generation procedure, maybe @FynnBe could have a look.

constantinpape avatar Nov 22 '22 09:11 constantinpape

I can close this issue, but I just want to make sure. Is there another way to test the model?

For me I had to work through fixing the data, then fixing the model, then fixing the data. Did you add the h5 support recently?

odinsbane avatar Nov 22 '22 09:11 odinsbane

I can close this issue, but I just want to make sure.

Let's leave it open until the documentation is updated.

Is there another way to test the model?

Once you have created the model you can test it via the bioimageio test-model command. See also https://github.com/bioimage-io/core-bioimage-io-python#command-line.

Did you add the h5 support recently?

No, that has been there for a while.

constantinpape avatar Nov 22 '22 09:11 constantinpape

Okay, I finally got around to using the bioimageio test-model! If I use the h5 version of the saved tensorflow model, then the tests pass. When I use the .zip file of the folder saved model I get an error.

error: ('SavedModel file does not exist at: '
'/tmp/username/bioimageio_cache/extracted_packages/f5854334514ff19d36267b690c03df47182881e321ccfdb78b62eebe6f63d3e7/drosophila-crb-d5/{saved_model.pbtxt|saved_model.pb}')

odinsbane avatar Nov 29 '22 22:11 odinsbane

@odinsbane could you share the created zip folder, any additional files that should be included in it, but are not, and your updated build_model call? With that we'll take a closer look at the issue.

FynnBe avatar Nov 30 '22 07:11 FynnBe

It's a large file, so I am looking at uploading it or making an example. I think though, looking at the temp folder the problem is the way the file has been zipped.

  • save model as folder "drosophila-crb-d5"
  • zip the folder: zip drosophila-crb-d5.zip drosophila-crb-d5

Maybe I need to zip the contents and remove the top-level folder?

odinsbane avatar Nov 30 '22 09:11 odinsbane

Maybe I need to zip the contents and remove the top-level folder?

that's probably it. 👍

I can't improve this right now, but here are some notes from looking into it briefly, if someone else wants to tackle it already:

We should improve the TF Moder adapter here: https://github.com/bioimage-io/core-bioimage-io-python/blob/bc25746cad127e6a54d0885c2212aa8f50a049f8/bioimageio/core/prediction_pipeline/_model_adapters/_tensorflow_model_adapter.py#L65-L66

here weight_file is a folder in which we should look for the expected saved_model.pb file and possibly add the nested folder to its path.

We potentially unzip nested zip files (calling self.require_unzipped twice, which we probably shouldn't)

We also implicitly assume the use of tensorflow 1: https://github.com/bioimage-io/core-bioimage-io-python/blob/bc25746cad127e6a54d0885c2212aa8f50a049f8/bioimageio/core/prediction_pipeline/_model_adapters/_tensorflow_model_adapter.py#L34-L35

Maybe https://www.tensorflow.org/api_docs/python/tf/saved_model/save is a relevant reference for updating this part.

FynnBe avatar Nov 30 '22 10:11 FynnBe