coralnet
coralnet copied to clipboard
Annotation by semantic segmentation
We keep getting asked about this, but for good reason. Segmentation gives a lot more information about an image compared to point annotation, which can be desirable for more small-scale, detailed studies. This greater information should also, in theory, give better quality training for robots.
Segmentation would be a different annotation mode from points. The annotation tool would work very differently, for one. The definition of "done annotating" is not labeling all the points in the image, but rather defining labeled segments for every pixel in the image. (One quick way to do it is outlining just the coral colonies, then selecting the remainder of the image as a segment and labeling it as 'other'.) As for how drawing segments should even work, I'll have to look at existing segmentation UIs to get a better idea of how to do it. Another question is how to offer robot suggestions for segments.
On the database level, we would probably allow each image to save either segment annotations, or point annotations, but not both. Either way, we can compute the cover of each type of coral, algae, etc.: for segmentation, we compute it as if we had done point annotation with one point per pixel.
In the create-source form, the user can select whether they want segments or points by default for uploaded images. One question is whether a source should be allowed to have both segments and points, i.e. whether the default could ever be changed after source creation.
- If the default can be changed, the Browse Patches grid could potentially have to show both points and segments on the same page. But the same goes for the "example patches" section on a label-detail page regardless of our decision here, so I think we'll just have to figure out how to do this. Doesn't seem too hard anyway, just put an 'if points, else segments' condition in the for-loop that goes over the grid.
- Overall I can't think of a serious technical problem with allowing both in the same source. The question is whether this ability would be more confusing than helpful for users. We could ask around perhaps.
Speaking of patches, I think the way to do this is to show the entire segment in the patch, leaving at least a small border on all sides of the segment so we can see what's around the segment. And for a label-detail page's "example patches", we might consider prioritizing segments over points there, because segments are better learning material.
For Annotation History, the simplest thing to do is report the percent-cover of each label according to the defined segments. If we wanted to get fancy, we could consider showing a small rectangle with the segment shapes colored in.
So we've been discussing supporting segmentation via a new API endpoint, which seems fairly straightforward on my end. We just have to agree on a format in which we present segmentation output.
After that, we might expose segmentation in the web UI. From emails:
I guess one more opportunity is whether to expose the semantic segmentation API through the UI. An ability to semantically segment images on coralnet and view the image/SS on COralNet and export the SS images. It would advertise the semantic segmentation capability and allow users to easily assess it's utility without any scripting. They may have already got complete point annotations, but then could see the segmented images.
It would require us to enable some way to download the segmented images also, because once they see the magic, they will want a piece of it! But that is OK since we wanted to enable downloaded the original images also, so we can use the same code for those two purposes.
In other words, we have an interface where we call the segmentation API, display the segments over the image, and provide a button to download the result.
This sounds like a suitable short-term stance on supporting segmentation, since if we REALLY went all-in with segmentation (see the original post here), it would probably be more work than multiple labelsets per source.
A few comments on this.
- We will need to pre-compute the seg. masks for all images. The user will not want to wait for the job to complete. This could get quite expensive since we'd need to re-process all images on the source every time a new classifier is accepted to the source.
- When an image is annotated with points, we will need some way to consolidate the seg. mask with the human annotations. Or we could just leave the inconsistencies.
- Even with the "button to download the result" for a single image, we will also need to allow bulk-download.
- We will need to think about the format for download. The most obvious way is to use an integer-image, but that is not readily viewable in a standard image-viewer.
Just adding some thoughts from the most recent meeting. We arrived at some sort of compromise, where segmasks will not automatically be computed, but the user will be able to request such computation, and then go back and view the results when available. We will also allow download.
Here's one simple way the website-based UI/workflow can work, although probably not optimal, so feel free to suggest things:
-
When you are looking at an image-specific page (Image Details, Annotation Tool, Annotation History), there's currently a menu of green buttons which let you jump to those 3 pages. We can add another green button for 'Segmentation'. Click that button to go to the segmentation page for that image.
-
The segmentation page has a button to generate segments for the image. This starts the segmentation job, and the page then says you'll have to come back later to get the result.
-
When the segmentation job is finished, the segmentation result is stored in the database. (As a model field of Image? model field of Features? or a separate model with a relationship to one of those?) Also, the source's Newsfeed is updated with an entry saying that segmentation has finished for that image.
-
If you visit the segmentation page after segments have been generated for the image, the page displays the segments and provides a download button. There is still a button to re-generate segments, and it shows the date that the current segments were generated.
The biggest limitation here is that this only lets you generate and download segments one image at a time. Though, we are providing the API endpoint for anyone who wants to work faster, so that might be OK.
The biggest limitation here is that this only lets you generate and download segments one image at a time. Though, we are providing the API endpoint for anyone who wants to work faster, so that might be OK.
Yeah. I'm torn about this. I'm leaning towards a button that says, generate and export all segmentation masks (probably on the export page). This would generate all masks, put them somewhere for the user to download. That somewhere could be a google-drive folder (which we delete after say 1 week), or a temporary s3 bucket, or something else. It's a bit tricky, but seems worth giving a try. We can then use the same mechanism for bulk image download.
That's quite an interesting idea. Keeping it on S3 seems easiest in terms of implementation, so we don't have to learn how to use the Google Drive API. But maybe Google Drive's feature to download an entire folder would be helpful (assuming S3 doesn't have that)...
I wonder what filesizes we're talking about for these segmentation masks. If I'm thinking about this correctly, a mask would look something like the result of a map-coloring problem, right? If so, it could just be a few KB per image, in which case it'd be easy to provide a ZIP file with thousands of images' masks.
Presumably, you don't want to use lossy compression for the masks. But should be pretty compressible with some kind of underlying run-length encoding. Nonetheless, probably best to use standard formats easy to view format, whether jpg, png, gif.
On Tue, Mar 10, 2020 at 1:21 AM StephenChan [email protected] wrote:
That's quite an interesting idea. Keeping it on S3 seems easiest in terms of implementation, so we don't have to learn how to use the Google Drive API. But maybe Google Drive's feature to download an entire folder would be helpful (assuming S3 doesn't have that)...
I wonder what filesizes we're talking about for these segmentation masks. If I'm thinking about this correctly, a mask would look something like the result of a map-coloring problem https://en.wikipedia.org/wiki/Four_color_theorem, right? If so, it could just be a few KB per image, in which case it'd be easy to provide a ZIP file with thousands of images' masks.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/beijbom/coralnet/issues/193?email_source=notifications&email_token=ABKA5ALVNBKYFOFSW2YGC33RGYBB3A5CNFSM4FHE7EY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOKOIBY#issuecomment-596960263, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKA5AOEP3TAZIGQGTVBMHTRGYBB3ANCNFSM4FHE7EYQ .
I was thinking lossless compression, yeah. If you have an image which is just patches of solid color, then PNG can be quite small. But I guess we'll see how it turns out.
Yeah, I'd also say png or run-length encoding.
Let's see if there is an option to use S3 then. Could just be a randomly-named bucket with public access that we then delete after a short while. We just need to check what the UI looks like for non-coders.
On Tue, Mar 10, 2020 at 12:32 PM StephenChan [email protected] wrote:
I was thinking lossless compression, yeah. If you have an image which is just patches of solid color, then PNG can be quite small. But I guess we'll see how it turns out.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/beijbom/coralnet/issues/193?email_source=notifications&email_token=AAITTF7UWZVFTGJPUAUH7LLRG2ITNA5CNFSM4FHE7EY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOM2PHI#issuecomment-597272477, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAITTFZGPD7MZZEQZFMBB6LRG2ITNANCNFSM4FHE7EYQ .
As I think about it more, we probably want to support a mode for semantic segmentation where we return the scores for each label at each pixel and, not just the single label at each pixel. This could be important for annotating the orthomosaics/SfM results where we integrate the scores across images and not just voting across labels. If the labelset has k labels, this output could be a k-channel image of floats or else k-one channel images of floats that could be compressed as jpg. Definitely bigger.
David
On Tue, Mar 10, 2020 at 1:22 PM Oscar Beijbom [email protected] wrote:
Yeah, I'd also say png or run-length encoding.
Let's see if there is an option to use S3 then. Could just be a randomly-named bucket with public access that we then delete after a short while. We just need to check what the UI looks like for non-coders.
On Tue, Mar 10, 2020 at 12:32 PM StephenChan [email protected] wrote:
I was thinking lossless compression, yeah. If you have an image which is just patches of solid color, then PNG can be quite small. But I guess we'll see how it turns out.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/beijbom/coralnet/issues/193?email_source=notifications&email_token=AAITTF7UWZVFTGJPUAUH7LLRG2ITNA5CNFSM4FHE7EY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOM2PHI#issuecomment-597272477 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAITTFZGPD7MZZEQZFMBB6LRG2ITNANCNFSM4FHE7EYQ
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/beijbom/coralnet/issues/193?email_source=notifications&email_token=ABKA5AKKBLFSD4FMTJUBD4DRG2ORJA5CNFSM4FHE7EY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEONAEDI#issuecomment-597295629, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKA5AOPC3RM2UDVQ3E35Q3RG2ORJANCNFSM4FHE7EYQ .
On Thu, Mar 12, 2020 at 12:05 AM kriegman [email protected] wrote:
As I think about it more, we probably want to support a mode for semantic segmentation where we return the scores for each label at each pixel and, not just the single label at each pixel. This could be important for annotating the orthomosaics/SfM results where we integrate the scores across images and not just voting across labels. If the labelset has k labels, this output could be a k-channel image of floats or else k-one channel images of floats that could be compressed as jpg.
Love the idea of k jpeg-compressed images. Add some extra spatial smoothing. haha.
Definitely bigger.
David
On Tue, Mar 10, 2020 at 1:22 PM Oscar Beijbom [email protected] wrote:
Yeah, I'd also say png or run-length encoding.
Let's see if there is an option to use S3 then. Could just be a randomly-named bucket with public access that we then delete after a short while. We just need to check what the UI looks like for non-coders.
On Tue, Mar 10, 2020 at 12:32 PM StephenChan [email protected] wrote:
I was thinking lossless compression, yeah. If you have an image which is just patches of solid color, then PNG can be quite small. But I guess we'll see how it turns out.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub <
https://github.com/beijbom/coralnet/issues/193?email_source=notifications&email_token=AAITTF7UWZVFTGJPUAUH7LLRG2ITNA5CNFSM4FHE7EY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOM2PHI#issuecomment-597272477
, or unsubscribe <
https://github.com/notifications/unsubscribe-auth/AAITTFZGPD7MZZEQZFMBB6LRG2ITNANCNFSM4FHE7EYQ
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/beijbom/coralnet/issues/193?email_source=notifications&email_token=ABKA5AKKBLFSD4FMTJUBD4DRG2ORJA5CNFSM4FHE7EY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEONAEDI#issuecomment-597295629 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABKA5AOPC3RM2UDVQ3E35Q3RG2ORJANCNFSM4FHE7EYQ
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/beijbom/coralnet/issues/193#issuecomment-598037673, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAITTFY26SPF7GR2IA4OBUDRHCCSRANCNFSM4FHE7EYQ .