nerfstudio
nerfstudio copied to clipboard
Semantic segmentation vs Semantic thing / class
Currently looking at the use of semantic segmentation - there seems to not really be a good use case to split the segmentation into thing / stuff classes. And Given the assumption that a 3d point can only belong to a single semantic class this should always be true? I was looking into integrating semantic nerf into my use case and find that the splitting into things / stuff seems necessary and adds added complexity, whereas I would like to just load the semantic labels as is and have a single head to perform semantic segmentation. It could add an additional head on top of thing/stuff, but this seems un-necessary and adds more complexity to the code. What do people think about changing the structure of the Semantics class to the following -
@dataclass
class Semantics:
"""Dataclass for semantic labels."""
filenames: List[Path]
"""filenames to load "stuff"/background data"""
classes: List[str]
"""class labels for semantic data"""
colors: torch.Tensor
"""color mapping for semantic classes"""
is_thing: List[bool] = []
"""Optional flag to indicate if the class is a thing - Can be used for usecases that require this? """
class_loss_weights: List[float] = []
"""Loss weighting for each class, if empty then uniform weighting is assumed"""
This should be flexible enough to support the current use case and allows simplifying the model code quite a bit? Im happy to self-assign this to myself as well if there is some consensus on this design decision. Thanks!
I have outlined a proposal in this PR - https://github.com/nerfstudio-project/nerfstudio/pull/886 . This is without the is_thing and loss_weight above, but can be added in pretty trivially if required
Hey @nikmo33, I just replied to your PR! We can discuss there. 🙂 I don't think we need the is_thing and class_loss_weights, but it would be nice to have an arbitrary number of things returned from the InputDataset. Happy to work together on this.