autoalbument
autoalbument copied to clipboard
Fix examples and input shapes
Hi, I noticed two issues with the docs / comments:
- PascalVOC example (https://albumentations.ai/docs/autoalbument/examples/pascal_voc/) is missing
_target_: autoalbument.faster_autoaugment.models.SemanticSegmentationModel
instruction withinsemantic_segmentation_model
section. Search fails without that line. - When you generate new
dataset.py
file, comments say that "mask
should be a NumPy array with the shape [height, width, num_classes]" and "image
should be a NumPy array with the shape [height, width, num_channels]". Meanwhile, it looks like channels should be first, i.e. [channels, height, width]. This was the only combination that works anyway. Also, I think that comment "If an image contains three color channels" could be rephrases -- it suggests that e.g. single-channel images are accepted, but in fact currently input probably always requires 3 channels.
Thanks
Hey, @jwitos, thanks for the report!
- Yes, now docs are outdated. I am planning to rework them soon and automatically publish actual configs from the repo.
- Could you please provide an example of your
dataset.py
? AutoAlbument expects that images and masks returned by that dataset should have the shape[height, width, num_channels]
. Then AutoAlbument will create a transformation function using this method. This function contains the ToTensorV2 transform from Albumentations. The purpose of that transformation is to change NumPy array dimensions from[height, width, num_channels]
to[num_channels, height, width]
and then convert it to a PyTorch Tensor (so basically convert a regular NumPy array with image or mask to a format expected by PyTorch). The dataset implementation should use that transform function for all images and mask that it returns (e.g., https://github.com/albumentations-team/autoalbument/blob/master/examples/pascal_voc/dataset.py#L86)
Also, I think that comment "If an image contains three color channels" could be rephrases -- it suggests that e.g. single-channel images are accepted, but in fact currently input probably always requires 3 channels.
Yes, I will rephrase it, thanks. In fact, it is possible to use single-channel images, but then you need to define a custom model that works with those single-channel images. I am planning to document such an option.
- Fixed. The documentation at https://albumentations.ai/docs/autoalbument/examples/list/ now contains the latest version of configs from the repository.