libact
libact copied to clipboard
Is there a way to perform batch mode active learning ?
Hi,
Instead of having of having unlabeled data which come as a stream, I would like to know if there is a way with libact to perform batch mode active learning meaning that the users can select multiples images at once (positive and negatives) ?
thank you in advance
We haven't officially support batch mode active learning yet.
Though some algorithms with slight modification can achieve this feature. Take uncertainty sampling for example, you may change the following line
https://github.com/ntucllab/libact/blob/master/libact/query_strategies/uncertainty_sampling.py#L111
by replacing np.argmin with something like n_smallest to return the index of n most uncertain data.
Thank you, I will try. When do you think you will support officially batch mode active learning ?
My question may be silly but I have made some changes inside the uncertainty_sampling file, then I have made sudo python setup.py build and sudo python setup.py install and my changes haven't been taken in consideration...I can do whatever I want, where I run a python example, nothing has changed. Do you know why ? Or maybe did I miss something ?
You may want to check your environment variables like PYTHONPATH or PATH. Or maybe you are inside a virtualenv where you don't need sudo to install the package.
Yes, the problem was because of the virtual environment. Thank for your help !
In order to not open a new thread, is there a way to associate an image to each feature ? I mean If I don't use pixel as a feature, I cannot use InteractiveLabel. So for example, If I use an abstract feature like SIFT, how can I associate each feature to each image. I tried to use a dictionary but it doesn't work.. thank in advance
Do you mean that the feature is separated with the pixel array?
If this is what you mean, I think you can try putting the corresponding pixel array to the label function of InteractiveLabeler.
For the label_digits example: https://github.com/ntucllab/libact/blob/master/examples/label_digits.py#L90
Instead of lb = lbr.label(trn_ds.data[ask_id][0].reshape(8, 8))
,
you may put lb = lbr.label(feature_to_image(trn_ds.data[ask_id][0]))
If your image is not in a form of pixel, you may need to modify the image rendering part of InteractiveLabeler. https://github.com/ntucllab/libact/blob/master/libact/labelers/interactive_labeler.py#L32
Yes. For example, I have an array of images (not in a form of pixel) and I have an array of associated features. images = [image1,..,imageN] and features = [features1,...,featuresN]
So yes I would like to create a Dataset object but after doing that, I lose track of my corresponding images as I have only in Dataset the pool X and label Y. There is no way to know which X corresponds to which image, if you see what I mean. I will try to do like you advise me.
The most simply way of doing it is to take featureX and search through [features1,...,featuresN] one by one for the index and then go back to the imageX.
Though if I remembered correctly, the index of features in the dataset won't change during the process.
This means that if you create dataset like this Dataset(features, Y)
.
The index returned by i = qs.make_query()
should be the same as the original one.
which means trn_ds.data[i][0]
shoud be the same as features[i]
, and you will show image[i]
.
If the first method is too slow for your application, you can double check with the second method by running some small samples.