AzureStorageExplorer
AzureStorageExplorer copied to clipboard
Search by more than just prefix (extension, wildcard, etc.)
Describe the solution you'd like First - could be nice if ctrl+f would take you to the search box. Second - could be very nice to search not only by prefix (maybe add an option to choose the kind of search).
If it is intended - maybe ctrl+f could be used to search like in web - just on the current view.
Hi @ohadbitt, are you trying to search a:
- blob container
- file share
- adls gen2 blob container
- something else?
Having the same problem when trying to find a file in a blob container. I don't understand what use prefix search has in any context as the default setting.
Blob files inside a container
totally agree is needs to be expanded on to use wildcard searches
Any update on this? We have thousands of files and some of our developers are not confident using command line azcopy. We need a proper search that has wildcard support. Postfix would be infinitely more useful than prefix. Is there anything we could publicly modify to add support for this?
+1 for this bummer it doesn't exist
a huge bummer, would really need this feature
Any update? Being developed? Not being developed? Any reason?
Any update? Being developed? Not being developed? Any reason?
Not being developed at this time, but it is something we (or at least I) try to think about every now and then.
Without API support, implementing this would not be a very good user experience for many of you. We can only list blobs 5000 at a time. So for a container with say a million blobs (not an uncommon number, many people have even more), that would take 200 API calls, that in many situations we cannot easily do in parallel. So if every API call takes 0.5-1 sec, well...you can do the math. And that's just for people with good internet. Imagine those on slower networks.
And then you think "well you could just list everything once when you open the container and cache it", we run into problems like:
- Where do we cache it such that it can quickly be accessed and not cause other issues (slow down, too much memory, using too much disk space)
- If we do it on disk, then we need to clean up the cache at some point, because there are going to be customers out there who do not want something a list of all the blobs in their containers persisted on their disk, but when do we clean up? if the cache is big enough, it could take some time (esp on Windows, which the majority of installs are) to clean that up. We don't want to make people wait on us to shut down.
- How do we communicate to people if the caching is done, or why it is taking so long?
- At some point the cache will be dirty and need to be rebuilt, but there's no good way to do that for just a small portion, so you just have to do all of it over again.
Also, if someone opens a bunch of containers, that's going to be a bunch of high data network calls all trying to happen in parallel. That's going to cause perf issues not only wrt to your network, but depending on how we cache things, either disk or memory issues as well.
So ya, we (the Storage Explorer team) understand how much this would be helpful to y'all, but one thing we've learned over the years is that we can't ship an experience that is only good for some users/we have to be careful about what features we build that aren't supported by the Storage APIs.
I think if I was building on-top of Azure Storage today I would:
- name my blobs such that I can take advantage of prefix listing
- look into leveraging blob tags: https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-index-how-to?tabs=azure-portal (those docs don't show it, but you can filter by tags in storage explorer)
Of course, obviously it might be too late for you to do that if the thing you work on is already built so, sorry if that's the case. :(
We'll keep passing this feedback along to the service team (we really do!), and as soon as they add support, or as soon as we can figure out a good client side solution to the problem we'll get the work scheduled.
I'm sorry to not have a better answer for you. :(
PS: if you think you have a good idea for how to do this client side, feel free to share. :) PPS: if you think you have a good idea for how to this client side, and you want to do it on your own/not wait for us/prove us wrong, consider signing up for this preview: https://github.com/microsoft/AzureStorageExplorer/issues/4142 I could see a "search container" extension being a thing.
Thank you for the detailed explanation. It helps us at least understand that it is not a simple job.
What about leveraging Azure Purview API for search (if the Purview is used and the Storage account is part of the Purview scan) or leverage Cognitive Search (e.g. provide a option to enable Cognitive Search out of Storage Explorer)?
PS: if you think you have a good idea for how to do this client side, feel free to share. :) PPS: if you think you have a good idea for how to this client side, and you want to do it on your own/not wait for us/prove us wrong, consider signing up for this preview: #4142 I could see a "search container" extension being a thing.
I have an idea that would be a good workaround. Why not create a way in the API to export a list of the blob names to a text file? That way we can import the text file into an editor that does let us perform a search on the string we're looking for.
2nd idea, why not just allow wildcard searches just in the portion that you have currently loaded? You don't need to do the search on millions of blobs.. narrow it down first using the filter, THEN do a wildcard search on just that portion. Even that would help us out greatly.
+1 adding my support - prefix search is practically useless - needs to allow full wildcard functionality - given the additional features of the November 2023 update, this seems like it should be a step nearer
I can see that this is closed however can we get an answer why it was closed? We are literally writing our own third party tool to do this as its necessary to find files at times. I feel like the 5000 blob storage query is up to the user if this is too slow or not.
I also stumbled across this limitation recently in BlobStorages and was surprised that no wildcard syntax is supported.