internetarchive icon indicating copy to clipboard operation
internetarchive copied to clipboard

Make 'is_collection' return False for non-collection items (rather than simply being absent)

Open Aubreymcfato opened this issue 6 years ago • 2 comments

Not sure if I'm missing something, but is it possible to run a query and access the 'is_collection' parameter? My personal example is this: I want to download metadata from a collection, but I just want to download only other collections, not single items.

I didn't find a way to do it yet, until I discovered the 'is_collection' parameter in the JSON... which seems to be not accessible.

Aubreymcfato avatar Apr 06 '18 09:04 Aubreymcfato

is_collection should be available as an attribute on the Item object if the item is a collection. If it's absent, it's not a collection. For example:

In [1]: from internetarchive import get_item

In [2]: item = get_item('nasa')

In [3]: item.is_collection
Out[3]: True

If you try to access the attribute on a non-collection item, and the attribute is absent, it will throw an AttributeError exception. Now that I think of it, that's not very helpful. It should be set to False for non-collection items. I'll keep this issue open, so we can address that in a future release.

However, I think it might be easier for you to do the filtering in your query, so you don't even have to bother checking that:

for item in search_items('collection:nasa AND NOT mediatype:collection').iter_as_items():
    item.download()

Or, from the command-line:

ia download --search 'collection:nasa AND NOT mediatype:collection'

And, from the command-line with GNU Parallel:

ia search 'collection:nasa AND NOT mediatype:collection' -i > itemlist.txt
parallel 'ia download {}' < itemlist.txt

I hope this answers your question, let me know if it doesn't.

jjjake avatar Apr 06 '18 17:04 jjjake

Thanks, it works also with the Advanced searchquery, which sometimes is the easiest way to get the CSV I need. I missed the collection:nasa AND NOT mediatype:collection syntax. Thanks again.

Aubreymcfato avatar Apr 09 '18 08:04 Aubreymcfato