get_in() raises TypeError when index is a string but structure is a list
from funcy import get_in
get_in({}, "key") # example 1
get_in([], "key") # example 2
In the code sample above, example 1 returns the default value because the path doesn't exist. However, example 2 fails with:
.../site-packages/funcy/colls.py in get_in(coll, path, default)
271 for key in path:
272 try:
--> 273 coll = coll[key]
274 except (KeyError, IndexError):
275 return default
TypeError: list indices must be integers or slices, not str
This is understandable but to handle this case, a special error handling needs to be added around the function and it kind of defeats the purpose of a simple one-liner. It is also not documented.
I would be happy to do a PR if you agree get_in() should handle this exception automatically by also returning the default value.
The thing is this is intended behavior. Python is strict on types, i.e. example 2 is indeed an error.
I understand there might be use cases for second example to return none silently. What is yours?
Btw, you can use silent(get_in)([], ...) though.
And sorry for slow response
The use case is quite common. If get_in is used to drill into deeply nested JSON structures, it can easily happen that somewhere along the path, the object is not a dictionary but a list. And in that case, one would expect to get the default value, since the path doesn't exist. But instead a TypeError is raised.
It seems inconsistent, because if a key is missing, KeyError nor IndexError are raised. And they are also caused by a wrong key. Just not the wrong type but a wrong value.
Using silent would work, but it is not an ideal solution. It may hide real exceptions. Imagine if instead of a dictionary, a custom mapping or iterable class is used and there is an error in there. It would hide it.
Missing key and unexpected structure type are different cases. If you are looking for a string key in a list then you get a TypeError, any other errors also pop up, so it's perfectly consistent now.
If silent() is a problem then you can use ignore(TypeError).
From the documentation:
Returns a value corresponding to path in nested collection.
It doesn't say anything about type checking or type requirements. So I assumed the behaviour is the same as for other similar technologies, such as XPath, JSONPath or CSS selectors. E.g. unresolvable path results in default value, not an Exception.
If
silent()is a problem then you can useignore(TypeError).
If I'm not mistaken, ignore is a decorator and cannot by used in-line just as silent(get_in)([], ...) was suggested previously. Could you please show how to use it inline? If it cannot be used inline, then it kind of defeats the purpose of the library because again, I would disable more errors than needed.
Behavior is close to Pythons not anything else. I understand that expectations might vary, but ignoring a TypeError like this while might be desirable for some uses will conceal bugs on other ones. So I am going strict here as Python does. funcy is not designed to be quick and dirty do my thing lib, but quick do my thing lib. I still follow DWIM when it's unambiguous.
If I'm not mistaken, ignore is a decorator and cannot by us ...
Both silent() and ignore() are decorators, both might be used inline. In fact silent(func) is exactly the same as ignore(Exception)(func). So you may use it like:
val = ignore(TypeError)(get_in)(data, path, default=some_value)
# Or make an alias for it to be reused
dirty_get_in = ignore(TypeError)(get_in) # choose any name you see fit
val1 = dirty_get_in(data, path1, default=some_value)
val2 = dirty_get_in(data, path2)
Thank you for the example. Unfortunately, it works only partially. It doesn't given back the default value, because it just ignores the error (the code never gets to the return of the default).
ignore(TypeError)(get_in)([], ["x"], default="default")
then produces None rather then "default". Additionally, it incorrectly hides a different TypeError:
ignore(TypeError)(get_in)([], "x", default="default") (path is a str, not a list)
If ignoring the error is not acceptable, the easiest solution for me is to take the function out of the library and reimplement it myself. It has only 8 lines and no dependency on the rest of the library code.
Just in case anyone else bumps into this problem, this was my solution:
def get_in(coll, path, default=None):
"""Returns a value at path in the given nested collection."""
for key in path:
try:
coll = coll[key]
except (KeyError, IndexError, TypeError):
return default
return coll
Another example to prove my point: The bolton.itertools.get_path() from a different library also interprets "string key to a list" as "missing path" and raises either the same exception as for any other missing path or returns the default value.
Reopening this to fix it via documenting alternative ways.
NOTE: has_path() now has the same duality.