cachecontrol
cachecontrol copied to clipboard
Allow using a heuristic with the `doesitcache` tool
Often times CacheControl be confusing b/c something seems like it should be cached, but it is not. There is typically a good explanation for this and work around that comes in the form of heuristics. It would be helpful for the doesitcache tool to accept a heuristic.
Specificially:
- doesitcache --list-heuristics - list the available heuristics and print the docstring of the classes
- doesitcache --heuristic $NAME - Apply an heuristic. There might be options that need to be passed in, which seems to go above and beyond at the moment. Just making sure heuristics have defaults should be enough
- doesitcache --heuristic $PATH - Use the file to load a python object as the heuristic. The algorithm would be to import it and see what classes are instances of the heuristic base class, choosing the first one to apply.
Docs!
Thanks to @sigmavirus24 for the suggestion!
You might want to change to load heuristics through entry-points to make life simpler (that way someone doesn't shove a bunch of heuristics in one module and get upset when they can't use the path). Alternatively, you might want to something like:
doesitcache --list-heuristics # to list the default heuristics
doesitcache --heuristic cachecontrol.heuristics.ExpiresAfter <url>
doesitcache --load-heuristics-from path/to/module.py --heuristic HeuristicDefinedInModule <url>
but realistically I think the last one should also take the fully qualified name to avoid naming collisions when using the tool and that would just be path.to.module.HeuristicDefiendInModule but that case you might want to be special in the even that path is not importable.
@sigmavirus24 That is a good idea. The entrypoint bit is a pain in the neck, but --list-heuristics might be helpful there.
Yeah, the entry-points are probably a bit of overkill but also make some of these features easier. I'm torn on it personally.
Any updates on this? Is it up for grabbing or should I leave it be? :)