typed-argument-parser
typed-argument-parser copied to clipboard
TAP works much slower than standard ArgumentParser
TAP works very slow: to reproduce
from tap import Tap
from argparse import ArgumentParser
from profilehooks import profile, timecall
class SimpleArgumentParser(Tap):
name: str # Your name
language: str = 'Python' # Programming language
package: str = 'Tap' # Package name
stars: int # Number of stars
max_stars: int = 5 # Maximum stars
@profile(dirs=True, sort='time')
@timecall()
def tap_parser():
return SimpleArgumentParser().parse_args()
@profile(dirs=True, sort='time')
@timecall()
def arg_parser():
parser = ArgumentParser()
parser.add_argument('--name', type=str, required=True,
help='Your name')
parser.add_argument('--language', type=str, default='Python',
help='Programming language')
parser.add_argument('--package', type=str, default='Tap',
help='Package name')
parser.add_argument('--stars', type=int, required=True,
help='Number of stars')
parser.add_argument('--max_stars', type=int, default=5,
help='Maximum stars')
return parser.parse_args()
tap_args = tap_parser()
args = arg_parser()
print(f'My name is {tap_args.name} and I give the {tap_args.language} package '
f'{tap_args.package} {tap_args.stars}/{tap_args.max_stars} stars!')
print(f'My name is {args.name} and I give the {args.language} package '
f'{args.package} {args.stars}/{args.max_stars} stars!')
tap_parser: 0.050 seconds (tap 1.6.1) tap_parser: 0.042 seconds (tap 1.4.3) arg_parser: 0.000 seconds
The most time consuming actions are ncalls tottime percall cumtime percall filename:lineno(function) 7754 0.013 0.000 0.032 0.000 /home/artem/.conda/envs/rdclone/lib/python3.6/tokenize.py:492(_tokenize) 7148 0.006 0.000 0.006 0.000 {method 'match' of '_sre.SRE_Pattern' objects}
I assume these two involve regular expressions.
Environment Ubuntu 20.04.2 LTS Python 3.6.10 conda 4.9.2 profilehooks 1.12.0
Hi @Enolerobotti,
Thank you for bringing this to our attention! Yes, the slow part is the source code inspection in order to extract the help strings from comments. In our machine learning workflows, we find this overhead to be acceptable, but we can imagine that certain applications may need a faster argument parser. In the future, we will add a flag to disable this. Concretely, we would skip this call: https://github.com/swansonk14/typed-argument-parser/blob/0dc69b36dfd54c3a5cb07daa92f7f0f680437064/tap/tap.py#L93
Best, Kyle and Jesse
Hi @martinjm97,
Thank you for your reply!
Indeed, the _get_class_variables
method works slow. The problem is that when I try to use ChemProp in my workflow, I need some default arguments (which are defined inside a TAP instance) to be loaded several times. I have to initiate TAP based classes several times due to an assertion (btw, for me it looks redundant). So, for ensemble of 10 models, I spend additional 0.5 sec per molecule to predict. For list longer than 1000 molecules it would be worth for me to get rid of redundant operations.
Cheers, Artem
@Enolerobotti,
That makes sense! I didn't think a Tap
would be in an inner loop. We'll add the flag soon and then you can save the 500s for your 1000 molecules! Thanks for letting us know and for the clear explanation.
--Jesse
Encountering the same issue, but now it seems like the class variables are being used on line 72 in _add_argument
:
# Description
if variable in self.class_variables:
kwargs['help'] += ' ' + self.class_variables[variable]['comment']
and on line 298 in _add_arguments
:
# Add class variables (in order)
for variable in self.class_variables:
As that variable is now also being used for ordering, is it now not possible to make this change as noted above?