cmssw
cmssw copied to clipboard
Inconsistent behavior of `VarParsing` default for lists
The argument parsing with FWCore.ParameterSet.VarParsing
has some inconsistent behavior of defaults when using VarParsing.multiplicity.list
, which seems to be cause by the following lines, which has a len (default)
requirement, and which appends the default to the list:
https://github.com/cms-sw/cmssw/blob/ac2d51b77bd6789940953d54d5d2c7cfbbf940a3/FWCore/ParameterSet/python/VarParsing.py#L410-L417
This works fine if the default is a single string object (VarParsing.varType.string
), but this causes issues for some other use cases, for example:
- If a user wants a list of objects as default, it will be converted into a list of lists: e.g.
['foo','bar']
becomes[['foo','bar']]
, while arguments passed via the command line of the formmyStrs=foo,bar
will yield['foo','bar']
... (Work around by setting the default after parsing the command line arguments.) - If a users wants to set a single integer/float/boolean as default for
VarParsing.multiplicity.list
, it throws aTypeError("object of type 'int' has no len()")
because of thelen (default)
requirement, forcing the user to setting a list as default, yielding inconsistent results with arguments passed via the command line as explained in (1).
Here is a reproducible code snippet:
import sys
from FWCore.ParameterSet.VarParsing import VarParsing
def parse(argv):
print(f">>> Parsing {argv!r}...")
sys.argv = ['test.py','maxEvents=100']+argv
opts = VarParsing('standard')
print(">>> Register...")
def add(*args,**kwargs):
try:
print(f">>> {args!r}, {kwargs!r}...")
opts.register(*args,**kwargs)
except Exception as err:
print(f">>> => {err!r}")
add('myInts0', '', VarParsing.multiplicity.list, VarParsing.varType.int) # works because len(str)
add('myInts1', 0, VarParsing.multiplicity.list, VarParsing.varType.int) # ERROR because len(int)
add('myInts2', [0], VarParsing.multiplicity.list, VarParsing.varType.int) # works because len(list)
add('myBools0', '', VarParsing.multiplicity.list, VarParsing.varType.bool) # works because len(str)
add('myBools1', True, VarParsing.multiplicity.list, VarParsing.varType.bool) # ERROR because len(bool)
add('myBools2', [True], VarParsing.multiplicity.list, VarParsing.varType.bool) # works because len(list)
add('myStrs0', '', VarParsing.multiplicity.list, VarParsing.varType.string) # works because len(str)
add('myStrs1', 'foo', VarParsing.multiplicity.list, VarParsing.varType.string) # works because len(str)
add('myStrs2', ['foo'], VarParsing.multiplicity.list, VarParsing.varType.string) # works because len(list)
opts.parseArguments()
print(f">>> Parsed:")
for var in opts._lists:
if var.startswith('my'):
print(f">>> {var:6s} = {getattr(opts,var)!r}")
parse([ ])
parse(['myInts1=0,1','myBools1=True,False','myStrs1=foo,bar','myStrs2=foo,bar'])
which yields
>>> Parsing []...
>>> Register...
>>> ('myInts0', '', 'multiplicity_list', '_int'), {}...
>>> ('myInts1', 0, 'multiplicity_list', '_int'), {}...
>>> => TypeError("object of type 'int' has no len()")
>>> ('myInts2', [0], 'multiplicity_list', '_int'), {}...
>>> ('myBools0', '', 'multiplicity_list', '_bool'), {}...
>>> ('myBools1', True, 'multiplicity_list', '_bool'), {}...
>>> => TypeError("object of type 'bool' has no len()")
>>> ('myBools2', [True], 'multiplicity_list', '_bool'), {}...
>>> ('myStrs0', '', 'multiplicity_list', '_string'), {}...
>>> ('myStrs1', 'foo', 'multiplicity_list', '_string'), {}...
>>> ('myStrs2', ['foo'], 'multiplicity_list', '_string'), {}...
>>> Parsed:
>>> myInts0 = []
>>> myInts1 = []
>>> myInts2 = [[0]]
>>> myBools0 = []
>>> myBools1 = []
>>> myBools2 = [[True]]
>>> myStrs0 = []
>>> myStrs1 = ['foo']
>>> myStrs2 = [['foo']]
>>> Parsing ['myInts1=0,1', 'myBools1=True,False', 'myStrs1=foo,bar', 'myStrs2=foo,bar']...
>>> Register...
>>> ('myInts0', '', 'multiplicity_list', '_int'), {}...
>>> ('myInts1', 0, 'multiplicity_list', '_int'), {}...
>>> => TypeError("object of type 'int' has no len()")
>>> ('myInts2', [0], 'multiplicity_list', '_int'), {}...
>>> ('myBools0', '', 'multiplicity_list', '_bool'), {}...
>>> ('myBools1', True, 'multiplicity_list', '_bool'), {}...
>>> => TypeError("object of type 'bool' has no len()")
>>> ('myBools2', [True], 'multiplicity_list', '_bool'), {}...
>>> ('myStrs0', '', 'multiplicity_list', '_string'), {}...
>>> ('myStrs1', 'foo', 'multiplicity_list', '_string'), {}...
>>> ('myStrs2', ['foo'], 'multiplicity_list', '_string'), {}...
>>> Parsed:
>>> myInts0 = []
>>> myInts1 = [0, 1]
>>> myInts2 = [[0]]
>>> myBools0 = []
>>> myBools1 = [True, False]
>>> myBools2 = [[True]]
>>> myStrs0 = []
>>> myStrs1 = ['foo', 'bar']
>>> myStrs2 = ['foo', 'bar']
I see this is already discussed in Issue https://github.com/cms-sw/cmssw/issues/34469 by @kpedro88, and one can use argparse
as alternative (https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideAboutPythonConfigFile#argparse), and I also realize there's a danger to break many existing configuration scripts if we change the behavior retroactively, but I wonder if we could at least patch the special use case (2) for integers/floats/booleans, which currently throws an TypeError
error? For example, something like the following should solve this error without changing the other behavior:
if (mytype == varType.string and len (default)) or (mytype != varType.string and default != ""):
self._lists[name].append (default)
Or if we want to also fix use case (1) to prevent lists of lists (although this could break some scripts that relied on this behavior):
if isinstance(default,list):
self._lists[name] = default
elif (mytype == varType.string and default == "") or (mytype != varType.string and default != ""):
self._lists[name].append (default)