Trouble making categorical variable --- TypeError: 'ClassRegistry' object is not callable
Not sure what's going on here.
In [44]: patsy.dmatrix("C(a)", {'a':['m', 'n', 'o']})
Out[44]:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-44-24c8bf15ba3b> in <module>()
----> 1 dmatrix("C(a)", {'a':['m', 'n', 'o']})
/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/highlevel.pyc in dmatrix(formula_like, data, eval_env, return_type)
259 """
260 (lhs, rhs) = _do_highlevel_design(formula_like, data, _get_env(eval_env),
--> 261 return_type)
262 if lhs.shape[1] != 0:
263 raise PatsyError("encountered outcome variables for a model "
/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/highlevel.pyc in _do_highlevel_design(formula_like, data, eval_env, return_type)
145 def data_iter_maker():
146 return iter([data])
--> 147 builders = _try_incr_builders(formula_like, data_iter_maker, eval_env)
148 if builders is not None:
149 return build_design_matrices(builders, data,
/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/highlevel.pyc in _try_incr_builders(formula_like, data_iter_maker, eval_env)
59 return design_matrix_builders([formula_like.lhs_termlist,
60 formula_like.rhs_termlist],
---> 61 data_iter_maker)
62 else:
63 return None
/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/build.pyc in design_matrix_builders(termlists, data_iter_maker)
691 cat_postprocessors) = _examine_factor_types(all_factors,
692 factor_states,
--> 693 data_iter_maker)
694 # Now we need the factor evaluators, which encapsulate the knowledge of
695 # how to turn any given factor into a chunk of data:
/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/build.pyc in _examine_factor_types(factors, factor_states, data_iter_maker)
441 break
442 for factor in list(examine_needed):
--> 443 value = factor.eval(factor_states[factor], data)
444 if isinstance(value, Categorical):
445 postprocessor = CategoricalTransform(levels=value.levels)
/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/eval.pyc in eval(self, memorize_state, data)
429 # http://nedbatchelder.com/blog/200711/rethrowing_exceptions_in_python.html
430 def eval(self, memorize_state, data):
--> 431 return self._eval(memorize_state["eval_code"], memorize_state, data)
432
433 def test_EvalFactor_basics():
/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/eval.pyc in _eval(self, code, memorize_state, data)
412 def _eval(self, code, memorize_state, data):
413 inner_namespace = VarLookupDict([data, memorize_state["transforms"]])
--> 414 return self._eval_env.eval(code, inner_namespace=inner_namespace)
415
416 def memorize_chunk(self, state, which_pass, data):
/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/eval.pyc in eval(self, expr, source_name, inner_namespace)
119 code = compile(expr, source_name, "eval", self.flags, False)
120 return eval(code, {}, VarLookupDict([inner_namespace]
--> 121 + self._namespaces))
122
123 @classmethod
<string> in <module>()
TypeError: 'ClassRegistry' object is not callable
Using virtualenv on a Mac. Python 2.7.3.
Pygments==1.5
cloud==2.6.9
distribute==0.6.31
ipython==0.13.1
matplotlib==1.2.0
nose==1.2.1
numpy==1.6.2
pandas==0.10.0
patsy==0.1.0
python-dateutil==2.1
pytz==2012h
pyzmq==2.2.0.1
readline==6.2.4.1
scipy==0.11.0
six==1.2.0
statsmodels==0.5.0
sympy==0.7.2
tornado==2.4.1
wsgiref==0.1.2
You have an object named C in your scope (a ClassRegistry object, apparently), which is shadowing the built-in C function. Compare to:
In [3]: C = 1
In [4]: patsy.dmatrix("C(a)", {"a": ["m", "n", "o"]})
TypeError: 'int' object is not callable
I'm not sure what the right thing to do here is. We could make builtins override things in the scope, but that gets a bit icky if it means that you can't use variables named "C" without jumping through hoops, and whenever we add a new builtin it would potentially break existing scripts.
Maybe we should keep the current behaviour, but issue a warning whenever someone looks up a builtin name but gets something else, i.e. it gets shadowed? That would preserve the semantics of existing formulas when we change the set of builtins, but it would still encourage people to avoid such names when possible.
Would you be interested in preparing a pull request implementing such a warning? It'd involve adding some code to VarLookupDict to keep track of warn_if_shadowed namespaces, and then on lookup doing something like
for warn_if_shadowed in self.warn_if_shadowed_namespaces:
if key in warn_if_shadowed and key is not warn_if_shadowed[key]:
warnings.warn(...)
and then we'd add an argument to EvalEnvironment.add_outer_namespace to let you flag an added namespace as having this special property, and then in desc.py when we call add_outer_namespace to add the builtins we'd use this argument. Plus tests, of course.
Thanks for the tip. It must be all the stuff that IPython Notebook imports automatically. I can try running it from the regular console. (Update: works just fine outside of IPython Notebook with pylab enabled. No namespace conflict.)
I haven't written any namespace confusion avoidance before, so that should be a good exercise. Not sure when I'll get to it...
For the record: the problem is that sympy defines a bunch of magic shorthands that clash with ours, and if you use isympy (and maybe other stuff) then they're dropped into the global namespace by default. This affects at least C, Q, and I, which I think is all of our single-variable names for now...
See here: http://docs.sympy.org/0.7.2/gotchas.html#symbols (specifically "Lastly, it is recommended that you not use I, E, S, N, C, O, or Q for variable or symbol names...").
This is a bit of a mess, because of course we want people to be able to use sympy and patsy together.
I suppose we could still rename Q to, like, V or something (for "variable"), but C and I have 20 years of tradition behind them, so I'm really not sure what the best solution is here :-/.
In some ways, Patsy is like the Regex module. Regex patterns solved this problem by introducing an escape character.