patsy Trouble making categorical variable --- TypeError: 'ClassRegistry' object is not callable

Not sure what's going on here.

In [44]: patsy.dmatrix("C(a)", {'a':['m', 'n', 'o']})
Out[44]:
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-44-24c8bf15ba3b> in <module>()
----> 1 dmatrix("C(a)", {'a':['m', 'n', 'o']})

/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/highlevel.pyc in dmatrix(formula_like, data, eval_env, return_type)
    259     """
    260     (lhs, rhs) = _do_highlevel_design(formula_like, data, _get_env(eval_env),
--> 261                                       return_type)
    262     if lhs.shape[1] != 0:
    263         raise PatsyError("encountered outcome variables for a model "

/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/highlevel.pyc in _do_highlevel_design(formula_like, data, eval_env, return_type)
    145     def data_iter_maker():
    146         return iter([data])
--> 147     builders = _try_incr_builders(formula_like, data_iter_maker, eval_env)
    148     if builders is not None:
    149         return build_design_matrices(builders, data,

/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/highlevel.pyc in _try_incr_builders(formula_like, data_iter_maker, eval_env)
     59         return design_matrix_builders([formula_like.lhs_termlist,
     60                                        formula_like.rhs_termlist],
---> 61                                       data_iter_maker)
     62     else:
     63         return None

/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/build.pyc in design_matrix_builders(termlists, data_iter_maker)
    691      cat_postprocessors) = _examine_factor_types(all_factors,
    692                                                  factor_states,
--> 693                                                  data_iter_maker)
    694     # Now we need the factor evaluators, which encapsulate the knowledge of
    695     # how to turn any given factor into a chunk of data:

/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/build.pyc in _examine_factor_types(factors, factor_states, data_iter_maker)
    441             break
    442         for factor in list(examine_needed):
--> 443             value = factor.eval(factor_states[factor], data)
    444             if isinstance(value, Categorical):
    445                 postprocessor = CategoricalTransform(levels=value.levels)

/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/eval.pyc in eval(self, memorize_state, data)
    429     #    http://nedbatchelder.com/blog/200711/rethrowing_exceptions_in_python.html
    430     def eval(self, memorize_state, data):
--> 431         return self._eval(memorize_state["eval_code"], memorize_state, data)
    432 
    433 def test_EvalFactor_basics():

/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/eval.pyc in _eval(self, code, memorize_state, data)
    412     def _eval(self, code, memorize_state, data):
    413         inner_namespace = VarLookupDict([data, memorize_state["transforms"]])
--> 414         return self._eval_env.eval(code, inner_namespace=inner_namespace)
    415 
    416     def memorize_chunk(self, state, which_pass, data):

/Users/mike/venv/sci/lib/python2.7/site-packages/patsy/eval.pyc in eval(self, expr, source_name, inner_namespace)
    119         code = compile(expr, source_name, "eval", self.flags, False)
    120         return eval(code, {}, VarLookupDict([inner_namespace]
--> 121                                             + self._namespaces))
    122 
    123     @classmethod

<string> in <module>()

TypeError: 'ClassRegistry' object is not callable

Using virtualenv on a Mac. Python 2.7.3. Pygments==1.5 cloud==2.6.9 distribute==0.6.31 ipython==0.13.1 matplotlib==1.2.0 nose==1.2.1 numpy==1.6.2 pandas==0.10.0 patsy==0.1.0 python-dateutil==2.1 pytz==2012h pyzmq==2.2.0.1 readline==6.2.4.1 scipy==0.11.0 six==1.2.0 statsmodels==0.5.0 sympy==0.7.2 tornado==2.4.1 wsgiref==0.1.2

Jan 09 '13 03:01 selik

You have an object named C in your scope (a ClassRegistry object, apparently), which is shadowing the built-in C function. Compare to:

In [3]: C = 1

In [4]: patsy.dmatrix("C(a)", {"a": ["m", "n", "o"]})
TypeError: 'int' object is not callable

I'm not sure what the right thing to do here is. We could make builtins override things in the scope, but that gets a bit icky if it means that you can't use variables named "C" without jumping through hoops, and whenever we add a new builtin it would potentially break existing scripts.

Maybe we should keep the current behaviour, but issue a warning whenever someone looks up a builtin name but gets something else, i.e. it gets shadowed? That would preserve the semantics of existing formulas when we change the set of builtins, but it would still encourage people to avoid such names when possible.

Would you be interested in preparing a pull request implementing such a warning? It'd involve adding some code to VarLookupDict to keep track of warn_if_shadowed namespaces, and then on lookup doing something like

for warn_if_shadowed in self.warn_if_shadowed_namespaces:
  if key in warn_if_shadowed and key is not warn_if_shadowed[key]:
    warnings.warn(...)

and then we'd add an argument to EvalEnvironment.add_outer_namespace to let you flag an added namespace as having this special property, and then in desc.py when we call add_outer_namespace to add the builtins we'd use this argument. Plus tests, of course.

Jan 09 '13 14:01 njsmith

Thanks for the tip. It must be all the stuff that IPython Notebook imports automatically. I can try running it from the regular console. (Update: works just fine outside of IPython Notebook with pylab enabled. No namespace conflict.)

I haven't written any namespace confusion avoidance before, so that should be a good exercise. Not sure when I'll get to it...

Jan 09 '13 17:01 selik

For the record: the problem is that sympy defines a bunch of magic shorthands that clash with ours, and if you use isympy (and maybe other stuff) then they're dropped into the global namespace by default. This affects at least C, Q, and I, which I think is all of our single-variable names for now...

See here: http://docs.sympy.org/0.7.2/gotchas.html#symbols (specifically "Lastly, it is recommended that you not use I, E, S, N, C, O, or Q for variable or symbol names...").

This is a bit of a mess, because of course we want people to be able to use sympy and patsy together.

I suppose we could still rename Q to, like, V or something (for "variable"), but C and I have 20 years of tradition behind them, so I'm really not sure what the best solution is here :-/.

Mar 27 '13 15:03 njsmith

In some ways, Patsy is like the Regex module. Regex patterns solved this problem by introducing an escape character.

Apr 22 '17 07:04 selik