Parser icon indicating copy to clipboard operation
Parser copied to clipboard

Generator expression without parenthesis does not raise SyntaxError

Open key262yek opened this issue 3 years ago • 3 comments

Feature

In CPython, generator expression without parenthesis raises following error

>>> foo(x, z for z in range(10), t, w)
  File "<stdin>", line 1
    foo(x, z for z in range(10), t, w)
           ^^^^^^^^^^^^^^^^^^^^
SyntaxError: Generator expression must be parenthesized

However, In RustPython, it does not matter

>>>>> foo(x, z for z in range(10), t, w)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'foo' is not defined

Python Documentation

key262yek avatar Jul 20 '22 09:07 key262yek

This is allowed only for single-argument function call.

In python.lalrpop I see FunctionArgument is including Generator stuff. I guess that part is expected to be moved to ArgumentList part.

youknowone avatar Jul 20 '22 14:07 youknowone

I try to change parse_args function in ArgumentList

ArgumentList: ArgumentList = {
    <e: Comma<FunctionArgument>> =>? {
        let arg_list = parse_args(e)?;
        Ok(arg_list)
    }
};

First, I checked whether the generator expression is single-argument or not.

// In parser/src/functions.rs
pub fn parse_args(func_args: Vec<FunctionArgument>) -> Result<ArgumentList, LexicalError> {
    let mut args = vec![];
    let mut keywords = vec![];
// <<<< New
    let len = func_args.len();    // read length of FunctionArgument
// <<<<
    let mut keyword_names = HashSet::with_capacity_and_hasher(func_args.len(), RandomState::new());
    for (name, value) in func_args {
        match name {
            Some((location, name)) => {
                ..
            }
            None => {
                ..
// <<<< New
                if let ast::ExprKind::GeneratorExp { elt: e, .. } = &value.node {
                    if len != 1{
                        return Err(LexicalError {
                            error: LexicalErrorType::OtherError(
                                "Generator expression must be parenthsized".to_string(),
                            ),
                            location: e.location,
                        });
                    }
                }
// <<<<
                args.push(value);
            }
        }
    }
    Ok(ArgumentList { args, keywords })
}

But, in this way, we cannot figure out that the generator at the first time were parenthesized or not. Following statements all raise errors.

>>>>> foo(x, z for z in range(10), t, w)
SyntaxError: Generator expression must be parenthsized at line 1 column 8
foo(x, z for z in range(10), t, w)
       ^
>>>>> foo(x, (z for z in range(10)), t, w)
SyntaxError: Generator expression must be parenthsized at line 1 column 8
foo(x, (z for z in range(10)), t, w)
        ^

It's because, as we can see in the parsing algorithm of function argument, NamedExpressionTest can be a generator expression itself.

FunctionArgument: (Option<(ast::Location, Option<String>)>, ast::Expo) = {
    <e:NamedExpressionTest> <c:CompFor?> => {    // <<<< Here
        let mut parenthesized = true;
        let expr = match c {
            Some(c) => ast::Expr {
                    location: e.location,
                    custom: (),
                    node: ast::ExprKind::GeneratorExp {
                        elt: Box::new(e),
                        generators: c,
                    }
            },
            None => e,
        };
        (None, expr)
    },
    ..
}

// In NamedExpressionTest parsing
  <location:@L> "(" <elt:Test> <generators:CompFor> ")" => {
        ast::Expr {
            location,
            custom: (),
            node: ast::ExprKind::GeneratorExp { elt: Box::new(elt), generators }
        }
    },

Hence, I add boolean attribute to result of FunctionArgument parsing, which contains whether the generator expression were parenthesized or not.

FunctionArgument: (Option<(ast::Location, Option<String>)>, ast::Expr, bool) = { // <<< Here
    <e:NamedExpressionTest> <c:CompFor?> => {
        let mut parenthesized = true;        // <<< New
        let expr = match c {
            Some(c) => {
                parenthesized = false;          // <<< New
                ast::Expr {
                    location: e.location,
                    custom: (),
                    node: ast::ExprKind::GeneratorExp {
                        elt: Box::new(e),
                        generators: c,
                    }
                }
            },
            None => e,
        };
        (None, expr, parenthesized)          // <<<< New
    },
    ..
}

and change the parse_args function as follows.

// <<<< New
type FunctionArgument = (Option<(ast::Location, Option<String>)>, ast::Expr, bool);
// <<<<

pub fn parse_args(func_args: Vec<FunctionArgument>) -> Result<ArgumentList, LexicalError> {
    let mut args = vec![];
    let mut keywords = vec![];
// <<<< New
    let len = func_args.len();
// <<<<
    let mut keyword_names = HashSet::with_capacity_and_hasher(func_args.len(), RandomState::new());
    for (name, value, parenthsized) in func_args {
        match name {
            Some((location, name)) => {
                ..
            }
            None => {
                ..

// <<<< New
                if let ast::ExprKind::GeneratorExp { elt: e, .. } = &value.node {
                    if len != 1 && !parenthsized {
                        return Err(LexicalError {
                            error: LexicalErrorType::OtherError(
                                "Generator expression must be parenthsized".to_string(),
                            ),
                            location: e.location,
                        });
                    }
                }
// <<<<
                args.push(value);
            }
        }
    }
    Ok(ArgumentList { args, keywords })
}

Question

Question is that, does it seem okay to add new attribute in lalrpop? Is there any other way to resolve it without adding new structure? any idea?

key262yek avatar Jul 21 '22 08:07 key262yek

More above than ArgumentList.

cpython:

>>> class A(x for x in range(10)): pass
  File "<stdin>", line 1
    class A(x for x in range(10)): pass
           ^
SyntaxError: expected ':'

RustPython:

>>>>> class A(x for x in range(10)): pass
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

This is SyntaxError but RustPython is not catching it.

In CPython, this is handled by t_primary: https://github.com/python/cpython/blob/3.10/Grammar/python.gram#L823-L824

which not used for class definition. (class_def is using arguments https://github.com/python/cpython/blob/3.10/Grammar/python.gram#L489-L495)

My suggestion is creating one more layer like cpython.

youknowone avatar Jul 23 '22 06:07 youknowone