rust-cssparser icon indicating copy to clipboard operation
rust-cssparser copied to clipboard

Round-trip parsing and output generation causes difference in parsed result

Open frewsxcv opened this issue 8 years ago • 3 comments

Found this while fuzzing.

extern crate cssparser;

use cssparser::ToCss;

fn main() {
    let input = "\\\n3:\'\\\x0c";
    println!("input:\n\t{:?}", input);

    let mut parser_input = cssparser::ParserInput::new(input);
    let mut parser = cssparser::Parser::new(&mut parser_input);
    let tokens = parser
        .next_including_whitespace_and_comments()
        .into_iter()
        .collect::<Vec<_>>();
    println!("tokens:\n\t{:?}", tokens);

    let str2 = tokens.iter().map(|t| t.to_css_string()).collect::<String>();
    println!("tokens to string:\n\t{:?}", str2);

    let mut parser_input = cssparser::ParserInput::new(&str2);
    let mut parser = cssparser::Parser::new(&mut parser_input);
    let tokens2 = parser
        .next_including_whitespace_and_comments()
        .into_iter()
        .collect::<Vec<_>>();
    println!("tokens to string to tokens:\n\t{:?}", tokens2);
}
input:
	"\\\n3:\'\\\u{c}"
tokens:
	[Delim('\\')]
tokens to string:
	"\\"
tokens to string to tokens:
	[Ident("�")]

frewsxcv avatar Jun 25 '17 21:06 frewsxcv

I was surprised to see only one token. This is because you only call next* once, and the into_iter call is a method of Result. You likely want something like:

    let mut tokens = Vec::new();
    while let Ok(token) = parser.next_including_whitespace_and_comments() {
        tokens.push(token)
    }

… and similarly for tokens2. This makes assert_eq!(tokens, tokens2); pass.

SimonSapin avatar Jun 26 '17 10:06 SimonSapin

Blah, you're right. I fixed that and here's a new issue:

extern crate cssparser;

use cssparser::ToCss;

fn main() {
    let input = "/~*3E833";
    println!("input:\n\t{:?}", input);

    let mut parser_input = cssparser::ParserInput::new(input);
    let mut parser = cssparser::Parser::new(&mut parser_input);
    let mut tokens = vec![];
    while let Ok(token) = parser.next_including_whitespace_and_comments() {
        tokens.push(token)
    }
    println!("tokens:\n\t{:?}", tokens);

    let str2 = tokens.iter().map(|t| t.to_css_string()).collect::<String>();
    println!("tokens to string:\n\t{:?}", str2);

    let mut parser_input = cssparser::ParserInput::new(&str2);
    let mut parser = cssparser::Parser::new(&mut parser_input);
    let mut tokens2 = vec![];
    while let Ok(token) = parser.next_including_whitespace_and_comments() {
        tokens2.push(token)
    }
    println!("tokens to string to tokens:\n\t{:?}", tokens2);
}
input:
	"/~*3E833"
tokens:
	[Delim('/'), Delim('~'), Delim('*'), Number { has_sign: false, value: inf, int_value: None }]
tokens to string:
	"/~*inf"
tokens to string to tokens:
	[Delim('/'), Delim('~'), Delim('*'), Ident("inf")]

Looks like it's struggling with big numbers

frewsxcv avatar Jun 27 '17 04:06 frewsxcv

I’ve filed #167. If you want to keep fuzzing in the meantime, consider skipping cases where the serialization contains inf.

SimonSapin avatar Jun 27 '17 08:06 SimonSapin