chomp icon indicating copy to clipboard operation
chomp copied to clipboard

Attoparsec parsers

Open m4rw3r opened this issue 8 years ago • 6 comments

Attoparsec has a lot of good parsers and combinators, would be a good idea to implement most if not all of them.

Data.Attoparsec.ByteString

Individual bytes

  • [x] word8 -> token

  • [x] anyWord8 -> any

  • [x] notWord8 -> not_token

  • [x] satisfy

  • [x] satisfyWith -> satisfy_with

  • [x] ~~skip~~

    satisfy optimizes into skip.

Lookahead

  • [x] peekWord8 -> peek
  • [x] peekWord8' -> peek_next

Byte classes

  • [ ] inClass
  • [ ] notInClass

Efficient string handling

  • [x] string

  • [x] skipWhile -> skip_while

    takeWhile optimizes into skipWhile for simple Input types like slices.

  • [x] take

  • [x] scan

  • [x] runScanner -> run_scanner

  • [x] takeWhile -> take_while

  • [x] takeWhile1 -> take_while1

  • [x] takeTill -> take_till

Consume all remaining input

  • [x] takeByteString -> take_remainder

Combinators

  • [x] ~~try~~

    Redundant since Chomp backtracks automatically on combinators requiring backtracking.

  • [x] ~~<?>~~

    Redundant since map_err exists.

  • [x] choice

  • [x] count

  • [x] option

  • [x] many

  • [x] many1

  • [x] manyTill -> many_till

  • [x] sepBy -> sep_by

  • [x] sepBy1 -> sep_by1

  • [x] skipMany -> skip_many

  • [x] skipMany1 -> skip_many1

  • [x] eitherP -> either

  • [x] match -> matched_by

State observation

  • [x] endOfInput -> eof

Data.Attoparsec.ByteString.Char8

Special character parsers

Fast predicates

  • [x] isDigit -> ascii::is_digit
  • [ ] isAlpha_iso8859_15
  • [x] isAlpha_ascii -> ascii::is_alpha
  • [x] isSpace -> ascii::is_whitespace
  • [x] isHorizontalSpace -> ascii::is_horizontal_space
  • [x] isEndOfLine -> ascii::is_end_of_line

Efficient string handling

Numeric parsers

Data.Attoparsec.Combinator

Data.Attoparsec.Text

m4rw3r avatar Nov 29 '15 13:11 m4rw3r

I have a somewhat a need for choice. Since variadic functions aren't possible in rust (yet), I'm wondering what the performance implications are when passing a slice of parsers. OTOH, macros may be utilized to sugar nested or functions.

dashed avatar Apr 15 '16 22:04 dashed

@dashed What kind of need? The reason I have not yet implemented choice is that it I am unsure if it should accept a list of function poiinters or a list of closures. The list-of-function-pointers is different in that it only has one level of indirection from the original slice compared to two of the closures. The first one does not need to box anything but for closures you need to since they are dynamically sized.

As for using or, there is already a sugar for this in the form of the <|> operator in the parse! macro. This is most likely the best solution if you have a static list of branches for the parser.

m4rw3r avatar Apr 16 '16 18:04 m4rw3r

I recently discovered <|> operator which seems to make things a bit nicer.

dashed avatar Apr 17 '16 10:04 dashed

@m4rw3r Do you know if there's a better way to do skip_many_till? Essentially many_till that doesn't return.

dashed avatar Apr 17 '16 20:04 dashed

@dashed To properly make it it would require some additional methods on the internal trait for the bounded combinators. But there is an easy way by implementing a sink implementing FromIterator which will just discard all the data.

m4rw3r avatar May 17 '16 20:05 m4rw3r

@m4rw3r Thanks for the suggestion! I'll try to investigate this approach.

dashed avatar May 24 '16 04:05 dashed