Automa.jl icon indicating copy to clipboard operation
Automa.jl copied to clipboard

Roadmap to v1.0

Open jakobnissen opened this issue 3 years ago • 6 comments

Automa has been at the core of BioJulia for around four years, so it's probably about time it got the "finally stable" batch and a shiny 1.0 release. In particular I'm worried that too few people are really "into" Automa, and, if we lose interest or get too busy to work on BioJulia, the knowledge necessary to effective use Automa is lost.

At its core, a 1.0 release means that we settle on an interface. Hence, this issue is about how Automa should feel, not how it should work. For this part, I would appreciate getting feedback from as many users as possible: What did you find hardest about Automa? How could it be made nicer to use? Could the interface be simplified?

A few suggestions:

  • [x] CodeGenContext should be optional, because people shouldn't need to go deep into Automa to be able to use it. It should default to something equivalent to CodeGenContext(:goto) or maybe just CodeGenContext()
  • [x] Perhaps p_eof = sizeof(mem); p_end = sizeof(mem) should be included in gen_init_code() to make it easier?
  • [ ] It's a little hard to generate an Automa reader. If possible, streamline that process and document it. Add good error handling.
  • [x] Add the error handling in #64 when an unexpected input is seen. Allow users to paste this generated code into their functions.
  • [x] Add this error handling to default implementations like generate_validator_function
  • [ ] Remove rewrite_special_macros. This is too magical

There are a few non-API things I would like to have done before 1.0, but these are not critical to get done before v1.0.

  • [x] Merge :goto and :simd generator, optionally improving the underlying codegen ( #53 )
  • [ ] Improve docs with examples (which implies we need to have nailed the interface before then.
  • [ ] Docs: Explain very clearly that Automa is byte-oriented
  • [ ] Docs: Advice using function calls as Expr objects, explain why, and amend the examples
  • [x] Add extra debugging capabilities and general quality-of-life improvements (#64 )
  • [x] Enforce the symbols in the passed Dict in generate_exec_code are the same as those in the Machine.

I'm envisioning tagging v1.0 some time before end of northern hemisphere summer 2022.

jakobnissen avatar Mar 21 '21 19:03 jakobnissen

Great idea! Timeline is not overly ambitious, which seems good. I think that will be enough time and motivation for me to finally make that genbank parser, which will give me a much better sense of what the pain points are.

kescobo avatar Mar 21 '21 21:03 kescobo

It's strange coming back to Automa after all of these years (#17) and seeing it still (#59) doesn't handle recursion. It seems such a massive missing chunk to me for a FSM parser. It would be make Automa genuinely useful (to me at least) if it did for 1.0 - as it is I've had to roll my own (buggy) parser instead. I appreciate there doesn't seem to be much enthusiasm for that though, sadly, despite it being necessary for all phylogenetic formats.

richardreeve avatar Apr 07 '21 12:04 richardreeve

FSM parsers can't handle recursion by definition. I think the reason why no-one has done it is that PDAs are harder to build and especially to minimize than FSMs - and that Automa is currently a FSM package only, so the changes would need to go deep.

jakobnissen avatar Apr 07 '21 13:04 jakobnissen

My mistake with the name. Still a shame though.

richardreeve avatar Apr 07 '21 17:04 richardreeve

I am also interested in PDAs. That could be a feature target for v2 or some v1.x, though. Regarding the syntax, what changes would be necessary to handle PDAs?

Azzaare avatar Apr 14 '21 03:04 Azzaare

@Azzaare for syntax, we can settle on whatever. I would recommend using a macro to parse a Backus-Naur expression The hard thing is actually making the PDA.

jakobnissen avatar Apr 30 '21 09:04 jakobnissen