Toitdoc code-section parser needs improvement.

Open floitsch opened this issue 3 years ago • 0 comments

Currently the code-section parser doesn't deal with whitespace. This means that many code-sections have indentations that shouldn't be there.

It also doesn't handle language identifiers correctly. In fact it currently just takes the whole text that is between the back-ticks and stores it for the viewer. Interestingly, it does do the indentation checks, though. (So a code-segment must not have lines that are intended less).

Ideally the CodeSegment class should have a language field:

class CodeSection : public Statement {
 public:
  explicit CodeSection(Symbol language, Symbol code)
      : _language(language), _code(code) { }
  IMPLEMENTS(CodeSection);

  Symbol language() const { return _language; }
  Symbol code() const { return _code; }

 private:
  Symbol _language;
  Symbol _code;
};

The code should be without indentation.

The easiest approach is probably to do the parsing the way it's currently done, since that keeps track of programs that aren't intended correctly, and then run again over the text removing the indentation and extracting the language identifier. (Might make sense to do the language identifier already in the beginning, though).

Mar 15 '22 11:03 floitsch