ceylon-sdk
ceylon-sdk copied to clipboard
Add a common Markdown processor module
Add a new ceylon.markdown module which processes Common Markdown (http://commonmark.org/) and renders Html, plain text, and possibly more outputs. The frontend and backend should be separate, with an SPI to allow other modules to provice custom backends (like man pages).
I'm not sure if common markdown cover this area, but for ceylondoc we need support for wiki links (so it should be extensible that way)
I'd like to see a DOM-like API so it's possible to parse markdown into a tree, manipulate that tree programmatically and then render it. That's the fundamental reason why the dist has two markdown processors -- the one used by ceylon doctool provides such an API, but the one used by ceylon doc does not.
By default, this should be CommonMark compliant. However, we will need options to:
- enable wiki links
- disable some HTML support (§4.6 – optimally, support individually turning each of the seven conditions on or off, and support white- or blacklists for conditions 6 and 7)
- disable images
- …
Also note that the Appendix describes a parsing strategy that we could use. (We might need a second, internal AST for Phase 1, which is then transformed into the shared AST for Phase 2.)
This module can be used by:
- the
ceylon doctool - Ceylon IDE (both) to show the documentation tooltip when hovering a type
- Herd, when showing the details of a module (when Herd is rewritten in Ceylon)
I've been looking into this as a GSoC project and I would like to know where to begin. I was wondering if I have to write my own parser to build an AST? or is there any other way?
I believe the commonmark spec outlines a parsing strategy, but you'd have to write a parser manually. On 11 Mar 2016 4:41 pm, "Rohit Mohan" [email protected] wrote:
I've been looking into this as a GSoC project and I would like to know where to begin. I was wondering if I have to write my own parser to build an AST? or is there any other way?
— Reply to this email directly or view it on GitHub https://github.com/ceylon/ceylon-sdk/issues/521#issuecomment-195443970.
I have started working on a parser based on the parsing strategy outlined in the commonmark spec. I was wondering what @FroMage meant by frontend and backend in this case. Could anyone please elaborate?
Frontend: Markdown to some form of AST. Backend: AST to HTML. or to man page (nroff). or to PDF… if the components are separate, the possibilities are endless :)
I have submitted a draft of my GSoC proposal that can be found here: https://gist.github.com/rohitmohan96/71451003ed758b791fc8 I would love to hear your feedback on my proposal before I submit my final proposal.
I have 4 days to submit my final proposal so I would like to hear your feedback asap.
The proposal looks good to me.
Thanks. If anyone else has any input you can comment on the gist itself. I have also started working on the parser which I'll push sometime this week. So I'll update the proposal then.
So what's left to do? It would be nice to have this in the SDK and on herd. I'm currently bundling a custom build in the VSCode extension.
Sublists (https://github.com/rohitmohan96/ceylon.markdown/issues/1) don't work yet as intended and I still have to add all the tests from the CommonMark spec. I'll fix them as soon as I get time.
Can this be released for 1.3.1, with sublists being a known bug?
I went back to the drawing board on this and did a direct port from commonmark.js: https://github.com/CPColin/ceylon.markdown
I'm not sure if we can use this as a starting point for the module we need here, but it does pass all the CommonMark compliance tests! It'd need some extensions to its functionality, to support some of the use cases mentioned in this issue and to render ceylon-lang.org (which was why I started working on it in the first place).
What do you all think?
I'm not sure if we can use this as a starting point for the module we need here
Why would it not be?
It's a rhetorical device to soften the impact of my going back to the drawing board, instead of continuing Rohit's work (and borrowing the strategy of fetching the spec tests directly from the CommonMark repository).
OK.
I was able to extend the parser to parse wiki-style links and made a proof of concept for an AST transformation phase in the last comment on CPColin/ceylon.markdown#10, so we're getting close here! As far as I can tell, it shouldn't be too tough to do a similar transformation in the IDE plugins, based on what I saw in their code.
In CPColin/ceylon.markdown#5 I have a checklist going for things I need to do in order to make the new module official. What else can I add to the list? Thanks!
@CPColin how is this work going?
The module should be functionally complete and ready to go, but I need a hand in the issue I linked in my last comment to figure out what changes I need to make to make it official and get it into the Herd.
OK, thanks
So what strategy should I take here? Should we try to get my code into the official Ceylon repo? Or should we get the module into the Herd and keep the code hosted in my personal repo? (I'm not sure how this sort of thing normally works, heh!)
Well I guess I think it might be nice to publish a release to Herd so folks can try it out and give feedback first before merging it into the SDK...
Okay, thanks. I'll get moving with claiming the module name in the Herd. Is there a particular license you'd prefer me to use? I don't have a preference, so whatever fits in with the rest of Ceylon is fine by me!
Edit: Never mind. I see that the Apache license is preferred. I'll get that sorted out now.
It's up! Thanks, everybody!
I started looking at using this module in the ceylon doc and ceylon doc-tool tools and suddenly realized we'd be setting up a bit of a nasty circular dependency. The source for those tools is written in Java and would now depend on the presence of several Ceylon modules in order to compile. Should we back away from this idea?
I do think the new module can be used in the IDE plugins, though, since those are written in Ceylon.
Hrm. OK, so maybe not such a great idea after all.