commonmark-spec
commonmark-spec copied to clipboard
Tabs and indentation removal
I think it would be nice to clarify the interplay between tabs and the indentation removal in fenced code blocks. What exactly gets removed?
Example:
⋅⋅```
→→foo
⋅→⋅bar
⋅⋅→baz
⋅```
The indentation of the opening fence suggests the removal of two leading spaces, but to do that in a line which has tabs in its first characters suggests that these have to be expanded to spaces. The interactive dingus currently does not do that. Should it? Either way, I think you should make that case more explicit.
Yes, this is an issue we've discussed on talk.commonmark.org. I wholeheartedly agree that this needs clarification!
I have fixed jgm/cmark and jgm/commonmark.js to remove the two leading spaces. We still need to clarify the spec, and an example like this would be helpful.
Hi, About tabs, there is something also not clear with the specs (sorry if it was mentioned in the forum). It says:
Tabs in lines are not expanded to [spaces]. However, in contexts where indentation is significant for the document's structure, tabs behave as if they were replaced by spaces with a tab stop of 4 characters.
but in example (5 or following 6 and 7), the tab is actually replaced by character and even splitted (!)
I have started my implem without expanding tabs, but I'm not able to pass these tests.
Any ideas?
@xoofx in examples 5, 6, and 7, the tabs occur "in contexts where indentation is significant for the document's structure." In such contexts, tabs behave exactly as if they were expanded to spaces.
Thus, for example, in Example 5 the output is exactly the same as if you'd had eight spaces in place of the two tabs on the third line.
What's odd about these cases is that the final tab is "split." In Example 5, we need 6 spaces indentation for a code block. That consumes the first tab and half of the second tab, so we get two spaces. This might be a bit surprising, since the source document contains no spaces on that line. But I think, on balance, that this is probably the best behavior. The issue is discussed at length in this forum thread.
thanks, good to know about this behavior.
What maybe is a little bit difficult to grasp from the spec is whether the term "space" has to be understood as:
- relaxed: "tab or space with tabs expanded to space"
- strict: just a plain space.
In some cases, the term "space" is strict (like for the space after an heading #). I understand that the general rule is to understand "space" as relaxed is when indenting is involved but it is not always obvious. Typically this spec:
Note that at least one space is needed between the list marker and any following content, so these are not list items:
is not easy to interpret, as we have to understand that it is part of an "indent rule"
+++ Alexandre Mutel [Feb 15 16 16:16 ]:
thanks, good to know about this behavior.
What maybe is a little bit difficult to grasp from the spec is whether the term "space" has to be understood as:
relaxed: "tab or space with tabs expanded to space"
strict: just a plain space.
In some cases, the term "space" is strict (like for the space after an heading #). I understand that the general rule is to understand "space" as relaxed is when indenting is involved but it is not always obvious. Typically this spec:
Note that at least one space is needed between the list marker and any following content, so these are not list items:
is not easy to interpret, as we have to understand that it is part of an "indent rule"
Yes, I think you're absolutely right about the current unclarity of the spec. It needs an overhaul with attention to this issue; I just haven't had time to do it.
@jgm no problem, I will try to come up with a list of potential issues as I have already started to list them