cobalt
cobalt copied to clipboard
Multiline string literal support
What could certainly be done is implicit concatenating of adjacent single line strings(on the same level of indentation) for readability.
val ss:String = "foo"
"bar"
"baz"
turns into val ss:String = "foobarbaz"
. These use escaped characters and everything just like ordinary strings. It might be useful to add some special character at the end after the string that would mean that another part of the string is expected. This is how it looks with example above
val ss:String = "foo"\
"bar"\
"baz"
The next level of this is true multiline string which does not require the double quotes on each line. Decision needs to be done, whether this type of string literal will concatenate or preserve the newlines. That means if this:
'''
foo
bar
baz
'''
translates into "foobarbaz"
or "foo\nbar\nbaz"
. The logic behind the first one is just to split long lines to improve readability. The logic behind the second option is to allow pasting blocks of text "as is" without needing to add escaping newlines manualy
Both of these approaches have difficulty with the fact that Cobalt is indentation sensitive so if placed inside some indented block, either the string literal would have to be without indentation to not contain leading whitespace and thus breaking the code prettyness or the user could manually indent the contents to fit the overall code style but the whitespace would be preserved inside the string.
Possible (but fairly complex) solution is to skip the leading whitespace to the point of indentation of beginning of the string. This is difficult to implement and doesn't solve problems with long lines. Also it is prone to "nondeterministic behavior" if less or more leading whitespace is provided by the user.
Python solves the last problem with standard library textwrap.dedent
Haskell does it even nicer. It makes the user to decide where the leading indentation ends and the actual string value starts https://stackoverflow.com/a/22919011
Bump! Do we have any ideas or preferences about this? I kind of like the very first example as we can simply add a rule - adjacent string literals are concatenated at compilation time. This however does not fully solve the issue of multiline string literals which is preserving newlines. C++ has R"V0G0N( )V0G0N"
syntax for raw strings that preserves basically everything - indentation, newlines etc. This is particularly useful if you write things like SQL queries or you want to avoid adding newlines manually. This approach might break our indentation.
I like the first one and it shouldn't be a difficult one. The thing I wonder is whether it is worth having that or the Haskell equivalent with a function? It depends which one we find most readable as they both are good.
unlines [ "a"
, "b"
, "c"
]
I really like the third example. We should be able to do it using lineFold
to remove the indented prefix.
Actually there's a big issue with the first example.
let x() = do
"x"
"y"
"z"
Is this representing an array or pushing values onto the stack?
I was not aware of the linefold functionality, If it is possible to avoid some indentation ambiguity that way, that would be great. Regarding your example, well, I still dont fully understand the semantics, but what I would expect is something like this - if the righthand side of assignment is an expression then its value is assigned (it is evaluated of course - it is like a declaration). If we use the block assignment, that means, the right hand side is a statement (possibly multiple statements) with the do
keyword and properly indented, it is like a inlined body of functions - it is not declaration anymore. There are some statements going on, computing some value, which needs to be somehow passed from the function body - returned, otherwise I am not sure how to determine what is the evaluation of block of statements.
Yes I think lineFold
could be the solution although I'm sure it removes the newline so we would have to add it in or adjust it slightly.
The evaluation for a block of statements is the last value pushed on the stack. In my example x()
would return "z".
Whether it is an expression or statement in this case why would it matter?
As an example.
let x() = 10
let x() = do
println("Called x")
10
In both cases using hs-java
we will generate a method and then just push the particular things onto the stack.
This would be the same if we generate a value.
let valName = 10
let valName = do
println("Called x")
10
This would be the same as in Java generating this.
private int valNameVal = 0;
private boolean valNameCreated = false;
private int valName(){
if(!valNameCreated){
System.out.println("Called x");
valNameCreated = true;
valNameVal = 10;
}
return valNameVal;
}
We can determine the value that is returned by knowing what the values being pushed onto the stack are. In Java and C++ it forces you to return the value. As long as we ensure there is no risk of other data types being pushed on the stack last it is fine.
E.g. This wouldn't be allowed.
let x = do
if True
then
println("Do something")
"String"
else
println("Do something else")
10
So the block actually executes each time the value is requested? In the Java example, the user would call the function (they are all private)? Because if yes, then it looks a bit like some regular lambda that is automatically called without parentheses, just with some help of syntactic sugar. If that is the case and even if it is the other way around, maybe the return keyword could be used too?
Yes the method gets called each time it is requested. It stores if it has been set already. This could probably be improved a bit but it's a basic example of what we will generate.
They would all be private in this example as by default we are making them private. The valName
would always be private but the method modifier may change depending on what the user sets.
We can use the return keyword too. I'm not sure if we need it though.
I recommend looking at how Scala deals with some of these cases. Here's an example.
https://tpolecat.github.io/2014/05/09/return.html
I think it's more to do with control flow. No risk of code randomly returning when you don't expect it. Always reaching the end of a method in comparison to stopping else where. In terms of debugging it could be nicer.
Ok I see that the semantics are different than I expected. Anyways, how is it with splitting expressions on multiple lines currently? Because if it is possible, then the functionality of 1 can be achieved by just using some string concatenating operator.
Apart from that, is there some similar issue with the third example? I agree that unlines
is an interesting option too.
Yes a string concat operator would work. There's an issue for adding a ++
operator for arrays and strings.
We would have to do some special stuff to work across multiple lines. I imagine it would be something to do with lineFold
again but inside the expression parser. I'm not 100% what this consists of.
let x() = do
"x" ++
"y" ++
"z"
The third example would be fine as we would be saying that if """
occurs then parse in a certain way until the next """
occurs. The difference is that this is one expression.
Ok, and what if there is a long expression (inside some if for example) - is it splittable to multiple lines? For example, I think in python there is an implied line continuation inside of parentheses or brackets, which make things simpler.
That is a good point. That could be a good way of dealing with it. Will need to put some more thought into this.
I quite like how Scala deals with multiline strings.
val x ="""This
|is multiline
|text
""".removeMargin()
If I could add anything; JavaScript's
let string = `
a
b
c
`;
is quite great! :D