cobalt icon indicating copy to clipboard operation
cobalt copied to clipboard

Multiline string literal support

Open FilipJanitor opened this issue 6 years ago • 14 comments

What could certainly be done is implicit concatenating of adjacent single line strings(on the same level of indentation) for readability.

val ss:String = "foo"
                "bar"
                "baz"

turns into val ss:String = "foobarbaz". These use escaped characters and everything just like ordinary strings. It might be useful to add some special character at the end after the string that would mean that another part of the string is expected. This is how it looks with example above

val ss:String = "foo"\
                "bar"\
                "baz"

The next level of this is true multiline string which does not require the double quotes on each line. Decision needs to be done, whether this type of string literal will concatenate or preserve the newlines. That means if this:

'''
foo
bar
baz
'''

translates into "foobarbaz" or "foo\nbar\nbaz". The logic behind the first one is just to split long lines to improve readability. The logic behind the second option is to allow pasting blocks of text "as is" without needing to add escaping newlines manualy

Both of these approaches have difficulty with the fact that Cobalt is indentation sensitive so if placed inside some indented block, either the string literal would have to be without indentation to not contain leading whitespace and thus breaking the code prettyness or the user could manually indent the contents to fit the overall code style but the whitespace would be preserved inside the string.

Possible (but fairly complex) solution is to skip the leading whitespace to the point of indentation of beginning of the string. This is difficult to implement and doesn't solve problems with long lines. Also it is prone to "nondeterministic behavior" if less or more leading whitespace is provided by the user.

FilipJanitor avatar Mar 21 '18 15:03 FilipJanitor

Python solves the last problem with standard library textwrap.dedent

FilipJanitor avatar Mar 21 '18 15:03 FilipJanitor

Haskell does it even nicer. It makes the user to decide where the leading indentation ends and the actual string value starts https://stackoverflow.com/a/22919011

FilipJanitor avatar Mar 21 '18 15:03 FilipJanitor

Bump! Do we have any ideas or preferences about this? I kind of like the very first example as we can simply add a rule - adjacent string literals are concatenated at compilation time. This however does not fully solve the issue of multiline string literals which is preserving newlines. C++ has R"V0G0N( )V0G0N"syntax for raw strings that preserves basically everything - indentation, newlines etc. This is particularly useful if you write things like SQL queries or you want to avoid adding newlines manually. This approach might break our indentation.

FilipJanitor avatar May 09 '18 15:05 FilipJanitor

I like the first one and it shouldn't be a difficult one. The thing I wonder is whether it is worth having that or the Haskell equivalent with a function? It depends which one we find most readable as they both are good.

unlines [ "a"
        , "b"
        , "c"
        ]

I really like the third example. We should be able to do it using lineFold to remove the indented prefix.

Actually there's a big issue with the first example.

let x() = do 
    "x"
    "y"
    "z"

Is this representing an array or pushing values onto the stack?

Michael2109 avatar May 09 '18 16:05 Michael2109

I was not aware of the linefold functionality, If it is possible to avoid some indentation ambiguity that way, that would be great. Regarding your example, well, I still dont fully understand the semantics, but what I would expect is something like this - if the righthand side of assignment is an expression then its value is assigned (it is evaluated of course - it is like a declaration). If we use the block assignment, that means, the right hand side is a statement (possibly multiple statements) with the dokeyword and properly indented, it is like a inlined body of functions - it is not declaration anymore. There are some statements going on, computing some value, which needs to be somehow passed from the function body - returned, otherwise I am not sure how to determine what is the evaluation of block of statements.

FilipJanitor avatar May 09 '18 20:05 FilipJanitor

Yes I think lineFold could be the solution although I'm sure it removes the newline so we would have to add it in or adjust it slightly. The evaluation for a block of statements is the last value pushed on the stack. In my example x() would return "z". Whether it is an expression or statement in this case why would it matter? As an example.

let x() = 10
let x() = do 
    println("Called x")
    10

In both cases using hs-java we will generate a method and then just push the particular things onto the stack. This would be the same if we generate a value.

let valName = 10
let valName = do
    println("Called x")
    10

This would be the same as in Java generating this.

private int valNameVal = 0;
private boolean valNameCreated = false;
private int valName(){
    if(!valNameCreated){
        System.out.println("Called x");
        valNameCreated = true;
        valNameVal = 10;
    }
    return valNameVal;
}

We can determine the value that is returned by knowing what the values being pushed onto the stack are. In Java and C++ it forces you to return the value. As long as we ensure there is no risk of other data types being pushed on the stack last it is fine.

E.g. This wouldn't be allowed.

let x = do
    if True
    then 
        println("Do something")
        "String"
    else 
        println("Do something else")
        10

Michael2109 avatar May 09 '18 21:05 Michael2109

So the block actually executes each time the value is requested? In the Java example, the user would call the function (they are all private)? Because if yes, then it looks a bit like some regular lambda that is automatically called without parentheses, just with some help of syntactic sugar. If that is the case and even if it is the other way around, maybe the return keyword could be used too?

FilipJanitor avatar May 09 '18 21:05 FilipJanitor

Yes the method gets called each time it is requested. It stores if it has been set already. This could probably be improved a bit but it's a basic example of what we will generate. They would all be private in this example as by default we are making them private. The valName would always be private but the method modifier may change depending on what the user sets. We can use the return keyword too. I'm not sure if we need it though. I recommend looking at how Scala deals with some of these cases. Here's an example. https://tpolecat.github.io/2014/05/09/return.html

I think it's more to do with control flow. No risk of code randomly returning when you don't expect it. Always reaching the end of a method in comparison to stopping else where. In terms of debugging it could be nicer.

Michael2109 avatar May 09 '18 21:05 Michael2109

Ok I see that the semantics are different than I expected. Anyways, how is it with splitting expressions on multiple lines currently? Because if it is possible, then the functionality of 1 can be achieved by just using some string concatenating operator. Apart from that, is there some similar issue with the third example? I agree that unlines is an interesting option too.

FilipJanitor avatar May 10 '18 12:05 FilipJanitor

Yes a string concat operator would work. There's an issue for adding a ++ operator for arrays and strings. We would have to do some special stuff to work across multiple lines. I imagine it would be something to do with lineFold again but inside the expression parser. I'm not 100% what this consists of.

let x() = do 
    "x" ++
    "y" ++
    "z"

The third example would be fine as we would be saying that if """ occurs then parse in a certain way until the next """ occurs. The difference is that this is one expression.

Michael2109 avatar May 10 '18 12:05 Michael2109

Ok, and what if there is a long expression (inside some if for example) - is it splittable to multiple lines? For example, I think in python there is an implied line continuation inside of parentheses or brackets, which make things simpler.

FilipJanitor avatar May 10 '18 12:05 FilipJanitor

That is a good point. That could be a good way of dealing with it. Will need to put some more thought into this.

Michael2109 avatar May 10 '18 12:05 Michael2109

I quite like how Scala deals with multiline strings.

val x ="""This 
            |is multiline 
            |text
           """.removeMargin()

Michael2109 avatar Jun 25 '18 14:06 Michael2109

If I could add anything; JavaScript's

let string = `
 a
 b
 c
`;

is quite great! :D

beProsto avatar Apr 10 '22 16:04 beProsto