sublime-markdown-popups icon indicating copy to clipboard operation
sublime-markdown-popups copied to clipboard

Breaks with a comma character in the code block info string

Open rwols opened this issue 4 years ago • 1 comments

The following markdown breaks code blocks:

```foo,bar
hello
```

While I don't expect proper highlighting, it'd be nice if the block didn't break.

Found in this markdown:


```rust
alloc::string
```

```rust
pub struct String
```

---

A UTF-8 encoded, growable string.

The `String` type is the most common string type that has ownership over the
contents of the string. It has a close relationship with its borrowed
counterpart, the primitive [`str`](https://doc.rust-lang.org/nightly/std/primitive.str.html).

# Examples

You can create a `String` from a literal string with [`String::from`](https://doc.rust-lang.org/nightly/alloc/string/struct.String.html#method.from):

```rust
let hello = String::from("Hello, world!");
```

You can append a [`char`](https://doc.rust-lang.org/nightly/std/primitive.char.html) to a `String` with the [`push`](https://doc.rust-lang.org/nightly/alloc/string/struct.String.html#method.push) method, and
append a [`&str`](https://doc.rust-lang.org/nightly/std/primitive.str.html) with the [`push_str`](https://doc.rust-lang.org/nightly/alloc/string/struct.String.html#method.push_str) method:

```rust
let mut hello = String::from("Hello, ");

hello.push(\'w\');
hello.push_str("orld!");
```

If you have a vector of UTF-8 bytes, you can create a `String` from it with
the [`from_utf8`](https://doc.rust-lang.org/nightly/alloc/string/struct.String.html#method.from_utf8) method:

```rust
// some bytes, in a vector
let sparkle_heart = vec![240, 159, 146, 150];

// We know these bytes are valid, so we\'ll use `unwrap()`.
let sparkle_heart = String::from_utf8(sparkle_heart).unwrap();

assert_eq!("💖", sparkle_heart);
```

# UTF-8

`String`s are always valid UTF-8. This has a few implications, the first of
which is that if you need a non-UTF-8 string, consider [`OsString`](https://doc.rust-lang.org/nightly/std/ffi/struct.OsString.html). It is
similar, but without the UTF-8 constraint. The second implication is that
you cannot index into a `String`:

```compile_fail,E0277
let s = "hello";

println!("The first letter of s is {}", s[0]); // ERROR!!!
```

Indexing is intended to be a constant-time operation, but UTF-8 encoding
does not allow us to do this. Furthermore, it\'s not clear what sort of
thing the index should return: a byte, a codepoint, or a grapheme cluster.
The [`bytes`](https://doc.rust-lang.org/nightly/alloc/string/struct.String.html#method.bytes) and [`chars`](https://doc.rust-lang.org/nightly/alloc/string/struct.String.html#method.chars) methods return iterators over the first
two, respectively.

# Deref

`String`s implement [`Deref`](https://doc.rust-lang.org/nightly/std/ops/trait.Deref.html)`<Target=str>`, and so inherit all of [`str`](https://doc.rust-lang.org/nightly/std/primitive.str.html)\'s
methods. In addition, this means that you can pass a `String` to a
function which takes a [`&str`](https://doc.rust-lang.org/nightly/std/primitive.str.html) by using an ampersand (`&`):

```rust
fn takes_str(s: &str) { }

let s = String::from("Hello");

takes_str(&s);
```

This will create a [`&str`](https://doc.rust-lang.org/nightly/std/primitive.str.html) from the `String` and pass it in. This
conversion is very inexpensive, and so generally, functions will accept
[`&str`](https://doc.rust-lang.org/nightly/std/primitive.str.html)s as arguments unless they need a `String` for some specific
reason.

In certain cases Rust doesn\'t have enough information to make this
conversion, known as [`Deref`](https://doc.rust-lang.org/nightly/std/ops/trait.Deref.html) coercion. In the following example a string
slice [`&\'a str`](https://doc.rust-lang.org/nightly/std/primitive.str.html) implements the trait `TraitExample`, and the function
`example_func` takes anything that implements the trait. In this case Rust
would need to make two implicit conversions, which Rust doesn\'t have the
means to do. For that reason, the following example will not compile.

```compile_fail,E0277
trait TraitExample {}

impl<\'a> TraitExample for &\'a str {}

fn example_func<A: TraitExample>(example_arg: A) {}

let example_string = String::from("example_string");
example_func(&example_string);
```

There are two options that would work instead. The first would be to
change the line `example_func(&example_string);` to
`example_func(example_string.as_str());`, using the method [`as_str()`](https://doc.rust-lang.org/nightly/alloc/string/struct.String.html#method.as_str)
to explicitly extract the string slice containing the string. The second
way changes `example_func(&example_string);` to
`example_func(&*example_string);`. In this case we are dereferencing a
`String` to a [`str`](https://doc.rust-lang.org/nightly/std/primitive.str.html), then referencing the [`str`](https://doc.rust-lang.org/nightly/std/primitive.str.html) back to
[`&str`](https://doc.rust-lang.org/nightly/std/primitive.str.html). The second way is more idiomatic, however both work to do the
conversion explicitly rather than relying on the implicit conversion.

# Representation

A `String` is made up of three components: a pointer to some bytes, a
length, and a capacity. The pointer points to an internal buffer `String`
uses to store its data. The length is the number of bytes currently stored
in the buffer, and the capacity is the size of the buffer in bytes. As such,
the length will always be less than or equal to the capacity.

This buffer is always stored on the heap.

You can look at these with the [`as_ptr`](https://doc.rust-lang.org/nightly/alloc/string/struct.String.html#method.as_ptr), [`len`](https://doc.rust-lang.org/nightly/alloc/string/struct.String.html#method.len), and [`capacity`](https://doc.rust-lang.org/nightly/alloc/string/struct.String.html#method.capacity)
methods:

```rust
use std::mem;

let story = String::from("Once upon a time...");

// Prevent automatically dropping the String\'s data
let mut story = mem::ManuallyDrop::new(story);

let ptr = story.as_mut_ptr();
let len = story.len();
let capacity = story.capacity();

// story has nineteen bytes
assert_eq!(19, len);

// We can re-build a String out of ptr, len, and capacity. This is all
// unsafe because we are responsible for making sure the components are
// valid:
let s = unsafe { String::from_raw_parts(ptr, len, capacity) } ;

assert_eq!(String::from("Once upon a time..."), s);
```

If a `String` has enough capacity, adding elements to it will not
re-allocate. For example, consider this program:

```rust
let mut s = String::new();

println!("{}", s.capacity());

for _ in 0..5 {
    s.push_str("hello");
    println!("{}", s.capacity());
    }
```

This will output the following:

```text
0
5
10
20
20
40
```

At first, we have no memory allocated at all, but as we append to the
string, it increases its capacity appropriately. If we instead use the
[`with_capacity`](https://doc.rust-lang.org/nightly/alloc/string/struct.String.html#method.with_capacity) method to allocate the correct capacity initially:

```rust
let mut s = String::with_capacity(25);

println!("{}", s.capacity());

for _ in 0..5 {
    s.push_str("hello");
    println!("{}", s.capacity());
    }
```

We end up with a different output:

```text
25
25
25
25
25
25
```

Here, there\'s no need to allocate more memory inside the loop.

From: https://github.com/rust-analyzer/rust-analyzer/issues/6655

rwols avatar Mar 06 '21 22:03 rwols

Not all parsers allow this, but I'll consider it. This would need to occur down in pymdown-extensions's code handling.

facelessuser avatar Mar 06 '21 23:03 facelessuser