nom-tutorial
nom-tutorial copied to clipboard
updates for WSL support
Hi,
I found this tutorial while looking for nom
tutorials. And Rust tutorials. This one seems to be the most up-to-date and well-written, and is essentially my first try at Rust. So thank you!
I had tried the tutorial on my WSL2 install (Ubuntu) on my Windows machine. But the parser fails at runtime, because of this weirdness in /proc/mounts
(pasted verbatim):
C:\134 /mnt/c 9p rw,dirsync,noatime,aname=drvfs;path=C:\;uid=1000;gid=1000;symlinkroot=/mnt/,mmap,access=client,msize=65536,trans=fd,rfd=8,wfd=8 0 0
You may understand why this fails at first glance, but I didn't, so I needed to figure it out (which makes for an even better tutorial, IMO).
This is how I solved it.
Like our escaped_space
, we need a new function to handle \134
(it's a backslash, which is totally not confusing at all). I think it should display as just C:\
:
fn windows_backslash(i: &str) -> nom::IResult<&str, &str> {
nom::combinator::value("\\", nom::bytes::complete::tag("134"))(i)
}
I added this to the tuple passed into nom::branch::alt()
:
nom::branch::alt((
escaped_backslash,
windows_backslash,
escaped_space
)),
This seemed to work OK, until I had to parse the mount options, which failed because \;
is invalid (see paste from /proc/mounts
above). I added yet another parser, and added this to the nom::branch::alt
call:
fn windows_options_backslash(i: &str) -> nom::IResult<&str, &str> {
nom::combinator::value("\\;", nom::bytes::complete::tag(";"))(i)
}
and
nom::branch::alt((
escaped_backslash,
windows_backslash,
escaped_space,
windows_options_backslash,
)),
(I was not sure what to name either of these functions.)
At any rate, the tests I've written against the /proc/mounts
entry pass (but I haven't had a chance to actually run it on my windows box yet). Here's the one for parse_line()
with different data:
#[test]
fn test_parse_line_wsl2() {
let mount3 = Mount {
device: "C:\\".to_string(),
mount_point: "/mnt/c".to_string(),
file_system_type: "9p".to_string(),
options: vec![
"rw".to_string(),
"dirsync".to_string(),
"noatime".to_string(),
"aname=drvfs;path=C:\\;uid=1000;gid=1000;symlinkroot=/mnt/".to_string(),
"mmap".to_string(),
"access=client".to_string(),
"msize=65536".to_string(),
"trans=fd".to_string(),
"rfd=8".to_string(),
"wfd=8".to_string(),
],
};
let (_, mount4) =
parse_line("C:\\134 /mnt/c 9p rw,dirsync,noatime,aname=drvfs;path=C:\\;uid=1000;gid=1000;symlinkroot=/mnt/,mmap,access=client,msize=65536,trans=fd,rfd=8,wfd=8 0 0").unwrap();
assert_eq!(mount3.device, mount4.device);
assert_eq!(mount3.mount_point, mount4.mount_point);
assert_eq!(mount3.file_system_type, mount4.file_system_type);
assert_eq!(mount3.options, mount4.options);
Note: I found the following causes the my test to break, because it's not returning the correct type of Err
result (it expects one from tag()
, not char()
, just like test_escaped_space()
):
fn windows_options_backslash(i: &str) -> nom::IResult<&str, &str> {
value("\\;", char(';'))(i)
}
The compiler didn't complain about this, which I found unusual, since it usually complains about everything. Assuming we're not deleting this parser, what would you have done? Update the unit test, write a new trait, etc.? I don't know what's idiomatic (yet).
I'd love if you could show how you may have tackled this problem. If you like, I can send a PR with changes for this environment, and we could also discuss the implementation that way. Or not!
Anyway, thanks again for this tutorial.
I am by no means an expert in what is the most idiomatic use of nom, but I'll give you my 2¢.
It appears that under Windows Subsystem for Linux the output of /proc/mounts is close to, but not quite the same as, that of a typical Linux system with two major difference:
-
Backslash is escaped as
\134
. I think your strategy of creating an additionalwindows_backslash
combinator similar to theescaped_space
combinator and adding it to thenom::branch::alt
inescaped_transform
is a good strategy. -
It looks like on WSL the mount options are separated by semicolons
;
instead of commas,
. Rather than creating another escape sequence for semicolons, it would probably make more sense to modify themount_opts
parser to separate mount options by commas or semicolons. This could be done by replacingnom::multi::separated_list(nom::character::complete::char(','), ...)
withnom::multi::separated_list(nom::character::complete::one_of(",;"), ...)
.
Hope this helps!
@benkay86 Thanks for the feedback. I guess I didn't realize the semicolon-delimited stuff was proper mount options (as /bin/mount
would recognize them), and thought they were just some weird windows-specific bunkum. but I suppose gid
and uid
are valid options!
Would you like me to send a PR? I'm probably not going to be the only person using WSL to go through this tutorial.
If you would like to send a PR I would be happy to test it on my native Linux system too.