rune Testing against GNU Emacs

We are striving to be “bug compatible” with GNU Emacs, in so far as it makes sense to do so. We might break with some behavior if it is obscure enough or adds enough value to justify it. But the right answer to “what should this function do?” Is almost always “whatever GNU Emacs does”.

Given that, we would like to have a way to test against Emacs and compare behavior. This issue is to brainstorm the best way to do that.

Currently the plan is to create a new rust binary that can feed a test file into Rune and GNU Emacs and make sure they get the same output. If one throws an error, so does the other. And the result of each expression is the same.

This can be expanded to include fuzzing/property testing. There could be some code to parse each built in function in Rune and get the type signature. We could then test the arity, accepted types, and random values against GNU Emacs. This would help flush out edge cases and differences in behavior. It could also help us catching changes between major version upgrades of GNU Emacs.

We could also create dedicated fuzzers for specific functionality. For example we have some code to convert a lisp regex to a rust regex. We could send in random regex and ensure that if Emacs considers it valid, then it is also converted to valid rust regex. Another example is printing; ensuring that printed representation of everything is the same.

Dec 06 '23 21:12 CeleritasCelery

Currently the plan is to create a new rust binary that can feed a test file into Rune and GNU Emacs and make sure they get the same output. If one throws an error, so does the other. And the result of each expression is the same.

I'm thinking about this. I really like this Github Action to set up emacs at specific version. We can then prop test by generating some elisp that the emacs command on CI can run. There are different ways to do it, but probably the easiest is to generate an .el file and have emacs run it with -q -x. We can generate an .el file from all the defuns we generate. It could get complicated fast, say, defuns need to run in a specific order to work correctly. I would imagine GNU Emacs would fail as well, and the output would be consistent vs rune.

Dec 07 '23 08:12 Qkessler

@Ki11erRabbit As you have seen with string-version-lessp (#78) It can be hard to match the behavior of GNU Emacs when porting code, even when you have the source code in front of you. The idea in this issue is to create a separate utility to compare Rune and GNU Emacs with property testing to help flush out issues. I have started working on this with a little CLI tool here. All it does right now is compare to see if the same functions exist, but we could expand it to generate input to the functions and compare the outputs. I have started writing it in Rust with proptest, but we could in theory write it in any other language (like python with hypothesis).

If you are interested this might be something you could tackle, because you have already seen how hard it is to get the functionality right. I have hit many behavior mismatches before, but usually they are deep in some other code and it takes me hours to find the source. With this utility we could fuzz new functions as they are created and hopefully make the quality much higher and bugs easier to flush out. I am sure our current defun's are loaded with issues that we just haven't seen yet. Let me know what you think.

Jun 06 '24 20:06 CeleritasCelery

If you are interested this might be something you could tackle, because you have already seen how hard it is to get the functionality right. I have hit many behavior mismatches before, but usually they are deep in some other code and it takes me hours to find the source. With this utility we could fuzz new functions as they are created and hopefully make the quality much higher and bugs easier to flush out. I am sure our current defun's are loaded with issues that we just haven't seen yet. Let me know what you think.

I wouldn't mind helping, although I will be very busy the next 2 weeks or so. But after that I should be able to help with this.

Jun 06 '24 21:06 Ki11erRabbit

I am now free to help. What would you like me to do @CeleritasCelery?

Jun 24 '24 22:06 Ki11erRabbit

It depends on what you want to do 😄 . If working on the this particular item interests you, than take a look here at what I have started with. If you wanted to start over, that is fine too. Essentially this code runs both Rune and GNU Emacs and compares the output. Right now it only sees if they both have the same functions defined, but we can expand it to do more. Some next steps:

We need to extract more information about the function definitions from the rust source. Probably using syn. This includes the number of arguments, number of optional arguments, return type, and argument types.
Make sure that function arity matches between the two implementation (using func-arity).
Use a prop testing library to generate random inputs and ensure the functions return the same outputs. If we know the types a function expects we can narrow down the types that we generate to test more interesting properties. For example we could test a bunch of random strings with string-version-lessp to help catch any corner cases that we missed.

Of course this all depends on if this is something you want to work on. I think it would be a good task because it is open and does not require a lot of context about the current system. But if there is something else you are interested in more, let me know.

Jun 25 '24 00:06 CeleritasCelery

Sure I get to work on something. It should keep me busy.

Jun 25 '24 01:06 Ki11erRabbit

I have a few questions after thinking about the problem.

How aware do we want the tester to be aware of the types? Since if we just take an object, we don't actually know what the type is. It would be nice to have a list of possible types that a function could have even if the list is somewhat incomplete.

Should I make a type that represents a function and use that to generate arbitrary function calls that we could test against Emacs? If so it will rely on the above information to provide arbitrary input.

Jun 25 '24 18:06 Ki11erRabbit

How aware do we want the tester to be aware of the types? Since if we just take an object, we don't actually know what the type is. It would be nice to have a list of possible types that a function could have even if the list is somewhat incomplete.

Agreed. The more specific the types, the more useful the input we can generate. Many builtin functions use specific types like &str and usize that we can extract, but for ones that take Object we don't have that info. We need to think of some way to include it. Maybe through a comment, annotation, or attribute?

Should I make a type that represents a function and use that to generate arbitrary function calls that we could test against Emacs? If so it will rely on the above information to provide arbitrary input.

I think that is a good approach.

Jun 25 '24 19:06 CeleritasCelery

While working on the tester I thought of way to provide more type information.

I think that an attribute might be best if it can do these things:

State the positional arg (like an int). This way we can indicate multiple types for the same argument
Give the type name. This is for convience
State whether or not the argument is optional.

I think that this would make parsing with Syn much easier.

Jun 28 '24 03:06 Ki11erRabbit

Most of that information should already be there.

The Rust types should map fairly cleanly to the lisp types.

pub(crate) fn string_lessp<'ob>(
    string1: StringOrSymbol<'ob>,
    string2: StringOrSymbol<'ob>,
) -> Result<bool> {

This tells us that the argument type is a string or symbol (which means we can test a string against it)

pub(crate) fn less_than(number: Number, numbers: &[Number]) -> bool {

This tells us that the arguments are numbers (either int or float)

Optional arguments from lisp are Option in Rust.

pub(crate) fn require<'ob>(
    feature: &Rto<Gc<Symbol>>,
    filename: Option<&Rto<Gc<&LispString>>>,
    noerror: Option<()>,
    env: &mut Rt<Env>,
    cx: &'ob mut Context,
) -> Result<Symbol<'ob>> {

Here we know that feature is a symbol, filename is an optional string, noerror is just optional (nil or t). This is the reason we use Option<()> for optionals instead of bool. it let's us distinguish between required boolean flags and optional lisp values.

Let me know if I am not understanding your question.

Jun 28 '24 19:06 CeleritasCelery

I think you are on point. Could you maybe make a list of all of the types and their equivalents in elisp?

Jun 29 '24 20:06 Ki11erRabbit

sure thing.

Rust Type	Elisp Type
usize	integer
i64	integer
isize	integer
f64	float
Number	integer or float
&str	string
StringOrSymbol	string
bool	`t` or `nil`
List	`nil` or cons
Function	function
Option<()>	`nil` or non-nil
ByteString	unibyte-string
LispVector	vector
LispHashTable	hash-table
Symbol	symbol
Cons	cons
Record	record
ByteFn	byte-code-function
SubrFn	subr
Buffer	buffer

Some of these like string, integer, and float will be easiest to generate data for.

Jun 30 '24 03:06 CeleritasCelery

Thank you that has been very helpful. Although, could we make an alias for Option<()>? It creates a slightly weird edge case in my code. It would also make the type much clearer.

Jun 30 '24 17:06 Ki11erRabbit

I am fine with that. What should we call the type?

Jun 30 '24 22:06 CeleritasCelery

I am fine with that. What should we call the type?

I am thinking something like AnyOrNil or something along those lines.

Jun 30 '24 23:06 Ki11erRabbit

I added a type alias called OptionalFlag for that type.

Jul 01 '24 14:07 CeleritasCelery

I added a type alias called OptionalFlag for that type.

Thank you

Jul 01 '24 16:07 Ki11erRabbit

I thought I would give an update. I have manged to get it to generated a very large test file that has random values. There are still some thinks to work out though.

I have one concern. I don't know how to handle randomly generated functions. Right now they have a random arity < 0 and return nil. I think that they should return something other than nil sometimes

Jul 02 '24 03:07 Ki11erRabbit

That’s great to hear! Feel free to open a PR.

As far as function go, I think we will need more info on what kind of function is needed. Otherwise you won’t be actually testing interesting properties of the defun. We could always just skip them for now. Maybe some attribute or comment that provides info on what kind of function to generate.

Jul 02 '24 04:07 CeleritasCelery

The only things left are to make it so that lists actually have elements in them, make a decent cmdline interface, and set up a test harness.

After I fix the list bug and give it a cmdline interface should I submit a PR?

Jul 02 '24 05:07 Ki11erRabbit

Yes please!

Jul 02 '24 15:07 CeleritasCelery

I also thought of a way to solve the function arity issue. We could just make a type alias to Function that specifies the arity.

Jul 02 '24 15:07 Ki11erRabbit

rune rune copied to clipboard

Testing against GNU Emacs

rune
rune copied to clipboard