gosub-engine icon indicating copy to clipboard operation
gosub-engine copied to clipboard

Moved tokenizer.rs and tree_construction.rs tests into crates for issue #445

Open kaigidwani opened this issue 9 months ago • 27 comments

For issue #445, I have copied the tokenizer.rs over into crates/gosub_html5/src/tokenizer as test.rs. I have also moved the contents of tree_construction.rs onto the end of crates/gosub_html5/src/parser/tree_builder.rs. I made sure to add the module declarations and #[cfg(test)] above them.

kaigidwani avatar May 03 '24 18:05 kaigidwani

I'm seeing this in the CI logs too: error: couldn't read tests/tree_construction.rs: No such file or directory (os error 2) Any suggestions on how to fix this?

kaigidwani avatar May 05 '24 17:05 kaigidwani

Oh, yeah... I started the test run and then completely forgot about it... I'll take a look tomorrow.

Sharktheone avatar May 05 '24 21:05 Sharktheone

Okay, looks like, the old module definition is still there. Just delete this file: https://github.com/gosub-browser/gosub-engine/blob/c6e81c66e90e886bacf82eff35d44d8bd94da827/tests/lib.rs#L1-L5

and in Cargo.toml these lines

https://github.com/gosub-browser/gosub-engine/blob/c6e81c66e90e886bacf82eff35d44d8bd94da827/Cargo.toml#L21-L27

Sharktheone avatar May 06 '24 06:05 Sharktheone

Looks like I also had to do cargo add --path ./crates/gosub_testing --dev -p gosub_html5 to get some additional similar rust errors to go away, but then it seemed to mostly work (other than some tests failing because they cant seem to find the requisite data files). To rerun just the tests that arent working, rust was instructing me to run cargo test -p gosub_html5 --lib in case that helps speed up debugging

MoralCode avatar May 06 '24 13:05 MoralCode

looks like the testing data files for html5lib-tests also need to be moved since it seems like the test fixtures are looking for them in ./tests/data/html5lib-tests which is relative to the crate root (which got moved into the gosub_html5 crate as part of this PR.

MoralCode avatar May 06 '24 14:05 MoralCode

Okay, so let's just run CI again and see what fails...

Sharktheone avatar May 06 '24 18:05 Sharktheone

So, if you want to speed things up, I can push a commit to fix the rest of the checks. I don't know if you want that, since from as far as I know, this is in context of some sort of class.

Sharktheone avatar May 06 '24 18:05 Sharktheone

So, here is what is missing:

In crates/gosub_html5/Cargo.toml (just appended to the bottom)

[dev-dependencies]
test-case = "3.3.1"
gosub_testing = { path = "../gosub_testing", features = [] }

afterward, change the FIXTURE_ROOT in crates/gosub_testing/src/testing.rs to `../../tests/data/html5lib-tests

Here is the complete diff

diff --git a/Cargo.lock b/Cargo.lock
index 7d8f68a..0c729b3 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -1324,9 +1324,11 @@ dependencies = [
  "derive_more",
  "gosub_css3",
  "gosub_shared",
+ "gosub_testing",
  "lazy_static",
  "log",
  "phf",
+ "test-case",
  "thiserror",
  "ureq",
  "url",
diff --git a/crates/gosub_html5/Cargo.toml b/crates/gosub_html5/Cargo.toml
index a06f337..22b9e39 100644
--- a/crates/gosub_html5/Cargo.toml
+++ b/crates/gosub_html5/Cargo.toml
@@ -20,3 +20,8 @@ log = { version = "0.4.21", features = [] }
 [target.'cfg(not(target_arch = "wasm32"))'.dependencies]
 ureq = "2.9.7"
 
+
+[dev-dependencies]
+test-case = "3.3.1"
+gosub_testing = { path = "../gosub_testing", features = [] }
+
diff --git a/crates/gosub_testing/src/testing.rs b/crates/gosub_testing/src/testing.rs
index e0deb4b..3bd5bd5 100644
--- a/crates/gosub_testing/src/testing.rs
+++ b/crates/gosub_testing/src/testing.rs
@@ -2,5 +2,5 @@
 pub mod tokenizer;
 pub mod tree_construction;
 
-pub const FIXTURE_ROOT: &str = "./tests/data/html5lib-tests";
+pub const FIXTURE_ROOT: &str = "../../tests/data/html5lib-tests";
 pub const TREE_CONSTRUCTION_PATH: &str = "tree-construction";

lastly, run cargo fmt --all in the root of the project.

You can push it in one commit if you want.

Sharktheone avatar May 06 '24 18:05 Sharktheone

So, if you want to speed things up, I can push a commit to fix the rest of the checks. I don't know if you want that, since from as far as I know, this is in context of some sort of class.

yep it is for a class! As the class TA i was planning to commit the rest of the fixes after the due date (maybe later this week) unless @kaigidwani wants to give it a shot.

MoralCode avatar May 06 '24 18:05 MoralCode

afterward, change the FIXTURE_ROOT in crates/gosub_testing/src/testing.rs to `../../tests/data/html5lib-tests

i ended up just moving the data files to the crate since it seems like FIXTURE_ROOT is relative to the crate anyway. Would you prefer one way or the other for solving this?

MoralCode avatar May 06 '24 18:05 MoralCode

Yes if possible I would like to give it a shot, but I wouldn't be able to get to this until Thursday, just because of how my schedule is working out with finals season. T-T I appreciate you guys letting me take on this issue even though you both clearly know how to do it easily haha

kaigidwani avatar May 06 '24 18:05 kaigidwani

afterward, change the FIXTURE_ROOT in crates/gosub_testing/src/testing.rs to `../../tests/data/html5lib-tests

i ended up just moving the data files to the crate since it seems like FIXTURE_ROOT is relative to the crate anyway. Would you prefer one way or the other for solving this?

I actually don't really care, but it would keep test-files in one directory, however as you said, the path is relative to the crate, which is a bit messy... Do it like you want. If you have the solution with moving the test files already, you can also use that.

Sharktheone avatar May 06 '24 19:05 Sharktheone

Yes if possible I would like to give it a shot, but I wouldn't be able to get to this until Thursday, just because of how my schedule is working out with finals season. T-T I appreciate you guys letting me take on this issue even though you both clearly know how to do it easily haha

Yeah, no problem, you can only learn things when making your hands dirty, and since you can't break anything, I don't see there any reason why not letting you (or others) doing that. When I think about my starts of coding, I don't think it was any better 😉

Sharktheone avatar May 06 '24 19:05 Sharktheone

Hey, sorry to go back on what I said, but my plans are getting more complicated and I'd like to take some time off before I have to start my Summer job. Since you already have that push would it be alright if you went through with it? Thanks so much for the help with this, I really appreciate it.

kaigidwani avatar May 10 '24 02:05 kaigidwani

Yup, I can do that

Sharktheone avatar May 10 '24 12:05 Sharktheone

@Sharktheone just pushed my commits to hopefully finish up this PR - currently rerunning the tests locally to double check they still work.

I figure since the tests and these data files are linked using a macro, it may be best to keep the data as close to the functions being tested as possible - similar to how it may be beneficial to keep the tests close to the code so that they are more likely to remain up to date.

MoralCode avatar May 10 '24 19:05 MoralCode

Ah, thanks, looks good

Sharktheone avatar May 10 '24 20:05 Sharktheone

Okay, somehow I can't push to the PR branch, neither over a new remote nor over the GitHub ref, maybe because it's the main branch?

So these are my changes to get also the benchmarks working

diff --git a/Cargo.lock b/Cargo.lock
index 0c729b3..e60b7be 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -1321,6 +1321,7 @@ dependencies = [
 name = "gosub_html5"
 version = "0.1.0"
 dependencies = [
+ "criterion",
  "derive_more",
  "gosub_css3",
  "gosub_shared",
diff --git a/Cargo.toml b/Cargo.toml
index 8b5692d..087e20b 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -18,14 +18,6 @@ members = [
 [[example]]
 name = "html5-parser"
 
-[[bench]]
-name = "tokenizer"
-harness = false
-
-[[bench]]
-name = "tree_construction"
-harness = false
-
 [[bench]]
 name = "tree_iterator"
 harness = false
diff --git a/crates/gosub_html5/Cargo.toml b/crates/gosub_html5/Cargo.toml
index 9cfa992..8cb481e 100644
--- a/crates/gosub_html5/Cargo.toml
+++ b/crates/gosub_html5/Cargo.toml
@@ -23,4 +23,13 @@ ureq = "2.9.7"
 [dev-dependencies]
 gosub_testing = { path = "../gosub_testing" }
 test-case = "3.3.1"
+criterion = { version = "0.5.1", features = ["html_reports"] }
 
+
+[[bench]]
+name = "tokenizer"
+harness = false
+
+[[bench]]
+name = "tree_construction"
+harness = false
diff --git a/benches/tokenizer.rs b/crates/gosub_html5/benches/tokenizer.rs
similarity index 100%
rename from benches/tokenizer.rs
rename to crates/gosub_html5/benches/tokenizer.rs
diff --git a/benches/tree_construction.rs b/crates/gosub_html5/benches/tree_construction.rs
similarity index 100%
rename from benches/tree_construction.rs
rename to crates/gosub_html5/benches/tree_construction.rs

Sharktheone avatar May 10 '24 21:05 Sharktheone

There you go! TIL you can just copy paste the diff into a text file and git apply it

MoralCode avatar May 11 '24 00:05 MoralCode

Interesting... looks like it somehow didn't apply the

diff --git a/Cargo.toml b/Cargo.toml
index 8b5692d..087e20b 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -18,14 +18,6 @@ members = [
 [[example]]
 name = "html5-parser"
 
-[[bench]]
-name = "tokenizer"
-harness = false
-
-[[bench]]
-name = "tree_construction"
-harness = false
-
 [[bench]]
 name = "tree_iterator"
 harness = false

part from the diff

Sharktheone avatar May 11 '24 00:05 Sharktheone

Done

MoralCode avatar May 11 '24 22:05 MoralCode

Okay, looks good. Looks like the last problem is commit signing.

Sharktheone avatar May 11 '24 23:05 Sharktheone

Okay, looks good. Looks like the last problem is commit signing.

Is there a particular reason this is required? I couldn't see much in CONTRIBUTING.md and (as far as I've read at least) it seems like the linux kernel (and the OSS community in general) tends to prefer tag signing over commit signing

MoralCode avatar May 14 '24 02:05 MoralCode

Okay, looks good. Looks like the last problem is commit signing.

Is there a particular reason this is required? I couldn't see much in CONTRIBUTING.md and (as far as I've read at least) it seems like the linux kernel (and the OSS community in general) tends to prefer tag signing over commit signing

I don't know, why have set this up to block merging with unsigned commits, maybe @jaytaph can say anything about that.

This is how it looks on my side image

Sharktheone avatar May 14 '24 11:05 Sharktheone

it also seems like theres some issues with the check_authors CI job:

Screenshot_20240514_100523

Seems like it's:

  1. not detecting Kai as a contributor
  2. pulling Sharks name into the email address field for me
  3. failing to post a comment to the PR with the requested changes

MoralCode avatar May 14 '24 14:05 MoralCode

Hmm, that's interesting. I have no clue why that's happening... I guess, just add yourself to the AUTHORS file and call it a day?

Sharktheone avatar May 14 '24 21:05 Sharktheone

Don't worry about the authors file. The check isn't that good, and should be either be changed or even removed (and maybe there are other - better - actions available for this. I haven't done the research).

As far as signed commits: they are preferred to make sure that code committed is signed off by that specific user. Even though everybody can commit, we should make sure that it's committed by the same user. For instance, There could come PR that looks like (co-)authored by one of the core members even though it's written by "evil hacker" that includes a backdoor (fortunately, that never happens /s).

So with signed commits, we assure that that commit really comes from that specific user. This becomes more important with many contributers and PRs.

Btw, signed tags will be a thing later on as soon as we actually have taggable code :)

jaytaph avatar May 15 '24 10:05 jaytaph

finally figured out commit signing

MoralCode avatar May 27 '24 15:05 MoralCode

Okay, I'm able to merge this now! 🎉

Sharktheone avatar May 27 '24 15:05 Sharktheone