osv.dev icon indicating copy to clipboard operation
osv.dev copied to clipboard

test: add unit tests for markdown filter

Open Vasuk12 opened this issue 3 weeks ago • 12 comments

Added 6 unit tests for markdown() template filter covering:

  • Empty anchor tag removal (both regular and self-closing)
  • Anchor tags with multiple attributes
  • Preservation of valid links with content
  • URL sanitization and HTML comment escaping
  • Edge cases (None, empty string)
  • XSS prevention

Tests validate markdown rendering including anchor tag sanitization, URL transformation, and security measures.

Vasuk12 avatar Dec 03 '25 03:12 Vasuk12

I wasn’t able to run the tests locally because ndb.Client() is instantiated at import time in frontend_handlers, which requires real GCP credentials even when the Datastore emulator is active. The website emulator works since it creates the emulator context before the client, but the test suite imports the module earlier, so local execution fails.

Vasuk12 avatar Dec 03 '25 03:12 Vasuk12

Hey, have you tried using make run-website-emulator? It shouldnt need datastore creds

jess-lowe avatar Dec 03 '25 03:12 jess-lowe

This should be fixed soon by #4462 on running the run_tests script. For now you could set the variable manually in your terminal session like so: GOOGLE_CLOUD_PROJECT=anything poetry run python frontend_handlers_test.py

Currently, those 3 tests are failing:

  • test_removes_empty_anchor_tags: AssertionError: 'name="x"' unexpectedly found in '<p>&lt;a name="x" id="y" class="z"&gt;&lt;/a&gt;</p>\n'
  • test_removes_anchor_with_multiple_attributes: AssertionError: 'name="test"' unexpectedly found in '<p>Text &lt;a name="test"&gt;&lt;/a&gt; &lt;a name="foo"/&gt; more</p>\n'
  • test_sanitizes_urls_and_escapes_comments: AssertionError: '/+/' not found in '<p>&lt;a href="http://ex.com/ /branch"&gt;x&lt;/a&gt;&lt;!-- comment --&gt;</p>\n'

For the first two it seems the anchor tags are being escaped rather than removed. For the third one, it seems the URL replacement isn't being applied to this input, maybe double-check how this input flows through the function.

ashmod avatar Dec 03 '25 10:12 ashmod

This should be fixed soon by #4462 on running the run_tests script. For now you could set the variable manually in your terminal session like so: GOOGLE_CLOUD_PROJECT=anything poetry run python frontend_handlers_test.py

Currently, those 3 tests are failing:

  • test_removes_empty_anchor_tags: AssertionError: 'name="x"' unexpectedly found in '<p>&lt;a name="x" id="y" class="z"&gt;&lt;/a&gt;</p>\n'
  • test_removes_anchor_with_multiple_attributes: AssertionError: 'name="test"' unexpectedly found in '<p>Text &lt;a name="test"&gt;&lt;/a&gt; &lt;a name="foo"/&gt; more</p>\n'
  • test_sanitizes_urls_and_escapes_comments: AssertionError: '/+/' not found in '<p>&lt;a href="http://ex.com/ /branch"&gt;x&lt;/a&gt;&lt;!-- comment --&gt;</p>\n'

For the first two it seems the anchor tags are being escaped rather than removed. For the third one, it seems the URL replacement isn't being applied to this input, maybe double-check how this input flows through the function.

the regex I previously had was fine for raw HTML but the data in details is passed through markdown2 with safe_mode='escape', so by the time it lands in the template the <a …> tags are already escaped to <a …&gt. That’s why the old regex never saw a match in the unit test or the GHSA sample. I have now adjusted the regex to match the escaped form, the anchors disappeared from the output, and the test suite passed with the real data. I will update pr #4431 with the new regex pattern and add the updated tests in #4460 as well.

Vasuk12 avatar Dec 04 '25 01:12 Vasuk12

/gcbrun

another-rex avatar Dec 05 '25 02:12 another-rex

I have updated both the PRs #4460 and #4431. Ran the unit tests locally and they r now working with the updated regex pattern. Let me know if there are any other changes!

Vasuk12 avatar Dec 08 '25 04:12 Vasuk12

/gcbrun

another-rex avatar Dec 08 '25 04:12 another-rex

Hey are there any more changes?

Vasuk12 avatar Dec 12 '25 02:12 Vasuk12

The tests does not seem to pass:

Step #2 - "website-tests": Already have image (with digest): gcr.io/oss-vdb/ci
Step #2 - "website-tests": + poetry install
Step #2 - "website-tests": Installing dependencies from lock file
Step #2 - "website-tests": 
Step #2 - "website-tests": No dependencies to install or update
Step #2 - "website-tests": + poetry run python frontend_handlers_test.py
Step #2 - "website-tests": WARNING:root:Vuln has empty affected ranges and versions: OSV-0, PyPI/blah
Step #2 - "website-tests": WARNING:root:Vuln has empty affected ranges and versions: OSV-1, Debian:3.1/blah
Step #2 - "website-tests": WARNING:root:Vuln has empty affected ranges and versions: OSV-1, Debian:7/blah
Step #2 - "website-tests": WARNING:root:Vuln has empty affected ranges and versions: OSV-2, Debian:8/blah
Step #2 - "website-tests": WARNING:root:Vuln has empty affected ranges and versions: OSV-3, Debian:8/blah
Step #2 - "website-tests": ....FF.
Step #2 - "website-tests": ======================================================================
Step #2 - "website-tests": FAIL: test_removes_anchor_with_multiple_attributes (__main__.MarkdownFilterTest.test_removes_anchor_with_multiple_attributes)
Step #2 - "website-tests": Test anchor tags with name and other attributes are removed.
Step #2 - "website-tests": ----------------------------------------------------------------------
Step #2 - "website-tests": Traceback (most recent call last):
Step #2 - "website-tests":   File "/workspace/gcp/website/frontend_handlers_test.py", line 132, in test_removes_anchor_with_multiple_attributes
Step #2 - "website-tests":     self.assertNotIn('name="x"', result)
Step #2 - "website-tests":     ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
Step #2 - "website-tests": AssertionError: 'name="x"' unexpectedly found in '<p>&lt;a name="x" id="y" class="z"&gt;&lt;/a&gt;</p>\n'
Step #2 - "website-tests": 
Step #2 - "website-tests": ======================================================================
Step #2 - "website-tests": FAIL: test_removes_empty_anchor_tags (__main__.MarkdownFilterTest.test_removes_empty_anchor_tags)
Step #2 - "website-tests": Test removal of empty anchor tags.
Step #2 - "website-tests": ----------------------------------------------------------------------
Step #2 - "website-tests": Traceback (most recent call last):
Step #2 - "website-tests":   File "/workspace/gcp/website/frontend_handlers_test.py", line 125, in test_removes_empty_anchor_tags
Step #2 - "website-tests":     self.assertNotIn('name="test"', result)
Step #2 - "website-tests":     ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
Step #2 - "website-tests": AssertionError: 'name="test"' unexpectedly found in '<p>Text &lt;a name="test"&gt;&lt;/a&gt; &lt;a name="foo"/&gt; more</p>\n'
Step #2 - "website-tests": 
Step #2 - "website-tests": ----------------------------------------------------------------------
Step #2 - "website-tests": Ran 7 tests in 6.329s
Step #2 - "website-tests": 
Step #2 - "website-tests": FAILED (failures=2)

It should be possible to run the test locally, let me know if you run into any trouble there.

another-rex avatar Dec 12 '25 04:12 another-rex

/gcbrun

another-rex avatar Dec 12 '25 05:12 another-rex

/gcbrun

another-rex avatar Dec 12 '25 05:12 another-rex

The tests does not seem to pass:

Step #2 - "website-tests": Already have image (with digest): gcr.io/oss-vdb/ci
Step #2 - "website-tests": + poetry install
Step #2 - "website-tests": Installing dependencies from lock file
Step #2 - "website-tests": 
Step #2 - "website-tests": No dependencies to install or update
Step #2 - "website-tests": + poetry run python frontend_handlers_test.py
Step #2 - "website-tests": WARNING:root:Vuln has empty affected ranges and versions: OSV-0, PyPI/blah
Step #2 - "website-tests": WARNING:root:Vuln has empty affected ranges and versions: OSV-1, Debian:3.1/blah
Step #2 - "website-tests": WARNING:root:Vuln has empty affected ranges and versions: OSV-1, Debian:7/blah
Step #2 - "website-tests": WARNING:root:Vuln has empty affected ranges and versions: OSV-2, Debian:8/blah
Step #2 - "website-tests": WARNING:root:Vuln has empty affected ranges and versions: OSV-3, Debian:8/blah
Step #2 - "website-tests": ....FF.
Step #2 - "website-tests": ======================================================================
Step #2 - "website-tests": FAIL: test_removes_anchor_with_multiple_attributes (__main__.MarkdownFilterTest.test_removes_anchor_with_multiple_attributes)
Step #2 - "website-tests": Test anchor tags with name and other attributes are removed.
Step #2 - "website-tests": ----------------------------------------------------------------------
Step #2 - "website-tests": Traceback (most recent call last):
Step #2 - "website-tests":   File "/workspace/gcp/website/frontend_handlers_test.py", line 132, in test_removes_anchor_with_multiple_attributes
Step #2 - "website-tests":     self.assertNotIn('name="x"', result)
Step #2 - "website-tests":     ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
Step #2 - "website-tests": AssertionError: 'name="x"' unexpectedly found in '<p>&lt;a name="x" id="y" class="z"&gt;&lt;/a&gt;</p>\n'
Step #2 - "website-tests": 
Step #2 - "website-tests": ======================================================================
Step #2 - "website-tests": FAIL: test_removes_empty_anchor_tags (__main__.MarkdownFilterTest.test_removes_empty_anchor_tags)
Step #2 - "website-tests": Test removal of empty anchor tags.
Step #2 - "website-tests": ----------------------------------------------------------------------
Step #2 - "website-tests": Traceback (most recent call last):
Step #2 - "website-tests":   File "/workspace/gcp/website/frontend_handlers_test.py", line 125, in test_removes_empty_anchor_tags
Step #2 - "website-tests":     self.assertNotIn('name="test"', result)
Step #2 - "website-tests":     ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
Step #2 - "website-tests": AssertionError: 'name="test"' unexpectedly found in '<p>Text &lt;a name="test"&gt;&lt;/a&gt; &lt;a name="foo"/&gt; more</p>\n'
Step #2 - "website-tests": 
Step #2 - "website-tests": ----------------------------------------------------------------------
Step #2 - "website-tests": Ran 7 tests in 6.329s
Step #2 - "website-tests": 
Step #2 - "website-tests": FAILED (failures=2)

It should be possible to run the test locally, let me know if you run into any trouble there.

vasukhare@Vasus-MacBook-Air website % cd gcp/website
DATASTORE_EMULATOR_PORT=8004 DATASTORE_EMULATOR_HOST=localhost:8004 GOOGLE_CLOUD_PROJECT=test-project ./run_tests.sh
cd: no such file or directory: gcp/website
+ poetry install
The currently activated Python version 3.10.13 is not supported by the project (>=3.13,<4.0).
Trying to find and use a compatible version. 
Using python3.14 (3.14.0)
Installing dependencies from lock file

No dependencies to install or update
+ poetry run python frontend_handlers_test.py
The currently activated Python version 3.10.13 is not supported by the project (>=3.13,<4.0).
Trying to find and use a compatible version. 
Using python3.14 (3.14.0)
WARNING:root:Vuln has empty affected ranges and versions: OSV-0, PyPI/blah
WARNING:root:Vuln has empty affected ranges and versions: OSV-1, Debian:3.1/blah
WARNING:root:Vuln has empty affected ranges and versions: OSV-1, Debian:7/blah
WARNING:root:Vuln has empty affected ranges and versions: OSV-2, Debian:8/blah
WARNING:root:Vuln has empty affected ranges and versions: OSV-3, Debian:8/blah
.......
----------------------------------------------------------------------
Ran 7 tests in 1.980s

OK
vasukhare@Vasus-MacBook-Air website % 

I am not having any issues while running the tests. Could you tell me is there any other way to reproduce the error u r facing? Also the regex pattern added is in a different PR so you might have to add that manually..

Vasuk12 avatar Dec 12 '25 05:12 Vasuk12

Ah seems to work fine, just required merging the master branch. Thanks!

another-rex avatar Dec 14 '25 23:12 another-rex