ladder Ruleset testing

As per https://github.com/everywall/ladder-rules/pull/3, we'll need to implement some robust testing.

The main challenge is to test after client-side JS rendering happens, which will probably mean we'll need a headless browser.

A test could look like this: https://github.com/everywall/ladder/blob/ladder_tests/tests/tests/www-wellandtribune-ca.spec.ts

And the results like this.

Perhaps we'll need some codegen in order to go from ruleset to test?

Nov 26 '23 22:11 deoxykev

Yes, this would be cool.

Nov 27 '23 21:11 ladddder

This seems like a very cool idea.

I played around with your example a bit, and I think we may be able to leverage Github actions to run a shell script whenever a ruleset yaml is uploaded or changed to generate and run a test. From your example, I changed await expect(page.getByText(paywallText)).toBeVisible(); to await expect(page.getByText(paywallText)).not.toBeVisible(); for the test with ladder so both tests pass and could be used for CI.

I used the following bash script to generate a Playwright test.

./generate_test.sh -i rulesets/ca/_multi-metroland-media-group.yaml > tests/_multi-metroland-media-group.spec.ts

generate_test.sh

#!/bin/bash

# Command-line argument parsing
while getopts "i:" opt; do
  case $opt in
    i)
      input_file=$OPTARG
      ;;
    \?)
      echo "Invalid option: -$OPTARG" >&2
      exit 1
      ;;
    :)
      echo "Option -$OPTARG requires an argument." >&2
      exit 1
      ;;
  esac
done

# Check if the input file is provided
if [ -z "$input_file" ]; then
  echo "Usage: $0 -i <input_yaml_file>"
  exit 1
fi

# Extract information from the "tests" section
url=$(awk '/- url:/ {sub(/- url: /, ""); sub(/^[[:space:]]*/, ""); print}' "$input_file")
domain=$(echo "$url" | awk -F/ '{print $3}')
test=$(awk '/test:/ {sub(/test: /, ""); sub(/^[[:space:]]*/, ""); print}' "$input_file")

# Generate Playwright test script
echo "import { expect, test } from '@playwright/test';"
echo
echo "test('$domain has paywall by default', async ({ page }) => {"
echo "  await page.goto('$url');"
echo "  await expect($test).toBeVisible();"
echo "});"
echo
echo "test('$domain + Ladder does not have paywall', async ({ page }) => {"
echo "  await page.goto('http://localhost:8080/$url');"
echo "  await page.waitForLoadState();"
echo "  await expect($test).not.toBeVisible();"
echo "});"

In the ruleset yaml, I put a Playwright locator in the test portion:

tests:
  - url: https://www.wellandtribune.ca/news/niagara-region/niagara-transit-commission-rejects-council-request-to-reduce-its-budget-increase/article_e9fb424c-8df5-58ae-a6c3-3648e2a9df66.html
    test: page.getByText("This article is exclusive to subscribers.")

At the moment, the bash script is pretty limited to just checking if a specified element is or is not visible. If we continue this way, we may want the script to be a bit more general so it can capture other scenarios. This may require anyone contributing a rule to be a bit more explicit in their ruleset tests section, so rather than contribute a Playwright locator they may need to provide the expectation with both a locator and an assertion:

    test: expect(page.getByText("This article is exclusive to subscribers.")).toBeVisible()

Some additional parsing in the bash script could insert a .not before the assertion for the ladder test.

Nov 28 '23 04:11 joncrangle

Nice work!

I've been thinking about how to generate rules for any site, in an automated fashion. One of the main roadblocks is figuring out whether or not a site is paywalled, and to generate a test for it. I wonder if it's as simple as extracting visible text from a page, and asking an LLM whether or not it is paywall text is sufficient.

Nov 28 '23 13:11 deoxykev

I wonder if it's as simple as extracting visible text from a page, and asking an LLM whether or not it is paywall text is sufficient

@deoxykev I know LLMs are a blunt force object, but you could even use a screenshot instead of text. The headless browsers support this out of the box usually, and visual inspection often is easier than code logic for an LLM.

This could even be integrated in a Docker composition with ollama so the LLM calling is local.

Aug 31 '24 11:08 actuallymentor

ladder ladder copied to clipboard

Ruleset testing

ladder
ladder copied to clipboard