auto-playwright icon indicating copy to clipboard operation
auto-playwright copied to clipboard

TypeError: Cannot read properties of undefined (reading 'defaults') in auto-playwright

Open galaczi opened this issue 1 year ago • 6 comments

I am encountering a TypeError when using the auto function from the auto-playwright library in conjunction with crawlee. The error message indicates that the defaults property is undefined in the sanitize-html library, which is used internally by auto-playwright.

Steps to Reproduce:

Install the latest versions of auto-playwright and sanitize-html. Set up a route handler using crawlee (routes.ts) and auto-playwright as shown below:

import { createPlaywrightRouter } from 'crawlee';
import { auto } from 'auto-playwright';

export const router = createPlaywrightRouter();

router.addDefaultHandler(async ({ page, log }) => {
    try {
        await auto('Click the start button', { page });
    } catch (error) {
        log.error(`Error in auto-playwright: ${error.message}`);
    }
});

Run the crawler.

The following error is thrown:

TypeError: Cannot read properties of undefined (reading 'defaults')
    at sanitizeHtml (/path/to/node_modules/auto-playwright/dist/sanitizeHtml.js:23:46)
    at getSnapshot (/path/to/node_modules/auto-playwright/dist/getSnapshot.js:7:46)
    at async runTask (/path/to/node_modules/auto-playwright/dist/auto.js:37:19)
    at async auto (/path/to/node_modules/auto-playwright/dist/auto.js:14:16)
    at <anonymous> (/path/to/project/src/routes.ts:8:5)

Additional Information:

The issue seems related to the sanitize-html library and its type definitions. I have verified that I am using the latest versions of both auto-playwright and sanitize-html. The problem persists despite the issue being marked as resolved in version 1.12.2 of auto-playwright.

Environment:

Node.js version: v20.13.1 auto-playwright version: 1.15.0 sanitize-html version: 2.13.0 Operating System: Ubuntu

References:

GitHub Issue #7 - Similar issue reported and marked as resolved.

Possible Workaround:

Manually adjust the sanitizeHtml function in the auto-playwright library to ensure it correctly references sanitize-html:

const sanitizeHtml = (subject) => {
    return sanitize(subject, {
        allowedTags: sanitize.defaults.allowedTags.concat([
            "button",
            "form",
            "img",
            "input",
            "select",
            "textarea",
        ]),
        allowedAttributes: false,
    });
};

Thank you!

galaczi avatar Jun 24 '24 07:06 galaczi

I think this might be due to the completeTask file not working properly. I have fixed it in my latest PR. Can you check if it fixes it ?

rajeshdavidbabu avatar Jul 08 '24 08:07 rajeshdavidbabu

I've encountered this issue as well this morning. Using version 1.16.0.

billw4 avatar Jul 11 '24 15:07 billw4

Got the same issue as well. Using version: 1.16.0

Randy705 avatar Jul 17 '24 11:07 Randy705

yes, i am too, 1.16.0

rubickecho avatar Jul 18 '24 10:07 rubickecho

Yeah you are right sanitizeHtml.

rajeshdavidbabu avatar Jul 18 '24 10:07 rajeshdavidbabu

Just raised a PR https://github.com/lucgagan/auto-playwright/pull/40 with explanation lets wait for @lucgagan to approve

rajeshdavidbabu avatar Jul 18 '24 11:07 rajeshdavidbabu

Could we get a release that includes this change? I tested it by using the repository directly, and I can confirm that it resolved the issue on my end.

jvrdelafuente avatar Oct 29 '24 18:10 jvrdelafuente

Great suggestion in another thread, until a new release is ready, replace the contents of:

node_modules\auto-playwright\dist\sanitizeHtml.js

With:

"use strict";
Object.defineProperty(exports, "__esModule", { value: true });
exports.sanitizeHtml = void 0;
const sanitize = require("sanitize-html");
/**
 * The reason for sanitization is because OpenAI does not need all of the HTML tags
 * to know how to interpret the website, e.g. it will not make a difference to AI if
 * we include or exclude <script> tags as they do not impact the already rendered DOM.
 *
 * In my experience, reducing HTML only to basic tags produces faster and more reliable prompts.
 *
 * Note that the output of this function is designed to interpret only the HTML tags.
 * For instructions that rely on visual cues (e.g. "click red button") we intend to
 * combine HTML with screenshots in the future versions of this library.
 */
const sanitizeHtml = (subject) => {
    return sanitize(subject, {
        // The default allowedTags list already includes _a lot_ of commonly used tags.
        // https://www.npmjs.com/package/sanitize-html#default-options
        //
        // I don't see a need for this to be configurable at the moment,
        // as it already covers all the layout tags, but we can revisit this if necessary.
        allowedTags: sanitize.defaults.allowedTags.concat([
            "button",
            "form",
            "img",
            "input",
            "select",
            "textarea",
        ]),
        // Setting allowedAttributes to false will allow all attributes.
        allowedAttributes: false,
    });
};
exports.sanitizeHtml = sanitizeHtml;

TomSelleck101 avatar Feb 07 '25 12:02 TomSelleck101