Lambda Execution Issues
Hey there! Awesome library! I am running into some issues. I hope the community here can help me troubleshoot them. I am attempting to run hrequests in Lambda to interact with specific web pages when a function URL is called.
I am using the AWS SDK to deploy a Docker container similar to the following to ECR -> Lambda:
FROM mcr.microsoft.com/playwright/python:v1.34.0-jammy
# Include global arg in this stage of the build
ARG FUNCTION_DIR
RUN mkdir -p ${FUNCTION_DIR}
COPY app.py ${FUNCTION_DIR}
WORKDIR /app
COPY ./mytool/pyproject.toml ./mytool/poetry.lock /app/
COPY ./mytool/. /app
# Install dependencies using poetry
RUN pip install --no-cache-dir poetry awslambdaric aws-xray-sdk sh \
&& poetry config virtualenvs.create false \
&& poetry install --no-interaction --no-ansi
RUN python -m playwright install-deps
RUN python -m playwright install
WORKDIR ${FUNCTION_DIR}
ENTRYPOINT [ "/usr/bin/python", "-m", "awslambdaric" ]
CMD [ "app.handler" ]
An app.py file similar to the following is then called using said function URL via awslambdaric:
def handler(event, context):
logger.debug(msg=f"Initial event: {event}")
headers = event["headers"]
header_validation = validate_headers(headers)
input = headers["x-input"]
try:
command = headers["x-command"].split()
command.extend(input.split())
except Exception as e:
logger.error(msg=f"Error parsing command: {e}")
return {
"statusCode": 500,
"body": f"Error parsing command: {e}",
}
parsed = []
try:
logger.debug(msg=f"Running command: {command}")
# Set HOME=/tmp to avoid writing to the container filesystem
# Set LD_LIBRARY_PATH to include /usr/lib64 to avoid issues with the AWS X-Ray daemon
os.environ["HOME"] = "/tmp"
os.environ["LD_LIBRARY_PATH"] = "/usr/lib64"
results = subprocess.run(command, capture_output=True, text=True, env=os.environ.copy())
logger.debug(msg=f"Results stdout: {results.stdout}")
logger.debug(msg=f"Results stderr: {results.stderr}")
logger.debug(msg=f"Command exited with code: {results.returncode}")
except subprocess.TimeoutExpired as e:
logger.error(msg=f"Command timed out: {e}")
return {
"statusCode": 408, # HTTP status code for Request Timeout
"body": json.dumps({
"stdout": str(e.stdout),
"stderr": str(e.stderr),
"e": str(e),
"error": "Command timed out"
}),
}
except Exception as e:
logger.error(msg=f"Error executing command: {e}")
return {
"statusCode": 500,
"body": f"Error executing command: {e}",
}
try:
for line in results.stdout.splitlines():
parsed_json = json.loads(line)
logger.debug(msg=f"Output: {parsed_json}")
parsed.append(parsed_json)
except Exception as e:
logger.error(msg=f"Error parsing output: {e}")
return {
"statusCode": 500,
"body": f"Error parsing output: {e}",
}
xray_recorder.end_segment()
return {"statusCode": 200, "body": json.dumps(parsed)}
This app.py code is calling a separate tool I have created that utilizes hrequests for navigation and interaction with web pages. When calling the app.py file with the function URL, however, the following error is returned from hrequests specifically:
Exception in thread Thread-1 (spawn_main):
Traceback (most recent call last):
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.10/dist-packages/hrequests/browser.py", line 128, in spawn_main
asyncio.new_event_loop().run_until_complete(self.main())
File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
return future.result()
File "/usr/local/lib/python3.10/dist-packages/hrequests/browser.py", line 135, in main
self.context = await self.client.new_context(
File "/usr/local/lib/python3.10/dist-packages/hrequests/playwright_mock/playwright_mock.py", line 38, in new_context
_browser = await context.new_context(
File "/usr/local/lib/python3.10/dist-packages/hrequests/playwright_mock/context.py", line 6, in new_context
context = await inst.main_browser.new_context(
File "/usr/local/lib/python3.10/dist-packages/playwright/async_api/_generated.py", line 14154, in new_context
await self._impl_obj.new_context(
File "/usr/local/lib/python3.10/dist-packages/playwright/_impl/_browser.py", line 127, in new_context
channel = await self._channel.send("newContext", params)
File "/usr/local/lib/python3.10/dist-packages/playwright/_impl/_connection.py", line 61, in send
return await self._connection.wrap_api_call(
File "/usr/local/lib/python3.10/dist-packages/playwright/_impl/_connection.py", line 482, in wrap_api_call
return await cb()
File "/usr/local/lib/python3.10/dist-packages/playwright/_impl/_connection.py", line 97, in inner_send
result = next(iter(done)).result()
playwright._impl._api_types.Error: Target page, context or browser has been closed
Some notes on what has already been attempted:
- The container image runs just fine on my local system with similar resource allocations specified
- I can call my tool remotely, and it appears to run partially before hitting this exception
- I have increased memory allocation to the Lambda function several times without success.
- My tool is always hitting the lambda timeout value set no matter how high so I suspect this error is occurring and locking the application entirely.
I am not experienced with playwright and headless browser usage, so any help would be greatly appreciated. I understand this is not directly related to hrequests, but I hope the community here is familiar enough with the frameworks to assist. Thanks!