npk icon indicating copy to clipboard operation
npk copied to clipboard

spot_monitor trowing errors at record pace

Open TheToddLuci0 opened this issue 3 years ago • 1 comments

Every invocation of the spot_monitor lambda throws the following errors:

2022-11-11T20:40:46.828Z	56d0e64e-0a6f-4f66-857b-576273621e67	INFO	TypeError: Cannot read property 'instances' of undefined
    at /var/task/main.js:87:22
    at Array.forEach (<anonymous>)
    at /var/task/main.js:72:31
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (internal/process/task_queues.js:95:5)
    at async Promise.all (index 0)
    at async Runtime.exports.main [as handler] (/var/task/main.js:100:3)
2022-11-11T20:30:53.928Z	ad91e5e7-ba38-452a-90dd-2d03718bbd6f	ERROR	Invoke Error 	{
    "errorType": "Error",
    "errorMessage": "[!] Failed to retreive spot instance statuses: TypeError: Cannot read property 'instances' of undefined",
    "stack": [
        "Error: [!] Failed to retreive spot instance statuses: TypeError: Cannot read property 'instances' of undefined",
        "    at intoError (/var/runtime/Errors.js:27:12)",
        "    at postError (/var/runtime/CallbackContext.js:21:47)",
        "    at callback (/var/runtime/CallbackContext.js:42:7)",
        "    at /var/runtime/CallbackContext.js:113:16",
        "    at Runtime.exports.main [as handler] (/var/task/main.js:104:10)",
        "    at runMicrotasks (<anonymous>)",
        "    at processTicksAndRejections (internal/process/task_queues.js:95:5)"
    ]
}

/lambda_functions/spot_monitor/main.js#L72

Getting the matching email every minute is not my favorite thing.

{"version":"0","id":"20c1c82d-1f7f-168a-9d40-bf697b8b84e9","detail-type":"Scheduled Event","source":"aws.events","account":"XXXXXXX","time":"2022-11-11T20:54:14Z","region":"us-east-2","resources":["arn:aws:events:us-east-2:XXXXXXXXX:rule/npkSpotMonitor"],"detail":{}}

There are no open spot requests, or running instances.

┌──(user㉿devbox)-[~]
└─$ aws ec2 describe-spot-instance-requests --filters Name=state,Values=open,active
{
    "SpotInstanceRequests": []
}
                                                                                                                                                                                                                                            
┌──(user㉿devbox)-[~]
└─$ aws ec2 describe-instances --filters Name=instance-state-name,Values=pending,running 
{
    "Reservations": []
}

Update

Issue appears to have been caused by a campaign that ended (keyspace exhausted), but wasn't properly cleaned up. Canceling the campaign via the ui has (for the moment) stopped the errors, and given my inbox some peace. Not sure what cosmic bit-flip resulted in that campaign not being properly closed down.

TheToddLuci0 avatar Nov 11 '22 21:11 TheToddLuci0

This issue has raised it's head again. Running latest version, seems to be the same issue, though this time the erroneous job failed due to timeout rather than keyspace exhaustion.

TheToddLuci0 avatar Feb 15 '23 22:02 TheToddLuci0