streamalert Feature: HTTP Endpoint Support

StreamAlert will also support receiving data via an HTTP endpoint. This is for service providers or appliances that support HTTP endpoints for logging. Example: Akamai, OneLogin: https://support.onelogin.com/hc/en-us/articles/215214143-Streaming-Real-Time-OneLogin-Event-Data-to-your-SIEM-Solution

Jan 30 '17 20:01 ghost

I'm interested in this in order to provide https://canary.tools/ and https://canarytokens.org/generate with a webhook that will result in StreamAlert being notified of the canaries being triggered. This is especially important because I want StreamAlert to contain more detailed information about where this canarytoken was placed and therefore what reaction I should take if it is triggered.

@jacknagz mentioned that using the API Gateway could be used for this, and I've found this is correct, by following the rough guidance on https://medium.com/@tombray/using-amazon-api-gateway-as-a-proxy-for-kinesis-6242ce132e3d

Here is a simple demonstration of the end result for what I've set up with API Gateway:

# Get shard iterator to read kinesis stream
$ aws kinesis get-shard-iterator --shard-id shardId-000000000000 --shard-iterator-type LATEST --stream-name test_prod_stream_alert_kinesis
{
    "ShardIterator": "AAAAAAAAAAF2OY...="
}

# Send data to the API Gateway
$ curl -H "Content-Type: application/json" -X POST -d '{"test":"testdata"}'  https://REDACTED.execute-api.us-east-1.amazonaws.com/Prod
{"SequenceNumber":"495...8","ShardId":"shardId-000000000000"}

# Read the latest item from the kinesis stream to a file
$ aws kinesis get-records --shard-iterator "AAAAAAAAAAF2OY...=" > get-records.json

# Extract the record from the file
$ cat get-records.json | jq -r '.Records[0].Data' | base64 --decode | jq '.'
{
  "test": "testdata"
}

To set this up, I created an API Gateway with an integration to Kinesis.

The role for this simply needs:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "kinesis:PutRecord"
            ],
            "Resource": "*"
        }
    ]
}

Then you need to setup the body mapping template:

I used:

{
    "Data": "$util.base64Encode("$input.json('$')")",
    "PartitionKey": "0",
    "StreamName": "test_prod_stream_alert_kinesis"
}

Note, that I should change that partition key to be a random value. I should also provide more information within the Data element so it has a schema that can be better identified by StreamAlert.

Next I deployed it so it can be invoked.

I plan on first documenting this setup as CLI commands and ensuring canary tokens really can hit this and write a rule for them to be picked up. Then we can look into integrating this into StreamAlert in such a way that it can be stood up automatically via configuration.

Note that there is no authentication on this webhook, as canary tokens don't supply anything. I do plan on making the URL at least randomized so it can't be easily found.

Feb 27 '18 19:02 0xdabbad00

The first step is setting up the IAM role that API Gateway will use:

cat << EOF > assume_role.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
                "Service": "apigateway.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
EOF

aws iam create-role --role-name StreamWriter --assume-role-policy-document file://assume_role.json --description "Allows API Gateway to write to Kinesis"


aws iam attach-role-policy --role-name StreamWriter --policy-arn "arn:aws:iam::aws:policy/service-role/AmazonAPIGatewayPushToCloudWatchLogs"


cat << EOF > AllowPutRecord.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "kinesis:PutRecord"
            ],
            "Resource": "*"
        }
    ]
}
EOF

aws iam put-role-policy --role-name StreamWriter --policy-name AllowPutRecord --policy-document file://AllowPutRecord.json

Creating the webhook looks like this:

aws apigateway create-rest-api --name StreamWriter --description "Webhook that writes to the Kinesis Stream for StreamAlert" --endpoint-configuration types=REGIONAL
{
    "id": "API_ID",
    "name": "StreamWriter",
    "description": "Webhook that writes to the Kinesis Stream for StreamAlert",
    "createdDate": 1519762634,
    "apiKeySource": "HEADER",
    "endpointConfiguration": {
        "types": [
            "REGIONAL"
        ]
    }
}

# Need to get the resource id
aws apigateway get-resources --rest-api-id API_ID
{
    "items": [
        {
            "id": "RESOURCE_ID",
            "path": "/"
        }
    ]
}

aws apigateway put-method --rest-api-id API_ID --resource-id RESOURCE_ID --http-method POST --authorization-type NONE --request-parameters {}


cat << EOF > requestTemplate.json
{ 
    "application/json": "{\n    \"Data\": \"\$util.base64Encode(\"\$input.json('$')\")\",\n    \"PartitionKey\": \"0\",\n    \"StreamName\": \"test_prod_stream_alert_kinesis\"\n}"
}
EOF

# The partition key should be a random value. This template uses the Velocity Template Language.
# http://velocity.apache.org/engine/devel/vtl-reference.html
# For my needs these webhooks will be triggered infrequently enough that I'm not concerned
# about randomizing the partition key.

aws apigateway put-integration \
    --rest-api-id API_ID \
    --resource-id RESOURCE_ID \
    --http-method POST \
    --integration-http-method POST \
    --type AWS \
    --uri "arn:aws:apigateway:us-east-1:kinesis:action/PutRecord" \
    --credentials "arn:aws:iam::ACCOUNT_ID:role/StreamWriter" \
    --request-templates file://requestTemplate.json \
    --passthrough-behavior NEVER

aws apigateway create-deployment --rest-api-id API_ID --stage-name deployed

aws apigateway put-method-response  --rest-api-id API_ID --resource-id RESOURCE_ID --http-method POST --status-code 200 --response-models '{"application/json": "Empty"}'

aws apigateway put-integration-response  --rest-api-id API_ID --resource-id RESOURCE_ID  --http-method POST --status-code 200 --response-templates '{"application/json":""}'

Calling this looks like:

curl -H "Content-Type: application/json" -X POST -d '{"test":"test1"}' https://API_ID.execute-api.us-east-1.amazonaws.com/deployed

Feb 27 '18 23:02 0xdabbad00

I created a web token canary token, as they trigger the fastest, and this immediately resulted in a new record in my kinesis stream:

aws kinesis get-shard-iterator --shard-id shardId-000000000000 --shard-iterator-type LATEST --stream-name test_prod_stream_alert_kinesis
{
    "ShardIterator": "AAAAAAAAAAF0...="
}

aws kinesis get-records --shard-iterator "AAAAAAAAAAF0...=" > get-records.json
cat get-records.json | jq -r '.Records[0].Data' | base64 --decode | jq '.'
{
  "manage_url": "http://canarytokens.org/manage?token=5p...5",
  "memo": "You hit my webhook",
  "additional_data": {
    "src_ip": "6.6.6.666",
    "useragent": "Mozilla/5.0....",
    "referer": null,
    "location": null
  },
  "channel": "HTTP",
  "time": "2018-02-27 23:35:26"
}

Feb 27 '18 23:02 0xdabbad00

Thing to do still:

Add the schema for this.
Create a StreamAlert rule that can trigger from this.

Nice to have:

Add more info to the record that is written to kinesis so this can be better identified by StreamAlert. This will involve changing the requestTemplate.json above.
Set a randomized partition key.
Use a resource so this webhook won't be randomly hit, as it may be possible to brute-force web api gateway subdomains. Example, I want the webhook url to be https://ffffffffff.execute-api.us-east-1.amazonaws.com/deployed/mysecrethookname It would also be good to use a regex in the resource name or have query parameters end up in this, so you could create webhook like: https://ffffffffff.execute-api.us-east-1.amazonaws.com/deployed/mysecrethookname?laptop=bobs_macbook&file=/home/bob/canary.html

Then ultimately we need to decide to either do a better write-up of how to set this up, or incorporate it directly into StreamAlert (which would be better, but harder).

Feb 27 '18 23:02 0xdabbad00

This is super cool. We use Canary as well. @jacknagz - thoughts on impl?

Feb 28 '18 00:02 ghost

Using a resource name (ie. a path such as /secrethook) is easy, but you MUST create the resource before making the deployment, else you get {"message": "Internal server error"} and the CloudWatch Logs will show No match for output mapping and no default output mapping configured (I'm making note of that error here for posterity, as it took me a long time to figure out).

This is how you use a resource:

aws apigateway create-resource --rest-api-id XXX --parent-id YYY --path-part mysecrethook

aws apigateway put-method-response  --rest-api-id XXX --resource-id YYY --http-method POST --status-code 200 --response-models '{"application/json": "Empty"}'

aws apigateway put-integration-response  --rest-api-id XXX --resource-id YYY  --http-method POST --status-code 200 --response-templates '{"application/json":""}' 

# Then, only after you've done the above, create the deployment
aws apigateway create-deployment --rest-api-id XXX --stage-name deployed

Then you can hit /mysecrethook with curl as follows:

curl -H "Content-Type: application/json" -X POST -d '{"test":"test1"}' https://XXX.execute-api.us-east-1.amazonaws.com/deployed/mysecrethook

There isn't a ton of value in doing this as it's just security through obscurity, but it's nice to have for services that don't use any other authentication in their requests, and the only effect of this being "abused" is you would get alerts you didn't care about, so security through obscurity is acceptable here. You could always add stronger security to this if you wanted.

Feb 28 '18 20:02 0xdabbad00

Getting a requestTemplate.json to work was pretty tricky due to the quote escaping needed, and the lack of functionality (or my lack of knowledge) of VTL: https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html

I set my requestTemplate.json with:

cat << EOF > requestTemplate.json
{ 
    "application/json": "{\n    \"Data\": \"\$util.base64Encode(\"{\"\"url\"\": \"\"\$context.path\"\", \"\"sourceIp\"\":\"\"\$context.identity.sourceIp\"\", \"\"userAgent\"\":\"\"\$context.identity.userAgent\"\", \"\"requestTime\"\":\"\"\$context.requestTime\"\", \"\"querystring\"\":\"\"\$util.urlDecode(\$input.params().querystring)\"\",\"\"detail\"\":\$input.json('$')}\")\",\n    \"PartitionKey\": \"0\",\n    \"StreamName\": \"test_prod_stream_alert_kinesis\"\n}"
}
EOF

That nightmare of escaping and lack of white-space can be deciphered as:

{
    "Data": "$util.base64Encode("{
        'url': '$context.path', 
        'sourceIp':'$context.identity.sourceIp',
        'userAgent':'$context.identity.userAgent',
        'requestTime':'$context.requestTime',
        'querystring':'$util.urlDecode($input.params().querystring)',
        'detail':$input.json('$')
    }")",
    "PartitionKey": "0",
    "StreamName": "test_prod_stream_alert_kinesis"
}

Here is what I have when I provide a canary token webhook of: https://XXX.execute-api.us-east-1.amazonaws.com/deployed/mysecrethook?device=bobs_laptop&location=secret.txt

You can see I've added query strings of device=bobs_laptop&location=secret.txt as I might want to use something like that in my webhook that I provide to the service so I know what the purpose of this was.

When the canary is triggered, I end up with the following record data in my kinesis stream:

{
  "url": "/deployed/mysecrethook",
  "sourceIp": "52.18.63.80",
  "userAgent": "python-requests/2.7.0 CPython/2.7.12 Linux/3.13.0-61-generic",
  "requestTime": "01/Mar/2018:04:05:37 +0000",
  "querystring": "{device=bobs_laptop, location=secret.txt}",
  "detail": {
    "manage_url": "http://canarytokens.org/manage?token=XXX&auth=YYY",
    "memo": "My StreamAlert test",
    "additional_data": {
      "src_ip": "6.6.6.666",
      "useragent": "Mozilla/5.0 ...",
      "referer": null,
      "location": null
    },
    "channel": "HTTP",
    "time": "2018-03-01 04:05:37"
  }
}

You can see, I'm getting some relevant info about who called this webook, and what the webhook is, along with the data that was sent to the webhook inside the detail element.

Some comments:

"url": "/deployed/mysecrethook": I don't seem to be able to find a way of getting the whole URL.
sourceIp and userAgent: These are fine
"requestTime": "01/Mar/2018:04:05:37 +0000": I don't have the ability to change this to a better format.
"querystring": "{device=bobs_laptop, location=secret.txt}": I'm upset that this isn't all json, but it doesn't look like I have an ability to do anything better.
"detail": This is just a blob of exactly what canary tools sent.

Mar 01 '18 04:03 0xdabbad00

It works!

I updated my requestTemplate.json to:

cat << EOF > requestTemplate.json
{ 
    "application/json": "{\n    \"Data\": \"\$util.base64Encode(\"{\"\"webhookApiId\"\": \"\"\$context.apiId\"\", \"\"url\"\": \"\"\$context.path\"\", \"\"sourceIp\"\":\"\"\$context.identity.sourceIp\"\", \"\"userAgent\"\":\"\"\$context.identity.userAgent\"\", \"\"requestTime\"\":\"\"\$context.requestTime\"\", \"\"querystring\"\":\"\"\$util.urlDecode(\$input.params().querystring)\"\",\"\"detail\"\":\$input.json('$')}\")\",\n    \"PartitionKey\": \"0\",\n    \"StreamName\": \"test_prod_stream_alert_kinesis\"\n}"
}
EOF

Then I added the following to my logs.json:

  "webhook": {
    "schema": {
      "webhookApiId": "string",
      "url": "string",
      "sourceIp": "string",
      "userAgent": "string",
      "requestTime": "string",
      "querystring": "string",
      "detail": {}
    },
    "parser": "json"
  }

and updated my sources.json to include webhook as log in my kinesis stream.

Then I made a rule webhook.py:

"""Alert on webhook being called."""
from stream_alert.rule_processor.rules_engine import StreamRules

rule = StreamRules.rule

@rule(logs=['webhook'],
      matchers=[],
      outputs=['slack:alerts'])
def webhook(rec):
    return True

Mar 01 '18 17:03 0xdabbad00

The PR #615 collects the comments in this ticket into a single, more coherent, guide, along with changes to the logs.json for the schema of the webhook and a sample rule that will fire anytime a webhook is triggered.

Mar 01 '18 19:03 0xdabbad00

hey @0xdabbad00 I'll take some more team to read through this, but it looks great so far. can you adjust the IAM policy for API gateway to allow to send to specific streams vs *?

Mar 01 '18 20:03 jacknagz

Yes. The API Gateway is configured such that it will only write to the stream you configure it for, but once we decide how to integrate this into StreamAlert, I'll tighten those permissions in a PR. This would be better if integrated into the terraform config of StreamAlert so we know the name of the Stream it will be sending the records to and can automatically set the policy accordingly.

Mar 01 '18 22:03 0xdabbad00

I corrected IAM policy, fixed a step I had forgotten, swapped the ordering of another step I discovered needed to be swapped, and fixed the formatting.

Mar 05 '18 21:03 0xdabbad00

We tried integrating Facebook's Certificate Transparency into this webhook flow: https://developers.facebook.com/docs/certificate-transparency/certificates-webhook

Unfortunately, the webhook I created is only for a POST request, and Facebook sends an initial GET request with a parameter that must be echo'd back. I assume this is done to avoid sending unsolicated requests to someone. This is sent in a query string such as ?hub.mode=subscribe&hub.challenge=123456&hub.verify_token=token_you_provide so you need the GET to respond back to this with 123456 as the body. I think cases like this are going to be one-off's, and probably should be handled independently of StreamAlert, but I wanted to mention it as something to consider for this.

Mar 07 '18 21:03 0xdabbad00

For those following along, the challenge response to Facebook for this can be set up as follows, where I perform most of the steps as previously, except this time for a GET request and also this time I echo back a challenge that is sent as a query parameter.

aws apigateway put-method --rest-api-id REST_API_ID --resource-id RESOURCE_ID --http-method GET --authorization-type None

# Use the same requestTemplate.json as described previously

aws apigateway put-integration \
    --rest-api-id REST_API_ID \
    --resource-id RESOURCE_ID \
    --http-method GET \
    --integration-http-method POST \
    --type AWS \
    --uri "arn:aws:apigateway:REGION:kinesis:action/PutRecord" \
    --credentials "arn:aws:iam::ACCOUNT_ID:role/StreamWriter" \
    --request-templates file://requestTemplate.json \
    --passthrough-behavior NEVER

# This `put-integration-response` is the key part of this that will echo back the challenge to Facebook.

cat << EOF > response-template.json
{"text/plain":"\$input.params().get('querystring').get('hub.challenge')"}
EOF

aws apigateway put-integration-response --rest-api-id REST_API_ID --resource-id RESOURCE_ID --http-method GET --status-code 200 --response-templates file://response-template.json

aws apigateway put-method-response --rest-api-id REST_API_ID --resource-id RESOURCE_ID --http-method GET --status-code 200 --response-models '{"application/json": "Empty"}'

aws apigateway create-deployment --rest-api-id REST_API_ID --stage-name deployed

Mar 09 '18 15:03 0xdabbad00

streamalert streamalert copied to clipboard

Feature: HTTP Endpoint Support

streamalert
streamalert copied to clipboard