ApplicationInsights-Java icon indicating copy to clipboard operation
ApplicationInsights-Java copied to clipboard

The trace logs of app-insights in azure portal don't reflect the effective container logs (masked out sensitive data)

Open bvahdat opened this issue 2 years ago • 1 comments

Expected behavior

The trace logs provided by app-insights should reflect the effective container logs of a given application.

Actual behavior

To have custom dimensions for our structured logging we make use of:

https://github.com/logfellow/logstash-logback-encoder

And we configure it to mask out sensitive data as per documentation here:

https://github.com/logfellow/logstash-logback-encoder#identifying-field-values-to-mask-by-value

And it actually works pretty well as because looking into the container logs we see that sensitive fields inside JSON are properly masked out, for example the container log entries look like:

{
  "@timestamp": "2022-07-21T12:15:41.181+02:00",
  "@version": "1",
  "message": "Processing of the event with cloudEventId: 6415d31r-b0o3-5cx8w-d934-d5320981278z and salesforceExternalId: 00525000008yPMSCC9 took 294 ms. which was successful.",
  "logger_name": "com.foo.bar.servicebus.salesforce.policy.MdcRoutePolicy",
  "thread_name": "main",
  "level": "INFO",
  "level_value": 20000,
  "forgerockEventBody": "{\n  \"data\": {\n    \"FrIndexedString1\": \"00525000008yPMSCC9\",\n    \"FrUnindexedString1\": \"George Michael\",\n    \"FrUnindexedString2\": \"en-GB\",\n    \"mail\": \"******\",\n    \"telephoneNumber\": \"******\"\n  },\n  \"datacontenttype\": \"application/json\",\n  \"dataschema\": \"foo:integration-hubs\",\n  \"id\": \"6415d31r-b0o3-5cx8w-d934-d5320981278z\",\n  \"source\": \"forgerock:idm\",\n  \"specversion\": \"1.0\",\n  \"time\": \"2018-04-05T17:31:00Z\",\n  \"type\": \"com.forgerock.user.migrated\"\n}\n",
  "trace_id": "00000000000000000000000000000000",
  "camel.contextId": "foo-ihub-connector-servicebus-salesforce",
  "salesforceMigrateAccountsSfResponseCode": "0",
  "span_id": "0000000000000000",
  "salesforceMigrateIndividualRequestBody": "{\"allOrNone\":true,\"compositeRequest\":[{\"body\":{\"FOO_Customer_Identity_External_ID__c\":\"00525000008yPMSCC9\",\"FirstName\":\"George Michael\",\"LastName\":\"Unknown\",\"REIDP_Email__c\": \"******\",\"REIDP_ForgeRock_Identity__c\":\"true\",\"REIDP_LanguageLocaleKey__c\":\"en-GB\",\"REIDP_Mobilephone__c\": \"******\"},\"method\":\"POST\",\"referenceId\":\"refIndividual\",\"url\":\"/services/data/v54.0/sobjects/Individual\"},{\"method\":\"GET\",\"referenceId\":\"refAccount\",\"url\":\"/services/data/v54.0/query/?q=SELECT+id+from+Account+where+FOO_Customer_Identity_External_ID__pc+=+%2700525000008yPMSCC9%27+AND+IsPersonAccount+=+TRUE\"}]}",
  "camel.routeId": "id-servicebus-to-salesforce",
  "cloudEventId": "6415d31r-b0o3-5cx8w-d934-d5320981278z",
  "salesforceMigrateIndividualResponseBody": "{\n  \"compositeResponse\": [\n    {\n      \"body\": {\n        \"id\": \"00525000008yPMSCC9\",\n        \"success\": true,\n        \"errors\": []\n      },\n      \"httpHeaders\": {\n        \"Location\": \"/services/data/v54.0/sobjects/Individual/0PK25000002zW7tGAE\"\n      },\n      \"httpStatusCode\": 201,\n      \"referenceId\": \"refIndividual\"\n    },\n    {\n      \"body\": {\n        \"totalSize\": 1,\n        \"done\": true,\n        \"records\": [\n          {\n            \"attributes\": {\n              \"type\": \"Account\",\n              \"url\": \"/services/data/v54.0/sobjects/Account/0012500001i9vb3AAA\"\n            },\n            \"Id\": \"0012500001i9vb3AAA\"\n          }\n        ]\n      },\n      \"httpHeaders\": {},\n      \"httpStatusCode\": 200,\n      \"referenceId\": \"refAccount\"\n    }\n  ]\n}",
  "camel.exchangeId": "F577AC290CF4B5D-0000000000000000",
  "salesforceExternalId": "00525000008yPMSCC9",
  "salesforceInvocationSucceeded": "true",
  "camel.messageId": "F577AC290CF4B5D-0000000000000000",
  "salesforceMigrateAccountsResponseBody": "{\"allOrNone\":true,\"compositeRequest\":[{\"body\":{\"FirstName\":\"George Michael\",\"LanguageLocaleKey__pc\":\"en-GB\",\"LastName\":\"Unknown\",\"PersonEmail\": \"******\",\"PersonIndividualId\":\"00525000008yPMSCC9\",\"PersonMobilePhone\": \"******\",\"REIDP_ForgeRock_Identity__pc\":\"true\"},\"method\":\"PATCH\",\"referenceId\":\"referenceId0012500001i9vb3AAA\",\"url\":\"/services/data/v54.0/sobjects/Account/0012500001i9vb3AAA\"}]}",
  "salesforceMigrateIndividualSfResponseCode": "201",
  "salesforceMigrateAccountsRequestBody": "{\"allOrNone\":true,\"compositeRequest\":[{\"body\":{\"FirstName\":\"George Michael\",\"LanguageLocaleKey__pc\":\"en-GB\",\"LastName\":\"Unknown\",\"PersonEmail\": \"******\",\"PersonIndividualId\":\"00525000008yPMSCC9\",\"PersonMobilePhone\": \"******\",\"REIDP_ForgeRock_Identity__pc\":\"true\"},\"method\":\"PATCH\",\"referenceId\":\"referenceId0012500001i9vb3AAA\",\"url\":\"/services/data/v54.0/sobjects/Account/0012500001i9vb3AAA\"}]}",
  "processingTimeInMillis": "294"
}

Notice the masked out fields which are:

\"mail\": \"******\"
\"REIDP_Email__c\": \"******\"
\"PersonEmail\": \"******\"
\"telephoneNumber\": \"******\"
\"PersonMobilePhone\": \"******\"
\"REIDP_Mobilephone__c\": \"******\"

But when we look at the corresponding trace logs of AppInsights in portal they are not masked out and show the sensitive data we are actually trying to mask out!

The corresponding logback-spring.xml of the spring-boot application is as the following (the used spring-boot profile at runtime is the default profile and not local):

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <springProfile name="local">
        <include resource="org/springframework/boot/logging/logback/defaults.xml"/>
        <include resource="org/springframework/boot/logging/logback/console-appender.xml"/>
        <root level="INFO">
            <appender-ref ref="CONSOLE"/>
        </root>
    </springProfile>

    <springProfile name="!local">
        <appender name="JSON" class="ch.qos.logback.core.ConsoleAppender">
            <encoder class="net.logstash.logback.encoder.LogstashEncoder">
                <jsonGeneratorDecorator class="net.logstash.logback.mask.MaskingJsonGeneratorDecorator">
                    <valueMask>
                        <value>"mail"\s*:\s*"[^"]+"</value>
                        <mask>"mail":"******"</mask>
                    </valueMask>
                    <valueMask>
                        <value>"REIDP_Email__c"\s*:\s*"[^"]+"</value>
                        <mask>"REIDP_Email__c":"******"</mask>
                    </valueMask>
                    <valueMask>
                        <value>"PersonEmail"\s*:\s*"[^"]+"</value>
                        <mask>"PersonEmail":"******"</mask>
                    </valueMask>
                    <valueMask>
                        <value>"telephoneNumber"\s*:\s*"[^"]+"</value>
                        <mask>"telephoneNumber":"******"</mask>
                    </valueMask>
                    <valueMask>
                        <value>"REIDP_Mobilephone__c"\s*:\s*"[^"]+"</value>
                        <mask>"REIDP_Mobilephone__c":"******"</mask>
                    </valueMask>
                    <valueMask>
                        <value>"PersonMobilePhone"\s*:\s*"[^"]+"</value>
                        <mask>"PersonMobilePhone":"******"</mask>
                    </valueMask>
                </jsonGeneratorDecorator>
            </encoder>
        </appender>
        <root level="INFO">
            <appender-ref ref="JSON"/>
        </root>
    </springProfile>
</configuration>

So to says from a timeline perspective, app-insights seems to infer the written logs after the encoder tag by the configuration above but before the jsonGeneratorDecorator tag.

To Reproduce

Steps to reproduce the behavior:

Sample Application

If applicable, provide a sample application which reproduces the issue.

System information

Please provide the following information:

  • SDK Version: Tried out 3.2.11, 3.3.0 and 3.3.1 having the same behaviour.
  • OS type and version: Running in AKS using the azul/zulu-openjdk-alpine:11-jre-headless base image.
  • Application Server type and version (if applicable):
  • Using spring-boot? Yes, version 2.7.2
  • Additional relevant libraries (with version, if applicable): Apache Camel 3.18.1

Logs

Turn on SDK logs and attach/paste them to the issue. If using an application server, also attach any relevant server logs.

The Turn on SDK logs link above doesn't work for me.

Screenshots

If applicable, add screenshots to help explain your problem.

bvahdat avatar Aug 25 '22 05:08 bvahdat

@bvahdat was asking me if this has been resolved. I understand that #6541 addresses sanitization, and wondered if it resolves this issue?

richorama avatar Sep 22 '22 10:09 richorama

Same problem here, the app-insights doesn´t take the mask logs. I can see they are mask on pod logs but unmasked on app-insights

Thanks

srferron avatar Oct 19 '22 11:10 srferron

Hi, I am facing the same problem. Is this solved please?

sahil-goel avatar Nov 17 '23 09:11 sahil-goel

@sahil-goel @srferron @richorama @bvahdat Sorry for taking so long to get to this issue.

Can you try our TelemetryProcessor preview feature to mask sensitive data in log message: Here is a sample of the json config.

The whole test is #3408. Please let me know if this works for you and ping me if you have further questions.

Essentially, sensitive data can be extracted out as a Log attribute and then delete it on the fly. It will have {redactedKeyName} instead of "*****".

If this still doesn't meet your need, I can try to apply mask feature to log body. That will require feature improvement and a new release.

heyams avatar Nov 29 '23 21:11 heyams

@heymas - what you explained don't really help

I have 100 microservices running in a AKS cluster and each connecting with the same application insights. Every microservice has its own configuration on what to mask in the logs. As of now, things are getting masked in AKS logs as per microservice configuration but when it comes to Application Insights logs, the fields are not masked.

Application insights.json is a centralised configuration and is not per microservice.

Ideally, I would want logs to be exactly same in AKS and App Insights

sahil-goel avatar Dec 11 '23 10:12 sahil-goel

@sahil-goel how do you attach Java agent? at the microservice level or AKS cluster level?

heyams avatar Dec 11 '23 21:12 heyams

this is related to https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/5969

heyams avatar Dec 12 '23 18:12 heyams

https://github.com/open-telemetry/opentelemetry-java/issues/5187#issuecomment-1689923356

logstash-logback-encoder library offers some powerful configuration. can you give it a try?

heyams avatar Dec 12 '23 23:12 heyams

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 7 days. It will be closed if no further activity occurs within 7 days of this comment.