Summary

At the moment access logs only print a limited number of information. Moreover, there is no way to select the type of information that needs to be printed or the format of the log entry. This proposal is to address these limitations.

Goals

Make access log information configurable
Add new information to access log
Introduce JSON format for access log

Non-Goals

Make the order of access log information configurable

Motivation

It is common to have different information in the access log based on the organization. Therefore, current limitations in access log, hinders the usability of Ballerina.

Description

At the moment one can enable access log using the below configuration for a given HTTP service.

[ballerina.http.accessLogConfig]
# Enable printing access logs on the Console. The default value is `false`.
console = true
# Specify the file path to save the access logs. This is optional.  
path = "testAccessLog.txt"

These configurations need to be in the config.toml and the resulting output looks as follows.

127.0.0.1 - - [15/Dec/2022:11:39:42 +0530] "GET /albums HTTP/1.1" 200 95 "-" "curl/7.79.1"

The above output is fixed and the user has no control over it. First thing that is needed to make it configurable is to list down the type information. Following is the key and description of each information that can be printed with the new proposal.

Name	Description
remote_ip	IP of the remote address
verb	HTTP verb of the request
path	Path of the request
schema_and_version	Schema of the request and HTTP version
status_code	HTTP status code
response_size	Size of the response body
request_size	Size of the request body
response_time	Time to took send the response
agent	Name of the agent who sent the request
upstream_ip	IP of the upstream server
upstream_response_time	Upstream server response time

Please note that time and the date of log is not configurable. Then to make this information configurable, a new key is introduced to the config.toml configuration as attributes.

[ballerina.http.accessLogConfig]
# The list of attributes to be printed
attributes = remote_ip, verb, path, schema_and_version, status_code, response_size, agent

One can configure the list of attributes as needed. The above is the default list. Finally, to change the format of access log entry another new configuration is introduced as format. The two formats are flat and json.

[ballerina.http.accessLogConfig]
# The list of attributes to be printed
format = flat

The default value of the format key is flat, which is concise and has least impact on the performance. This can be changed by setting the value to json, which results in the same information in json format.

{
  "remote_ip": "127.0.0.1",
  "verb": "GET",
  "path": "/albums",
  "schema_and_version": "HTTP/1.1",
  "status_code": 200,
  "response_size": 95,
  "agent": "curl/7.79.1",
  "time_stamp": "[15/Dec/2022:11:39:42 +0530]"
}

Feb 28 '24 11:02 shafreenAnfar

Hi Shafreen,

Thank you for the proposal. Let me add a few questions that I need to clarify.

Regarding path Path of the request I think this is the downstream request. This is important to us. And also, there are some customers who does some changes to the path before sending the upstream request. So its good if we can see the upstream request path too.
status_code - we need to confirm whether this status code originated from the upstream or due to some reason mediation service initiated this. Do we have a way for that? Envoy has an attribute called RESPONSE_CODE_DETAILS which solves a similar problem.
response_time Time to took send the response Is this the total response time on the listener side right? This is important to us. This means response_time - upstream_response_time gives the latency added by the ballerina mediation service, which is an important measure we want to identify. Hope my understanding is correct.
agent - this is the UserAgent header?
What is the difference between remote_ip and upstream_ip? Is this client IP vs upstream IP?
The other parameters you mentioned (verb, schema_and_version, response_size, request_size) are required too. +1 to include them.

Anther important facts we need to see is:

Can we have another field called requestId which is logged by using an incoming header value. This is important for us to correlate requests throughout the path (nginx LB -> Gateway -> Mediation service) we use the same request id. By default we use the x-request-id header.
What happens to the above log when there's a transport failure during the upstream call (connection refuse, connection timeout, read timeout) and can we identify that from the log?

The default value of the format key is flat, which is concise and has least impact on the performance. This can be changed by setting the value to json, which results in the same information in json format.

If we used json or flat, we will get all the details in a single log line right? Hope we'll not format the json with multiline (coz logs will be overlapped in high concurrency).

Mar 01 '24 05:03 malinthaprasan

+1 for this =) we have the same requirement for BCentral.

Mar 19 '24 06:03 xlight05

[Proposal] Improve AccessLog in Ballerina Services

Ballerina's current HTTP access logging mechanism provides a static and limited view of interactions, lacking the flexibility required by diverse organizational needs. This proposal advocates for a more configurable logging system, introducing options for log content customization, format variation (including JSON), and adding new information types to access logs.

Objectives

Configurable Log Information: Empower users to select and configure the log information relevant to their needs.
Enhanced Log Content: Enrich the access logs by adding new types of information.
Format Flexibility: Introduce JSON as an alternative log format, alongside the existing flat format.

Motivation

Different organizations have unique requirements for logging web access data, driven by varied use cases such as security analysis, performance monitoring, and compliance auditing. The inflexibility in Ballerina's access logging fails to meet these diverse needs, thus limiting its utility and adoption. By enhancing customization capabilities, Ballerina can better serve a wider array of enterprise and developer scenarios.

Description

Current Limitations

Presently, enabling HTTP access logs in Ballerina is constrained to minimal configuration options, as shown below:

[ballerina.http.accessLogConfig]
console = true  # Enables logging to the console.
path = "testAccessLog.txt"  # Optional: Specifies a file path for logging.

The resultant log entry is rigid:

127.0.0.1 - - [15/Dec/2022:11:39:42 +0530] "GET /albums HTTP/1.1" 200 95 "-" "curl/7.79.1"

AccessLog in other Web Servers

Nginx

Nginx offers a highly customizable logging mechanism. Users can define custom log formats by specifying the desired variables in the nginx.conf configuration file. Here’s an example of how to define a custom log format:

log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                '$status $body_bytes_sent "$http_referer" '
                '"$http_user_agent" "$http_x_forwarded_for"';

In addition to this, since Nginx supports a customizable logging mechanism, we can customize the logging format to be JSON as well as below:

log_format json_combined escape=json '{'
    '"time_local":"$time_local",'
    '"remote_addr":"$remote_addr",'
    '"remote_user":"$remote_user",'
    '"request":"$request",'
    '"status": "$status",'
    '"body_bytes_sent":"$body_bytes_sent",'
    '"request_time":"$request_time",'
    '"http_referrer":"$http_referer",'
    '"http_user_agent":"$http_user_agent"'
'}';

In this configuration, escape=json ensures that strings are properly escaped to conform to JSON standards.

Apache HTTP Server

Apache uses modules to extend its functionality, with mod_log_config providing flexible logging capabilities. This module allows for a high degree of customization in how access logs are formatted and what information is logged.

Apache allows administrators to define custom log formats using the LogFormat directive. This directive can include a wide variety of variables, enabling detailed control over what information is logged and its presentation:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" my_custom_format

While Apache does not natively support JSON logging out-of-the-box as simply as Nginx does with its escape=json parameter, you can achieve JSON logging by creatively using its custom log format capabilities.

LogFormat "{ \"remote_ip\": \"%h\", \"time\": \"%t\", \"request\": \"%U%q\", \"status\": %>s, \"referrer\": \"%{Referer}i\", \"user_agent\": \"%{User-Agent}i\" }" json

Proposed Enhancements

Configurable Attributes

To introduce flexibility, we propose listing specific types of log information (e.g., ip, request_method, patrequest_uri, status, response_body_size) that users can selectively include in their logs. The introduction of an attributes configuration in config.toml will allow for such customization:

Name	Description
ip	The IP address of the client making the request
request	The full HTTP request line, including the method, URI, and protocol version
request_method	The HTTP method of the request
request_uri	The URI of the request, including parameters
scheme	Scheme of the request and HTTP version
status	The HTTP status code sent to the client
request_body_size	Size of the request body
response_body_size	The size of the HTTP response body sent to the client, in bytes
request_time	The total request processing time, from reading the client's request to sending the last byte of the response
http_referrer	The Referer header field of the request, indicating the page from which the user navigated
http_user_agent	The User-Agent header field, name of the agent who sent the request
http_x_forwarded_for	The originating IP address of a client connecting through an HTTP proxy or a load balancer
http_(X-Custom-Header)	Header fields. Referring to them with `http` followed by the header name (`x-request-id` -> `http_x-request-id`)
host	Host name from the request that the client sent

Default fields: ip, request, status, response_body_size, http_referer, http_user_agent

Note: All the above fields are available for upstream as well.

Note: Most of these field names are common across all other services, we wanted to stick with the common terminology, in most cases we have followed nginx terminology.

[ballerina.http.accessLogConfig]
attributes = ["ip", "request", "status", "response_body_size", "http_user_agent"]

Log Formats

In addition to the flat format, we propose introducing a JSON format for access logs to provide structured and easily parseable log data. Users can select their preferred format via a new format configuration:

[ballerina.http.accessLogConfig]
format = "json"  # Options: "flat", "json"

The FLAT format example:

No upstream server:

127.0.0.1 - - [15/Dec/2022:11:39:42 +0530] "GET /albums HTTP/1.1" 200 95 "-" "curl/7.79.1"

In the presence of upstream servers: We use "~" to differentiate downstream and upstream. In the below case, there are two upstream servers, and the information on that is logged as per the defined attributes.

127.0.0.1 - - [15/Dec/2022:11:39:42 +0530] "GET /albums HTTP/1.1" 200 95 "-" "curl/7.79.1" "~" 192.168.3.7 - - [15/Dec/2022:11:39:48 +0530] "POST /myAlbums HTTP/1.1" 201 235 "-" "ballerina" 192.168.5.5 - - [15/Dec/2022:11:39:55 +0530] "GET /yourAlbums HTTP/1.1" 200 222 "-" "ballerina"

The JSON format example:

{
  "ip": "127.0.0.1",
  "verb": "GET",
  "path": "/albums",
  "schema": "HTTP/1.1",
  "status": 200,
  "response_size": 95,
  "agent": "curl/7.79.1",
  "time_stamp": "[15/Dec/2022:11:39:42 +0530]",
  "upstream": [
    {
      "ip": "192.168.3.7",
      "verb": "POST",
      "path": "/myAlbums",
      "schema": "HTTP/1.1",
      "status": 201,
      "response_size": 36,
      "agent": "ballerina",
      "time_stamp": "[15/Dec/2022:11:39:42 +0530]"
    },
    {
      "ip": "192.168.5.5",
      "verb": "GET",
      "path": "/yourAlbums",
      "schema": "HTTP/1.1",
      "status": 200,
      "response_size": 40,
      "agent": "ballerina",
      "time_stamp": "[15/Dec/2022:11:39:42 +0530]"
    }
  ]
}

Conclusion

By enhancing the Ballerina HTTP access log's customization capabilities, we aim to address current limitations and better fulfill the diverse requirements of users and organizations.

Mar 21 '24 12:03 AzeemMuzammil

@AzeemMuzammil Can we clearly differentiate between the response_time and request_time in the proposal?

Mar 26 '24 04:03 TharmiganK

Sorry, now only could look at the proposal.

As @TharmiganK also mentioned, "response_time Time took to send the response" is not clear to me too. Can we elaborate a bit more on this?
Is it possible to add upstream_hostname too? (Do not try to reverse dns look up and resolve using the IP. We can log this only if the upstream host is provided)
And also, with mediation services, we are using client->execute method and only once during an request. (So the upstream array will only have one json object). Hope this path doesn't cause any issues.
Are we using the existing ballerina access log config to enable this too? or do we need additional configs?
As I mentioned earlier too, hope JSON format is printed with a single line by default (I think you mentioned a formatted multiline json only for reading purposes).
Can you also tryout a few different scenarios where the backend doesn't response properly, eg:
- Connection failure (this can be a connection refuse, failed TLS handshake etc)
- Connection timeout (ballerina client tries to initiate a connection with the backend but the server doesn't respond to the connection request)
- Response timeout (initial TCP connection with the backend is successful, but server doesn't respond on time with the HTTP response)
- Client closes the connection before backend responds (the client who calls the ballerina service closes the connection before backend responds)
These are commonly occurring failures when connecting to backends. Its important to reflect these from access logs. These cases I mentioned are mainly for HTTP/1.1. I think @TharmiganK might know more scenarios which are related to HTTP/2.0.

Other than that, other available fields looks good to me.

@VirajSalaka @Thushani-Jayasekera @renuka-fernando please have a look at the proposal and see if there are anything important missing.

Mar 26 '24 04:03 malinthaprasan

The following are some other suggestions.

In the access log config response_body_size refers to The size of the HTTP response body sent to the client, in bytes. Is it possible to add upstream_response_body_size as well for each upstream?
request_body_size refers to the Size of the request body (from downstream). Is it possible to add upstream_request_body_size where the payload size of the request is sent to the upstream?

Apr 07 '24 13:04 renuka-fernando

@AzeemMuzammil Can we clearly differentiate between the response_time and request_time in the proposal?

As we discussed, we will be removing request_time field as there is no consistent interpretation for this, and response_time captures what are we interested in. (Updated the proposal as per this)

* Is it possible to add upstream_hostname too? (Do not try to reverse dns look up and resolve using the IP. We can log this only if the upstream host is provided)

Yes, we will and we can add the host name as well under the filed name host if host is provided.

* And also, with mediation services, we are using client->execute method and only once during an request. (So the upstream array will only have one json object). Hope this path doesn't cause any issues.

No, this would not cause any issues and should not,

* Are we using the existing ballerina access log config to enable this too? or do we need additional configs?

The existing log config should work fine without any issues, but to enable any additional properties to lot, we have to enable them under the new property attributes introduced in Config.toml, in addition to this I have updated the proposal with default fields and all existing fields.

* As I mentioned earlier too, hope JSON format is printed with a single line by default (I think you mentioned a formatted multiline json only for reading purposes).

Yes, correct

* Can you also tryout a few different scenarios where the backend doesn't response properly, eg:
  
  * Connection failure (this can be a connection refuse, failed TLS handshake etc)
  * Connection timeout (ballerina client tries to initiate a connection with the backend but the server doesn't respond to the connection request)
  * Response timeout (initial TCP connection with the backend is successful, but server doesn't respond on time with the HTTP response)
  * Client closes the connection before backend responds (the client who calls the ballerina service closes the connection before backend responds)
  
  These are commonly occurring failures when connecting to backends. Its important to reflect these from access logs. These cases I mentioned are mainly for HTTP/1.1. I think @TharmiganK might know more scenarios which are related to HTTP/2.0.

Sure will check this as well,

* In the access log config `response_body_size` refers to `The size of the HTTP response body sent to the client, in bytes`. Is it possible to add `upstream_response_body_size` as well for each upstream?

Yes, every fields which are available under downstream path will be available for upstream path as well, I have updated this in the proposal to reflect this.

* `request_body_size` refers to the `Size of the request body` (from downstream). Is it possible to add `upstream_request_body_size` where the payload size of the request is sent to the upstream?

Yes, we will have that filed as well.

Please check the responses @TharmiganK @malinthaprasan @renuka-fernando, feel free to ask any questions if available.

Apr 25 '24 05:04 AzeemMuzammil

Hi @AzeemMuzammil,

Could you please provide an update on the current status?

I am sorry, this request might be coming in a bit late. In recent customer issues, when we’ve noticed delays in their API responses we have to analyse the network traces etc to get further details. After analyzing the situation using TCP dumps and other network tools, we usually find that these delays are often occurring at the connection establishment level. Is it possible for you to log the time taken for connection establishment in the access log itself? It will be a great time saver for us. For connection reuse, the initial request should log a positive value, while subsequent requests using the same connection should log a time of 0, as there's no need for re-establishment.

Jun 10 '24 06:06 malinthaprasan

Also reminder about below as they are very important.

* Can you also tryout a few different scenarios where the backend doesn't response properly, eg:
  
  * Connection failure (this can be a connection refuse, failed TLS handshake etc)
  * Connection timeout (ballerina client tries to initiate a connection with the backend but the server doesn't respond to the connection request)
  * Response timeout (initial TCP connection with the backend is successful, but server doesn't respond on time with the HTTP response)
  * Client closes the connection before backend responds (the client who calls the ballerina service closes the connection before backend responds)
  
  These are commonly occurring failures when connecting to backends. Its important to reflect these from access logs. These cases I mentioned are mainly for HTTP/1.1. I think @TharmiganK might know more scenarios which are related to HTTP/2.0.

Jun 10 '24 06:06 malinthaprasan

Hi @AzeemMuzammil,

Could you please provide an update on the current status?

I am sorry, this request might be coming in a bit late. In recent customer issues, when we’ve noticed delays in their API responses we have to analyse the network traces etc to get further details. After analyzing the situation using TCP dumps and other network tools, we usually find that these delays are often occurring at the connection establishment level. Is it possible for you to log the time taken for connection establishment in the access log itself? It will be a great time saver for us. For connection reuse, the initial request should log a positive value, while subsequent requests using the same connection should log a time of 0, as there's no need for re-establishment.

Hi @malinthaprasan as we discussed offline, we will look at this as a separate issue as we are already in line to merge the current changes. To track that I will create an issue.

Also reminder about below as they are very important.

* Can you also tryout a few different scenarios where the backend doesn't response properly, eg:
  
  * Connection failure (this can be a connection refuse, failed TLS handshake etc)
  * Connection timeout (ballerina client tries to initiate a connection with the backend but the server doesn't respond to the connection request)
  * Response timeout (initial TCP connection with the backend is successful, but server doesn't respond on time with the HTTP response)
  * Client closes the connection before backend responds (the client who calls the ballerina service closes the connection before backend responds)
  
  These are commonly occurring failures when connecting to backends. Its important to reflect these from access logs. These cases I mentioned are mainly for HTTP/1.1. I think @TharmiganK might know more scenarios which are related to HTTP/2.0.

I have tested out the scenarios mentioned above and we do still log these but not part of access logs, incorporating these into access logs requires a design reconsideration, we will look into this if possible,

Jun 18 '24 05:06 AzeemMuzammil

I have tested out the scenarios mentioned above and we do still log these but not part of access logs, incorporating these into access logs requires a design reconsideration, we will look into this if possible

In what type of log do we have this information? And how do we correlate them? Bdw, we need to consider this as an important aspect of access logs. One of the main values of these logs is for troubleshooting. If we don't have this data in the access logs, we'll be missing out on important details..

Jun 20 '24 11:06 malinthaprasan

@malinthaprasan as I mentioned these are not logged as part of access logs, and we have to implement these as part of access logs if required, can you create a new issue for any other requirements that are needed to be part of the access log other than what is mentioned in the proposal. whatever is mentioned in the proposal will be released as part of update 10. and anything in addition we can incorporate in upcoming updates. Thanks,

For the moment, since the feature is implemented and being shipped with update 10, I'll close the issue, if required feel free to open it.

Aug 02 '24 06:08 AzeemMuzammil

Ack, that is fine.

Is the feature released in an 2201.5.x version now? (Asking because the issue was closed)

Aug 02 '24 06:08 malinthaprasan

No, the feature will be released with 2201.10.0, the reason to close the issue is, that the relevant change is already merged and in the pipeline to release with 2201.10.0 as we planned.

Aug 05 '24 05:08 AzeemMuzammil

ballerina-library
ballerina-library copied to clipboard

[Proposal] Add and make access logs attributes configurable

Summary

Goals

Non-Goals

Motivation

Description

[Proposal] Improve AccessLog in Ballerina Services

Objectives

Motivation

Description

Current Limitations

AccessLog in other Web Servers

Nginx

Apache HTTP Server

Proposed Enhancements

Configurable Attributes

Log Formats

Conclusion

ballerina-library ballerina-library copied to clipboard

[Proposal] Add and make access logs attributes configurable

Summary

Goals

Non-Goals

Motivation

Description

[Proposal] Improve AccessLog in Ballerina Services

Objectives

Motivation

Description

Current Limitations

AccessLog in other Web Servers

Nginx

Apache HTTP Server

Proposed Enhancements

Configurable Attributes

Log Formats

Conclusion

ballerina-library
ballerina-library copied to clipboard