Automated-Fact-Checking-Resources
Automated-Fact-Checking-Resources copied to clipboard
Issue - No log entry written when message is too long
Bug Report / Support Request Template
Please clearly describe what the SQL Sink is doing incorrectly:
We have an existing logs table which has some indexed columns for lookup and searching. Due to the required index, those columns (e.g. the message column) also have a max-length parameter set.
However when logs are written that have a message which is longer, these messages are not written to the server.
The sink actually fails in the bulk copy operation WriteToServerAsync
because of an
InvalidOperationException:
The given value of type String from the data source cannot be converted to type nvarchar of the specified target column.
The same message you will see when enabling Serilogs SelfLog. However, the inner exception message would be more helpful in my opinion:
InvalidOperationException: String or binary data would be truncated.
Please clearly describe the expected behavior: I would have expected, that I'm able to specify the length of data (e.g. the message column) in the configuration and that this property is considered for formatting the message. (Note: I've seen in the current dev branch that the new class
SqlColumn
has theDataLength
property - however this will not make a difference in this scenario, because the message is still being written into the internal DataTable withlogEvent.RenderMessage
not taking this new property into account)
Maybe in combination with a SelfLog.WriteLine
warning about truncated messages.
Currently however I don't seem to have a possibility to control this from outside or modify the log message before it enters the sink (which is actually not an option as we use a file sink simultaneously where the entire message shall be written to).
List the names and versions of all Serilog packages used in the project:
- Serilog: 2.7.1
- Serilog.Sinks.MSSqlServer: 5.1.2
- Configuration:
private static LoggerConfiguration CreateMsSQLLogConfiguration(LoggerConfiguration loggerConfig)
{
Serilog.Sinks.MSSqlServer.ColumnOptions tableColumnOptions = new Serilog.Sinks.MSSqlServer.ColumnOptions();
tableColumnOptions.DisableTriggers = true;
tableColumnOptions.Store.Remove(Serilog.Sinks.MSSqlServer.StandardColumn.MessageTemplate);
tableColumnOptions.Store.Remove(Serilog.Sinks.MSSqlServer.StandardColumn.Properties);
tableColumnOptions.TimeStamp.ConvertToUtc = true;
tableColumnOptions.TimeStamp.ColumnName = "Date";
tableColumnOptions.AdditionalDataColumns = new Collection<System.Data.DataColumn>
{
new System.Data.DataColumn("UserName", typeof(string)),
new System.Data.DataColumn("Logger", typeof(string))
};
return loggerConfig.WriteTo
.MSSqlServer(AppSettings.SqlConnectionString, "Logs",
batchPostingLimit: 1,
period: TimeSpan.FromSeconds(1),
columnOptions: tableColumnOptions);
// autoCreateSqlTable: true);
}
Target framework and operating system:
- [ ] .NET Core 2.0
- [x] .NET Framework 4.7
- [ ] .NET Framework 4.6.x
- [ ] .NET Framework 4.5.x OS: Windows 10
Provide a simple reproduction of your Serilog configuration code:
- Create a table with Message column configured as nvarchar(10)
- Write a log message with more than 10 characters
- Note the exception happening in
WriteToServerAsync
Possible solution to this problem would be:
In MSSqlServerSinkTraits.GetStandardColumnNameAndValue
instead of calling logEvent.RenderMessage(formatProvider)
and take the result as is, refactor this call to a method which might look similar to this (just to get an idea):
private string GetLogMessage(LogEvent logEvent)
{
string rawMessage = logEvent.RenderMessage(FormatProvider);
Int32 maxAllowedMessageLength = ColumnOptions.Message.DataLength;
if (0 < maxAllowedMessageLength && rawMessage.Length > maxAllowedMessageLength)
{
rawMessage = rawMessage.Substring(0, maxAllowedMessageLength - 3);
rawMessage = $"{rawMessage}...";
}
return rawMessage;
}
Of course this approach needs to be generalized and the max length and the string should be the parameter. This way such a utility method (e.g. TruncateInputToAllowedLength
or something) could be used for any data being written to the table while comparing against the DataLength
property of the respective SqlColumn configuration (or against the MaxLength
of the DataColumn in case of AdditionalColumns when handling the logEvents properties).
Furthermore, I think one should use the SelfLog to inform about truncated messages so the consumer of the library has a chance to find out what is going on under the hood.
Thank you for reporting this. Is it possible for you to send us a pull request including tests?
Is there a way to highlite this issue as critical bug (according to principle of less surprise) and mention in documentation in red? Production databases rarely refactored, so there is no way for me to use nvarchar(max) in nearest years.
This issue also makes main feature (structured logs) - auto-serialization not usable, for every column I need to serialize data manually and then truncacte it to specific value.
When using the batched sink (WriteTo
) there is a big caveat whith wrong input data. If one log event in a batch cannot be written to the database, then the whole batch will be lost. This is not obvious and therefore I have now documented this behavior in the README: https://github.com/serilog/serilog-sinks-mssqlserver/blob/dev/README.md#consider-batched-sink-sqlbulkcopy-behavior
I'm still open for a PR like suggested above to do at least a truncation of input data when configured in the column options.
Furthermore I don't think that this is a critical bug of the sink but it is by design of the underlying SqlBulkCopy
class. If invalid input data (according to the database schema) is supplied, then the batch will be lost. This can not only occurr with length exceeding but also when supplying invalid values according to constraints or supplying null for fields that are non nullable in the database. It is not easily possble, nor desireable considering performance, for the sink to do all these checks. If you regularily run into invalid input data, I recommend using a wrapper around the log methods or using Serilog Enrichers to validate/check and correct the log event data before it gets written to the database.
Message trimming can be used to avoid this case. Closing this issue.