logstash-input-http icon indicating copy to clipboard operation
logstash-input-http copied to clipboard

Error in handling charsets different from UTF-8

Open andsel opened this issue 4 years ago • 2 comments

  • Version: 3.3.5
  • Operating System:
  • Config File (if you have sensitive info, please remove it):
input {
	http {
		port => 9006
		codec => plain {
			charset => "CP1254"
		}
	}
}	

output {
	stdout {
		codec => json {charset => "UTF-8"}
	}
}
  • Sample Data: python script to use as client to send encoded data
import requests
API_ENDPOINT = "http://127.0.0.1:9006"
message='TÜRKÇE karakter test : ĞÜŞİÇÖışüğöç'
r = requests.post(url = API_ENDPOINT, data = bytes(message,'cp1254'))
  • Steps to Reproduce:
    • run logstash with the pipeline
    • execute the python script
    • the console output is:
{"message":"T�RK�E karakter test : ������������","@version":"1","@timestamp":"2020-11-30T10:38:55.338Z","headers":{"connection":"keep-alive","request_method":"POST","http_accept":"*/*","http_user_agent":"python-requests/2.21.0","content_length":"35","http_version":"HTTP/1.1","http_host":"127.0.0.1:9006","request_path":"/","accept_encoding":"gzip, deflate"},"host":"127.0.0.1"}

This seems not to be a problem in the codec because I've tried with this pipeline (same codec, different input):

input {
	file {
		path => "/tmp/cp1254_encoded.txt"
		mode => "read"
		sincedb_path => "/dev/null"
		file_completed_log_path => "/tmp/file_actions.log"
		file_completed_action => "log"
		codec => plain {
			charset => "CP1254"
		}
	}
}	

output {
	stdout {
		codec => json {charset => "UTF-8"}
	}
}

with the file attached as input data cp1254_encoded.txt

and the console out is what's expected (TÜRKÇE karakter test : ĞÜŞİÇÖışüğöç)

NB: to reproduce the text file simply cut&paste the above string in a text editor and ask the editor to save it with encoding CP1254

andsel avatar Nov 30 '20 10:11 andsel

Hi guys, any improvement about this issue ?

GokcerBelgusen avatar Apr 05 '21 09:04 GokcerBelgusen

Hi @GokcerBelgusen actually no news on this, but I'll keep track in my radar

andsel avatar May 03 '21 09:05 andsel