Source Marketo: handle chinese chars
Tell us about the problem you're trying to solve
Marketo connector need to handle chinese encoding. Request from user in slack convo. Looks the actual marketo-singer doesnt handle this encoding.
Describe the solution you’d like
A clear and concise description of what you want to see happen, or the change you would like to see
Describe the alternative you’ve considered or used
A clear and concise description of any alternative solutions or features you've considered or are using today.
Additional context
Add any other context or screenshots about the feature request here.
Are you willing to submit a PR?
Your answer
The issue is with Marketo API call:
https://
- Authorization: <Bearer token>
- User-Agent: Singer.io/tap-marketo
The header response.encoding from Marketo is set to ISO-8859-1 This causes Python's requests.models.iter_content(decode_unicode=True) to use the incorrect decoder.
Possible fixes:
- Have the Marketo API return encoding=utf-8
- In tap-marketo.sync.stream_rows: change the header response.encoding from ISO-8859-1 to utf-8
tap-marketo.sync.py
`def stream_rows(client, stream_type, export_id): with tempfile.NamedTemporaryFile(mode="w+", encoding="utf8", delete=False) as csv_file: singer.log_info("Download starting.") resp = client.stream_export(stream_type, export_id) resp.encoding = 'utf-8' for chunk in resp.iter_content(chunk_size=CHUNK_SIZE_BYTES, decode_unicode=True): if chunk: # Replace CR chunk = chunk.replace('\r', '') csv_file.write(chunk)
singer.log_info("Download completed. Begin streaming rows to file: " + csv_file.name)
csv_file.seek(0)
reader = csv.reader((line.replace('\0', '') for line in csv_file), delimiter=',', quotechar='"')
headers = next(reader)
for line in reader:
yield dict(zip(headers, line))`
https://github.com/singer-io/tap-marketo/issues/74
fix checked in to the main tap_marketo.
hey @YowanR should we validate singer based connector issues against the CDK based ones? if so, should I treat this issue as a bug and include in the certification scope?
This one is out of scope for the certification process. @davydov-d We will look at this issue again if there are more requests for it.
Duplicate of https://github.com/airbytehq/airbyte/issues/20641