redis-js
redis-js copied to clipboard
Unicode converting from UTF-16 to UTF-32 then failing on JSON.parse
We're storing an object via hset
that contains the following unicode character (an emoji) \ud83e\udef6
. When we retrieve the value via hgetall
we get \U0001faf6
back which is the UTF-32
encoding of the original character. This causes the deserializer to fail.
Is there a reason that these characters are getting converted from different encoding types? And what would be the recommended work around here?
Error we receive back from hgetall
:
FetchError: invalid json response body at [URL] reason: Unexpected token U in JSON at position 330
We then fetch the response using /hgetall/
REST API to inspect it - (res.text()
) and it's telling us the unicode character was transformed to UTF-32
Perhaps the solution here is to be able to define a custom serializer? I notice we can do that for deserializers.
Hmm, we're not doing anything other than calling JSON.parse
when deserializing. I'm not sure what's causing this but I'll try it myself.
Yeah, can you try it without deserialization:
new Redis({
// ...
automaticDeserialization: false
})
that will give you the string response straight from redis.
If that also converts the string in to UTF-32, then we'll need to fix it in our redis server.
Here's a min repro case
const key = "repro";
// Unicode goes in as \ud83e\udef6
await redis.hset(key, { key: { jsonKey: "Some text \ud83e\udef6" } });
try {
// This throws an error: FetchError: invalid json response bod
const userInfo = await redis.hgetall(key);
} catch (e) {
const res = await fetch(`${process.env.UPSTASH_REDIS_REST_URL}/hgetall/${key}`, {
headers: {
Authorization: `Bearer ${process.env.UPSTASH_REDIS_REST_TOKEN}`,
},
});
console.error(await res.text());
// Unicode returned as \U0001faf6 {"result":["key","{\"jsonKey\":\"Some text \U0001faf6\"}"]}
}
Using automaticDeserialization: false
results in the same!
Thanks, I will try to debug this asap, but I can't promise you anything until the end of next week.
Hey @cathykc
I did some more testing and I belive it is actually the JSON.stringify
method, that causes the issue.
The easiest way around that is to escape the backslashes in your value.
import { Redis } from "@upstash/redis";
import "isomorphic-fetch";
const value = "Some text \\ud83e\\udef6";
async function main() {
const redis = Redis.fromEnv()Ï
await redis.hset("upstash", { value });
console.log(await redis.hgetall("upstash"));
}
main();
I tried some other ways to include a custom serializer but that didn't solve the problem unfortunately. So I hope this works for you.
I'll close this due to inactivity, please reopen if you have more questions
Hey! Sorry for not replying - we ended up solving this by with encodeURI
.
Setting - encodeURI(JSON.stringify(VALUE_TO_STORE_IN_REDIS))
Getting -JSON.parse(decodeURI(VALUE_RETREIVED_FROM_REDIS)
- this returns the original unicode characters
Thanks for looking into this!
Running into the same issue!
Very easy to reproduce, just run:
➜ SET bugtest 🫠
OK
➜ GET bugtest
Bad escaped character in JSON at position 12
running into the same issue as @cvle, on the backend I replaced upstash/redis
with ioredis
because setting automaticDeserialization = false
did not help. but now on the frontend the issue is the same, but on the frontend I can't use anything other than upstash restapi...
@chronark do you think there will be an option in the future for the rest api to return un-desirialized (raw) data from redis?
The problem is that the http API speaks json, and this we need to call json.parse on it, which automatically messes with the encoding.
To prevent this we have started to encode the response in base64 and you can opt out of automatic deserilization
I think this might be on the server side and we will take another look into it
cc @mdogan