bootleg
bootleg copied to clipboard
Converting string that contains Japanese text breaks on Windows
I'm using bootleg both via babashka v1.2.174 and running the jar file (java 11.0.13) on Windows 10 Home Single Language
here is a reproducible example via babashka
(ns user)
(require '[babashka.pods :as pods])
(pods/load-pod 'retrogradeorbit/bootleg "0.1.9")
(require '[pod.retrogradeorbit.bootleg.utils :as utils])
(require '[pod.retrogradeorbit.hickory.select :as s])
(let [jp-html "<div>読</div>"]
(spit "test-jp-str.txt" jp-html)
(spit "test-jp-converted.txt"
(utils/convert-to jp-html :hickory)))
and I also tried running this with the jar via java -jar
command, and stdout to a converted.txt
(let [jp-html "<div>読</div>"]
(convert-to jp-html :hickory))
both breaks the string 読 but with different results
- via babashka, it turns into
{:type :element, :attrs nil, :tag :div, :content ["èª"]}
- via java -jar, it turns into
{:type :element, :attrs nil, :tag :div, :content ["��"]}
(not quite sure if the characters will be correctly shown here)
note: I tried this on MacOS and there is no conversion problem