HookedBehemoth
HookedBehemoth
So far I only fleshed out this for utf16 to utf8. Do you think there a branchless way for this? ```cpp const __m128i escape_bs = _mm_set1_epi8('\\'); const __m128i escape_ap =...
I only have an AVX2 capable CPU so that's that. While the json spec wants all of " \ / \b \f \n \r \t, I'd really only need the...
Ah I see now. Doing it in an extra step wouldn't be desirable for me I think. I don't currently have any spare memory and checking after every X amount...
I went with the array approach rapidjson uses. https://github.com/Tencent/rapidjson/blob/06d58b9e848c650114556a23294d0b6440078c61/include/rapidjson/writer.h#L380-L388 Unsurprisingly, this is very slow, taking up ~61% of the execution time, according to AMDuProf. Firefox' SpiderMonkey & rapidjson use such...
I'd be interested to see this too. I'm unsure if this helps with my use case though.
This was already addressed in the original PR but that change was reverted. https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/1752/commits/5f12e7efd92ad802742f96788b4be3249ad02829 On *nix, multiprocessing prefers fork over spawn.
> When running the Deepdanbooru model, TensorFlow tries to initiate the same primary GPU that the WebUI is using, which causes the crash. I was able to resolve this bug...
To aid with size reduction it would also be nice to have a "smallest" size annotation. The assembler could then choose to use add/sub/jmp instructions with the smallest possible size...
The time it takes to store the bitmaps is mostly determined by how much memory I allocate. If you increate this variable it should save significantly faster. https://github.com/HookedBehemoth/bitmap-printer/blob/5a15742c1c2fd2c1aab43f4c7d4d629a0716b55b/source/main.cpp#L92 It should...
code sucks too much to fix this