Yilong Guo
Yilong Guo
楼上+1 无序还可以考虑用hashmap,空间换时间,平均时间复杂度O(n) ```cpp class Solution { public: vector twoSum(vector& nums, int target) { unordered_map imap; for (int i = 0; i < nums.size(); ++i) { auto it = imap.find(target -...
Intel A750 8G (IPEX backend): this improves the performance from 0.7it/s to 1.5it/s with no significant VRAM usage increase.
It is a known issue of Arc driver that garbage images are generated at 1024x1024 resolution. Try 1080x1080 instead.  BTW, if you generate 512x512 images of batch size 4...
> The images produced by the ARC770 at 832x832 using SAI's Base 1.0 SDXL model are "okay" although generating speed is 2.5sec/it (much slower than others are reporting) Invoke AI...
I see the ratio for linux has been increased to 98%: https://github.com/intel/compute-runtime/commit/e8ac22c26508f1f32eba5f4057d3d9917bf352ff Do we have a plan to increase that for Windows? The 80% ratio is really limiting AI workload...
Good news! I was able to override the default ratio with the following environments: ``` NEOReadDebugKeys=1 ClDeviceGlobalMemSizeAvailablePercent=100 ``` https://github.com/intel/compute-runtime/blob/8ed2cb2bfe7d749a8f5958da83e431fed1af0564/shared/source/device/device.cpp#L597-L602
Fix was merged to OCL CPU driver. XFAIL can be removed on next OCL CPU driver uplift.
Got 2 fails with `test_bruteforce -f -r -w -1` on our device. Investigating: ``` fract... 57: fract fp64 ................Wimp pass { 0.00, 0.00} @ {0x0p+0, 0x0p+0} 58: fract fp16 ERROR:...
> Got 2 fails with `test_bruteforce -f -r -w -1` on our device. Investigating: > > ``` > fract... > 57: fract fp64 ................Wimp pass { 0.00, 0.00} @ {0x0p+0,...
Have a try with https://github.com/Tencent/HunyuanDiT/pull/71 Something like `python .\app\multiTurnT2I_app.py --device cuda:0 --enhance-device cuda:1 --load-4bit`