Da Li (李达) comments

Results 118 comments of


                                            Da Li (李达)

give primitive types a faster key&value compare, copy, and zero methods in TypedDict

Here is a mini-benchmark (I guess you forgot this?) https://github.com/numba/numba/pull/9520#issuecomment-2033243981

give primitive types a faster key&value compare, copy, and zero methods in TypedDict

After 2nd thought, perhaps this way is too tricky to maintain? I won't mind if you don't wanna merge this into next release.

[Feature Request] `key_equal`, `copy_key`, `zero_key` in dict is slower than direct assignment if key type is primitive

Hi, @guilhermeleobas. Sure, I should provide an example to show it, I will give one after I provide a corresponding PR, then we can compare the performance changes with and...

Feature request: user-level statically defined tracepoints

Any update on this feature? cc @gmarkall @spenczar . sounds very useful if possible to use it.

Feature request: user-level statically defined tracepoints

Hi, I tried to install stuff and run the demo in https://github.com/spenczar/numba_stap_demo. But I met an import issue when `import bcc from USDT` in `find_stap_lib.py`. I searched pypi for bcc-related...

CUDA target respect NUMBA_OPT environment variable

I would like to give this a try. After first looking, I found some useful comments on `cuda.jit` decorator: ```python :param debug: If True, check for exceptions thrown when executing...

Type inference leads to libNVVM bugs on 0.59.1

I also tested this case locally. If just using `@cuda.jit`, this test will pass. So the issue comes with `lineinfo=True` option. And it's also related to this complex branch structure,...

Type inference leads to libNVVM bugs on 0.59.1

BTW, when I want to see optimized NVVM IR, which envvar should be useful? I tried with: ```python # os.environ["NUMBA_CUDA_DEBUGINFO"] = "1" # os.environ["NUMBA_DEBUG_TYPEINFER"] = "1" os.environ["NUMBA_DUMP_LLVM"] = "1" #...

Type inference leads to libNVVM bugs on 0.59.1

So I took a quick look into https://github.com/numba/numba/blob/df07de114404225e64eea3c0622d3aee4a12e0c8/numba/cuda/codegen.py#L138-L150 I think `llvm_strs` should be the unoptimized LLVM IR from numba frontend? Then cuda codegen directly converts it to PTX, so in...

follow the NVVM IR form: not use `.` in global identifier

drop this for now, since the main PR in numba-cuda is stalled.