Vineet Gupta
Vineet Gupta
Per ISA spec: **FNMSUB.S** multiplies the values in rs1 and rs2, negates the product, **adds** the value in rs3, and writes the final result to rd. FNMSUB.S computes -(rs1×rs2)+rs3. **FNMADD.S**...
Since we are on the topic of wider accesses generated by codegen, here's another issue that we have but got lost along the way. When built with -munaligned-access -O2, a...
We have some code of form ```asm ST a, [@global, 0] ``` gas barfs as "Error: inappropriate arguments for opcode 'st' " It can simply relax the ST instruction and...
An ARCv2 binary was incorrectly flagged as having hw float instructions despit ethe soft-float build. Turns out that objdump can incorrectly disassemble random fragments of jump tables - embedded inline...
Busybox free prints stray characters ``` Linux version 5.6.0-00223-gf03b92a6f9a7 (vineetg@vineetg-Latitude-7400) (gcc version 10.2.0 (Buildroot 2021.02-6-g5e29ba7bf732)) #1 PREEMPT Tue Apr 20 11:54:40 PDT 2021 Memory @ 80000000 [1024M] Memory @ 100000000...
On ARC ST encodings of form ST a, [b, off] only take s9 offset. The workaround is to change this into ST.as (if offset is multiple of 4) A bunch...
currently mdb doesn't show call stack for ARC64 kernel due to missing .eh_frame from "C" code despite -fasynchronous-unwind-tables FWIW asm code with hand edited .cfi_* seems to work. There's also...
LMBench memory bandwidth tests frd() and fwr() access consecutive 512 bytes to compute memory subystem bandwidth. ``` void fwr(iter_t iterations, void *cookie) buf; p[0]= p[1]= p[2]= p[3]= p[4]= p[5]= p[6]=...
We currently copyout/copyin each register which generates horrible code. Instead use a tmp on-stack structure, populate it with pt_regs and then do a single copyout/copyin (see xtensa port e.g.)
strace is not able to decipher some syscalls. ``` # strace -f ./lmbench/bin/arc64/bin/memsize 16 ... [pid 69] syscall_0x104(0xffffffffffffffff, 0x5ffffbb4, 0, 0, 0x201df0d8, 0x201df008 ... syscall_0x71(0, 0x5ffffb18, 0, 0x8, 0x40, 0x6)...