llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

clang16 SelectionDAG caused bpf program stack exceeding limit error

Open yonghong-song opened this issue 3 years ago • 4 comments

Hi, we found a regression with some bpf code with patch https://reviews.llvm.org/D77804.

commit 69d5a038b90d3616a5c22d4e0e682aad4679758d
Author: Simon Pilgrim <[email protected]>
Date:   Thu Jul 28 14:10:44 2022 +0100

    [DAG] Enable ISD::SRL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
    
    This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the ISD::SRL source operand has other uses, enabling us to peek through the sh
ifted value if we don't demand all the bits/elts.
    
    This is another step towards removing SelectionDAG::GetDemandedBits and just using TargetLowering::SimplifyMultipleUseDemandedBits.
    
    There a few cases where we end up with extra register moves which I think we can accept in exchange for the increased ILP.
    
    Differential Revision: https://reviews.llvm.org/D77804

The following shows the problem:

[$ ~/tmp] cat run.sh
/home/yhs/work/llvm-project/llvm/build.cur/install/bin/clang -target bpf -O2 -g -c t.c
[$ ~/tmp] cat t.c
typedef unsigned char u8;
struct event {
  u8 tag;
  u8 hostname[84];
};

void *g;
void bar(void *);

int foo() {
  struct event event = {};

  event.tag = 1;
  __builtin_memcpy(&event.hostname, g, 84);
  bar(&event);
  return 0;
}

[$ ~/tmp] ./run.sh
t.c:14:3: error: Looks like the BPF stack limit of 512 bytes is exceeded. Please move large on stack variables into BPF per-cpu array map.

  __builtin_memcpy(&event.hostname, g, 84);
  ^
t.c:14:3: error: Looks like the BPF stack limit of 512 bytes is exceeded. Please move large on stack variables into BPF per-cpu array map.

2 errors generated.
[$ ~/tmp]

The BPF program enforces the stack size <= 512 bytes. For the above program, with this patch, the code after dag insn selection is worse and eventually in register allocation stage, the stack size is more than 512 and caused the above issue.

To illustrate the problem in more details, without this patch, the lowered machine code looks like

  STB killed %7:gpr, %stack.1.event.i, 0, debug-location !21355 :: (store (s8) into %ir.event.i, align 8, !tbaa !21356); tracecon/src/bpf/tracecon.bpf.c:78:12 @[ tracecon/src/b
pf/tracecon.bpf.c:68:5 ]
  %8:gpr = LDB %6:gpr, 7, debug-location !21358 :: (load (s8) from %ir.call1.i + 7); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %8:gpr, %stack.1.event.i, 12, debug-location !21358 :: (store (s8) into %ir.hostname.i + 7); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.
c:68:5 ]
  %9:gpr = LDB %6:gpr, 6, debug-location !21358 :: (load (s8) from %ir.call1.i + 6); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %9:gpr, %stack.1.event.i, 11, debug-location !21358 :: (store (s8) into %ir.hostname.i + 6); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.
c:68:5 ]
  %10:gpr = LDB %6:gpr, 5, debug-location !21358 :: (load (s8) from %ir.call1.i + 5); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %10:gpr, %stack.1.event.i, 10, debug-location !21358 :: (store (s8) into %ir.hostname.i + 5); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf
.c:68:5 ]
  %11:gpr = LDB %6:gpr, 4, debug-location !21358 :: (load (s8) from %ir.call1.i + 4); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %11:gpr, %stack.1.event.i, 9, debug-location !21358 :: (store (s8) into %ir.hostname.i + 4); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.
c:68:5 ]
  %12:gpr = LDB %6:gpr, 3, debug-location !21358 :: (load (s8) from %ir.call1.i + 3); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %12:gpr, %stack.1.event.i, 8, debug-location !21358 :: (store (s8) into %ir.hostname.i + 3); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.
c:68:5 ]
  %13:gpr = LDB %6:gpr, 2, debug-location !21358 :: (load (s8) from %ir.call1.i + 2); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %13:gpr, %stack.1.event.i, 7, debug-location !21358 :: (store (s8) into %ir.hostname.i + 2); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.
c:68:5 ]
  %14:gpr = LDB %6:gpr, 1, debug-location !21358 :: (load (s8) from %ir.call1.i + 1); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %14:gpr, %stack.1.event.i, 6, debug-location !21358 :: (store (s8) into %ir.hostname.i + 1); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.
c:68:5 ]
  %15:gpr = LDB %6:gpr, 0, debug-location !21358 :: (load (s8) from %ir.call1.i); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %15:gpr, %stack.1.event.i, 5, debug-location !21358 :: (store (s8) into %ir.hostname.i); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68
:5 ]
  %16:gpr = LDB %6:gpr, 15, debug-location !21358 :: (load (s8) from %ir.call1.i + 15); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %16:gpr, %stack.1.event.i, 20, debug-location !21358 :: (store (s8) into %ir.hostname.i + 15); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bp
f.c:68:5 ]
  %17:gpr = LDB %6:gpr, 14, debug-location !21358 :: (load (s8) from %ir.call1.i + 14); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %17:gpr, %stack.1.event.i, 19, debug-location !21358 :: (store (s8) into %ir.hostname.i + 14); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bp
f.c:68:5 ]
...
  %88:gpr = LDB %6:gpr, 83, debug-location !21358 :: (load (s8) from %ir.call1.i + 83); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %88:gpr, %stack.1.event.i, 88, debug-location !21358 :: (store (s8) into %ir.hostname.i + 83); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %89:gpr = LDB %6:gpr, 82, debug-location !21358 :: (load (s8) from %ir.call1.i + 82); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %89:gpr, %stack.1.event.i, 87, debug-location !21358 :: (store (s8) into %ir.hostname.i + 82); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %90:gpr = LDB %6:gpr, 81, debug-location !21358 :: (load (s8) from %ir.call1.i + 81); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %90:gpr, %stack.1.event.i, 86, debug-location !21358 :: (store (s8) into %ir.hostname.i + 81); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %91:gpr = LDB %6:gpr, 80, debug-location !21358 :: (load (s8) from %ir.call1.i + 80); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB killed %91:gpr, %stack.1.event.i, 85, debug-location !21358 :: (store (s8) into %ir.hostname.i + 80); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]

Pretty straightforward byte load and stores and the corresponding stack after register allocation,

# *** IR Dump After Greedy Register Allocator (greedy) ***:
# Machine code for function tcp_v4_connect_exit: NoPHIs, TracksLiveness, TiedOpsRewritten, TracksDebugUserValues
Frame Objects:
  fi#0: size=4, align=4, at location [SP]
  fi#1: size=89, align=8, at location [SP]
  fi#2: size=4, align=4, at location [SP]
Function Live Ins: $r1 in %0

But this patch, the code becomes very complex,

  %8:gpr = LDB %6:gpr, 71, debug-location !21358 :: (load (s8) from %ir.call1.i + 71); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %9:gpr = SLL_ri %8:gpr(tied-def 0), 8, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %10:gpr = LDB %6:gpr, 70, debug-location !21358 :: (load (s8) from %ir.call1.i + 70); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %11:gpr = OR_rr %9:gpr(tied-def 0), killed %10:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %12:gpr = LDB %6:gpr, 15, debug-location !21358 :: (load (s8) from %ir.call1.i + 15); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %13:gpr = SLL_ri %12:gpr(tied-def 0), 8, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %14:gpr = LDB %6:gpr, 14, debug-location !21358 :: (load (s8) from %ir.call1.i + 14); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %15:gpr = OR_rr %13:gpr(tied-def 0), killed %14:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
...
  %71:gpr = OR_rr %69:gpr(tied-def 0), killed %70:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %72:gpr = LDB %6:gpr, 77, debug-location !21358 :: (load (s8) from %ir.call1.i + 77); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %73:gpr = SLL_ri %72:gpr(tied-def 0), 8, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %74:gpr = LDB %6:gpr, 76, debug-location !21358 :: (load (s8) from %ir.call1.i + 76); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %75:gpr = OR_rr %73:gpr(tied-def 0), killed %74:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %76:gpr = LDB %6:gpr, 79, debug-location !21358 :: (load (s8) from %ir.call1.i + 79); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %77:gpr = SLL_ri %76:gpr(tied-def 0), 8, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %78:gpr = LDB %6:gpr, 78, debug-location !21358 :: (load (s8) from %ir.call1.i + 78); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %79:gpr = OR_rr %77:gpr(tied-def 0), killed %78:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %80:gpr = SLL_ri %27:gpr(tied-def 0), 16, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %81:gpr = OR_rr %80:gpr(tied-def 0), killed %23:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %82:gpr = SLL_ri %19:gpr(tied-def 0), 16, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %83:gpr = OR_rr %82:gpr(tied-def 0), killed %71:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %84:gpr = SLL_ri %15:gpr(tied-def 0), 16, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %85:gpr = OR_rr %84:gpr(tied-def 0), killed %67:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %86:gpr = SLL_ri %11:gpr(tied-def 0), 16, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %87:gpr = OR_rr %86:gpr(tied-def 0), killed %63:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %88:gpr = SLL_ri %43:gpr(tied-def 0), 16, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
...
  %100:gpr = LDB %6:gpr, 74, debug-location !21358 :: (load (s8) from %ir.call1.i + 74); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %101:gpr = LDB %6:gpr, 75, debug-location !21358 :: (load (s8) from %ir.call1.i + 75); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %102:gpr = SLL_ri %101:gpr(tied-def 0), 8, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %103:gpr = OR_rr %102:gpr(tied-def 0), %100:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %104:gpr = SLL_ri %103:gpr(tied-def 0), 16, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %105:gpr = OR_rr %104:gpr(tied-def 0), killed %99:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %106:gpr = SLL_ri %79:gpr(tied-def 0), 16, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %107:gpr = OR_rr %106:gpr(tied-def 0), killed %75:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %108:gpr = LDB %6:gpr, 64, debug-location !21358 :: (load (s8) from %ir.call1.i + 64); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %109:gpr = LDB %6:gpr, 65, debug-location !21358 :: (load (s8) from %ir.call1.i + 65); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %110:gpr = SLL_ri %109:gpr(tied-def 0), 8, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %111:gpr = OR_rr %110:gpr(tied-def 0), %108:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %112:gpr = LDB %6:gpr, 66, debug-location !21358 :: (load (s8) from %ir.call1.i + 66); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %113:gpr = LDB %6:gpr, 67, debug-location !21358 :: (load (s8) from %ir.call1.i + 67); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %114:gpr = SLL_ri %113:gpr(tied-def 0), 8, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %115:gpr = OR_rr %114:gpr(tied-def 0), %112:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %116:gpr = SLL_ri %115:gpr(tied-def 0), 16, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %117:gpr = OR_rr %116:gpr(tied-def 0), killed %111:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
...
  %225:gpr = OR_rr %224:gpr(tied-def 0), killed %117:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %226:gpr = SLL_ri %107:gpr(tied-def 0), 32, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  %227:gpr = OR_rr %226:gpr(tied-def 0), killed %105:gpr, debug-location !21358; tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB %193:gpr, %stack.1.event.i, 8, debug-location !21358 :: (store (s8) into %ir.hostname.i + 3); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5
 ]
  STB %192:gpr, %stack.1.event.i, 7, debug-location !21358 :: (store (s8) into %ir.hostname.i + 2); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB %189:gpr, %stack.1.event.i, 6, debug-location !21358 :: (store (s8) into %ir.hostname.i + 1); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB %188:gpr, %stack.1.event.i, 5, debug-location !21358 :: (store (s8) into %ir.hostname.i); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68:5 ]
  STB %183:gpr, %stack.1.event.i, 16, debug-location !21358 :: (store (s8) into %ir.hostname.i + 11); tracecon/src/bpf/tracecon.bpf.c:79:2 @[ tracecon/src/bpf/tracecon.bpf.c:68
:5 ]
...

And the code becomes very complex and inefficient and this caused later larger stack size,

# *** IR Dump After Greedy Register Allocator (greedy) ***:
# Machine code for function tcp_v4_connect_exit: NoPHIs, TracksLiveness, TiedOpsRewritten, TracksDebugUserValues
Frame Objects:
  fi#0: size=4, align=4, at location [SP]
  fi#1: size=89, align=8, at location [SP]
  fi#2: size=4, align=4, at location [SP]
  fi#3: size=8, align=8, at location [SP]
...
  fi#57: size=8, align=8, at location [SP]
  fi#58: size=8, align=8, at location [SP]
  fi#59: size=8, align=8, at location [SP]
  fi#60: size=8, align=8, at location [SP]
Function Live Ins: $r1 in %0

Could you help take a look at this problem and suggest how to fix it?

yonghong-song avatar Sep 21 '22 14:09 yonghong-song

Reduced IR:

%struct.event = type { i8, [84 x i8] }

define void @foo(ptr %g) {
entry:
  %event = alloca %struct.event, align 1
  %hostname = getelementptr inbounds %struct.event, ptr %event, i64 0, i32 1
  %0 = load ptr, ptr %g, align 8
  call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 1 dereferenceable(84) %hostname, ptr noundef nonnull align 1 dereferenceable(84) %0, i64 84, i1 false)
  call void @bar(ptr noundef nonnull %event)
  ret void
}

declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1 immarg) #2
declare void @bar(ptr noundef)

RKSimon avatar Sep 21 '22 16:09 RKSimon

I know almost nothing about BPF, but this problem does go away if BPF overrides allowsMisalignedMemoryAccesses to allow fast unaligned load/stores.

So - does BPF support unaligned loads?

RKSimon avatar Sep 24 '22 17:09 RKSimon

@RKSimon BPF does not support unaligned loads.

yonghong-song avatar Sep 26 '22 14:09 yonghong-song

Candidate Patch: https://reviews.llvm.org/D136042

RKSimon avatar Oct 17 '22 09:10 RKSimon

@llvm/issue-subscribers-backend-amdgpu

llvmbot avatar Oct 29 '22 14:10 llvmbot