llvm
llvm copied to clipboard
cloning functions
Is there a simple way how to clone functions?
I'd like to duplicate a function like this:
func duplicate(m *ir.Module, foo string) {
for _, f := range m.Funcs {
if f.GlobalIdent.GlobalName == foo {
DuplicateFunc(m, f, foo + "_v2")
}
}
}
And afterwards I'd do some minor changes in the function code.
Hi @db7,
The way to clone functions would be to do a deep copy of the *ir.Func structure. There should be several Go packages which provide deep copy (e.g. mohae/deepcopy), so use the one you prefer.
That's at least in theory. In practice, when I tried this out, there were quite a few issues. I'll post the source code I've been using to test this so you can get more insight into the various approaches tried. In the end, we probably need to do this manually for the ir structures to resolve issues with cyclic references, etc.
Given the LLVM IR assembly file foo.ll:
define i32 @f(i32 %x, i32 %y) {
%result = add i32 %x, %y
ret i32 %result
}
The following source code may be used to clone the function @f to a new function @g, and made independent modifications to @g. The code below is a proof of concept to get a feel for what approach may be taken.
// Note, the proper solution would be to do something like
// https://godoc.org/github.com/go-toolsmith/astcopy but for llir/llvm/ir, e.g.
// ircopy.
package main
import (
//"bytes"
//"encoding/json" // unmarshal error: `json: cannot unmarshal object into Go struct field FuncType.Sig.RetType of type types.Type`
"fmt"
"log"
"github.com/llir/llvm/asm"
"github.com/llir/llvm/ir"
//"github.com/mohae/deepcopy" // this library didn't work as we get an infinite recursion during deepcopy.Copy; `goroutine stack exceeds 1000000000-byte limit`
//"github.com/ybriffa/deepcopy" // this library didn't work as it cannot handle enums; `value of type uint8 cannot be converted to type ir.FuncAttribute`
//"github.com/rogpeppe/rog-go/exp/deepcopy" // cannot use this library, build error; deepcopy.go:218:25: undefined: unsafe.Unreflect
//"github.com/jinzhu/copier" // seems to work (but the copy is shallow). however, not deep copy, so changes to instructions are mirrored to both functions.
//"github.com/mitchellh/copystructure" // this library didn't work as we get an infinite recursion during copystructure.Copy; `goroutine stack exceeds 1000000000-byte limit`
//"github.com/getlantern/deepcopy" // unmarshal error: `json: cannot unmarshal object into Go struct field FuncType.Sig.RetType of type types.Type`
)
func main() {
// parse LLVM IR module.
m, err := asm.ParseFile("foo.ll")
if err != nil {
log.Fatalf("%+v", err)
}
// locate function "@f".
const funcName = "f"
f, ok := findFunc(m, funcName)
if !ok {
log.Fatalf("unable to locate function %q", funcName)
}
// create clone of function "@f" with new function name "@g".
g, err := dupFunc(f, "g")
if err != nil {
log.Fatalf("unable to clone function %q; %v", f.Name(), err)
}
modifyFunc(g)
// append new function "@g" to LLVM IR module.
m.Funcs = append(m.Funcs, g)
// print LLVM IR module to standard output.
fmt.Println(m)
}
// modifyFunc modified g by setting the name of each basic block and swapping
// the order of operands to add instructions.
func modifyFunc(g *ir.Func) *ir.Func {
for i, block := range g.Blocks {
blockName := fmt.Sprintf("block_%d", i)
block.SetName(blockName)
for _, inst := range block.Insts {
switch inst := inst.(type) {
case *ir.InstAdd:
// swap X and Y.
inst.X, inst.Y = inst.Y, inst.X
}
}
}
return g
}
// findFunc tries to locate the function with the given name in the LLVM IR
// module. The boolean return value indicates success.
func findFunc(m *ir.Module, funcName string) (*ir.Func, bool) {
for _, f := range m.Funcs {
if f.Name() == funcName {
return f, true
}
}
return nil, false
}
// dupFunc returns a clone of the given function with a new function name.
func dupFunc(origFunc *ir.Func, newFuncName string) (*ir.Func, error) {
// Attempt using deepcopy.
//
//newFunc := deepcopy.Copy(origFunc).(*ir.Func)
// Attempt using copier.
//
//newFunc := new(ir.Func)
//if err := copier.Copy(newFunc, origFunc); err != nil {
// return nil, err
//}
// Attempt using copystructure.
//v, err := copystructure.Copy(origFunc)
//if err != nil {
// return nil, err
//}
//newFunc := v.(*ir.Func)
// Use json marshal/unmarshal as workaround for deep copy.
// This should work but may be slow.
//newFunc := new(ir.Func)
//if err := deepcopyWorkaround(newFunc, origFunc); err != nil {
// return nil, err
//}
// Attempt using getlantern/deepcopy.
//newFunc := new(ir.Func)
//if err := deepcopy.Copy(newFunc, origFunc); err != nil {
// return nil, err
//}
newFunc := cloneFunc(origFunc)
newFunc.SetName(newFuncName)
return newFunc, nil
}
func cloneFunc(origFunc *ir.Func) *ir.Func {
newFunc := new(ir.Func)
*newFunc = *origFunc
newFunc.Blocks = make([]*ir.Block, len(origFunc.Blocks))
for i, origBlock := range origFunc.Blocks {
newBlock := cloneBlock(origBlock)
newFunc.Blocks[i] = newBlock
}
return newFunc
}
func cloneBlock(origBlock *ir.Block) *ir.Block {
newBlock := new(ir.Block)
*newBlock = *origBlock
newBlock.Insts = make([]ir.Instruction, len(origBlock.Insts))
for j, origInst := range origBlock.Insts {
newInst := cloneInst(origInst)
newBlock.Insts[j] = newInst
}
newTerm := cloneTerm(origBlock.Term)
newBlock.Term = newTerm
return newBlock
}
func cloneInst(origInst ir.Instruction) ir.Instruction {
var newInst ir.Instruction
switch origInst := origInst.(type) {
case *ir.InstAdd:
v := new(ir.InstAdd)
*v = *origInst
newInst = v
default:
panic(fmt.Errorf("support for instruction %T not yet implemented", origInst))
}
return newInst
}
func cloneTerm(origTerm ir.Terminator) ir.Terminator {
var newTerm ir.Terminator
switch origTerm := origTerm.(type) {
case *ir.TermRet:
v := new(ir.TermRet)
*v = *origTerm
newTerm = v
default:
panic(fmt.Errorf("support for terminator %T not yet implemented", origTerm))
}
return newTerm
}
/*
func deepcopyWorkaround(dst, src interface{}) error {
buf := &bytes.Buffer{}
enc := json.NewEncoder(buf)
if err := enc.Encode(src); err != nil {
return err
}
dec := json.NewDecoder(buf)
if err := dec.Decode(dst); err != nil {
return err
}
return nil
}
*/
@mewmew, thanks for the detailed reply. Indeed, I got issues with infinite recursions with all approaches you have tried (including json encoding).
I've noticed you marked this issue with "util". Do you intend to implement something on this direction as part of this project?
@mewmew, thanks for the detailed reply. Indeed, I got issues with infinite recursions with all approaches you have tried (including json encoding).
I know, the self-referential values are problematic. We need to do some brain storming and figure out how to implement this in a way that can handle such recursions.
I've noticed you marked this issue with "util". Do you intend to implement something on this direction as part of this project?
Indeed, the intention is to implement it as part of the irutil repository. Having the ability to clone and arbitrarily modify functions is functionality that will be of common use and thus belongs in irutil.
However, I still don't know of a good way to implement this, so will leave this issue open to invite others to contribute to the discussion. Once we've determined how to implement this, I'd be more than happy to put a solid implementation in irutil.
So for now, lets continue to brain storm and try different approaches. Let me know if you find or think of something that could work!
Looks into the LLVM implementation: https://llvm.org/doxygen/IndirectionUtils_8cpp_source.html#l00286, it basically creates new instances from old values one by one. I also have no idea how to clone automatically correctly.
I haven't tested it yet but the deep-copy project was recently announced, which generates deep copy functions for a given type. With some luck it could suit our needs.
Do we have any updates here?
Do we have any updates here?
Not yet. Deep copy is difficult to handle correctly. And writing it manually will be difficult to update whenever we update the llir/llvm/ir struct representation. Therefore, we would like to find an automated approach. Either by code generation (after parsing the type definitions of llir/llvm/ir) or by using reflection (e.g. as is done in the deep-copy library).
More experimentation is needed to figure out what approach works best, both for future maintenance, and for other aspects such as preventing infinite cyclic recursion and also in part performance.
https://github.com/jinzhu/copier have a try?
https://github.com/jinzhu/copier have a try?
@db7, have you tried using the copier package for deep copy? I think it should be possible to adapt the code example outlined in https://github.com/llir/llvm/issues/115#issuecomment-563259036 to verify if this library works or not.
@dannypsnl, if you wish, you can also try to adapt the code example of https://github.com/llir/llvm/issues/115#issuecomment-563259036 to see if the copier library works. Most deep copy libraries we've tried got stuck with infinite recursion problems.
I'm out hiking, and had a stop at a local cafe :)
Wish you both a good summer vacation!
Cheers, Robin