go icon indicating copy to clipboard operation
go copied to clipboard

cmd/link: rearrange compiler/linker generated data for clarity

Open ianlancetaylor opened this issue 2 months ago • 30 comments

A Go executable contains information generated by the compiler and linker that is not program code or data. This data currently appears in various different locations in the executable. I think we should try to clarify it to make it easier for users to understand.

This information I am discussing is at least the following:

  • buildinfo, which includes module dependencies and build settings. In ELF this is found in the .go.buildinfo section. It can be found without a section table by searching for a magic string, which is "\xff Go buildinf:". It contains a 32-byte header followed by a pair of strings. It's written out by (*Link).buildinfo in cmd/link/internal/ld/data.go. It's parsed by readRawBuildInfo in debug/buildinfo/buildinfo.go.
  • The go:buildinfo.ref symbol is a pointer-sized symbol that refers to the buildinfo data, to keep the C linker from removing the .go.buildinfo section. This symbol appears in the .rodata section. It's written out by (*Link).buildinfo in cmd/link/internal/ld/data.go.
  • moduledata, which has pointers to most of the other generated information. In ELF this is found at the symbol runtime.firstmoduledata (for an executable). There is no way to find it if the symbol table is missing. The format is simply the struct runtime.moduledata. It appears in the .noptrdata section. It is written out by (*Link).symtab in cmd/link/internal/ld/symtab.go.
  • pcheader, which points to more of the generated information. The moduledata points to the pcheader. The format is the struct runtime.pcHeader. In ELF it appears at the start of the .gopclntab section. It starts with a four byte magic number 0xfffffff1. It is written out by (*pclntab).generatePCHeader in cmd/link/internal/ld/pcln.go.
  • The function offset table, which has one entry for each function and method in the executable, plus one more entry that records the end of the last function. This can be found via a slice in moduledata. The format is runtime.functab, which is just a pair of uint32 offsets. The number of functions can also be found in the PC header, although as noted the slice has one more entry. This appears in the .gopclntab section. It is written out by writePCToFunc in cmd/link/internal/ld/pcln.go.
  • The PC to function lookup table. This divides the address space devoted to functions into buckets, where each bucket covers 4096 bytes. See runtime.findfuncbucket for details. It can be found via a pointer in the moduledata. The size doesn't seem to be recorded, it must be computed based on minpc and maxpc in moduledata (or the addresses in the function offset table). It appears in the .rodata section. It is written out by (*pcln).findfunctab.
  • The function table. This records information for each function. It can be found via the function offset table, or by the pclntable field in moduledata or the pclnOffset field in pcheader. The format starts with runtime._func, and is followed by variable length arrays containing pcdata and funcdata offsets. This appears in the .gopclntab section. It is written out by writeFuncs in cmd/link/internal/ld/pcln.go.
  • The function name table. This is simply the name of each function as a series of NUL terminated strings. The function table has an offset into this table for each function. This can also be found as the funcnametab field of moduledata, and the funcnameOffset field of pcheader. It appears in the .gopclntab section. It is written out by (*pclntab).generateFuncnametab.
  • The compilation unit table, which maps file numbers for a compilation unit to offsets into the file name table. See runtime.funcfile for how this is used. Basically it lets each function store small numbers when mapping from PC to file name. This can be found via the cutab field of moduledata or the cuOffset field of pcheader. The format is just a slice of uint32 values. It appears in the .gopclntab section. It is written out by (*pclntab).generateFilenameTabs.
  • The file name table, which contains the actual file names, as a series of NUL terminated strings. Entries are found via the compilation unit table. This can also be found via the filetab field of moduledata or the filetabOffset field of pcheader. It appears in the .gopclntab section. It is written out by (*pclntab).generateFilenameTabs.
  • The pcdata information, also known as pctab. The function table pcdata offsets point here, as does the function table pcsp field. The table can also be found via the pctab field of moduledata or the pctabOffset field of pcheader. The format of this table is complex and is described at http://go.dev/s/go12symtab in the PC-Value Table Encoding appendix. It appears in the .gopclntab section. The data is created by the compiler.
  • The funcdata information. The function table funcdata offsets point here, and it can also be found via the gofunc field of moduledata. The format of this table varies depending on the exact funcdata information being recorded. It appears in the .rodata section (not the .gopclntab section). There doesn't seem to be a way to know the complete size of the data. The data is created by the compiler, though the linker seems to compute some of the inline symbol information.
  • Type descriptors used for the reflect package. Besides direct references from code, this can be found via the typelinks field of moduledata, which indexes into the types (and etypes) pointers in moduledata. These appear in the .rodata section.
  • The type links used by the reflect package to find type descriptors. This is a sequence of int32 offsets into the types section, sorted by type string. This appears in the .typelink section. It is written by (*Link).typelink.
  • Type descriptors have sizes that depend on their kind. Many types also have a GC bitmask. The size of each GC bitmask depends on the type. These appear in the .rodata section.
  • Larger types will compute the GC bitmask at runtime as needed. For these types, the program will contain a pointer that is filled out as needed (by runtime.getGCMaskOnDemand). These pointers appear in the .noptrdata section (though they could be .noptrbss; there is a TODO in dgcptrmaskOnDemand in cmd/compile/internal/reflectdata/reflect.go).
  • The data and BSS sections have their own GC information. These can be found via the gcdata and gcbss fields of moduledata. They appear in the .rodata section. These are GC programs, not bitmasks; they are expanded into bitmasks by runtime.progToPointerMask.
  • Interface tables computed by the compiler. These can be found via the itablink field of moduledata, which is a slice of pointers to runtime.itab structs. itablinks appears in the .itablink section. The itabs themselves appear in the .rodata section. The itabs are variable length, as they have a list of pointers to methods.
  • FIPS checking information used by crypto/internal/fips140/check to verify the checksum of FIPS sections when in FIPS mode. This can be found at the go:fipsinfo symbol. It appears in the .go.fipsinfo section. Currently it is 120 bytes.
  • Function descriptors are created whenever a top-level function or method expression is converted to a function type. A value of function type is a pointer to memory such that the first word of that memory is the address of the function code. Remaining words are closure or method pointers, but a top-level function or method expression doesn't have any of those. So a function descriptor is just a word of memory pointing to the function code. These appear in the .rodata section. They are created by WriteFuncSyms in cmd/compile/staticdata/data.go.

ianlancetaylor avatar Oct 24 '25 04:10 ianlancetaylor

Today, when using ELF, when generating an ordinary executable, the Go linker generates the following Go-specific sections:

  • .gopclntab. This holds the function names, the file names, the function table, the function lookup table, the filename offset table, and the pcdata.
  • .gosymtab. This is always empty. The symbols .runtime.symtab and .runtime.esymtab point here; they always have the same value, and they don't point to anything meaninful. It's been a long time since anything appeared in this section, maybe before Go 1.
  • .go.buildinfo. This holds the go:buildinfo symbol which holds the buildinfo data: the Go version and the module dependencies.
  • .go.fipsinfo. This holds the go:fipsinfo symbol used to verify FIPS information in crypto/internal/fips140/check.
  • .typelink. This holds the runtime.typelink symbol, which is a series of int32 offsets into the type section of the module. Note that although the field seems to consistently have type int32 the actual use converts to uintptr, so negative numbers are inadvisable.
  • .itablink. This holds the runtime.itablink symbol, which is a series of pointers (not offsets) to itab tables.
  • .noptrdata. This holds variables that can't contain pointers.
  • .noptrbss. This holds zero-initialized variables that can't contain pointers.
  • .note.go.buildid. This holds the value of the linker's -buildid option, which is normally computed by cmd/go. This is allocated in memory but I'm not sure why.

ianlancetaylor avatar Oct 24 '25 18:10 ianlancetaylor

The current Delve implementation looks for at least the .gopclntab and .noptrdata sections (among other sections not specific to Go). It seems to use the .notptrdata section to find the functab information.

ianlancetaylor avatar Oct 24 '25 19:10 ianlancetaylor

Thanks for writing all of this down!

This is relevant to #74396, where we've considered putting runtime.moduledata in its own section, or pointing to the moduledata from go:buildinfo, to make it easier to find in a stripped binary.

prattmic avatar Oct 24 '25 19:10 prattmic

I suggest that we do the following. I'll use ELF names for the sections, and the names can be tweaked as appropriate for other platforms.

  • Keep the current .go.buildinfo section and go:buildinfo symbol unchanged, holding the buildinfo.
    • At some point I believe that we should add a pointer from buildinfo to moduledata, so that moduledata can be found without a symbol table.
    • At some point we can set the SHF_GNU_RETAIN section flag and drop go:buildinfo.ref.
  • Keep the current .go.fipsinfo section and go:fipsinfo symbol unchanged.
  • Keep the current .noptrdata and .noptrbss sections unchanged.
  • ~~See if we can make .note.go.buildid an unallocated section.~~
    • We can't, at least not easily. cmd/internal/buildid expects it to be near the start of the file.
  • Remove the .gosymtab section.
    • Done in https://go.dev/cl/717200.
  • Keep the .gopclntab section, and keep it starting with the pcHeader information, as Delve uses it.
    • Move the PC to function lookup table into .gopclntab.
      • Done in https://go.dev/cl/719743.
    • Move the funcdata information into .gopclntab.
      • Done in https://go.dev/cl/719440.
    • ~~Add an offset to the funcdata at the end of the current pcHeader.~~
      • Postponing until something would use this.
  • Add a new .go.module section in the data segment.
    • Move the moduledata information there.
      • Done in https://go.dev/cl/720660.
    • Longer term, we should move this to the rodata segment. It is only in the data segment because the next field is updated at runtime as shared libraries and plugins are loaded. We can split that out.
  • Add a new .go.type section in the read-only data segment.
    • Move all type descriptors to this section.
      • Done in https://go.dev/cl/723580.
    • Move typelinks to this section; remove the .typelink section.
      • https://go.dev/cl/724261 removes typelinks entirely.
    • Move the itabs to this section.
      • Done in https://go.dev/cl/729200.
    • Move itablinks to this section; remove the .itablink section.
      • https://go.dev/cl/729200 removes itablinks entirely.
  • Add a new .go.gc section in the read-only data segment.
    • Move all type GC bitmasks to this section.
    • Move the global data and global BSS GC programs to this section.
    • Turns out they are already grouped under runtime.gcbits.*.
    • That seems good enough for now.
  • Add a new .go.gccbss section to BSS.
    • Store the GC program data words there.
    • Or just use .noptrbss?
    • https://go.dev/cl/729880 groups the gcmask symbols in noptrbss under the name runtime.gcmask.*.
    • That seems good enough, we can add a new section later if that seems useful.
  • Perhaps add a new .go.func section to hold function descriptors.
    • I don't know whether this is worth it.
    • Did it anyhow in https://go.dev/cl/723580, because that made relro handling easier.

CC @golang/compiler @golang/runtime

ianlancetaylor avatar Oct 24 '25 21:10 ianlancetaylor

I took a look at relro sections. relro means relocatable read-only. A relro section is initially loaded as a writable section. After dynamic relocations are applied, the section is changed to be read-only. This is a security measure. It permits variables to contain pointers to other variables whose address is not known at link time (perhaps because the code is going into a position independent executable (PIE) or a shared library), and for those variables to then become immutable after they have been initialized at program load time. In general this concept does not apply to Go variables, which are never immutable, but it does apply to some of the compiler/linker generated data.

As a side note, as a lot of compiler/linker generated data uses offsets, relro must be used with care when using external linking. Most object file format don't provide relocations for the necessary offsets, so they must be computed by cmd/link. That means that offsets should not cross sections; if they do, the external linker may rearrange the sections such that the cmd/link computed offsets are incorrect.

Specifically, cmd/link currently puts the following compiler/linker generated data into relro sections when generating a PIE or shared library or plugin or a c-archive:

  • type descriptors
    • Type descriptors contain several pointers:
      • GC bitmaps
      • Equality function
      • Pointers to type descriptors for element types, such as slice elements, map keys, etc.
      • Interface methods
      • Struct fields
  • type links
    • These no longer contain any pointers, just offsets from the start of the type descriptors.
    • So these don't need to be relro.
  • itab links
    • These currently contain pointers to itabs.
  • the entire pclntab section
    • Currently I don't think this needs to be relro, but further investigation is needed.
  • function descriptors
    • These contain pointers to function code.

ianlancetaylor avatar Oct 26 '25 00:10 ianlancetaylor

A quick rundown of what is stored in pcdata and funcdata. Currently pcdata appears in the .gopclntab section, and funcdata appears in the .rodata section. I am suggesting that funcdata move to .gopclntab.

Recall that pcdata points to tables that map from PC to value as described at https://go.dev/s/go12symtab. In pcdata we currently find:

  • The _func.pcsp field points to a mapping from PC to the offset between the stack pointer and the frame pointer.
  • The _func.pcfile field points to a mapping from PC to the current file number. This is added to _func.cuOffset to get an offset into cutab, which provides an offset into the file name table.
  • The _func.pcln field points to a mapping from PC to line number.
  • Optional pcdata field 0 is UnsafePoint, a mapping from PC to whether that PC is a safe preemption point. This valid values appear in internal/abi/symtab.go (UnsafePointSafe and so forth).
  • Optional pcdata field 1 is StackMapIndex, a mapping from PC to an index into the funcdata entries LocalsPointersMaps and ArgsPointerMaps.
  • Optional pcdata field 2 is InlTreeIndex, a mapping from PC to an index into the funcdata entry InlTree.
  • Optional pcdata field 3 is ArgLiveIndex, a mapping from PC to an index into the funcdata entry ArgLiveInfo.
  • Optional pcdata field 4 is PanicBounds, a mapping from PC to a code that is a compressed record of an out of of range slice or index operation. See internal/abi/bounds.go or https://go.dev/cl/682396 for details.

The funcdata information is always optional. It is not a mapping from PC, it's just any random data.

  • 0 ArgsPointerMaps: A stackmap of type runtime.stackmap: number of bitmaps, number of bits in a bitmap, and then that many bitmaps of that size. Indexed by pcdata StackmapIndex. This holds a bitmap of pointers in the arguments, used by the GC when walking the stack. It is also used when copying the stack.
  • 1 LocalsPointerMaps: Just like ArgsPointerMaps, but for local variables.
  • 2 StackObjects: A number of entries, followed by that many entries of type runtime.stackObjectRecord. These record a GC mask for stack objects, so that the GC can ignore stack objects that nothing refers to. The compiler emits this information for variables on the stack whose address is taken. This is separate from stack liveness in that it refers to variables that may or may not be live. See #22350.
  • 3 InlTree: A list of type runtime.inlinedCall. The number of entries is not specified. This is indexed by pcdata InlTreeIndex. This stores information about inlined calls, used when unwinding the stack.
  • 4 OpenCodedDeferInfo: a pair of varints holding the stack offsets to two variables used to run open coded defer statements.
  • 5 ArgInfo: a sequence of bytes used to print function arguments in a stack traceback. See TraceArgsEndSeq and friends in internal/abi/type.go.
  • 6 ArgLiveInfo: A one byte stack offset followed by a bitmap of which register arguments are live. The bitmap is indexed by both pcdata ArgLiveIndex and a slot index from in the funcdata ArgInfo. See cmd/compile/internal/liveness/arg.go for details.
  • 7 WrapInfo: A uint32 text offset that is present for wrapper functions. The text offset points to the wrapped function. See #50622.

It's worth noting that none of the pcdata or funcdata fields contain pointers. Some of the funcdata fields contain offsets.

As far as I can tell the only pointer in all of pclntab is that the pcheader holds the address of the text section. Interestingly, Delve does not use that information, with the comment "use the start PC instead of reading from the table, which may be unrelocated". And the runtime doesn't use it either, preferring the moduledata text field. Getting rid of that pointer will let us avoid moving .gopclntab into relro.

ianlancetaylor avatar Oct 28 '25 00:10 ianlancetaylor

Change https://go.dev/cl/717200 mentions this issue: cmd/link: don't generate .gosymtab section

gopherbot avatar Nov 02 '25 21:11 gopherbot

Change https://go.dev/cl/717240 mentions this issue: cmd/link, runtime: don't store text start in pcHeader

gopherbot avatar Nov 03 '25 04:11 gopherbot

Change https://go.dev/cl/718065 mentions this issue: cmd/link: move pclntab out of relro section

gopherbot avatar Nov 06 '25 00:11 gopherbot

go now uses the following linker segment order

text
rodata (contains pclntab)
data
bss
... (others)

pclntab is usually quite large. The .text and .bss relocation overflow may occur in some huge program. If pclntab can be moved from the regular rodata segment to another location, it may reduce the relocation overflow.

wdvxdr1123 avatar Nov 06 '25 04:11 wdvxdr1123

@wdvxdr1123 References between text, data, and BSS segments normally use pointers, not offsets. There shouldn't be any risk of relocation overflow due to a large program. Please let us know if you know of a specific example where a large pclntab can cause a relocation overflow. Thanks.

ianlancetaylor avatar Nov 06 '25 04:11 ianlancetaylor

I encountered the PCREL link overflow in the company's internal code: internal/cpu/cpu.go:200:(.text+0x6e1): relocation truncated to fit: R_X86_64_PC32 against symbolinternal/cpu.options

wdvxdr1123 avatar Nov 06 '25 04:11 wdvxdr1123

Thanks. You are correct, and I was mistaken. For GOARCH=amd64 the gc compiler does currently generate 32-bit PC relative relocations for references to package-scope variables. This requires that the total address space used by the program not be much larger than 2GB.

Moving pclntab can help a little bit, but not much.

Please open a separate issue for this issue. Please include as many details about your program as you can. I doubt there is a simple fix. Thanks.

ianlancetaylor avatar Nov 06 '25 17:11 ianlancetaylor

@wdvxdr1123 By the way, note that you are using external linking, as is the default when building a program that uses cgo, so moving the pclntab section will do nothing. It is the external linker that will determine section placement.

ianlancetaylor avatar Nov 06 '25 19:11 ianlancetaylor

Change https://go.dev/cl/719440 mentions this issue: cmd/link: put funcdata symbols in .gopclntab section

gopherbot avatar Nov 11 '25 00:11 gopherbot

Change https://go.dev/cl/719743 mentions this issue: cmd/link: put runtime.findfunctab in the .gopclntab section

gopherbot avatar Nov 11 '25 23:11 gopherbot

Change https://go.dev/cl/720660 mentions this issue: cmd/link: put moduledata in its own .go.module section

gopherbot avatar Nov 14 '25 19:11 gopherbot

Change https://go.dev/cl/721460 mentions this issue: cmd/link: test that funcdata values are in gopclntab section

gopherbot avatar Nov 18 '25 02:11 gopherbot

Change https://go.dev/cl/721480 mentions this issue: cmd/link: test that moduledata is in its own section

gopherbot avatar Nov 18 '25 04:11 gopherbot

Change https://go.dev/cl/723580 mentions this issue: cmd/link: put type descriptors in .go.type section

gopherbot avatar Nov 24 '25 04:11 gopherbot

https://go.dev/cl/718065 is breaking a lot of Google internal targets with relocation R_X86_64_PC32 out of range errors. At least most of these targets use external linking and can't be released anymore. This is significantly impacting our ability to release an internal toolchain. I suspect that the number of failures is a sign that this change will affect people outside of Google too.

znkr avatar Dec 09 '25 16:12 znkr

@znkr Is there anything else you can say about when this occurs? LIke, which symbol is being referenced? I don't understand how this change could cause this problem. This is moving pclntab from relro to rodata. All references to pclntab should be offsets from addresses. So I don't see how we could get relocation errors there. And the change keeps pclntab in the data segment. It is true that this may make references from function code to relro variables farther away (because relro follows rodata in the address space), and thus possibly out of range. But that could only happen if the references were very close to being out of range already. But maybe I am missing something.

ianlancetaylor avatar Dec 09 '25 18:12 ianlancetaylor

@stapelberg looked into this last week. He found that the C linker takes care of placing .rodata and .text close to each other (so that relative references work as well as possible). He also came up with a patch to work around the problem by placing pclntab in .lrodata.

diff --git a/src/cmd/link/internal/ld/data.go b/src/cmd/link/internal/ld/data.go
index 5b6dabb62b..c9fe261515 100644
--- a/src/cmd/link/internal/ld/data.go
+++ b/src/cmd/link/internal/ld/data.go
@@ -2140,7 +2140,7 @@ func (state *dodataState) allocateDataSections(ctxt *Link) {
 	}
 
 	/* gopclntab */
-	sect = state.allocateNamedSectionAndAssignSyms(segro, ".gopclntab", sym.SPCLNTAB, sym.SRODATA, 04)
+	sect = state.allocateNamedSectionAndAssignSyms(segro, ".lrodata.gopclntab", sym.SPCLNTAB, sym.SRODATA, 04)
 	ldr.SetSymSect(ldr.LookupOrCreateSym("runtime.pclntab", 0), sect)
 	ldr.SetSymSect(ldr.LookupOrCreateSym("runtime.pcheader", 0), sect)
 	ldr.SetSymSect(ldr.LookupOrCreateSym("runtime.funcnametab", 0), sect)
diff --git a/src/cmd/link/internal/ld/symtab.go b/src/cmd/link/internal/ld/symtab.go
index f9bc7007ed..c5214b8e4f 100644
--- a/src/cmd/link/internal/ld/symtab.go
+++ b/src/cmd/link/internal/ld/symtab.go
@@ -656,6 +656,7 @@ func (ctxt *Link) symtab(pcln *pclntab) []sym.SymKind {
 	moduledata.AddAddr(ctxt.Arch, ldr.Lookup("runtime.etypes", 0))
 	moduledata.AddAddr(ctxt.Arch, ldr.Lookup("runtime.rodata", 0))
 	moduledata.AddAddr(ctxt.Arch, ldr.Lookup("go:func.*", 0))
+	moduledata.AddAddr(ctxt.Arch, ldr.Lookup("runtime.epclntab", 0))
 
 	if ctxt.IsAIX() && ctxt.IsExternal() {
 		// Add R_XCOFFREF relocation to prevent ld's garbage collection of
diff --git a/src/runtime/stack.go b/src/runtime/stack.go
index d1c80276a5..3c0f11f643 100644
--- a/src/runtime/stack.go
+++ b/src/runtime/stack.go
@@ -1357,7 +1357,12 @@ func (r *stackObjectRecord) gcdata() (uintptr, *byte) {
 	ptr := uintptr(unsafe.Pointer(r))
 	var mod *moduledata
 	for datap := &firstmoduledata; datap != nil; datap = datap.next {
-		if datap.gofunc <= ptr && ptr < datap.end {
+		// Check if ptr is in pclntab section (gofunc to epclntab).
+		if datap.gofunc <= ptr && ptr < datap.epclntab {
+			mod = datap
+			break
+		}
+		if datap.noptrbss <= ptr && ptr < datap.enoptrbss {
 			mod = datap
 			break
 		}
diff --git a/src/runtime/stkframe.go b/src/runtime/stkframe.go
index d6e7e0371c..14e0e3f8e6 100644
--- a/src/runtime/stkframe.go
+++ b/src/runtime/stkframe.go
@@ -269,7 +269,12 @@ func stkobjinit() {
 	ptr := uintptr(unsafe.Pointer(&methodValueCallFrameObjs[0]))
 	var mod *moduledata
 	for datap := &firstmoduledata; datap != nil; datap = datap.next {
-		if datap.gofunc <= ptr && ptr < datap.end {
+		// Check if ptr is in pclntab section (gofunc to epclntab).
+		if datap.gofunc <= ptr && ptr < datap.epclntab {
+			mod = datap
+			break
+		}
+		if datap.noptrbss <= ptr && ptr < datap.enoptrbss {
 			mod = datap
 			break
 		}
diff --git a/src/runtime/symtab.go b/src/runtime/symtab.go
index c1643c1b39..0581655530 100644
--- a/src/runtime/symtab.go
+++ b/src/runtime/symtab.go
@@ -422,6 +422,7 @@ type moduledata struct {
 	types, etypes         uintptr
 	rodata                uintptr
 	gofunc                uintptr // go.func.*
+	epclntab              uintptr
 
 	textsectmap []textsect
 	typelinks   []int32 // offsets from types

IIUC, this means that .gopclntab might be inserted between the C++ .rodata and .text in the binary and cause relocation out of range errors.

znkr avatar Dec 09 '25 18:12 znkr

Any idea why there aren't problems with references to relro data? Is the system not using relro?

ianlancetaylor avatar Dec 09 '25 19:12 ianlancetaylor

Inside Google, for (at least some) C++/Go mixed binaries, it is built with -mcmodel=medium -mlarge-data-threshold. With LLD, the sections are laid out in the order .rodata, ..., .text, ..., .data.rel.ro, ..., .lrodata, .... That is, relro data is not between text and rodata. By moving pclntab into rodata, it can be in between of a relocation from a text symbol to an early part of rodata. Putting it into lrodata works around the issue as it is not in between text and rodata.

The failures we see inside Google are all C++/Go mixed binaries, with very large rodata and text sections such that the relocations are already on the edge of overflowing even with the small/large data split. The relocations are from a C++ text symbol to a C++ rodata symbol. They don't reference Go symbols, just that Go part happens to be in between.

cherrymui avatar Dec 09 '25 19:12 cherrymui

Putting pclntab in lrodata is a reasonable workaround, as our pclntab could very well be larger than the -mlarge-data-threshold. I'd think we want to do it only when linking externally in -mcmodel=medium -mlarge-data-threshold mode, though.

As seen from the patch above, we made assumptions about the order of sections in the runtime. By moving things around, certain markers may no longer be valid (like moduledata.end, which currently points to the end of bss, assuming no Go data is laid out after that). Given the unusual order the C linker uses, we probably cannot depend on the order between sections. Instead, we may need start/end marker for each section we need and check them separately. We'll need to carefully look into this.

Besides the runtime, there may be other tools (debugger, code analyzer) that could make implicit assumptions about section orders. The layout is never guaranteed, so we can make the change. But there could be frictions.

cherrymui avatar Dec 09 '25 19:12 cherrymui

What can you share about the linker script that you are using? Right now the pclntab is going into a separate section named .gopclntab. That section is SHF_ALLOC but not SHF_WRITE or SHF_EXECINSTR. Thus in a normal link the section will wind up the read only segment.

The proposed patch changes that to .lrodata.gopclntab and doesn't touch the flags. Why does that make a difference? Does the linker script look for sections that start with .lrodata? Can we just change the linker script to handle .gopclntab?

For the changes to runtime/stack.go, the change to look between gofunc and the new epclntab field makes sense to me. But I don't understand why it is also looking between noptrbss and enoptrbss. The pointers in question are funcdata and should always be between gofunc and epclntab. They should always be between gofunc and end, but the tighter bound makes sense.

For the changes to runtime/stkframe.go we should just always look between noptrbss and noptrbss. The current test looks at a broader range and should always work. But it will fail if .gopclntab somehow winds up after noptrbss.

ianlancetaylor avatar Dec 10 '25 02:12 ianlancetaylor

Oh, I see, runtime/stack.go needs to look in noptrbss specifically because of methodValueCallFrameObjs.

ianlancetaylor avatar Dec 10 '25 02:12 ianlancetaylor

Change https://go.dev/cl/728840 mentions this issue: runtime, cmd/link: tighten search for stackObjectRecord

gopherbot avatar Dec 10 '25 03:12 gopherbot