godror icon indicating copy to clipboard operation
godror copied to clipboard

Memory leak with nested plsql collections

Open sergej-klimenko opened this issue 2 months ago • 44 comments

I created a simple test using nested lists of objects and measured the memory usage (heap and rss).

z_plsql_nested_mem_test.go These are the results obtained

go test -run "TestPlSqlNestedObj" -v

eDSN=oracle://demo:demo@localhost:1521/ORA19EE
OSuser=&{502 20 sergej sergej /Users/sergej}
export GODROR_TEST_DSN="user=demo password=SECRET-*** connectString=localhost:1521/ORA19EE connectionClass=TestClassName enableEvents=1 poolIncrement=1 poolMaxSessions=16 poolMinSessions=1 poolSessionMaxLifetime=1h0m0s poolSessionTimeout=5m0s poolWaitTimeout=30s"
# user=demo password=SECRET-*** connectString=localhost:1521/ORA19EE connectionClass=TestClassName enableEvents=1 poolIncrement=1 poolMaxSessions=16 poolMinSessions=1 poolSessionMaxLifetime=1h0m0s poolSessionTimeout=5m0s poolWaitTimeout=30s
Version: v0.49.3+ODPI-5.6.2
Client: 23.3.0.23.9 Timezone: Local
Server: 19.19.0.0.0 [Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production; Version 19.19.0.0.0] Timezone: Local
=== RUN   TestPlSqlNestedMem
    z_plsql_nested_mem_test.go:128: Alloc: 0.413 MiB, Heap: 1.359 MiB, Sys: 13.893 MiB, NumGC: 4
    z_plsql_nested_mem_test.go:135: 0; process memory (rss): 41.469 MiB
testDb: {MaxOpenConnections:16 OpenConnections:2 InUse:2 Idle:0 WaitCount:0 WaitDuration:0s MaxIdleClosed:13 MaxIdleTimeClosed:0 MaxLifetimeClosed:0}: busy=0 open=0 max=0 maxLifetime=0s timeout=0s waitTimeout=0s <nil>
    z_plsql_nested_mem_test.go:128: Alloc: 0.503 MiB, Heap: 1.727 MiB, Sys: 14.143 MiB, NumGC: 188
    z_plsql_nested_mem_test.go:135: 100; process memory (rss): 53.938 MiB
    z_plsql_nested_mem_test.go:128: Alloc: 0.542 MiB, Heap: 1.875 MiB, Sys: 14.393 MiB, NumGC: 373
    z_plsql_nested_mem_test.go:135: 200; process memory (rss): 64.609 MiB
    z_plsql_nested_mem_test.go:128: Alloc: 0.575 MiB, Heap: 1.953 MiB, Sys: 18.393 MiB, NumGC: 561
    z_plsql_nested_mem_test.go:135: 300; process memory (rss): 70.438 MiB
    z_plsql_nested_mem_test.go:128: Alloc: 0.585 MiB, Heap: 1.984 MiB, Sys: 18.393 MiB, NumGC: 749
    z_plsql_nested_mem_test.go:135: 400; process memory (rss): 74.562 MiB
    z_plsql_nested_mem_test.go:128: Alloc: 0.586 MiB, Heap: 1.969 MiB, Sys: 18.393 MiB, NumGC: 940
    z_plsql_nested_mem_test.go:135: 500; process memory (rss): 79.219 MiB
    z_plsql_nested_mem_test.go:128: Alloc: 0.595 MiB, Heap: 2.016 MiB, Sys: 18.393 MiB, NumGC: 1131
    z_plsql_nested_mem_test.go:135: 600; process memory (rss): 83.172 MiB
    z_plsql_nested_mem_test.go:137: started with RSS 43417600, got 87212032
    z_plsql_nested_mem_test.go:128: Alloc: 0.598 MiB, Heap: 2.008 MiB, Sys: 18.393 MiB, NumGC: 1320
    z_plsql_nested_mem_test.go:135: 700; process memory (rss): 86.844 MiB
    z_plsql_nested_mem_test.go:137: started with RSS 43417600, got 91062272
    z_plsql_nested_mem_test.go:128: Alloc: 0.608 MiB, Heap: 2.055 MiB, Sys: 18.393 MiB, NumGC: 1511
    z_plsql_nested_mem_test.go:135: 800; process memory (rss): 90.438 MiB
    z_plsql_nested_mem_test.go:137: started with RSS 43417600, got 94830592
    z_plsql_nested_mem_test.go:128: Alloc: 0.606 MiB, Heap: 2.016 MiB, Sys: 18.643 MiB, NumGC: 1704
    z_plsql_nested_mem_test.go:135: 900; process memory (rss): 94.312 MiB
    z_plsql_nested_mem_test.go:137: started with RSS 43417600, got 98893824
    z_plsql_nested_mem_test.go:128: Alloc: 0.608 MiB, Heap: 2.078 MiB, Sys: 18.643 MiB, NumGC: 1896
    z_plsql_nested_mem_test.go:135: 1000; process memory (rss): 98.094 MiB
    z_plsql_nested_mem_test.go:137: started with RSS 43417600, got 102858752
    z_plsql_nested_mem_test.go:128: Alloc: 0.612 MiB, Heap: 2.094 MiB, Sys: 18.643 MiB, NumGC: 2089
    z_plsql_nested_mem_test.go:135: 1100; process memory (rss): 101.625 MiB
    z_plsql_nested_mem_test.go:137: started with RSS 43417600, got 106561536
    z_plsql_nested_mem_test.go:128: Alloc: 0.615 MiB, Heap: 2.055 MiB, Sys: 18.643 MiB, NumGC: 2282
    z_plsql_nested_mem_test.go:135: 1200; process memory (rss): 105.344 MiB
    z_plsql_nested_mem_test.go:137: started with RSS 43417600, got 110460928
    z_plsql_nested_mem_test.go:128: Alloc: 0.609 MiB, Heap: 2.039 MiB, Sys: 18.643 MiB, NumGC: 2305
    z_plsql_nested_mem_test.go:135: 1213; process memory (rss): 105.766 MiB
    z_plsql_nested_mem_test.go:137: started with RSS 43417600, got 110903296
--- FAIL: TestPlSqlNestedMem (57.16s)
FAIL
exit status 1
FAIL	github.com/godror/godror	58.178s

There is clearly a memory leak here, but I couldn't figure it out. Heap dump show nothing and all usual suspects are correctly closed underway. Also I tried variant with manually mapped structs without using struct tags with same result. Using standaloneConnection=1 - same result. The larger the internal list becomes, the greater the RSS jump. I would appreciate for any help.

sergej-klimenko avatar Oct 09 '25 17:10 sergej-klimenko

I discovered something strange in the debug output of test. None of the lines like

handlers.go:172: 20:33:59.357 DBG "Object.Close" attrs={"object":"0x600001b33570"}

are followed by "Object.Close close sub-object". However, they should be in the output, right? (For closing collections as attributes on parent object)

sergej-klimenko avatar Oct 09 '25 17:10 sergej-klimenko

Try with using the same s (pslice) for each call - that'll eliminate any Go-side allocations.

Re sub.Close: does https://github.com/godror/godror/issues/319#issuecomment-1827090539 help?

tgulacsi avatar Oct 10 '25 05:10 tgulacsi

Try with using the same s (pslice) for each call - that'll eliminate any Go-side allocations.

Unfortunately, that changes nothing - just need slightly more time to double the RSS 8face74

Re sub.Close: does https://github.com/godror/godror/issues/319#issuecomment-1827090539 help?

I've tried this on master, so that doesn't help too

sergej-klimenko avatar Oct 10 '25 07:10 sergej-klimenko

Commenting out

  • L 3329 in stmt.go: does not help
  • L 3320: does not help.
  • dataGetObjectStruct: does not help
  • dataSetObjectStruct: panics

I've tried to add obj/coll.Close everywhere possible, without success.

For me the allocation is quite enormous: starts with 60MiB, ends with 1475MiB.

The strange is that the memprofile says it's mostly dataGetObjectStructObj and dateSetObjectStructObj (

Image

). As it appears in Go's profile, it must be on Go side, NOT C/OCI/ODPI!

tgulacsi avatar Oct 13 '25 17:10 tgulacsi

         .          .   2937:		} else {
         .          .   2938:			for i, n := 0, rv.Len(); i < n; i++ {
         .          .   2939:				if logger != nil && logger.Enabled(ctx, slog.LevelDebug) {
         .          .   2940:					logger.Debug("dataSetObjectStructObj", "i", i, "collectionOf", ot.CollectionOf, "elt", rv.Index(i).Interface())
         .          .   2941:				}
         .   493.02MB   2942:				sub, err := c.dataSetObjectStructObj(ctx, ot.CollectionOf, rv.Index(i))
         .          .   2943:				if err != nil {
         .          .   2944:					coll.Close()
         .          .   2945:					return nil, fmt.Errorf("%d. dataSetObjectStructObj: %w", i, err)
         .          .   2946:				}
         .   512.56kB   2947:				err = coll.Append(sub)
         .          .   2948:				sub.Close()
         .          .   2949:				if err != nil {
         .          .   2950:					coll.Close()
         .          .   2951:					return nil, err
         .          .   2952:				}
         .          .   2953:			}
         .          .   2954:		}
         .          .   2955:		return coll.Object, nil
         .          .   2956:
         .          .   2957:	case reflect.Struct:
         .          .   2958:		if ot.CollectionOf != nil {
         .          .   2959:			for i, n := 0, rvt.NumField(); i < n; i++ {
         .          .   2960:				f := rvt.Field(i)
         .          .   2961:				if !f.IsExported() || fieldIsObjectTypeName(f) {
         .          .   2962:					continue
         .          .   2963:				}
         .          .   2964:				rf := rv.FieldByIndex(f.Index)
         .          .   2965:				// we must find the slice in the struct
         .          .   2966:				if f.Type.Kind() != reflect.Slice {
         .          .   2967:					continue
         .          .   2968:				}
         .          .   2969:				if logger != nil && logger.Enabled(ctx, slog.LevelDebug) {
         .          .   2970:					logger.Debug("dataSetObjectStructObj", "sliceInStruct", f.Name, "ot", ot.FullName(), "rf", rf.Interface())
         .          .   2971:				}
         .          .   2972:				return c.dataSetObjectStructObj(ctx, ot, rf)
         .          .   2973:			}
         .          .   2974:		}
         .          .   2975:
         .    67.50MB   2976:		obj, err := ot.NewObject()
         .          .   2977:		if err != nil || obj == nil {
         .          .   2978:			return nil, err
         .          .   2979:		}
         .          .   2980:		for i, n := 0, rvt.NumField(); i < n; i++ {
         .          .   2981:			f := rvt.Field(i)
         .          .   2982:			if !f.IsExported() || fieldIsObjectTypeName(f) {
         .          .   2983:				continue
         .          .   2984:			}
         .          .   2985:			rf := rv.FieldByIndex(f.Index)
         .     1.50MB   2986:			nm, typ, _ := parseStructTag(f.Tag)
         .          .   2987:			if nm == "-" {
         .          .   2988:				continue
         .          .   2989:			}
         .          .   2990:			if nm == "" {
         .        1MB   2991:				nm = strings.ToUpper(f.Name)
         .          .   2992:			}
         .          .   2993:			attr, ok := obj.Attributes[nm]
         .          .   2994:			if !ok {
         .          .   2995:				return nil, fmt.Errorf("copy %s to %s.%s: %w (have: %q)",
         .          .   2996:					f.Name,
         .          .   2997:					obj.Name, nm, ErrNoSuchKey, obj.AttributeNames())
         .          .   2998:			}
  361.02MB   361.02MB   2999:			var ad Data
         .          .   3000:			if !attr.IsObject() {
         .    59.50MB   3001:				if err := ad.Set(rf.Interface()); err != nil {
         .          .   3002:					return nil, fmt.Errorf("set %q with %T: %w", nm, rv.Interface(), err)
         .          .   3003:				}
         .          .   3004:			} else {
         .          .   3005:				ot := attr.ObjectType
         .          .   3006:				if ot == nil && typ != "" {
         .          .   3007:					var err error
         .          .   3008:					if ot, err = c.GetObjectType(typ); err != nil {
         .          .   3009:						return nil, err
         .          .   3010:					}
         .          .   3011:				}
         .          .   3012:				if logger != nil && logger.Enabled(ctx, slog.LevelDebug) {
         .          .   3013:					logger.Debug("dataSetObjectStructObj", "name", nm, "tag", f.Tag, "ot", ot, "typ", typ)
         .          .   3014:				}
         .   478.02MB   3015:				sub, err := c.dataSetObjectStructObj(ctx, ot, rf)
         .          .   3016:				if err != nil {
         .          .   3017:					return nil, err
         .          .   3018:				}
         .          .   3019:				ad.SetObject(sub)
  512.01kB   512.01kB   3020:				defer sub.Close()
         .          .   3021:			}
         .          .   3022:			if err := obj.SetAttribute(nm, &ad); err != nil {
         .          .   3023:				if logger != nil {
         .          .   3024:					logger.Error("SetAttribute", "obj", ot.Name, "nm", nm,
         .          .   3025:						"index", f.Index, "kind", f.Type.Kind(),

tgulacsi avatar Oct 13 '25 17:10 tgulacsi

Commenting out stmt.go L 2943 eliminates the memory consumption! (RSS stays at 64MiB).

tgulacsi avatar Oct 13 '25 17:10 tgulacsi

x.pprof.gz

tgulacsi avatar Oct 13 '25 17:10 tgulacsi

Ok, without logging in SetAttribute, it's way better:

x.pprof.gz

tgulacsi avatar Oct 13 '25 17:10 tgulacsi

Ok, without logging in SetAttribute, it's way better:

I've got more or less same results about RSS as before this commit...

=== RUN   TestPlSqlNestedObj
    z_plsql_types_i390_test.go:114: Alloc: 0.485 MiB, Heap: 1.258 MiB, Sys: 12.893 MiB, NumGC: 2
    z_plsql_types_i390_test.go:121: 0; process memory (rss): 36.891 MiB
testDb: {MaxOpenConnections:16 OpenConnections:2 InUse:2 Idle:0 WaitCount:0 WaitDuration:0s MaxIdleClosed:16 MaxIdleTimeClosed:0 MaxLifetimeClosed:0}: busy=0 open=0 max=0 maxLifetime=0s timeout=0s waitTimeout=0s <nil>
    z_plsql_types_i390_test.go:114: Alloc: 0.531 MiB, Heap: 1.344 MiB, Sys: 13.393 MiB, NumGC: 28
    z_plsql_types_i390_test.go:121: 100; process memory (rss): 70.188 MiB
    z_plsql_types_i390_test.go:114: Alloc: 0.547 MiB, Heap: 1.352 MiB, Sys: 13.393 MiB, NumGC: 54
    z_plsql_types_i390_test.go:121: 200; process memory (rss): 80.797 MiB
    z_plsql_types_i390_test.go:123: started with RSS 36, got 80 (221.551%)
    z_plsql_types_i390_test.go:114: Alloc: 0.557 MiB, Heap: 1.414 MiB, Sys: 13.393 MiB, NumGC: 79
    z_plsql_types_i390_test.go:121: 300; process memory (rss): 89.469 MiB
    z_plsql_types_i390_test.go:123: started with RSS 36, got 89 (245.330%)
    z_plsql_types_i390_test.go:114: Alloc: 0.561 MiB, Heap: 1.359 MiB, Sys: 13.393 MiB, NumGC: 106
    z_plsql_types_i390_test.go:121: 400; process memory (rss): 98.109 MiB
    z_plsql_types_i390_test.go:123: started with RSS 36, got 98 (269.023%)
    z_plsql_types_i390_test.go:114: Alloc: 0.566 MiB, Heap: 1.367 MiB, Sys: 13.393 MiB, NumGC: 132
    z_plsql_types_i390_test.go:121: 500; process memory (rss): 106.891 MiB
    z_plsql_types_i390_test.go:123: started with RSS 36, got 106 (293.102%)
    z_plsql_types_i390_test.go:114: Alloc: 0.577 MiB, Heap: 1.422 MiB, Sys: 13.393 MiB, NumGC: 158
    z_plsql_types_i390_test.go:121: 600; process memory (rss): 115.406 MiB
    z_plsql_types_i390_test.go:123: started with RSS 36, got 115 (316.452%)
    z_plsql_types_i390_test.go:114: Alloc: 0.584 MiB, Heap: 1.422 MiB, Sys: 14.143 MiB, NumGC: 184
    z_plsql_types_i390_test.go:121: 700; process memory (rss): 124.656 MiB
    z_plsql_types_i390_test.go:123: started with RSS 36, got 124 (341.817%)
    z_plsql_types_i390_test.go:114: Alloc: 0.589 MiB, Heap: 1.422 MiB, Sys: 14.143 MiB, NumGC: 209
    z_plsql_types_i390_test.go:121: 800; process memory (rss): 133.438 MiB
    z_plsql_types_i390_test.go:123: started with RSS 36, got 133 (365.895%)
    z_plsql_types_i390_test.go:114: Alloc: 0.597 MiB, Heap: 1.430 MiB, Sys: 14.143 MiB, NumGC: 234
    z_plsql_types_i390_test.go:121: 900; process memory (rss): 141.516 MiB
    z_plsql_types_i390_test.go:123: started with RSS 36, got 141 (388.046%)
    z_plsql_types_i390_test.go:114: Alloc: 0.499 MiB, Heap: 1.297 MiB, Sys: 14.143 MiB, NumGC: 241
    z_plsql_types_i390_test.go:121: 924; process memory (rss): 143.375 MiB
    z_plsql_types_i390_test.go:123: started with RSS 36, got 143 (393.145%)
--- FAIL: TestPlSqlNestedObj (57.16s)
FAIL
exit status 1
FAIL	github.com/godror/godror	57.889s

sergej-klimenko avatar Oct 13 '25 18:10 sergej-klimenko

Also tried to pass parameter as IN-only with no difference...

sergej-klimenko avatar Oct 13 '25 19:10 sergej-klimenko

Maybe you can try to run this test with AddressSanitizer (unfortunately not supported on my arm64) ?

GO_LDFLAGS='-fsanitize=address' CGO_CFLAGS='-O0 -g3 -fsanitize=address' go test -asan -run "TestPlSqlNestedObj" -v

sergej-klimenko avatar Oct 13 '25 20:10 sergej-klimenko

$ CC=gcc go test -asan -run "TestPlSqlNestedObj" -v
Version: v0.49.5+ODPI-5.6.3
Client: 23.8.0.25.4 Timezone: Local
Server: 23.9.0.25.7 [Oracle Database 23ai Free Release 23.0.0.0.0 - Develop, Learn, and Run for Free; Version 23.9.0.25.07] Timezone: UTC
=== RUN   TestPlSqlNestedObj
    z_plsql_types_i390_test.go:114: Alloc: 0.405 MiB, Heap: 1.109 MiB, Sys: 12.334 MiB, NumGC: 3
    z_plsql_types_i390_test.go:121: 0; process memory (rss): 155.680 MiB
testDb: {MaxOpenConnections:2 OpenConnections:2 InUse:2 Idle:0 WaitCount:0 WaitDuration:0s MaxIdleClosed:10 MaxIdleTimeClosed:0 MaxLifetimeClosed:0}: busy=0 open=0 max=0 maxLifetime=0s timeout=0s waitTimeout=0s <nil>
    z_plsql_types_i390_test.go:114: Alloc: 0.430 MiB, Heap: 1.141 MiB, Sys: 12.334 MiB, NumGC: 162
    z_plsql_types_i390_test.go:121: 100; process memory (rss): 465.246 MiB
    z_plsql_types_i390_test.go:123: started with RSS 155, got 465 (298.848%)
    z_plsql_types_i390_test.go:114: Alloc: 0.433 MiB, Heap: 1.133 MiB, Sys: 12.334 MiB, NumGC: 317
    z_plsql_types_i390_test.go:121: 200; process memory (rss): 492.793 MiB
    z_plsql_types_i390_test.go:123: started with RSS 155, got 492 (316.543%)
    z_plsql_types_i390_test.go:114: Alloc: 0.337 MiB, Heap: 1.047 MiB, Sys: 12.334 MiB, NumGC: 390
    z_plsql_types_i390_test.go:121: 247; process memory (rss): 501.500 MiB
    z_plsql_types_i390_test.go:123: started with RSS 155, got 501 (322.136%)
--- FAIL: TestPlSqlNestedObj (57.49s)

tgulacsi avatar Oct 14 '25 05:10 tgulacsi

I've translated this test one-to-one to python-oracledb (using thick mode) - there is no memory leak here.

sergej-klimenko avatar Oct 14 '25 06:10 sergej-klimenko

With justOut, the memory grow stays around 10% (with inOut, it's more than 100%) - so it's dataSet...

With empty dataSetObjectStruct, the memory grow stays around 20%.

With breaking out of Append loops in dataSetObjectStructObj, it stays 10%.

Ok, it's the loop at stmt.go#L

tgulacsi avatar Oct 28 '25 06:10 tgulacsi

Perhaps you meant the memory grow stays around 10% with justOut , not with justIn ? Because in my run : justIn - 380%, justIn < 10%

z_plsql_types_i390_test.go:152: TestPlSqlNestedObj/justIn: started with RSS 42, got 160 (381.659%)

z_plsql_types_i390_test.go:150: TestPlSqlNestedObj/justOut: 1554; process memory (rss): 71.766 MiB

sergej-klimenko avatar Oct 28 '25 09:10 sergej-klimenko

Yes, the input (inOut) allocates that much on C side - the Go -memprofile says it's the same for with input and output (data{Get,Set}ObjectStruct).

I'll try some ebpf tracers, and some more aggressive closing for Objects.

tgulacsi avatar Oct 28 '25 10:10 tgulacsi

github.com/poonai/cgoleak panics on DWARF:

./godror.test -test.run=NestedObj/inOut -test.v & sudo cgoleak --pid $!

mimalloc:

env MIMALLOC_VERBOSE=1 MIMALLOC_SHOW_STATS=1 LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libmimalloc.so.3.1 ./godror.test -test.run=NestedObj/inOut -test.v

says nothing - nor running with valgrind:

valgrind --leak-check=yes ./godror.test -test.run=NestedObj/inOut

(says a lot but only about 1.5MiB of leak).

I'm getting short on ideas...

tgulacsi avatar Oct 29 '25 06:10 tgulacsi

Strange.

LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libmimalloc.so.3 go test -run=Nested -v

passes (120MiB from 95)

LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 go test -run=Nested -v

does not pass (221MiB from 96)

This underlines that the allocation is on the C side - but I don't know what more has to be released and when... @cjbj, @anthony-tuininga, @sudarshan12s can you help?

tgulacsi avatar Nov 02 '25 19:11 tgulacsi

I've simplified test, replacing nested collection with some simple char fields of sufficient length. z_plsql_types_coll_test And this variation also failed :

z_plsql_types_coll_test.go:126: TestPlSqlObjColl/inOut: started with RSS 43, got 107 (247.925%)
--- FAIL: TestPlSqlObjColl (57.10s)
    --- FAIL: TestPlSqlObjColl/inOut (56.91s)

The more of these fields we add, the faster RSS grows, and it grows indefinitely if we extend deadline. However, it is not as fast as in the case of nested collections.

sergej-klimenko avatar Nov 04 '25 10:11 sergej-klimenko

Based on my testing, I didn’t observe actual memory leaks at the C layer. Yes there is a gradual RSS increase and it is most likely due to frequent malloc and free operations.

In each iteration, a new connection is acquired , bind variables are created and destroyed, and then the connection is released. This triggers repeated memory allocations and deallocations inside C.dpiConn_newVar. Reusing the in/out bind variable could help reduce these allocations, but I think implementing that may not be straight forward.

sudarshan12s avatar Nov 04 '25 14:11 sudarshan12s

Yes there is a gradual RSS increase and it is most likely due to frequent malloc and free operations.

But it not the case with same test in python-oracledb ( thick mode) , wich using the same ODPI-C library...

sergej-klimenko avatar Nov 04 '25 14:11 sergej-klimenko

https://github.com/godror/godror/issues/390#issuecomment-3486336538 : if you close the connection, it releases all its unreleased resources. I think here we're hunting some not freed objects...

https://github.com/godror/godror/issues/390#issuecomment-3485034900 : I've tried explicitly closeing all collection items and trim the collection (8637bd5), without success.

tgulacsi avatar Nov 04 '25 18:11 tgulacsi

5d7d9d2 : if I print out dpiObject.refCount in Object.Close, it seems that those PAIR objects are one instance with a hell'of a lot of references - and dpiObject_release does not decreases it!

tgulacsi avatar Nov 08 '25 09:11 tgulacsi

5d7d9d2 : if I print out dpiObject.refCount in Object.Close, it seems that those PAIR objects are one instance with a hell'of a lot of references - and dpiObject_release does not decreases it!

After running test with printed out dpiObject.refCount, I have only lines like that for PAIR objects, as it should be...

PAIR[0x600002760370] refCount=1 -> 0

sergej-klimenko avatar Nov 08 '25 10:11 sergej-klimenko

3531147:

$ env VERBOSE=1 GODROR_WARN_DPI_OBJECT_REFCOUNT=1 go test -run=Nested/inOut  2>&1 | uniq -c

export GODROR_TEST_DSN="user=demo password=SECRET-*** connectString=localhost:1521/freepdb1 connectionClass=TestClassName enableEvents=1 poolIncrement=1 poolMaxSessions=2 poolMinSessions=1 poolSessionMaxLifetime=1h0m0s poolSessionTimeout=5m0s poolWaitTimeout=30s"
# user=demo password=SECRET-*** connectString=localhost:1521/freepdb1 connectionClass=TestClassName enableEvents=1 poolIncrement=1 poolMaxSessions=2 poolMinSessions=1 poolSessionMaxLifetime=1h0m0s poolSessionTimeout=5m0s poolWaitTimeout=30s
Version: v0.49.5+ODPI-5.6.3
Client: 23.26.0.0.0 Timezone: Local
Server: 23.26.0.0.0 [Oracle AI Database 26ai Free Release 23.26.0.0.0 - Develop, Learn, and Run for Free; Version 23.26.0.0.0] Timezone: UTC
    200 PAIR[0x7fa0a404a360] refCount 1 -> 98261597
      1 PAIR_LIST[0x7fa0a404a2f0] refCount 1 -> 98261597
    100 PAIR[0x7fa0a404a360] refCount 1 -> 98261597
      1 PAIR_LIST[0x7fa0a404a2f0] refCount 1 -> 98261597
      1 POBJ[0x7fa0a404a280] refCount 1 -> 98261597
    200 PAIR[0x7fa0a404a360] refCount 1 -> 98261597
      1 PAIR_LIST[0x7fa0a404a2f0] refCount 1 -> 98261597
    100 PAIR[0x7fa0a404a360] refCount 1 -> 98261597
      1 PAIR_LIST[0x7fa0a404a2f0] refCount 1 -> 98261597
      1 POBJ[0x7fa0a404a280] refCount 1 -> 98261597

...

    200 PAIR[0x7fa070141730] refCount 1 -> 98261597
      1 PAIR_LIST[0x7fa0701205c0] refCount 1 -> 98261597
    100 PAIR[0x7fa070141730] refCount 1 -> 98261597
      1 PAIR_LIST[0x7fa0701205c0] refCount 1 -> 98261597
      1 POBJ[0x7fa0700e90d0] refCount 1 -> 98261597
    200 PAIR[0x7fa070141730] refCount 1 -> 98261597
      1 PAIR_LIST[0x7fa0701205c0] refCount 1 -> 98261597
    100 PAIR[0x7fa070141730] refCount 1 -> 98261597
      1 PAIR_LIST[0x7fa0701205c0] refCount 1 -> 98261597
    201 POBJ[0x7fa0700e90d0] refCount 1 -> 98261597
      1 POBJ_T[0x7fa0a81eaaa0] refCount 2 -> 1
      1 --- FAIL: TestPlSqlNestedObj (29.24s)
      1     --- FAIL: TestPlSqlNestedObj/inOut (28.77s)
      1         z_plsql_types_i390_test.go:250: dl: 2025-11-08 15:43:30.399895323 +0100 CET m=+28.740652034 dur:28.364931356s
      1         z_plsql_types_i390_test.go:144: TestPlSqlNestedObj/inOut: Alloc: 0.707 MiB, Heap: 1.688 MiB, Sys: 13.334 MiB, NumGC: 5
      1         z_plsql_types_i390_test.go:151: TestPlSqlNestedObj/inOut: 0; process memory (rss): 66.449 MiB
      1         z_plsql_types_i390_test.go:157: TestPlSqlNestedObj/inOut: started with RSS 66, got 66 (100.000%)
      1         z_plsql_types_i390_test.go:144: TestPlSqlNestedObj/inOut: Alloc: 0.743 MiB, Heap: 1.727 MiB, Sys: 13.709 MiB, NumGC: 96
      1         z_plsql_types_i390_test.go:151: TestPlSqlNestedObj/inOut: 30; process memory (rss): 232.887 MiB
      1         z_plsql_types_i390_test.go:153: TestPlSqlNestedObj/inOut: started with RSS 66, got 232 (350.473%)
 FAIL

tgulacsi avatar Nov 08 '25 13:11 tgulacsi

My output

VERBOSE=1 GODROR_WARN_DPI_OBJECT_REFCOUNT=1 go test -run=Nested/inOut 2>&1 | uniq -c

1 export GODROR_TEST_DSN="user=demo password=SECRET-*** connectString=localhost:1521/ORA19EE connectionClass=TestClassName enableEvents=1 poolIncrement=1 poolMaxSessions=16 poolMinSessions=1 poolSessionMaxLifetime=1h0m0s poolSessionTimeout=5m0s poolWaitTimeout=30s"
   1 # user=demo password=SECRET-*** connectString=localhost:1521/ORA19EE connectionClass=TestClassName enableEvents=1 poolIncrement=1 poolMaxSessions=16 poolMinSessions=1 poolSessionMaxLifetime=1h0m0s poolSessionTimeout=5m0s poolWaitTimeout=30s
   1 Version: v0.49.5+ODPI-5.6.3
   1 Client: 23.3.0.23.9 Timezone: Local
   1 Server: 19.19.0.0.0 [Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production; Version 19.19.0.0.0] Timezone: Local
 200 PAIR[0x600002fc1400] refCount 1 -> 0
   1 PAIR_LIST[0x600002fc14a0] refCount 1 -> 0
 100 PAIR[0x600002fc1400] refCount 1 -> 0
   1 PAIR_LIST[0x600002fc14a0] refCount 1 -> 0
   1 POBJ[0x600002fc14f0] refCount 1 -> 0
 200 PAIR[0x600002fc1400] refCount 1 -> 0
   1 PAIR_LIST[0x600002fc14a0] refCount 1 -> 0
 100 PAIR[0x600002fc1400] refCount 1 -> 0
   1 PAIR_LIST[0x600002fc14a0] refCount 1 -> 0
   1 POBJ[0x600002fc14f0] refCount 1 -> 0
 200 PAIR[0x600002fc1400] refCount 1 -> 0
   1 PAIR_LIST[0x600002fc14a0] refCount 1 -> 0
 100 PAIR[0x600002fc1400] refCount 1 -> 0
   1 PAIR_LIST[0x600002fc14a0] refCount 1 -> 0
   1 POBJ[0x600002fc14f0] refCount 1 -> 0
....

100 PAIR[0x600002ff80a0] refCount 1 -> 0
   1 PAIR_LIST[0x600002ff8050] refCount 1 -> 0
 201 POBJ[0x600002ff8000] refCount 1 -> 0
   1 POBJ_T[0x600002ff4000] refCount 2 -> 1
   1 --- FAIL: TestPlSqlNestedObj (28.82s)
   1     --- FAIL: TestPlSqlNestedObj/inOut (28.50s)
   1         z_plsql_types_i390_test.go:250: dl: 2025-11-08 15:32:21.329436375 +0200 EET m=+28.988519501 dur:28.425921375s
   1         z_plsql_types_i390_test.go:144: TestPlSqlNestedObj/inOut: Alloc: 0.768 MiB, Heap: 1.734 MiB, Sys: 13.393 MiB, NumGC: 5
   1         z_plsql_types_i390_test.go:151: TestPlSqlNestedObj/inOut: 0; process memory (rss): 46.094 MiB
   1         z_plsql_types_i390_test.go:157: TestPlSqlNestedObj/inOut: started with RSS 46, got 46 (100.102%)
   1         z_plsql_types_i390_test.go:144: TestPlSqlNestedObj/inOut: Alloc: 0.808 MiB, Heap: 1.758 MiB, Sys: 14.018 MiB, NumGC: 189
   1         z_plsql_types_i390_test.go:151: TestPlSqlNestedObj/inOut: 62; process memory (rss): 109.250 MiB
   1         z_plsql_types_i390_test.go:153: TestPlSqlNestedObj/inOut: started with RSS 46, got 109 (237.258%)
   1 FAIL
   1 exit status 1
   1 FAIL	github.com/godror/godror	29.580s

sergej-klimenko avatar Nov 08 '25 13:11 sergej-klimenko

I've tried with instant client 21.20, 19.29, even with a real server (19.28), no change, I get this nonsense in dpiObject.refCount.

And the uniq -c counts sums up: we free as much object as we need (200/100).

So I still don't see the leak's cause (just the leak)...

tgulacsi avatar Nov 08 '25 14:11 tgulacsi

I tried a similar script in python-oracledb and see a similar increase in memory. It appears to be an issue with the Oracle Client libraries (or my understanding of it in ODPI-C!) but that is still being investigated. Using DPI_DEBUG_LEVEL=36 I can see that memory is allocated by OCI but never freed until a call to dpiConn_close() is made.

anthony-tuininga avatar Nov 08 '25 15:11 anthony-tuininga

@anthony-tuininga

I tried a similar script in python-oracledb and see a similar increase in memory.

This is my variant of the test using python-oracledb, and result output - no increase in memory z_plsql_types_i390_test.py

0; process memory (rss): 58.203 MiB
100; process memory (rss): 60.047 MiB
200; process memory (rss): 60.109 MiB
300; process memory (rss): 60.125 MiB
400; process memory (rss): 60.141 MiB
500; process memory (rss): 60.141 MiB
600; process memory (rss): 60.141 MiB
700; process memory (rss): 60.141 MiB
800; process memory (rss): 60.141 MiB
900; process memory (rss): 60.141 MiB
1000; process memory (rss): 60.156 MiB
1100; process memory (rss): 60.250 MiB
1200; process memory (rss): 60.250 MiB
1300; process memory (rss): 60.250 MiB
1400; process memory (rss): 60.250 MiB
1500; process memory (rss): 60.250 MiB
1600; process memory (rss): 60.250 MiB
1700; process memory (rss): 60.250 MiB
1800; process memory (rss): 60.250 MiB
1854; process memory (rss): 60.250 MiB

sergej-klimenko avatar Nov 08 '25 16:11 sergej-klimenko

@anthony-tuininga

I tried a similar script in python-oracledb and see a similar increase in memory.

This is my variant of the test using python-oracledb, and result output - no increase in memory z_plsql_types_i390_test.py

0; process memory (rss): 58.203 MiB
100; process memory (rss): 60.047 MiB
200; process memory (rss): 60.109 MiB
300; process memory (rss): 60.125 MiB
400; process memory (rss): 60.141 MiB
500; process memory (rss): 60.141 MiB
600; process memory (rss): 60.141 MiB
700; process memory (rss): 60.141 MiB
800; process memory (rss): 60.141 MiB
900; process memory (rss): 60.141 MiB
1000; process memory (rss): 60.156 MiB
1100; process memory (rss): 60.250 MiB
1200; process memory (rss): 60.250 MiB
1300; process memory (rss): 60.250 MiB
1400; process memory (rss): 60.250 MiB
1500; process memory (rss): 60.250 MiB
1600; process memory (rss): 60.250 MiB
1700; process memory (rss): 60.250 MiB
1800; process memory (rss): 60.250 MiB
1854; process memory (rss): 60.250 MiB

@sergej-klimenko, can you confirm the platform (Linux or macOS)?

sudarshan12s avatar Nov 08 '25 16:11 sudarshan12s