go-ora icon indicating copy to clipboard operation
go-ora copied to clipboard

rows.Next() halts with no exceptions

Open Selenion opened this issue 3 years ago • 13 comments

On go-ora/v2 v2.4.27 got a problem. When I iterate through the sql.rows sometimes it halts with no exceptions and timeouts. I did same on v1.2.1 with no problem

The discussion was here #https://stackoverflow.com/questions/73048029/rows-next-halts-after-some-number-of-rows

The max length of field is 7000, but it doesn't affect the time when iteration stops. It stops even when I ORDER BY length(TEXT) asc

DDL of TICKET table:

-- auto-generated definition
create table TICKET
(
    ID                   NUMBER           not null
        constraint TICKET_ID_PK
            primary key,
    APPLICANT_ID         NUMBER           not null
        constraint TICKET_APPLICANT_FK
            references USERS,
    SOLUTION_USER_PHONE  VARCHAR2(56),
    PRIORITY_ID          NUMBER
        constraint TICKET_PRIORITY_FK
            references TICKET_PRIORITY,
    STATE_ID             NUMBER           not null
        constraint TICKET_STATE_FK
            references TICKET_STATE,
    SERVICE_ID           NUMBER           not null
        constraint TICKET_SERVICE_FK
            references SERVICE,
    CHANGE_ID            VARCHAR2(255),
    GUID                 VARCHAR2(56),
    REMEDY_ID            VARCHAR2(50),
    HEADER               VARCHAR2(100)    not null,
    CREATE_DATE          NUMBER           not null,
    LAST_EDIT_DATE       NUMBER,
    SLA_DATE             NUMBER,
    FINISH_DATE          NUMBER,
    SYNC_MASK            NUMBER default 0 not null,
    TEXT                 CLOB,
    SOLUTION             CLOB,
    TEXT_COPY            CLOB,
    USER_CONTACT_ID      NUMBER
        constraint TICKET_USER_CONTACT_ID_FK
            references USER_CONTACT,
    LAST_SOLUTION_DATE   NUMBER,
    WAIT_PERIOD          NUMBER,
    STATE_REASON         VARCHAR2(255),
    IMPACT_ID            NUMBER
        constraint TICKET_IMPACT_FK
            references IMPACT,
    URGENCY_ID           NUMBER
        constraint TICKET_URGENCY_FK
            references URGENCY,
    WEIGHT_ID            NUMBER,
    REQUEST_TYPE_ID      NUMBER
        constraint TICKET_REQUEST_TYPE_FK
            references REQUEST_TYPE,
    SOURCE_ID            NUMBER
        constraint TICKET_SOURCE_FK
            references SOURCE,
    STATE_REASON_ID      NUMBER
        constraint TICKET_STATE_REASON_FK
            references TICKET_STATE_REASON,
    OPER_CAT_1           VARCHAR2(60),
    OPER_CAT_2           VARCHAR2(60),
    OPER_CAT_3           VARCHAR2(60),
    PROD_CAT_1           VARCHAR2(60),
    PROD_CAT_2           VARCHAR2(60),
    PROD_CAT_3           VARCHAR2(60),
    WORKK                VARCHAR2(255),
    WORKK_RE             VARCHAR2(255),
    SUB_WORK             VARCHAR2(255),
    SUB_WORK_RE          VARCHAR2(255),
    SOLUTION_CAT_1       VARCHAR2(60),
    SOLUTION_CAT_2       VARCHAR2(60),
    SOLUTION_CAT_3       VARCHAR2(60),
    ASSIGNED_NOTE        VARCHAR2(256),
    SOLUTION_USER_LOGIN  VARCHAR2(128),
    SOLUTION_GROUP_NAME  VARCHAR2(128),
    SOLUTION_USER_ID     NUMBER
        constraint TICKET_SOLUTION_USER_FK
            references USERS,
    SOLUTION_GROUP_ID    NUMBER
        constraint TICKET_SOLUTION_GROUP_FK
            references SUPPORT_GROUP,
    LAST_EDIT_USER_LOGIN VARCHAR2(255)
)
/

create index IDX_TICKET
    on TICKET (STATE_ID, APPLICANT_ID)
/

create index IDX_TICKET_SOL_GROUP_ID
    on TICKET (SOLUTION_GROUP_ID)
/

create index IDX_TICKET_SOL_USER_ID
    on TICKET (SOLUTION_USER_ID)
/

create index IDX_TICKET_STATE_ID
    on TICKET (STATE_ID)
/

create index IDX_TICKET_GUID
    on TICKET (GUID)
/

create index IDX_TICKET_REMEDY_ID
    on TICKET (REMEDY_ID)
/

create index IDX_TICKET_CREATE_DATE
    on TICKET (CREATE_DATE)
/

Selenion avatar Jul 21 '22 09:07 Selenion

Unfortunately the above is a bit low on detail so I'll copy additional info from Stack overflow in an attempt make this stand-alone. The OP did enable tracing but has not provided any info from that. I have not been able to replicate this (would need the OP's data for that).

Oracle version: "Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production" Working go-ora version: v1.2.1 Failing go-ora version: v1.2.1

The code below freezes (not clear if this happens at rows.Next or at the Scan). While it fails with v2.4.27 the same code runs successfully with v1.2.1. The code is run against a table with the schema shown above; query returns circa 400k rows.

On Stackoverflow we narrowed the issue down to the column TEXT CLOB, (as per the code below; this field contains up to 7000 bytes/row). The point at which things freeze varies (the OP reports that sometimes it "stops right after the first 25 batch, sometimes when it reaches 1200+"). I suggested ordering the result to see if the fault was due to a particular row but am not really clear on what happened when that was tried (other than that it still froze).

Sorry that this is still a little vague but I think it covers off everything in the Stackoverflow question. I suggested the OP raise this issue as it does seem likely the issue is within this library (and it's difficult to get sufficient detail via Stack overflow comments).

package main

import (
    "database/sql"
    _ "github.com/sijms/go-ora/v2"
    "log"
)

var oraDB *sql.DB

var con = "oracle://login:password@ora_db:1521/database"

func InitOraDB(dataSourceName string) error {
    var err error
    oraDB, err = sql.Open("oracle", dataSourceName)
    if err != nil {
        return err
    }
    return oraDB.Ping()
}

func GetHrTicketsFromOra() {
    var ot string
    rows, err := oraDB.Query("select TEXT from TICKET where SOLUTION_GROUP_ID = 5549")
    if err != nil {
        println("Error while getting rows from Ora")
    }
    for rows.Next() {
        log.Println("Reading the ticket")

        err := rows.Scan(&ot)
        if err != nil {
            log.Println("Reading failed", err)
        }
        log.Println("Read:")
    }
    log.Println("Finished legacy tickets export")

}

func main() {

    err := InitOraDB(con)
    if err != nil {
        log.Println("Error connection Ora")
    }

    GetHrTicketsFromOra()
}

MattBrittan avatar Jul 21 '22 10:07 MattBrittan

trace.log here is the log when I tried to extract data ordered by text length asc, so the first was very low, but I still got a freeze.

Selenion avatar Jul 21 '22 10:07 Selenion

try to define PREFETCH_ROWS around 100 and see . do you use high speed connection? I think the program feeze because it try to download large amount of data / one fetch. so decrease PREFETCH_ROWS value to avoid this freeze

v1 use fixed PREFETCH_ROWS = 25

sijms avatar Jul 24 '22 16:07 sijms

I think I found the place where the problem occurs. The reading stops even with PREFETCH_ROWS=1 on the row where the column SOLUTION is 35000 length. I tried various PREFETCH_ROWS, but 50+ shows a TTC error.

Selenion avatar Jul 24 '22 17:07 Selenion

SOLUTION is CLOB i test before up to 100000 length and it is ok I will try to create the table put some data in it and see how many rows you try to fetch from the table?

sijms avatar Jul 25 '22 08:07 sijms

The result of Query is about 400k. The minimum PREFETCH_ROWS is 1. Here is example of one SOLUTION (depersonalized) where I stuck. example.txt

Selenion avatar Jul 25 '22 10:07 Selenion

Got same behavior when exporting another field (file content up to 10mb). Sometimes it halt, sometimes successfuly exports, sometimes end with error the packet received is not data packet It can happen on the same row. I asked guys, that support database, but they said there is no strange info in logs when I execute the query

Selenion avatar Aug 04 '22 12:08 Selenion

I got an update. When I launched the application in same data center where database is, the first example (where the rows with TEXT and SOLUTION are up to 35k) successfully extracted. But the second table with files (up to 10mb) got same problem. So I think the behavior somehow depends on connection speed, or active connection time.

Selenion avatar Aug 11 '22 11:08 Selenion

this is repeated in issue #112 and #138 large data need rapid and stable network

sijms avatar Aug 11 '22 12:08 sijms

Nope. I extract only one row at time. And one more error I get from the place where connection is definitely stable: panic: runtime error: slice bounds out of range [-17:]

goroutine 1 [running]: github.com/sijms/go-ora/v2/network.(*Session).GetInt64(0xc000392000?, 0xc0008368d7?, 0x7c?, 0x1f?) C:/Users/Grigoriy.Shiporin/go/pkg/mod/github.com/sijms/go-ora/[email protected]/network/session.go:1036 +0x192 github.com/sijms/go-ora/v2/network.(*Session).GetInt(...) C:/Users/Grigoriy.Shiporin/go/pkg/mod/github.com/sijms/go-ora/[email protected]/network/session.go:1052 github.com/sijms/go-ora/v2.(*Lob).readData(0xc0004b6e80)

Selenion avatar Aug 11 '22 12:08 Selenion

One more thing. The minimal network characteristic where this issue occured is 6ms ping and 285mbit/sec

Selenion avatar Aug 11 '22 13:08 Selenion

@howmp helps by pull request he fix read large blob over multiple data packet and I create new release you can test again v2.5.3

sijms avatar Aug 23 '22 20:08 sijms

Thanks for update. No changes in new version. Same 3 various behavior:

  1. Halt with no error
  2. panic: runtime error: slice bounds out of range
  3. panic: the packet received is not data packet

I could not reproduce each error separate form other two

Selenion avatar Aug 24 '22 06:08 Selenion

I find the bug related to oracle 11g Summary Object reading and fix it in version 2.5.11 pls test it and see if every thing is ok

sijms avatar Nov 23 '22 16:11 sijms

First of all, thanks for the work on this driver!

We are facing (kind of) the same issues. The rows.Next() fails silently, while sometimes only receiving e.g. 50 records (2 iterations) instead of the expected 240 for our specific query.

We're using v2.5.17 of the driver. The Oracle version we're using is 'Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production Version 18.7.0.0.0'.

We tested this with setting the prefetch_rows to 20 at which time the result sometimes had 40 records and sometimes all.

Now we 'fixed' this by setting the prefetch_rows to 1000, but this is of course not a solution.

We have our Oracle instance in our 'own' datacenter and the functions using this in a AWS Lambda function. Running the same functions in our internal network (with less latency to the Oracle DB) doesn't give a problem.

If rows.Next() fails, but with a catchable error, this would already work for us. Please let us know what information we can provide to help out.

The Table structure we use is:

CREATE TABLE 
    MAPPING_TBL 
    ( 
        RECNO                          NUMBER(10) NOT NULL, 
        TRANSFORMCODE                  VARCHAR2(10) NOT NULL, 
        FILTERROW                      NUMBER(10) NOT NULL, 
        COLUMNNAME                     VARCHAR2(60) NOT NULL, 
        VALUE                          VARCHAR2(254), 
        SYS_STSLUAFVULIWJ#2ABCQY09S_4H NUMBER GENERATED ALWAYS AS (SYS_OP_COMBINED_HASH 
        ("TRANSFORMCODE","COLUMNNAME")), 
        SYS_STS_N$BNEAJEV0X66MP#LA9RKR NUMBER GENERATED ALWAYS AS (SYS_OP_COMBINED_HASH 
        ("TRANSFORMCODE","COLUMNNAME","VALUE")), 
        CONSTRAINT PK_MAPPING_TBL_TRANSFORMCODEFILTER PRIMARY KEY (TRANSFORMCODE, FILTERROW, COLUMNNAME), 
        CONSTRAINT UX_MAPPING_TBLFILTERECNO UNIQUE (RECNO) 
    );

The Query we use is:

SELECT
    f.value source, was.value from_productcode, wase.value from_variantcode, wordt.value to_productcode, wordte.value to_variantcode
FROM
    SYSADM.MAPPING_TBL f
	, SYSADM.MAPPING_TBL was
	, SYSADM.MAPPING_TBL wordt
	, SYSADM.MAPPING_TBL wordte
	, SYSADM.MAPPING_TBL wase 
WHERE 
    f.transformcode = 'PRODUCT' AND wase.transformcode = 'PRODUCT' AND wordte.transformcode = 'PRODUCT' 
	AND f.columnname = '0#ADDRESSORIGINCODE' 
	AND was.filterrow = f.filterrow AND wordt.filterrow = f.filterrow
	AND wase.filterrow = f.filterrow AND wordte.filterrow = f.filterrow
	AND was.transformcode = 'PRODUCT' AND wordt.transformcode = 'PRODUCT' 
	AND was.columnname = '0#PRODUCTCODE'
	AND wase.columnname = '0#VARIANTCODE'
	AND wordt.columnname = '1#PRODUCTCODE'
	AND wordte.columnname = '1#VARIANTCODE'

The amount of rows are approximately 8.000 in our data set.

rvanduiven avatar Dec 13 '22 10:12 rvanduiven

would you see what is the last messages in log file

sijms avatar Dec 13 '22 22:12 sijms

the same issue with me. Any solution to this ?

LawyZheng avatar Feb 07 '23 01:02 LawyZheng

would you see what is the last messages in log file

Unfortunately I wasn't able to get the messages in the log file from our data team, there was nothing they could find pointing towards this error.

the same issue with me. Any solution to this ?

Not apart from setting the prefetch_rows (which only helps in smaller sets which are below this limit).

rvanduiven avatar Feb 07 '23 07:02 rvanduiven

you can see log file of the program use the following url option

urlOptions := map[string]string{
    "TRACE FILE": "trace.log",
}

sijms avatar Feb 07 '23 07:02 sijms

I am updating the driver fixing many errors and issues would you keep testing if the issue resolved

sijms avatar Feb 24 '23 16:02 sijms