mysqlclient icon indicating copy to clipboard operation
mysqlclient copied to clipboard

Feature request: let client get the correct lastrowid when bulk insert

Open cbi-gshen opened this issue 5 years ago • 0 comments

Hi, When bulk insert a long list of data(thousands of rows) using cursor.executemany(), the cursor.lastrowid is not correct. code is like this:

def get_data():
    return [
        ("db_tests_112312313", 1, 1, "testsetestst", i)
        for i in range(1, 3000)
    ]


def using_mysql_db(insert):
    db = MySQLdb.connect(hostname, username, password)

    cursor = db.cursor()
    sql_statement = """
        INSERT INTO test (name, id_creator, id_owner, description, id_collection)
        VALUES (%s, %s, %s, %s, %s)
    """

    results = cursor.executemany(sql_statement, get_data())

    print("lastrowid ", cursor.lastrowid)

I think this is an issue a lot of people experienced. I am wondering if it's possible the PyMySQL can support that. I did a small experiment like this, only assign the lastrowid once(the first execute() will set the correct lastrowid) when using executemany. and it works.

    def _do_execute_many(self, prefix, values, postfix, args, max_stmt_length, encoding):
        conn = self._get_db()
        escape = self._escape_args
        if isinstance(prefix, str):
            prefix = prefix.encode(encoding)
        if isinstance(values, str):
            values = values.encode(encoding)
        if isinstance(postfix, str):
            postfix = postfix.encode(encoding)
        sql = bytearray(prefix)
        args = iter(args)
        v = values % escape(next(args), conn)
        sql += v
        rows = 0
        tmp_lastrowid = 0  # added this line
        for arg in args:
            v = values % escape(arg, conn)
            if len(sql) + len(v) + len(postfix) + 1 > max_stmt_length:
                rows += self.execute(sql + postfix)
                if tmp_lastrowid:                              # added this line
                    self.lastrowid = tmp_lastrowid  # added this line
                else:                                                    # added this line
                    tmp_lastrowid = self.lastrowid   # added this line
                sql = bytearray(prefix)
            else:
                sql += b","
            sql += v
        rows += self.execute(sql + postfix)
        if tmp_lastrowid:                              # added this line
            self.lastrowid = tmp_lastrowid    # added this line
        self.rowcount = rows
        return rows

I am wordering if this is good idea. are there bigger concerns to do this?

Thanks,

cbi-gshen avatar May 23 '20 00:05 cbi-gshen