go-mssqldb icon indicating copy to clipboard operation
go-mssqldb copied to clipboard

perf: reduce goroutine used when reading a response

Open victorterraga opened this issue 5 years ago • 8 comments

Is this goroutines overheading normal? I have been working with other database drivers (postgres) and i hadn't this problem. This is a pic extracted from go tool trace trace.out: image All this goroutines have been generated in a 15s execution with a code that does about 10.000 updates per second in the database. Seems like each sql query generates a new goroutine which slows performance.

victorterraga avatar Mar 05 '19 13:03 victorterraga

This driver does indeed use way too many goroutines. So in a sense, yes, that was how it was implemented. Should it, probably not.

kardianos avatar Mar 06 '19 05:03 kardianos

Hi, I've tried to change the driver removing the channels involved in the goroutine pipeline. The driver's performace seem to improve using blocking queues intead of the channels. Using goroutines and channels seems to slow down the execution of very large result set parsing. This behavior occurs especially if you read a large result set of about 1M raws. The implementation that uses blocking queues is faster.

Here you can find the forked repository where I tried this changes: https://github.com/maxzerbini/go-mssqldb

If you are interested in this approach I can provide other information or I can make a pull request. Thanks for your support.

All the best, Massimo

maxzerbini avatar Jun 10 '19 10:06 maxzerbini

Send a PR, I will take a look.

denisenkom avatar Sep 10 '19 00:09 denisenkom

I have created an experimental branch https://github.com/denisenkom/go-mssqldb/tree/no-goroutines. It implements response handling without use of goroutines. It does not support request cancellation and I don't yet know how this can be done without using a goroutine since cancellation is implemented by simultaneously waiting for response and cancel channels.

denisenkom avatar Nov 05 '20 00:11 denisenkom

Can it be the reason for kind of "slow" reading? In our case we have a small utility "sql2csv" which job is just to export query results into CSV file, compared to dotnet implementation, golang one takes forever to complete

mac2000 avatar Dec 09 '21 14:12 mac2000

@mac2000 I would advise you to look at buffering your results to file. .NET by default buffers writes to a file, where as Go does not, so in Go you need to write to a bufio.Writer for good large performance in streaming situations, especially on Windows which has much slower file operations in general.

kardianos avatar Dec 09 '21 15:12 kardianos

@denisenkom That looks great! I think to support cancellations, we should use another go routine. I think there are three levels that it can be implemented: (1) Per Connector, (2) Per Connection, (3) Per Query. I think the best way to do this would be (2), per connection. That is, when a new connection is created, it creates one long running go-routine that handles cancellation.

I'll see if I can add this in and make the test suite pass again.

kardianos avatar Dec 09 '21 15:12 kardianos

@kardianos indeed, sorry, my mistake, just double checked everything, golang lib is even faster

dotnet - read 1414484 rows in 00:01:47.1451433
using System.Data.SqlClient;
using System.Diagnostics;

var counter = 0;
var timer = Stopwatch.StartNew();
using var connection = new SqlConnection("...");
using var command = new SqlCommand("...", connection) { CommandTimeout = 0 };
connection.Open();
using var reader = command.ExecuteReader();
if (reader.HasRows)
{
    while (reader.Read())
    {
        counter += 1;
    }
}

Console.WriteLine($"Read {counter} rows in {timer.Elapsed}");
golang - read 1414530 rows in 1m41.883204s
package main

import (
	"context"
	"database/sql"
	"fmt"
	"time"

	_ "github.com/denisenkom/go-mssqldb"
)

func main() {
	ctx := context.Background()
	db, _ := sql.Open("sqlserver", "...")
	rows, _ := db.QueryContext(ctx, "...")
	defer rows.Close()
	count := 0
	start := time.Now()
	for rows.Next() {
		count += 1
	}
	fmt.Printf("Read %v rows in %v\n", count, time.Since(start))
}

seems like indeed I have missed some golang/windows specific things here and misunderstood results, sorry again and thanks for the library 👍

after more and more experiments realized that regex slows down everything, also thanks for point to buffered writer, I got almost same results 4 min in dotnet vs 5 min in golang for a 4gb resulting csv file, which is not ideal but way much better than previous attempts

mac2000 avatar Dec 09 '21 16:12 mac2000