MySqlConnector icon indicating copy to clipboard operation
MySqlConnector copied to clipboard

ConnectionTimeout is not working

Open radian-jp opened this issue 5 years ago • 14 comments

If the MySqlConnection.Open method is called when the LAN cable is disconnected, it will stop for a long time.

[OS] Windows10 [DLL] MySqlConnector 0.61.0 [Framework] .NET Core 2.1

[TestCode]

using System;
using System.Diagnostics;
using MySql.Data.MySqlClient;

namespace MySqlConnectorTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var timeFormat = "yyyy/MM/dd HH:mm:ss.fff";
            var builder = new MySqlConnectionStringBuilder();
            builder.Server = "testserver";
            builder.Database = "mydb";
            builder.Port = 3306;
            builder.UserID = "user";
            builder.Password = "password";
            builder.ConnectionTimeout= 5;
            builder.DefaultCommandTimeout = 5;
            var conStr = builder.ToString();
            Debug.WriteLine("conStr:" + conStr);

            while (true)
            {
                try
                {
                    using (var con = new MySqlConnection(conStr))
                    {
                        Debug.WriteLine("Open Start:" + DateTime.Now.ToString(timeFormat));
                        con.Open();
                        Debug.WriteLine("Open End:" + DateTime.Now.ToString(timeFormat));
                        using (var cmd = con.CreateCommand())
                        {
                            cmd.CommandText = "select now(3)";
                            cmd.ExecuteScalar();
                        }
                        con.Close();
                    }
                }
                catch (Exception ex)
                {
                    Debug.WriteLine("Catch Exception:" + DateTime.Now.ToString(timeFormat));
                    Debug.WriteLine(ex.ToString());
                }
                System.Threading.Thread.Sleep(1000);
            }
        }
    }
}

[Result] result.txt

radian-jp avatar Nov 20 '19 06:11 radian-jp

I see two exceptions in your log. The first takes ~19 seconds to time out:

Open Start:2019/11/20 14:45:14.859 <<< disconnect LAN cable
スレッド 0x18504 はコード 0 (0x0) で終了しました。
'dotnet.exe' (CoreCLR: clrhost): 'C:\Program Files\dotnet\shared\Microsoft.NETCore.App\2.1.12\System.Security.Cryptography.X509Certificates.dll' が読み込まれました。シンボルの読み込みをスキップしました。モジュールは最適化されていて、デバッグ オプションの [マイ コードのみ] 設定が有効になっています。
例外がスローされました: 'MySql.Data.MySqlClient.MySqlException' (System.Private.CoreLib.dll の中)
Catch Exception:2019/11/20 14:45:33.814
MySql.Data.MySqlClient.MySqlException (0x80004005): Connect Timeout expired. ---> System.ObjectDisposedException: Cannot access a disposed object.

The second appears to take ~125 seconds. I assume your expectation is that both timeouts would be much closer to five seconds.

Open Start:2019/11/20 14:45:57.986 <<< disconnect LAN cable
スレッド 0x1ab98 はコード 0 (0x0) で終了しました。
スレッド 0x1a780 はコード 0 (0x0) で終了しました。
例外がスローされました: 'MySql.Data.MySqlClient.MySqlException' (System.Private.CoreLib.dll の中)
Catch Exception:2019/11/20 14:48:02.103
MySql.Data.MySqlClient.MySqlException (0x80004005): Connect Timeout expired. ---> System.ObjectDisposedException: Cannot access a disposed object.

bgrainger avatar Dec 07 '19 16:12 bgrainger

I'm having similar issues calling this from within an AWS Lambda. No matter what I do, I will not get a timeout of the Open command; it will instead always timeout the Lambda script when I run it. (using .NET Core 2.1)

Any tips? I even log the timeout variable, but still no luck.

joelbyren avatar Feb 12 '20 07:02 joelbyren

There are integration tests to ensure that ConnectionTimeout is working, e.g., https://github.com/mysql-net/MySqlConnector/blob/a0da7960974948aa89754ce00313db6ce7a4c2eb/tests/SideBySide/ConnectSync.cs#L226-L240

However, it sounds like there may be some cases that aren't handled by the existing code, which may need further investigation.

bgrainger avatar Feb 12 '20 16:02 bgrainger

As a hint to check, this could very well be due to AWS implementation of their serverless DB. They probably have a "frontend" that is accepting the incoming connection, but because the DB "backend" is not actually running (it is being started when the call is performed) it waits on the other side after the initial connect is made. Does this make sense?

joelbyren avatar Feb 13 '20 08:02 joelbyren

First off, thank you @bgrainger and team very much for a great open-source connector! Sorry for the wall of text, but I wanted to share my findings as I think they should help a lot in fixing this issue.

We were also seeing problems in an AWS Lambda + Aurora serverless stack, although running with a netcore3.1 custom runtime and through Pomelo's EF Core provider. This is a very niche problem but it seems to occur with Aurora Serverless, and causes our Lambdas to hit the 5 min compute limit rather than gracefully handling a DB timeout.

After quite a bit of digging I was able to recreate the problem synthetically with the following steps:

  • Run a MySQL stub program that blocks indefinitely just after accepting a TCP connection (this is basically just a TCP listener with a breakpoint after accept connection)
  • In a clean project setup new MySqlConnection with a connection string that will point to the above stub
    • Tried both default timeouts and explicitly setting them
  • run connector.Open() or awaited connector.OpenAsync()

The last call never completes, and never throws an exception; it blocks forever

Digging through the stack frames and connector source, it seems there is no timeout for the handshake packets when connecting with TCP

So, what will happen is the TCP connection is opened but the receive call for the initial server hello packet blocks indefinitely.

AFAICT the offending line is in ServerSession.cs. This hello payload receive is made with a SocketByteHandler that never has its RemainingTimeout set, so it will have the default timeout of infinite.

So from all that I think setting a timeout before that receive call (and other handshake calls) would fix the issue. A timeout equal to Connection Timeout would work in a pinch, but maybe better would be ConnectionTimeout - <time-used-during-socket-open>

Thanks!

Some reference:

  • MySqlConnector version 0.62.0
  • .NET Core 2.1 and 3.1 both tested
  • Running on macOS 10.14.6

Test Main:

static async Task Main(string[] args)
{
    MySqlConnectorLogManager.Provider = new ConsoleLoggerProvider(MySqlConnectorLogLevel.Trace);
            
    var connector = new MySqlConnection("Server=localhost;Port=3305;ConnectionTimeout=10;Command Timeout=10");
    await connector.OpenAsync();

    // The following line is never reached
    Console.WriteLine("Connection opened");
}

Log output: (this is after leaving for ~10min)

[INFO]  ConnectionPool  Pool1 creating new connection pool for ConnectionString: Server=localhost;Port=3305;Connection Timeout=10;Default Command Timeout=10
[DEBUG] ConnectionPool  Pool1 reaping connection pool
[DEBUG] ConnectionPool  Pool1 recovered no sessions
[DEBUG] ConnectionPool  Pool1 waiting for an available session
[DEBUG] ServerSession   Session1.1 created new session
[INFO]  ConnectionPool  Pool1 no pooled session available; created new Session1.1
[INFO]  ServerSession   Session1.1 connecting to IpAddress ::1 for HostName 'localhost'
[INFO]  ServerSession   Session1.1 connecting to IpAddress 127.0.0.1 for HostName 'localhost'
[DEBUG] ServerSession   Session1.1 connected to IpAddress 127.0.0.1 for HostName 'localhost' with local Port 56660
[DEBUG] ConnectionPool  Pool1 reaping connection pool
[DEBUG] ConnectionPool  Pool1 recovered no sessions
[DEBUG] ConnectionPool  Pool1 reaping connection pool
[DEBUG] ConnectionPool  Pool1 recovered no sessions
[DEBUG] ConnectionPool  Pool1 reaping connection pool
[DEBUG] ConnectionPool  Pool1 recovered no sessions
[DEBUG] ConnectionPool  Pool1 reaping connection pool
[DEBUG] ConnectionPool  Pool1 recovered no sessions
[DEBUG] ConnectionPool  Pool1 reaping connection pool
[DEBUG] ConnectionPool  Pool1 recovered no sessions
[DEBUG] ConnectionPool  Pool1 reaping connection pool
[DEBUG] ConnectionPool  Pool1 recovered no sessions

DillonN avatar Mar 17 '20 03:03 DillonN

@DillonN Thank you for the detailed report! It sounds like I should be able to reproduce this easily (using https://github.com/mysql-net/MySqlConnector/blob/master/tests/MySqlConnector.Tests/FakeMySqlServer.cs or an even simpler server), so I'll try to look into this and fix it soon.

bgrainger avatar Mar 17 '20 04:03 bgrainger

@bgrainger thanks for the quick response, let me know if you need anything more

DillonN avatar Mar 17 '20 14:03 DillonN

The scenario @DillonN should be fixed in 0.63.0.

@radian-jp can you confirm your original scenario is improved?

bgrainger avatar Mar 26 '20 23:03 bgrainger

Latency after repeated disconnections has been improved, but seems to be different from the ConnectionTimeout setting. (ConnectionTimeout= 5) The results of comparison with Official MySql.Data will be posted.

[MySql.Data 8.0.19] Open Start:2020/03/27 14:17:42.058 MySqlConnectorTest Error: 0 : Unable to connect to any of the specified MySQL hosts. 例外がスローされました: 'MySql.Data.MySqlClient.MySqlException' (MySql.Data.dll の中) Catch Exception:2020/03/27 14:17:47.114 (ElapsedMilliseconds) 5056

[MySqlConnector 0.63.0] Open Start:2020/03/27 14:12:00.483 '例外がスローされました: 'MySql.Data.MySqlClient.MySqlException' (System.Private.CoreLib.dll の中) Catch Exception:2020/03/27 14:12:19.431 (ElapsedMilliseconds) 18948

radian-jp avatar Mar 27 '20 05:03 radian-jp

Success! The connection from my AWS Lambda now correctly timed out! Thank you so much for the fast response!

joelbyren avatar Mar 27 '20 07:03 joelbyren

Reopening as the issue @radian-jp reported needs to be investigated further.

bgrainger avatar Mar 27 '20 13:03 bgrainger

Commenting out the ExecuteScalar in the first post of the test code worked correctly for the timeout. It is possible that another socket operation is affecting the MySqlConnection.Open.

radian-jp avatar Mar 31 '20 04:03 radian-jp

@bgrainger I re-tested in my synthetic setup and the timeout works correctly!

It will take a bit to get the change merged in our dev environment and watch for issues, but I am thinking it is fixed. So unless I come back in a couple days it is fixed on my end.

Thanks again for the quick action!

DillonN avatar Mar 31 '20 18:03 DillonN

After executing ExecuteScalar and reconnecting with the network disconnected, m_socket.Receive of SocketByteHandler is called and seems to be blocking. trace.txt

radian-jp avatar Apr 09 '20 01:04 radian-jp