crate-python icon indicating copy to clipboard operation
crate-python copied to clipboard

[DRAFT] CI: Enable testing on Windows

Open amotl opened this issue 5 years ago • 9 comments

This is just a WIP based on #386. Let's see how it goes.

amotl avatar Nov 03 '20 11:11 amotl

CrateDB nightlies are not available for macOS and Windows, see [1], bummer. Let's mitigate that by only adjusting the CrateDB version to "nightly" on Linux through c99d01d.

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347098559#step:4:124

amotl avatar Nov 03 '20 11:11 amotl

We are seeing An existing connection was forcibly closed by the remote host again [1], like https://github.com/crate/crate-python/pull/386#issuecomment-718101852.

Error: Error downloading extends for URL https://cdn.crate.io/downloads/releases/cratedb/x64_windows/crate-4.3.0.zip: [WinError 10054] An existing connection was forcibly closed by the remote host
Error: Process completed with exit code 1.

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347231174#step:4:115

amotl avatar Nov 03 '20 11:11 amotl

While making some progress on this, the last command bin/coverage run $test_program -vv1 errors out on Windows like

No file to run: 'bin/test-quick'

or

No file to run: 'bin\test-quick'

or

No file to run: 'D:\a\crate-python\crate-python\bin\test-quick'

See also [1], [2] and [3].

This is somehow related to

how the system path and python path are handled on ~~various operating systems~~ Windows [4].

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347349770#step:5:16 [2] https://github.com/crate/crate-python/runs/1347349770?check_suite_focus=true#step:5:16 [3] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347464565#step:5:16 [4] https://stackoverflow.com/questions/3312451/how-can-you-get-unittest2-and-coverage-py-working-together

amotl avatar Nov 03 '20 12:11 amotl

The reason that bin/coverage has not been able to find bin/test-quick is that it gets installed as bin/test-quick.exe on Windows. However, addressing that file obviously will not work as intended [1]:

Couldn't run 'bin/test-quick.exe' as Python code: SyntaxError: invalid or missing encoding declaration

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347537748#step:5:17

amotl avatar Nov 03 '20 13:11 amotl

After some more adjustments, the test suite also starts on Windows. Now, it croaks with AttributeError: module 'time' has no attribute 'clock' from sqlalchemy.util.compat [1], at least with sqlalchemy-1.1.18 on Python 3.8.

Traceback (most recent call last):
  File "D:\a\crate-python\crate-python\src\crate\client\tests.py", line 39, in <module>
    from crate.client.sqlalchemy.dialect import CrateDialect
  File "D:\a\crate-python\crate-python\src\crate\client\sqlalchemy\__init__.py", line 22, in <module>
    from .dialect import CrateDialect
  File "D:\a\crate-python\crate-python\src\crate\client\sqlalchemy\dialect.py", line 25, in <module>
    from sqlalchemy import types as sqltypes
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\__init__.py", line 9, in <module>
    from .sql import (
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\sql\__init__.py", line 8, in <module>
    from .expression import (
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\sql\expression.py", line 33, in <module>
    from .visitors import Visitable
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\sql\visitors.py", line 28, in <module>
    from .. import util
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\util\__init__.py", line 8, in <module>
    from .compat import callable, cmp, reduce,  \
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\util\compat.py", line 172, in <module>
    time_func = time.clock
AttributeError: module 'time' has no attribute 'clock'

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347946253#step:5:42


According to https://github.com/sqlalchemy/sqlalchemy/issues/5305, https://github.com/sqlalchemy/sqlalchemy/issues/4731 and https://github.com/sqlalchemy/sqlalchemy/issues/5444, it looks like this is deprecated.

At https://github.com/sqlalchemy/sqlalchemy/issues/5305#issuecomment-623451647, @zzzeek recommends to

upgrade to a recent version of SQLAlchemy.

amotl avatar Nov 03 '20 14:11 amotl

Introduction

There's a problem spinning up CrateDB on Windows.

Problem

Now, there's an error when trying to bootstrap CrateDB on Windows [1]. It goes like

b'[2020-11-03T15:11:44,184][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [crate] uncaught exception in thread [main]'
b'org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to create a child event loop'
[...]
b'Caused by: java.lang.IllegalStateException: failed to create a child event loop'
b'at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:88) ~[netty-common-4.1.53.Final.jar:4.1.53.Final]'
[...]
b'Caused by: io.netty.channel.ChannelException: failed to open a new selector'
[...]
b'Caused by: java.io.IOException: Unable to establish loopback connection'
[...]
b'Caused by: java.net.SocketException: Unrecognized Windows Sockets error: 10106: socket'
[...]

java.lang.IllegalStateException: failed to create a child event loop

It might be related to the same thing happening to ES [2] or Netty in general [3]. Comments within both threads seem to suggest to turn off the firewall. See also https://github.com/netty/netty/issues/7243.

Maybe [4] is related to this?

GitHub hosts Linux and Windows runners on Standard_DS2_v2 virtual machines in Microsoft Azure with the GitHub Actions runner application installed. And due to security policy Azure blocks ICMP by default. Hence, you cannot get ICMP answer in workflow.

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1348166824#step:5:212 [2] https://discuss.elastic.co/t/elasticsearch-fails-to-start-with-java-lang-illegalstateexception-failed-to-create-child-event-loop/218797 [3] https://stackoverflow.com/questions/27506788/failed-to-create-a-child-event-loop [4] https://github.community/t/what-is-the-default-firewall-for-github-actions-runners/17732

java.io.IOException: Unable to establish loopback connection

That seems to be related to IPv6 somehow. People suggest to prefer the IPv4 stack using export _JAVA_OPTIONS="-Djava.net.preferIPv4Stack=true". However, there might be other root causes.

  • https://stackoverflow.com/questions/42547714/java-io-ioexception-unable-to-establish-loopback-connection/42548312#42548312
  • https://stackoverflow.com/questions/13698221/loopback-connection-fail
  • https://androidpedia.net/en/knowledge-base/42547714/java-io-ioexception--unable-to-establish-loopback-connection
  • https://www.wowza.com/community/t/server-error-unable-to-establish-loopback-connection/41453

java.net.SocketException: Unrecognized Windows Sockets error: 10106: socket

This issue seems to be well known to the community and has been around for quite some time, dating back to Java 1.4. Most probably, it is a socket port conflict.

While SO #37643301 says

WinSock error 10106 is WSAEPROVIDERFAILEDINIT: "Service provider failed to initialize. The requested service provider could not be loaded or initialized. This error is returned if either a service provider's DLL could not be loaded (LoadLibrary failed) or the provider's WSPStartup or NSPStartup function failed."

GSE #163543 says

java.net.SocketException: Unrecognized Windows Sockets error: 10106: create means you are failing to open the port. Usually this has one of two reasons: either you (or any program you installed) blocked the port to protect you, or any program you have got is using the Port right now.

  • https://stackoverflow.com/questions/37643301/unrecognized-windows-sockets-error-10106-create-when-starting-tomcat-in-the-co
  • https://bz.apache.org/netbeans/show_bug.cgi?id=91532
  • https://coderanch.com/t/207423/java/java-net-SocketException-Unrecognized-Windows
  • https://www.programmersought.com/article/44024687455/
  • https://bugs.mojang.com/browse/MC-81575

amotl avatar Nov 03 '20 15:11 amotl

java.net.SocketException: Unrecognized Windows Sockets error: 10106: socket

All of these didn't bring any improvements:

# Disable firewall.
netsh advfirewall set allprofiles state off

# Reset IP stack.
netsh winsock reset

# Disable IPv6.
netsh interface teredo set state disabled
netsh interface ipv6 6to4 set state state=disabled undoonstop=disabled
netsh interface ipv6 isatap set state state=disabled

# Run Java in headless mode and prefer IPv4 stack.
set _JAVA_OPTIONS="-Djava.awt.headless=true -Djava.net.preferIPv4Stack=true"

amotl avatar Nov 05 '20 07:11 amotl

java.net.SocketException: Unrecognized Windows Sockets error: 10106: socket

Some resources suggest to set the SystemRoot environment variable to C:\Windows:

  • https://bz.apache.org/netbeans/show_bug.cgi?id=91532
  • https://stackoverflow.com/questions/3947555/java-net-socketexception-unrecognized-windows-sockets-error-0-jvm-bind-jboss/60072492#60072492
  • https://community.microfocus.com/t5/StarTeam-Knowledge-Base/Network-problem-Unrecognized-Windows-Sockets-error-10106-create/ta-p/1755101

Indeed: With 4c05db6, CrateDB seems to be able to spin up successfully.

amotl avatar Nov 05 '20 07:11 amotl

Now, we are seeing this:

Error in test D:\a\crate-python\crate-python\src\crate\client\sqlalchemy\doctests\itests.txt
Traceback (most recent call last):
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\unittest\case.py", line 59, in testPartExecutor
    yield
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\unittest\case.py", line 601, in run
    self.setUp()
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\doctest.py", line 2167, in setUp
    self._dt_setUp(test)
  File "D:\a\crate-python\crate-python\src\crate\client\tests.py", line 160, in setUpCrateLayerAndSqlAlchemy
    setUpWithCrateLayer(test)
  File "D:\a\crate-python\crate-python\src\crate\client\tests.py", line 148, in setUpWithCrateLayer
    cursor.execute("copy locations from ?", (data_path,))
  File "D:\a\crate-python\crate-python\src\crate\client\cursor.py", line 54, in execute
    bulk_parameters)
  File "D:\a\crate-python\crate-python\src\crate\client\http.py", line 396, in sql
    content = self._json_request('POST', self.path, data=data)
  File "D:\a\crate-python\crate-python\src\crate\client\http.py", line 523, in _json_request
    _raise_for_status(response)
  File "D:\a\crate-python\crate-python\src\crate\client\http.py", line 207, in _raise_for_status
    error_trace=error_trace)
crate.client.exceptions.ProgrammingError: SQLParseException[Illegal character in opaque part at index 2: D:\a\crate-python\crate-python\src\crate\testing\testdata\data\test_a.json]

Regarding this, we are lucky @seut already suggested at https://github.com/crate/crate/issues/7781 to

Use a valid file URI on windows, see e.g.: https://blogs.msdn.microsoft.com/ie/2006/12/06/file-uris-in-windows/.

The place to adjust that is: https://github.com/crate/crate-python/blob/0ad90b0d7eff6b332fbe6a826603d15d425c8d5d/src/crate/client/tests.py#L146-L148

amotl avatar Nov 05 '20 08:11 amotl