sig-windows-dev-tools icon indicating copy to clipboard operation
sig-windows-dev-tools copied to clipboard

antrea OvS not installing

Open jayunit100 opened this issue 2 years ago • 25 comments

Looks like as of today, somehow ovs isnt starting....

Any ideas on why ?

Reproducing:

vagrant destroy --force ; make all; 

Debugging:

vagrant ssh winw1 ; Start-Service *ovs* ; 
# check C:/openvswitch logs for details

Nothing in the openvswitch logs....

PS C:\Users\vagrant> cat C:\openvswitch\var\log\openvswitch\
cat : Could not find a part of the path 'C:\openvswitch\var\log\openvswitch\'.
At line:1 char:1
+ cat C:\openvswitch\var\log\openvswitch\
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\openvswitch\var\log\openvswitch\:String) [Get-Content], DirectoryN
   otFoundException
    + FullyQualifiedErrorId : GetContentReaderDirectoryNotFoundError,Microsoft.PowerShell.Commands.GetContentCommand

PS C:\Users\vagrant> ls C:\openvswitch\var\log\openvswitch\
PS C:\Users\vagrant> ls C:\openvswitch\var\log\openvswitch\
PS C:\Users\vagrant>

Full logs i see are:

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
d-----         7/4/2022  10:15 AM                net.d
7/4/2022 10:18 AM Installation log location: C:\k\antrea\install_ovs.log

7/4/2022 10:18 AM Downloading OVS package from https://downloads.antrea.io/ovs/ovs-2.15.2-antrea.0-win64.zip to C:\k\antrea\ovs-win64.zip

7/4/2022 10:18 AM Download OVS package success.

7/4/2022 10:18 AM Extract C:\k\antrea\ovs-win64.zip to C:\k\antrea

7/4/2022 10:18 AM Copying OVS package from C:\k\antrea\openvswitch to C:\openvswitch

7/4/2022 10:18 AM Installing OVS driver certificate.


PSPath                   : Microsoft.PowerShell.Security\Certificate::LocalMachine\TrustedPublisher\9E27F7B475263235DF5
                           A75A4936D0B21460F4630
PSParentPath             : Microsoft.PowerShell.Security\Certificate::LocalMachine\TrustedPublisher
PSChildName              : 9E27F7B475263235DF5A75A4936D0B21460F4630
PSIsContainer            : False
Archived                 : False
Extensions               : {System.Security.Cryptography.Oid, System.Security.Cryptography.Oid}
FriendlyName             :
IssuerName               : System.Security.Cryptography.X509Certificates.X500DistinguishedName
NotAfter                 : 10/12/2031 5:00:00 PM
NotBefore                : 10/13/2021 4:44:07 AM
HasPrivateKey            : False
PrivateKey               :
PublicKey                : System.Security.Cryptography.X509Certificates.PublicKey
RawData                  : {48, 130, 3, 18...}
SerialNumber             : 1F2362D86CB36598457D26A3A4317421
SubjectName              : System.Security.Cryptography.X509Certificates.X500DistinguishedName
SignatureAlgorithm       : System.Security.Cryptography.Oid
Thumbprint               : 9E27F7B475263235DF5A75A4936D0B21460F4630
Version                  : 3
Handle                   : 1963336863792
Issuer                   : CN="WDKTestCert appveyor,132785990465004813"
Subject                  : CN="WDKTestCert appveyor,132785990465004813"
EnhancedKeyUsageList     : {Code Signing (1.3.6.1.5.5.7.3.3)}
DnsNameList              : {WDKTestCert appveyor,132785990465004813}
SendAsTrustedIssuer      : False
EnrollmentPolicyEndPoint : Microsoft.CertificateServices.Commands.EnrollmentEndPointProperty
EnrollmentServerEndPoint : Microsoft.CertificateServices.Commands.EnrollmentEndPointProperty
PolicyId                 :


PSPath                   : Microsoft.PowerShell.Security\Certificate::LocalMachine\Root\9E27F7B475263235DF5A75A4936D0B2
                           1460F4630
PSParentPath             : Microsoft.PowerShell.Security\Certificate::LocalMachine\Root
PSChildName              : 9E27F7B475263235DF5A75A4936D0B21460F4630
PSIsContainer            : False
Archived                 : False
Extensions               : {System.Security.Cryptography.Oid, System.Security.Cryptography.Oid}
FriendlyName             :
IssuerName               : System.Security.Cryptography.X509Certificates.X500DistinguishedName
NotAfter                 : 10/12/2031 5:00:00 PM
NotBefore                : 10/13/2021 4:44:07 AM
HasPrivateKey            : False
PrivateKey               :
PublicKey                : System.Security.Cryptography.X509Certificates.PublicKey
RawData                  : {48, 130, 3, 18...}
SerialNumber             : 1F2362D86CB36598457D26A3A4317421
SubjectName              : System.Security.Cryptography.X509Certificates.X500DistinguishedName
SignatureAlgorithm       : System.Security.Cryptography.Oid
Thumbprint               : 9E27F7B475263235DF5A75A4936D0B21460F4630
Version                  : 3
Handle                   : 1963339904416
Issuer                   : CN="WDKTestCert appveyor,132785990465004813"
Subject                  : CN="WDKTestCert appveyor,132785990465004813"
EnhancedKeyUsageList     : {Code Signing (1.3.6.1.5.5.7.3.3)}
DnsNameList              : {WDKTestCert appveyor,132785990465004813}
SendAsTrustedIssuer      : False
EnrollmentPolicyEndPoint : Microsoft.CertificateServices.Commands.EnrollmentEndPointProperty
EnrollmentServerEndPoint : Microsoft.CertificateServices.Commands.EnrollmentEndPointProperty
PolicyId                 :

7/4/2022 10:18 AM Installing OVS kernel driver

7/4/2022 10:18 AM Hyper-V Virtual Machine Management service status: Running


C:\openvswitch\driver>netcfg -l .\ovsext.inf -c s -i OVSExt
Trying to install OVSExt ...

... .\ovsext.inf was copied to C:\Windows\INF\oem2.inf.

... done.


C:\openvswitch\driver>net stop vmms
The Hyper-V Virtual Machine Management service is stopping.
The Hyper-V Virtual Machine Management service was stopped successfully.


C:\openvswitch\driver>net start vmms
The Hyper-V Virtual Machine Management service is starting.
The Hyper-V Virtual Machine Management service was started successfully.

7/4/2022 10:18 AM Creating ovsdb file

7/4/2022 10:18 AM Create and start ovsdb-server service

[SC] CreateService SUCCESS
[SC] ChangeServiceConfig2 SUCCESS


Stderr from the command:

Start-Service : Service 'ovsdb-server (ovsdb-server)' cannot be started due to the following error: Cannot start
service ovsdb-server on computer '.'.
At C:\k\antrea\Install-OVS.ps1:214 char:5
+     Start-Service ovsdb-server
+     ~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OpenError: (System.ServiceProcess.ServiceController:ServiceController) [Start-Service],
   ServiceCommandException
    + FullyQualifiedErrorId : CouldNotStartService,Microsoft.PowerShell.Commands.StartServiceCommand

jayunit100 avatar Jul 04 '22 17:07 jayunit100

Could you provide your steps to install Antrea and OVS?

wenyingd avatar Jul 06 '22 02:07 wenyingd

Hi wenying

The scripts in the repo run antrea install via powershell they are in the fork/ directory

Running make all you can see them run

Can add more details if that doesn't help or debug live together

jayunit100 avatar Jul 07 '22 01:07 jayunit100

@jayunit100 Thanks for providing the OVS installation logs. It looks the service of ovsdb-server fails to start which blocks the later installation steps. The issue should be related with several missing DLLs on for OpenSSL on the Windows host. Would you try to download and install this exe file (https://slproweb.com/download/Win64OpenSSL-3_0_5.exe) on the issued host, and then re-install OVS?

wenyingd avatar Jul 11 '22 02:07 wenyingd

right now we use "Win64OpenSSL-3_0_3.exe" ... is that ok ? or we need 3_0_5 ?

jayunit100 avatar Jul 13 '22 14:07 jayunit100

  1. @wenyingd how did you find that SSL error ?

  2. testing w/ https://slproweb.com/download/Win64OpenSSL-3_0_5.exe now will let you know if it works.

  3. meanwhile. @wenyingd ... Do you see any flaws with our existing installer for this ? https://github.com/kubernetes-sigs/sig-windows-dev-tools/tree/master/sync/windows

jayunit100 avatar Jul 13 '22 14:07 jayunit100

@wenyingd whats the "right" way to install OpenSSL from our powershell automation ?

jayunit100 avatar Jul 13 '22 15:07 jayunit100

looks like

Add-WindowsCapability -Online -Name OpenSSH.Server~~~~0.0.1.0

from @marosset working? will see if that can be used for ssh, maybe i can do same w ssl

jayunit100 avatar Jul 13 '22 16:07 jayunit100

per mark rosetti something like

  • add openssl.ps1 script to https://github.com/kubernetes-sigs/sig-windows-dev-tools/tree/master/sync/windows
  • have it wget opeenssl msi
  • then use msiexec.exe w/ no-ui

will try that later, then will retry the antrea install

jayunit100 avatar Jul 13 '22 16:07 jayunit100

Having some offline discussions with @knabben , and we come to a problem when installing OpenSSL exe file with EULA , a popup is on the console to wait for user agreement input, which is not acceptable for the automation. And @knabben has another patch to copy the dll files directly to system path. But this might introduce conclict on different OS versions.

Personally, I prefer to install SSL dlls via some focusing software installation, but EULA popup is the biggest challenge for automation. Another thought to install OpenSSH could install the required dll files, but it also introduce other un-needed files to be installed.

wenyingd avatar Jul 14 '22 06:07 wenyingd

I remembered @knabben used to say that the working version of openssl is 1.0.2u, which is not found from the original valid download link. I think that version is working because of no EULA stuck. So maybe we could try to download it from this link https://indy.fulgan.com/SSL/ ( I got a link to download openssl_x64_1.0.2u in this site), which is also a recommended link for Windows OpenSSL suggested for OVS installation.

wenyingd avatar Jul 14 '22 06:07 wenyingd

you mean https://indy.fulgan.com/SSL/openssl-1.0.2u-x64_86-win64.zip ?

jayunit100 avatar Jul 15 '22 01:07 jayunit100

ok i uploaded https://storage.googleapis.com/jayunit100/openssl.exe to try it in the recipe along w the Open SSL license to the same buckettt... testing now

jayunit100 avatar Jul 15 '22 01:07 jayunit100

you mean https://indy.fulgan.com/SSL/openssl-1.0.2u-x64_86-win64.zip ?

Yes, this is a recommended link to download a version which works.

wenyingd avatar Jul 15 '22 02:07 wenyingd

that didnt seem to work. trying

choco install openssl --yes

... will update tomorrow...

PS @wenyingd ... why does Antrea need OpenSSL ?

jayunit100 avatar Jul 16 '22 01:07 jayunit100

i didnt have any luck with this , either. We'll need to dig further into: whats the right sustainable path to install ovsdb-server on windows nodes...

jayunit100 avatar Jul 17 '22 02:07 jayunit100

that didnt seem to work. trying

choco install openssl --yes

... will update tomorrow...

PS @wenyingd ... why does Antrea need OpenSSL ?

OpenSSL is required by OVS not Antrea. The OVS bits compilation needs OpenSSL support, so it also needs it in the runtime.

wenyingd avatar Jul 18 '22 03:07 wenyingd

so, when we say runtime here, why ? isnt OpenSSL just a protocol that is implemented by a library compiled into the binary for ovs? sorry for the naive question...

i dont think we should need to do things like choco install openssl and so on to put a .exe onto the host to run antrea agent... but maybe im missing something.

jayunit100 avatar Jul 19 '22 03:07 jayunit100

My word is not clear. My point is, when we run ovsdb-server/ovs-vswitch processes, they search for the dll files on the Windows host. But if they fail to find the dependent dlls, the bits are not able to run.

choco install openssl actually places the two dlls files "libeay32.dll" and "ssleay32.dll" to Windows system root path (C:\windows\system32), which is a path used for dll searching. So after the installation of openssl, ovsdb-server/ovs-vswitchd processes can find the dll files when we run them. We con't want ssl exe file itself, but OVS needs two dll files installed along with the exe.

wenyingd avatar Jul 19 '22 03:07 wenyingd

let me dig this one. /assign

dougsland avatar Aug 06 '22 14:08 dougsland

thanks @dougsland !

jayunit100 avatar Aug 07 '22 14:08 jayunit100

ok,

  • so the key requirement is "install DLL files for openSSL into "C:/windows/system32"
  • If we use choco install openssl, we then expect the 'ovsdb-server (ovsdb-server)' cannot be started due to the following error: Cannot start service ovsdb-server on computer '.'. error to go away ?

@wenyingd ? if so i think that will allow doug to finish implementing this.

QUESTION

@wenyingd ... In one of the experimenets above, I ran choco install openssl --yes but , it didnt seem to fix this issue - was i missing some detail

jayunit100 avatar Aug 07 '22 14:08 jayunit100

ok,

  • so the key requirement is "install DLL files for openSSL into "C:/windows/system32"
  • If we use choco install openssl, we then expect the 'ovsdb-server (ovsdb-server)' cannot be started due to the following error: Cannot start service ovsdb-server on computer '.'. error to go away ?

@wenyingd ? if so i think that will allow doug to finish implementing this.

QUESTION

@wenyingd ... In one of the experimenets above, I ran choco install openssl --yes but , it didnt seem to fix this issue - was i missing some detail

You are correct, the key requirement is "install DLL files for openSSL into "C:/windows/system32", one thing I want to add is, except for path C:/windows/system32, installing DLL files into path "c:/openvswitch/usr/sbin" also works to resolve the ovsdb-server start failure issue.

As for your question that choco install openssl --yes doesn't resolve the issue according to your experience, could you help check if the two files "ssleay32.dll and libeay32.dll" are located in path "C:\windows\system32". If no, this is possible not a valid resolution Besides. upstream antrea has merged a fixe for the issue (https://github.com/antrea-io/antrea/issues/4027). So using the latest code in antrea main branch, no additional operations are expected to copy the dll files. But the fix is not involved in the release branches, so it may happen if the code in release branches are used.

wenyingd avatar Aug 08 '22 03:08 wenyingd

ok so , I think we're saying

  • that choco install openssl --yes might do the wrong path.
  • the fix in (https://github.com/antrea-io/antrea/issues/4027) will download OpenSSL FOR US so that if its not there, we just get a fresh installation

So, there are two possible fixes we can do?

jayunit100 avatar Aug 10 '22 10:08 jayunit100

ok so , I think we're saying

So, there are two possible fixes we can do?

yes

wenyingd avatar Aug 10 '22 10:08 wenyingd

@wenyingd hey, tried updating the CNI to 1.8.0 and still no good. Please see: https://github.com/kubernetes-sigs/sig-windows-dev-tools/pull/204

If you can review anyway the patch, I believe we must update the CNI as it's far behind. Going to run a new tests tonight.

dougsland avatar Aug 29 '22 13:08 dougsland