webots_ros2 icon indicating copy to clipboard operation
webots_ros2 copied to clipboard

Webots Ros2 program (any of the demos) running in Ubuntu (Windows WSL2) is not able to connect to Webots host simulator in Windows 11.

Open andrespineda opened this issue 9 months ago • 4 comments

Describe the Bug After installing Webots on a Windows 11 host configured with a Ubuntu 22.04 on WSL2, any of the demo programs fail in trying to connect to the Windows Webot simulator.

Steps to Reproduce

  1. Install the Ros2 and Webots code as specified in the Ros2 documentation: https://docs.ros.org/en/foxy/Tutorials/Advanced/Simulators/Webots/Installation-Windows.html#install-wsl2https://docs.ros.org/en/foxy/Tutorials/Advanced/Simulators/Webots/Installation-Windows.html#install-wsl2.
  2. After installation, attempt to run the provided demo script: ros2 launch webots_ros2_universal_robot multirobot_launch.py.

Expected behavior The expectation is that the code will start to run and pop open the Webots simulator window on the Windows host. It will them start to send commands to the robot running in the simulation to perform the demo actions. Instead, the program starts up, but is unable to connect to the host. The screen continuously displays an error that it cannot connect and will try in a few seconds. It never is able to connect.

Affected Packages List of affected packages: All of the demo programs, such as epuck, universal_robot, etc.

Screenshots Not available

System

  • Webots Version: [R2025a ]
  • ROS Version: [Jazzy]
  • Operating System: [Windows 11, WSL2 Ubuntu 24.04.2 LTS]
  • Graphics Card: [Unknown]

Additional context

I spent over a full day trying to figure this out and found a solution.

In my case, the issue was that the program located in "webots_ros2_driver/lib/python3.12/site-packages/webots-driver/utils.py" has a function named "get_wsl_ip_address()" . This function gets the "nameserver" ip address from the WSL2 "/etc/resolv.conf" file and returns it incorrectly as the ip address of the Windows host. Because of this, the Ubuntu webots controller keeps trying to connect to the webots simulator running on the Windows side using the nameserver ip address as the TCP IP address. I my system, the nameserver was set to 127.0.0.53. The "webots-controller" kept trying to connect to TCP 127.0.0.53 port 1234.

To prove that this was the problem, I manually edited the "/etc/resolv.conf" file and changed the nameserver address to "127.0.0.1".

This worked: I was then able to run "ros2 launch webots_ros2_epuck robot_launch.py"" and it connected up to the webot simulator using TCP 127.0.0.1 Port 1234. Of course this only works while the terminal is open. Once I close the terminal and open it again, the /etc/resolv changes and it starts to fail again.

Changing the nameserver address is not the correct way to fix the problem, but it got me to a working demo and can move on.

I consider the code in utils.py to be an error as the nameserver ip address is not the correct value to use to connect to the Windows host webot simulator.

andrespineda avatar Feb 16 '25 01:02 andrespineda

Hi, pardon. I meet the same problem.I have tried the "Execute_Process" to choose the right windows ip I needed for webots_controller connection and it worked, but the problem occured then. It throwed the exception like below: [webots-controller-2] terminate called after throwing an instance of 'std::runtime_error' [webots-controller-2] what(): Error: The Python module with the WebotsNode class cannot be executed. [webots-controller-2] [ros2run]: Aborted Have you met the problem too? could you tell me how to fix it? please thx

pantagnun avatar Feb 28 '25 07:02 pantagnun

Hello, I was having the same problem. After hours of debugging, I think I have figured out the source of the issue.

WSL is configured by default to use a NAT architecture for the communication between Windows and WSL.

The official Microsoft Documentation on WSL Networking explains how the WSL system its own IP address under a sub-network has, and that windows can connect to any WSL program on localhost normally. But for a WSL program to connect to a Windows one it has to use the proper IP under the WSL sub-network.

The window machine is configured as the gateway within this sub-network, we can lookup it's IP by running ip route within WSL:

$ ip route
default via 172.30.32.1 dev eth0 proto kernel
172.30.32.0/20 dev eth0 proto kernel scope link src 172.30.43.162

It's the one in the first line: 172.30.32.1.

The webots_ros2 package works in WSL by running Webots on Windows like any normal Windows program and connecting to it using TCP. In order to connect it has to use the IP mentioned earlier, 172.30.32.1.

The package has the logic of determining the IP in .../webots_ros2_driver/utils.py. It handles other types of systems too like Docker and macOS. But we're only interested in WSL here.

In the following 2 functions: https://github.com/cyberbotics/webots_ros2/blob/329486a5b393da249165ad57c1160001e80e27a1/webots_ros2_driver/webots_ros2_driver/utils.py#L149-L151 and https://github.com/cyberbotics/webots_ros2/blob/329486a5b393da249165ad57c1160001e80e27a1/webots_ros2_driver/webots_ros2_driver/utils.py#L154-L158, we can see how the IP is determined:

get_host_ip() if has_shared_folder() else get_wsl_ip_address()

has_shared_folder() is True when there's a environment variable with the name WEBOTS_SHARED_FOLDER defined, which is not the case with the WSL setup.

https://github.com/cyberbotics/webots_ros2/blob/329486a5b393da249165ad57c1160001e80e27a1/webots_ros2_driver/webots_ros2_driver/utils.py#L110-L111

This leaves us with get_wsl_ip_address():

https://github.com/cyberbotics/webots_ros2/blob/329486a5b393da249165ad57c1160001e80e27a1/webots_ros2_driver/webots_ros2_driver/utils.py#L87-L107

It uses the IP address specified in /etc/resolv.conf for connecting to the Windows Webots instance.

Upon checking the file I have found the following:

# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
nameserver 10.255.255.254

10.255.255.254 is not the gateway IP! By checking ip addr we can find out that it is actually a loopback address (similar to 127.0.0.1):

$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet 10.255.255.254/32 brd 10.255.255.254 scope global lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:15:5d:62:37:e2 brd ff:ff:ff:ff:ff:ff
    inet 172.30.43.162/20 brd 172.30.47.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::215:5dff:fe62:37e2/64 scope link
       valid_lft forever preferred_lft forever

Why? WSL has a DNS server running with in itself?

Well, it's a new feature called "DNS Tunneling", and as mentioned in https://github.com/microsoft/WSL/issues/12101#issuecomment-2381829266, WSL dnsTunneling it became enabled by default.

So, using the IP address in resolv.conf is not a reliable method for WSL...

Actually the get_host_ip() method would have done the job as it uses ip route: https://github.com/cyberbotics/webots_ros2/blob/329486a5b393da249165ad57c1160001e80e27a1/webots_ros2_driver/webots_ros2_driver/utils.py#L129-L141

A quick hackish solution to validate the hypothesis is to modify resolv.conf and set the namespace to the gateway IP:

# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
#nameserver 10.255.255.254
namespace 172.30.32.1

Launching the webots_ros2 example now should work fine, temporarly.

Once WSL is restarted the resolv.conf file is going to be regenerated, overwriting the changes done. Or if configuration file was reloaded, DNS resolving might break in WSL.

Rami-Sabbagh avatar Mar 11 '25 14:03 Rami-Sabbagh

Disabling the "DNS Tunneling" feature and restarting WSL solves the issue. But we lose the feature. I think it can be detected by code by a hardcoded check for the IP extracted from resolv.conf, if it's 10.255.255.254 then the feature is active.

When it's active and NAT networking is used, ip route gives the right result.

This leaves the mirrored networking mode to solve.

Rami-Sabbagh avatar Mar 11 '25 15:03 Rami-Sabbagh

is there any simpler solutions? i have the exact same issue

myudak avatar Apr 24 '25 18:04 myudak