wordpress-playground icon indicating copy to clipboard operation
wordpress-playground copied to clipboard

Explore curl support

Open adamziel opened this issue 11 months ago • 52 comments

What is this PR doing?

Explores building PHP with libcurl support

CURL builds, PHP builds with the --with-curl flag, curl_init() etc run as expected.

However, running the following PHP snippet fails:

<?php 
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://wordpress.org');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
var_dump($output);
var_dump(curl_error($ch));
curl_close($ch);

CleanShot 2024-03-22 at 18 17 12@2x

Reproduction link: http://localhost:5400/website-server/?php=8.0&wp=6.4&storage=none&php-extension-bundle=kitchen-sink&url=/test-curl.php

Curl likely runs fork() internally, similarly to PHP's proc_open(). Getting it to work in Playground will require patching curl source code to remove that fork() call and, likely, replace it with a JavaScript function call – similarly to the0 proc_open() patch.

To rebuild curl, run:

cd packages/php-wasm/compile
rm -rf libcurl/dist
make libcurl
cd ../../../
nx reset; npm run recompile:php:web:kitchen-sink:8.0

cc @mho22 – I spent an hour here just to get to the first roadblock. I won't be able to spend more time here for now – you're more than welcome to take over. I'd love to see a functional CURL extension!

Related resources

  • https://github.com/WordPress/wordpress-playground/issues/85
  • https://github.com/WordPress/wordpress-playground/pull/1093

adamziel avatar Mar 22 '24 17:03 adamziel

File descriptors, fork and NTLM An application that uses libcurl and invokes fork() gets all file descriptors duplicated in the child process, including the ones libcurl created. libcurl itself uses fork() and execl() if told to use the CURLAUTH_NTLM_WB authentication method which then invokes the helper command in a child process with file descriptors duplicated. Make sure that only the trusted and reliable helper program is invoked!

https://github.com/curl/curl/blob/647e86a3efe1eea7a2a456c009cfe1eb55fe48eb/docs/libcurl/libcurl-security.md?plain=1#L452C1-L462C1

bgrgicak avatar Mar 25 '24 10:03 bgrgicak

NTML and NTLM_WP are both set to no in PHP info. I don't think that this is caused by using fork.

This message is documented in CURL. We could try to disable AsynchDNS and see if this resolves the issue.

bgrgicak avatar Apr 01 '24 08:04 bgrgicak

Disabling AsynchDNS resolved the thread failed to start error.

Now the test request times-out and from the sound of my fans, it keeps doing something in the background. I haven't debugged it.

bgrgicak avatar Apr 01 '24 09:04 bgrgicak

The verbose output has some insights:

* Trying 172.29.1.0:80... * Could not set TCP_NODELAY: Protocol not available * Connection timed out after 10000 milliseconds * Closing connection 0 bool(false) string(45) "Connection timed out after 10000 milliseconds"

I see that we use TCP_NODELAY in PHP-WASM, but I don't know where is this coming from: TCP_NODELAY: Protocol not available

bgrgicak avatar Apr 01 '24 11:04 bgrgicak

curl_setopt($ch, CURLOPT_TCP_NODELAY, 0); resolves the TCP_NODELAY, now I'm back to timeouts.

bgrgicak avatar Apr 01 '24 11:04 bgrgicak

I need to wrap up now, this is what I found today:

  • NTML and NTLM_WP are disabled so they won't trigger fork
  • AsynchDNS started a new thread and that was resolved by disabling it
  • TCP_NODELAY is enabled by default, but it doesn't work (Protocol not available), disabling it in the request works, but I'm not sure if it's required
  • Requests are now timing out without any errors. I assume, that the request isn't properly sent to WASM, or that the response isn't properly returned, but I wasn't able to debug this part today.

bgrgicak avatar Apr 01 '24 11:04 bgrgicak

I attempted another approach on my end by trying to run php-wasm/node with curl. It does appear in my modules list when I run it. I created a file named curl.php:

<?php

$ch = curl_init();

curl_setopt( $ch, CURLOPT_URL, 'http://wordpress.org' );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );

var_dump( curl_version() );
var_dump( curl_getinfo( $ch ) );

$output = curl_exec( $ch );
var_dump( $output );
var_dump( curl_error( $ch ) );
curl_close( $ch );

I still get the same error of course :

bool(false)
string(37) "getaddrinfo() thread failed to start\n"

So I tried to make a comparison with built-in php where no error occur when running php curl.php.

curl_version() in php-wasm :

["ssl_version"]=>
 string(0) ""
["libz_version"]=>
 string(0) ""

when curl_version() in PHP8.3 :

["ssl_version"]=>
string(13) "OpenSSL/3.1.4"
["libz_version"]=>
string(5) "1.3.1"

Additionally, the following information is present in PHP 8.3 but missing in php-wasmwhen running curl_getinfo($ch):

["effective_method"]=>
 string(3) "GET"
["capath"]=>
 string(14) "/etc/ssl/certs"
["cainfo"]=>
 string(17) "/etc/ssl/cert.pem"

I'm not sure if this information could be helpful, but it's something I noticed."

mho22 avatar Apr 02 '24 10:04 mho22

Thank you @mho22! I have an hour now and can take a look at it.

bgrgicak avatar Apr 02 '24 10:04 bgrgicak

Fixed the link https://github.com/curl/curl/blob/master/configure.ac

bgrgicak avatar Apr 02 '24 11:04 bgrgicak

It looks like this will take some effort to find the correct combination of flags and link all required libraries. For example, the scp protocol which I assume we need, requires libssh. We need to add libssh, build it, and link it.

bgrgicak avatar Apr 02 '24 11:04 bgrgicak

@mho22 feel free to take over, I'm not sure if I will have time to work more on this.

bgrgicak avatar Apr 02 '24 11:04 bgrgicak

scp protocol which I assume we need

I only had the http:// and https:// support in mind for the first iteration here. That's about what the browser can support anyway. Anything beyond that would make a great follow-up effort, but I wouldn't block v1 on it.

adamziel avatar Apr 02 '24 12:04 adamziel

I don't know why it produces an error here but :

cd packages/php-wasm/compile
rm -rf libcurl/dist
make libcurl

returns :

#14 17.84   CC       ../lib/curl-nonblock.o
#14 17.91   CC       ../lib/curl-warnless.o
#14 17.98   CC       ../lib/curl-curl_ctype.o
#14 18.05   CCLD     curl
#14 18.15 wasm-ld: error: duplicate symbol: curlx_strtoofft
#14 18.15 >>> defined in ../lib/curl-strtoofft.o
#14 18.15 >>> defined in ../lib/.libs/libcurl.a(libcurl_la-strtoofft.o)
#14 18.15
#14 18.15 wasm-ld: error: duplicate symbol: curlx_nonblock
#14 18.15 >>> defined in ../lib/curl-nonblock.o
#14 18.15 >>> defined in ../lib/.libs/libcurl.a(libcurl_la-nonblock.o)
...

duplicate symbol errors prevent the script to successfully end.

mho22 avatar Apr 02 '24 12:04 mho22

I think that's fine, at that point libcurl.a is already created in the filesystem. This is why I put || true in this line:

RUN source /root/emsdk/emsdk_env.sh && EMCC_SKIP="-lc -lz -lcurl" EMCC_FLAGS="-sSIDE_MODULE" emmake make || true

It would be useful to have a comment in place to document that behavior.

adamziel avatar Apr 02 '24 12:04 adamziel

@adamziel Ok thank you. Could it be possible that curl has no ssl and libz even if this portion of code added them [ or at least openssl ] :

RUN CPPFLAGS="-I/root/lib/include " \
    LDFLAGS="-L/root/lib/lib " \
    PKG_CONFIG_PATH=$PKG_CONFIG_PATH \
    source /root/emsdk/emsdk_env.sh && \
    emconfigure ./configure \
        --build i386-pc-linux-gnu \
        --target wasm32-unknown-emscripten \
        --prefix=/root/install/ \
        --disable-shared \
        --enable-static \
        --with-openssl \
        --enable-https \
        --enable-http

I suspect curl to not properly load openssl and zlib as displayed using var_dump( curl_version() );

I currently don't have the tools to investigate this. But I should try :

  • I suppose make libcurl will run the script from compile/Makefile of course.
  • It will run base-image libz and libopenssl scripts before all.
  • create a dist directory dist/root/lib in libcurl
  • run the libcurl/Dockerfile and return the different resulting directories curl-7.69.1/libs/.libs -> libcurl/dist/root/lib/lib and curl-7.69.1/include -> ./libcurl/dist/root/lib/include.

This assumes that the Dockerfile script runs correctly. We can consider having curl with openssl [ openssl having zlib ].

  • running npm run recompile:php:node:8.0 should then add curl to php thanks to this :
# Add curl if needed
RUN if [ "$WITH_CURL" = "yes" ]; \
	then \
		echo -n ' --with-curl=/root/lib ' >> /root/.php-configure-flags; \
		echo -n ' /root/lib/lib/libcurl.a' >> /root/.emcc-php-wasm-sources; \
	fi;

And in fact if we display phpinfo() , curl exists. But something is missing between php-wasm phpinfo() and php phpinfo() :

php-wasm phpinfo() :

curl

cURL Information => 7.69.1
Age => 5
IPv6 => No
libz => No
NTLM => No
SSL => No
TLS-SRP => No
HTTP2 => No
HTTPS_PROXY => No
Host => i386-pc-linux-gnu

curl.cainfo => no value => no value

php8.3 phpinfo()

curl

cURL Information => 8.6.0
Age => 10
IPv6 => Yes
libz => Yes
NTLM => Yes
SSL => Yes
TLS-SRP => Yes
HTTP2 => Yes
HTTPS_PROXY => Yes
ALTSVC => Yes
HTTP3 => No
UNICODE => No
ZSTD => No
HSTS => Yes
GSASL => No
Protocols => ftps, gophers, https, imaps, ldap, ldaps, mqtt,pop3s, smb, smbs, smtps
Host => Darwin
SSL Version => OpenSSL/3.1.4
ZLib Version => 1.3.1

curl.cainfo => .../config/php/cacert.pem

I only displayed the differences between the two curls. libz, SSL are part of the main differences.

But what next ?

How can I be sure the problem comes from compile/libcurl/Dockerfile or maybe libcurl/dist/root/lib/lib/libcurl.a or libcurl/dist/root/lib/include/Makefile ? Where should I investigate ?

mho22 avatar Apr 02 '24 14:04 mho22

I only displayed the differences between the two curls. libz, SSL are part of the main differences.

I forgot to push it yesterday. This commit adds zlib. https://github.com/WordPress/wordpress-playground/pull/1133/commits/0c84fd1b4089c8f15019fb37ddffed832d94c68e

bgrgicak avatar Apr 03 '24 06:04 bgrgicak

For SSL, we need to do something similar and provide a path with the --with-openssl flag. I'm trying this now.

bgrgicak avatar Apr 03 '24 06:04 bgrgicak

Done, OpenSSL was missing the includes folder. Here is a path that will print curl info.

Requests are still timing out. As a next step, we could add some breakpoints to see if the request gets "stuck" somewhere.

bgrgicak avatar Apr 03 '24 07:04 bgrgicak

I investigated a lot today. I couldn't find the answer yet but I came across a lot of informations. I first had to copy paste files into process to allow printing data from them : lib/multi.c, lib/url.c and lib/connect.c :

libcurl/Dockerfile :

COPY ./libcurl/multi.c /root/$CURL_VERSION/lib/multi.c
COPY ./libcurl/url.c /root/$CURL_VERSION/lib/url.c
COPY ./libcurl/connect.c /root/$CURL_VERSION/lib/connect.c


WORKDIR /root/$CURL_VERSION

I could then inject a lot of flags to follow the behavior of our test script. So here is what I understood so far :

It begins with this file :

multi.c

  1. function multi_socket() called
  2. function curl_multi_perform() called
  3. function multi_runsingle() called
  • CASE CURLM_STATE_INIT entered then while
  • CASE CURLM_STATE_CONNECT entered

function Curl_connect in url.c file is called within the CURLM_STATE_CONNECT case

url.c

  1. function Curl_connect() called
  2. function Curl_setup_conn() called

function Curl_connecthost in connect.c file is called in previous Curl_setup_conn() function

connect.c

  1. function Curl_connecthost() called
  2. function singleipconnect() returns CURLE_OK

It then go back into the previous multi_runsingle function mentionned on point 3 and go on the while loop indefinitely. Entering endlessly in CURLM_STATE_WAITRESOLVE case

multi.c

  1. function multi_runsingle INFINITE Loop While
  • CASE CURLM_STATE_WAITRESOLVE

The error is probably coming from the singleipconnect() function.

Here are multiple results I printed :

line 1191 : result = bindlocal(conn, sockfd, addr.family, Curl_ipv6_scope((struct sockaddr*)&addr.sa_addr)); EQUALS 0

line 1212 : if(!isconnected && (conn->transport != TRNSPRT_UDP)) EQUALS TRUE

line 1251 : rc = connect(sockfd, &addr.sa_addr, addr.addrlen); EQUALS -1 = [ rc = connect( 4, 11319420, 16 ); ]

line 1273 : switch(error) -> return result = CURLE_OK from line 1285;

line 1303 : return result EQUALS CURLE_OK == 0

I suppose something is probably going wrong with the sockfd parameter ?

data->set.fsockopt on line 1184 is false. Should we try to add that fsockopt ?

Another thing :

php-wasm 

Trying 172.29.1.0:80...

php

Trying 198.143.164.252:80...

I tried to add every option mentionned in the singleipconnect() function in the test script :

curl_setopt( $ch, CURLOPT_TCP_NODELAY, 1 );
curl_setopt( $ch, CURLOPT_TCP_KEEPALIVE, 1 );
curl_setopt( $ch, CURLOPT_TCP_FASTOPEN, 1 );
* Could not set TCP_NODELAY: Protocol not available
* Failed to set SO_KEEPALIVE on fd 4
* Failed to enable TCP Fast Open on fd 4

P.S. : if you want to add a new flag, don't forget to add a \n at the end of the infof( data, "message\n" ). Otherwise nothing will be displayed and this will cause a lot of time wasted to find out why.

mho22 avatar Apr 04 '24 02:04 mho22

This is really good debugging @mho22!

The local IP is resolved likely due to this issue:

https://github.com/WordPress/wordpress-playground/issues/400

Regardless, file_get_contents( "https://wordpress.org" ) makes Emscripten start a WebSockets so Curl should also be able to do that.

sockfd might be the right track – Playground applies a few patches on top of Emscripten to improve fd and sockopt handling.

Here's a few questions I'm thinking of:

  • Is sockfd a valid descriptor, or is it -1?
  • Does Emscripten create a new WebSocket instance? In other words, is this console.log statement triggered? console.log('Called constructor()!');
  • Is ___syscall_connect called in php_8_0.js? An unminified PHP build might be helpful here – to get one you could runn emcc with -g2.
  • If it is, where does the execution stop? Is the catch ever trigerred? What's the error?

adamziel avatar Apr 04 '24 11:04 adamziel

@adamziel Here are the answers :

  1. sockfd equals 18
  2. console.log('Called constructor()!') is never triggered
  3. ___syscall_connect is called 2 times in php_8_0.js
  4. The execution stops in the catch and this object is returned :
Capture d’écran 2024-04-04 à 13 48 38 Capture d’écran 2024-04-04 à 13 51 06

Here is a copy of the ___syscall_connect where I added console.log to link with the screenshot :

function ___syscall_connect(fd, addr, addrlen, d1, d2, d3) {

 try {
  var sock = getSocketFromFD(fd);

  console.log( sock );

  var info = getSocketAddress(addr, addrlen);

  console.log( info );

  sock.sock_ops.connect(sock, info.addr, info.port);

  return 0;
 } catch (e) {

    console.log( e );

  if (typeof FS == "undefined" || !(e.name === "ErrnoError")) throw e;
  return -e.errno;
 }
}

It seems the second time ___syscall_connect is called, we get data from our curl_exec .

Capture d’écran 2024-04-04 à 14 00 40

I hope this is helpful. I'm uncertain about the next steps to take, so I'm looking forward to hearing your insights.

mho22 avatar Apr 04 '24 11:04 mho22

ERRNO 26 is "EINPROGRESS":

https://github.com/WordPress/wordpress-playground/blob/096a01782fc73f7d0aad3ffa8913aa0163fb03f6/packages/php-wasm/node/public/php_8_0.js#L7059

It's documented as follows:

EINPROGRESS The socket socket is non-blocking and the connection could not be established immediately. You can determine when the connection is completely established with select; see Waiting for Input or Output. Another connect call on the same socket, before the connection is completely established, will fail with EALREADY.

It would be interesting to step through the ___syscall_connect execution and get to the point where it throws that exception. I also wonder, why at first the socket says 127.0.0.1:5400, and only later it says wordpress.org:80. Is it CURL reusing the same file descriptor? Or perhaps that's related to Asyncify stack rewinding? It would be interesting to console.log Asyncify.state on each of these calls and compare it to all the entries from Asyncify.State (notice the capital letter), e.g. Asyncify.State.Normal.

adamziel avatar Apr 04 '24 12:04 adamziel

@adamziel I am on it.

The method sock.sock_ops.connect(sock, info.addr, info.port); throws the error 26 :

  connect(sock, addr, port) {
   if (sock.server) {
    throw new FS.ErrnoError(138);
   }
   if (typeof sock.daddr != "undefined" && typeof sock.dport != "undefined") {
    var dest = SOCKFS.websocket_sock_ops.getPeer(sock, sock.daddr, sock.dport);
    if (dest) {
     if (dest.socket.readyState === dest.socket.CONNECTING) {
      throw new FS.ErrnoError(7);
     } else {
      throw new FS.ErrnoError(30);
     }
    }
   }
   var peer = SOCKFS.websocket_sock_ops.createPeer(sock, addr, port);
   sock.daddr = peer.addr;
   sock.dport = peer.port;
   throw new FS.ErrnoError(26);
  },

On last line.

And for Asyncify.state , the two syscalls are of type Asyncify.State.Normal

Capture d’écran 2024-04-04 à 14 27 28

mho22 avatar Apr 04 '24 12:04 mho22

Aha, so error code 26 seems like the correct outcome – there's no no-error way for that function to conclude anyway. Well, but in this case the SOCKFS.websocket_sock_ops.createPeer() method should be called, and that's where the WebSocket is created. It seems like it fails in some way and never gets to start that WS connection – I wonder why is that.

adamziel avatar Apr 04 '24 12:04 adamziel

@adamziel Do you have any suggestions for a comparison I could conduct to troubleshoot why our case is failing when running the ___syscall_connect method?

The var peer = SOCKFS.websocket_sock_ops.createPeer(sock, addr, port); returns a valid peer and the data from the peer is given to the sock, so after the catch , the sock should be operational.

the peer object :

Capture d’écran 2024-04-04 à 15 17 17

This probably indicates that the websocket is correctly created. However, communication is not established.

mho22 avatar Apr 04 '24 13:04 mho22

Oh! It's the FetchWebsocketConstructor, it doesn't have a console.log() in its constructor – so it's actually created correctly! It seems like console.log('Send called with ', data); never shows up in the console so libcurl never attempts to send any data.

The problem could be with how _wasm_poll_socket implements waiting for the connection to be ready. Also, I wonder what readyStates does the WS instance communicate?

adamziel avatar Apr 04 '24 14:04 adamziel

@adamziel Yes sorry, I thought you would see it with my screenshot that this was a FetchWebsocketConstructor so no console.log necessary to find out it probably created a socket.

It seems the _wasm_poll_socket is never called. I added a simple console.log in it and it never triggered.

I see wasp_poll_socket is called in custom implemented php_pollfd_for and wasm_select functions in php-wasm.c. Maybe curl_exec does not call these functions ?

mho22 avatar Apr 04 '24 14:04 mho22

Oooh I think you're right! That function is wired by patching PHP, not by replacing the libc function. That could be the root cause of this issue! There are two ways forward here:

  1. Patch libcurl (and run into this issue again in the future)
  2. Replace select with _wasm_select – I'm not sure how viable that is, though

adamziel avatar Apr 04 '24 14:04 adamziel

@adamziel What do you mean by replacing select with _wasm_select ? In fact I am not sure to fully understand the first way either 😅.

mho22 avatar Apr 04 '24 15:04 mho22