wordpress-playground icon indicating copy to clipboard operation
wordpress-playground copied to clipboard

PHP: Support http:// https:// and ssl:// stream wrappers

Open adamziel opened this issue 11 months ago • 8 comments

What is this PR doing?

Intercepts all network traffic coming from PHP and handles http:// and https:// requests using fetch(). This enables using the native networking features of PHP without implementing custom transport classes.

🚧 Work in progress – this PR needs discussion and cleaning up before it can be shipped 🚧

How is it implemented?

Emscripten can be configured to stream all network traffic through a WebSocket. @php-wasm/node and wp-now use that to access the internet via a local WebSocket->TCP proxy, but the in-browser version of WordPress Playground exposes no such proxy.

This PR ships a "fake" WebSocket implementation that doesn't initiate any ws:// connection. Instead, it mocks the WebSocket interface and analyzes the connection address and transmitted bytes to infer a corresponding fetch() call.

In case of HTTP, it parses the request text, extracts the method, path, and headers (body TBD), and feeds that information to a fetch() call. Then, as the response status, headers, and the data stream comes in, it rewrites it as raw bytes and pretends to emit them as incoming WebSocket data.

In case of HTTPS, it uses the node-forge package to start a HTTPS server with a self-signed certificate that is also added to PHP CA store via the openssl.cafile PHP.ini setting. The outbound traffic is piped to node-forge which handles the SSL handshake, the encryption, and yields unencrypted HTTP request bytes. They are treated the same as in the previous paragraph, and then the response is piped back to node-forge for encryption and then, finally, piped back to PHP.

The current implementation is naive and optimistically assumes the data is intended for either HTTP or HTTPS use.

Upsides

  • file_get_contents(), fopen(), fsockopen() and other functions using stream wrappers will now work with http://, https://, and ssl:// URLs!
  • libcurl can be now be supported without any customizations as the network traffic handler doesn't care about the library that initiated the connection.

Downsides

  • node-forge doesn't seem to work well with the latest TLS so we'd have to force-downgrade it in PHP. It doesn't seem like a big deal in the browser because all SSL security is provided by fetch(), but it's something to note. This approach is , of course, unsuitable and unnecessary outside of web browsers.
  • Raw sockets and CORS-less URLs still can't be supported this way.
  • I'm not sure whether we can reliably distinguish between HTTP / HTTPS traffic and a byte transmission via fsockopen('ssl://somesite.com'). Perhaps not. In this case we could ship a naive heuristic that would cover the majority of cases, and then patch PHP to provide an explicit flag "I'm about to request something via HTTPS".
  • node-forge is slow and the certificates are generated synchronously. It would be far better to use the browser-native asynchronous API like crypto.subtle.generateKey().
  • HTTPS adds noticeable overhead. Sticking to the FetchTransport in WordPress sounds like a good idea since it triggers fetch() directly and without an encryption layer.
  • I think each domain requires a separate cert, which means a bit of slowness added to the first request to every domain. That sounds like a fair trade-off for providing the much-requested networking support.

Remaining work

  • [x] Discuss the approach
  • [ ] Clean up the code and the import structure
  • [ ] Remove as much synchronous processing as possible
  • [ ] Generate SSL certificates lazily

Testing Instructions

  • Run npm install
  • Go to http://localhost:5400/website-server/#{%22landingPage%22:%22/network-test.php%22,%22preferredVersions%22:{%22php%22:%228.0%22,%22wp%22:%22latest%22},%22phpExtensionBundles%22:[%22kitchen-sink%22],%22steps%22:[{%22step%22:%22writeFile%22,%22path%22:%22/wordpress/network-test.php%22,%22data%22:%22%3C?php%20echo%20'Hello-dolly.zip%20downloaded%20from%20https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip%20has%20this%20many%20bytes:%20';%20var_dump(strlen(file_get_contents('https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip')));%22}]}
  • Confirm the page says Hello-dolly.zip downloaded from https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip has this many bytes: int(1887)

That Blueprint above runs the following PHP code:

<?php
echo 'Hello-dolly.zip downloaded from https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip has this many bytes: ';
var_dump(strlen(file_get_contents('https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip')));

Related to https://github.com/WordPress/wordpress-playground/pull/724

cc @dmsnell @bgrgicak @brandonpayton @ThomasTheDane

adamziel avatar Mar 07 '24 12:03 adamziel

@adamziel When I opened the test link (on Linux) I got back an error Hello-dolly.zip downloaded from https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip has this many bytes: int(1887).

bgrgicak avatar Mar 08 '24 07:03 bgrgicak

That's the expected output and not an error :-)

adamziel avatar Mar 08 '24 07:03 adamziel

A naive question. Is HTTPS necessary or could we just use HTTP internally?

How would that work?

adamziel avatar Mar 08 '24 07:03 adamziel

That's the expected output and not an error :-)

I should read all the testing instructions before I comment 😕

bgrgicak avatar Mar 08 '24 08:03 bgrgicak

How would that work?

I imagined that we could just send these requests without having an SSL server. WordPress and CURL requests can be configured to ignore SSL certificates. But that's probably a bad idea.

bgrgicak avatar Mar 08 '24 13:03 bgrgicak

were there any specific things you'd like review on?

Mostly the general idea and approach in case there was something off with it.

adamziel avatar Mar 11 '24 23:03 adamziel

Summarizing a conversation @adamziel and I had: this seems like a fine approach, though I think it's a long way away from where it needs to be with naming and documentation.

This is essentially a TLS proxy standing as a middleman between the PHP code and JS code. What it isn't is writing our own security layer, and this is because ultimately all TLS connections on the JS side will be enforced with the security mechanisms that the browser provides.

Proxying through fetch() is challenging because it requires that we grab the data out of the encrypted stream and present it to PHP in a new encrypted stream, acting like it was never translated, thus the middleman.

It looks like we can provide a ReadableStream to fetch as the body of the network request so I think it's even possible to go beyond what's here and provide a generalized socket interface into PHP. I was on my way to testing that out when I ran out of time. There's still the challenge of detecting when a TLS handshake is beginning, but maybe that's not too hard to do.

dmsnell avatar Mar 25 '24 09:03 dmsnell

node-forge is quite slow when it comes to certificate generation, perhaps https://pkijs.org/docs/examples/certificates-and-revocation/create-and-validate-certificate would be faster

adamziel avatar May 20 '24 19:05 adamziel

Surfacing this comment:

I got this from Claude, it's probably incorrect but it's fast :D It may or may not be a good starting point for CA cert generation:

/**
 * Generate a CA.pem certificate pair dynamically in the browser with no dependencies
 * using just the Browser-native crypto API.
 */

async function generateCaPem() {
	const certInfo = {
		serialNumber: '1',
		validity: {
			notBefore: new Date(),
			notAfter: new Date(Date.now() + 1000 * 60 * 60 * 24 * 365),
		},
		subject: {
			commonName: 'Root CA',
		},
		issuer: {
			commonName: 'Root CA',
		},
		extensions: {
			basicConstraints: {
				critical: true,
				cA: true,
			},
			keyUsage: {
				digitalSignature: true,
				keyCertSign: true,
			},
		},
	};

	const crypto = window.crypto;
	const encoder = new TextEncoder();
	const decoder = new TextDecoder();

	const caKey = await crypto.subtle.generateKey(
		{
			name: 'RSASSA-PKCS1-v1_5',
			modulusLength: 2048,
			publicExponent: new Uint8Array([1, 0, 1]),
			hash: 'SHA-256',
		},
		true,
		['sign', 'verify']
	);

	// Create a simple ASN.1 structure for the certificate
	const tbs = encoder.encode(JSON.stringify({
		version: 3,
		serialNumber: certInfo.serialNumber,
		issuer: certInfo.issuer,
		subject: certInfo.subject,
		validity: {
			notBefore: certInfo.validity.notBefore.toISOString(),
			notAfter: certInfo.validity.notAfter.toISOString(),
		},
		extensions: certInfo.extensions,
	}));

	const signature = await crypto.subtle.sign(
		{
			name: 'RSASSA-PKCS1-v1_5',
		},
		caKey.privateKey,
		tbs
	);

	// Combine TBS and signature into a simple certificate structure
	const cert = encoder.encode(JSON.stringify({
		tbsCertificate: decoder.decode(tbs),
		signatureAlgorithm: 'sha256WithRSAEncryption',
		signatureValue: btoa(String.fromCharCode(...new Uint8Array(signature))),
	}));

	const caPem = `-----BEGIN CERTIFICATE-----\n${btoa(decoder.decode(cert))}\n-----END CERTIFICATE-----`;
	const caKeyPem = await exportKeyToPem(caKey.privateKey);

	return { caPem, caKeyPem };
}

async function exportKeyToPem(key) {
	const exported = await crypto.subtle.exportKey('pkcs8', key);
	const exportedAsBase64 = btoa(String.fromCharCode(...new Uint8Array(exported)));
	return `-----BEGIN PRIVATE KEY-----\n${exportedAsBase64}\n-----END PRIVATE KEY-----`;
}

generateCaPem().then(({ caPem, caKeyPem }) => {
	console.log({ caPem, caKeyPem });
});

adamziel avatar Sep 09 '24 19:09 adamziel