reverse-proxy
reverse-proxy copied to clipboard
Tunneling with YARP
How awesome would it be if we could use YARP as a reverse proxy over WebSockets?
I have one public facing ASPNETCORE server. Then I have multiple small blazor-server applications running on multiple machines in customers infrastructure (behind firewall etc). It would be great if I could setup YARP as a reverse proxy over websockets/signlar so I can reach the sites from public facing server.
ASPNETCORE Server (domain.com):
.AddYarpWebSocketServerEndpoint("/someSite1");
.AddYarpWebSocketServerEndpoint("/someSite2");
.AddYarpWebSocketServerEndpoint("/someSite3");
NETCOREAPP adding YARP on Client machine 1 proxy app:
.AddYarpWebSocketClient("https://domain.com/someSite1", "http://192.168.1.100:5000" );
.AddYarpWebSocketClient("https://domain.com/someSite2", "http://192.168.1.101:5000" );
NETCOREAPP adding YARP on Client machine 2 proxy app:
.AddYarpWebSocketClient("https://domain.com/someSite3", "http://192.168.1.100:5000" );
Browser (domain.com/someSite1)->Public server (WebSocket/SignalR-server-Hub) | Client(WebSocket/SignalR) -> 192.168.1.100 (blazor server app)
This article explains it pretty well: https://dev.to/hgsgtk/reverse-http-proxy-over-websocket-in-go-part-1-13n4
Triage: If we understand it correctly, you use YARP to receive (and respond via) normal HTTP traffic, but you need custom transport to the destinations. Is that correct?
You can achieve that via customization of HttpClient
. You can plug it in via https://microsoft.github.io/reverse-proxy/articles/http-client-config.html#custom-iforwarderhttpclientfactory
@jsandv were you successful in following it via docs link above?
The provided link is not enough. Yarp would have to live on the public server as part of the public app. Then when requests are coming in, they are tunneled to the allready connected proxy/agent via websockets, So the tunnel is allready established because yarp would live in the client app which will execute the actual query to the endpoint.
@karelz - I think what @jsandv is trying to do is to use YARP as a tunnel to break through firewalls, similar to Azure Relay. This is one of the features that I have been looking at with @davidfowl. You would have an instance of YARP in both networks. The internal network instance would create an web sockets connection to the external network instance. Routes would be configured on the external network instance to resources via the internal network instance. Requests to those routes would be tunneled from the external network instance to the internal network instance which would then make local requests within its network.
Triage: It would come down to YARP tunneling feature (between 2 YARP instances) - config driven.
This is what I actually thought YARP was .. but its not. I want something like Azure Relay that we can build into our own API's. WCF had something like this (netTcpRelay).
Roy
This would be a very cool feature and we have all of the pieces to build it. In fact @samsp-msft and I have discussed it in the past but it hasn't risen to be a priority as yet.
As an example, we actually have a azure relay server implementation for ASP.NET Core today that does this (https://github.com/Azure/azure-relay-aspnetserver).
If you're interested, it would be cool to build a prototype and plug it into YARP. We have all of the required extensibility points to build this as a plugin (I believe).
Was thinking of building this exact thing today as a side-project and looking at YARP.
Thinking that this could this be solved with a custom IForwarderHttpClientFactory
that returns a HttpMessageInvoker
with a custom WebSocketsHandler : HttpMessageHandler
. There we override SendAsync
and push data down to a web socket client, which again creates a HttpMessageHandler
on its side and sends it to the local host. Am I way off base here? Am I thinking too simple and naive?
This would relay traffic from the external network to the internal network, so there would be one YARP instance in the external network. Traffic originating in the internal network would just go straight out through the firewall and thus we wouldn't need a YARP on the internal network. Internal agent would just be a local agent that connect to the external instance via web sockets and forwards any requests and responses back and forth.
Would it make sense to use SignalR here instead of raw web sockets? According to docs SignalR has "no significant performance disadvantage compared to using raw WebSockets" for "most scenarios".
I was thinking that it would nest http calls inside of a web socket connection. Assuming we have 4 nodes:
- a web browser on a device connected to the internet - we'll call this the browser. It wants to access resources hosted on...
- OnPremServer - a web server, residing on premises of a company, and protected by a firewall. It doesn't have any internet facing access.
- OnPremProxy - an instance of yarp+ that acts as a gateway for requests from CloudProxy which come through a websocket connection to CloudProxy. It is located on premises and has internal network access to OnPremServer.
- CloudProxy - an instance of yarp+ that faces the internet and acts as the endpoint for the websocket connection from OnPremProxy, and is located at a cloud provider or somewhere that is internet addressable.
At startup, OnPremProxy will make an outbound HTTPS connection to CloudProxy, stating that it's the connection from OnPremProxy, and with appropriate credentials such as a client cert. That connection is then upgraded to a websocket connection. The connection can be initiated by OnPremProxy so it can break through the firewall as an outbound http request.
CloudProxy has a special route for OnPremServer that specifies that the requests need to be routed via OnPremProxy.
The browser makes a request to CloudProxy for a resource that its route configuration says is on OnPremServer. Rather than making a direct http request, it will make the http request over the websocket connection from OnPremProxy. (HttpClient can do this by using the ConnectCallback of SocketsHttpHandler). When the request is received by OnPremProxy, it then has a route for OnPremServer and can forward the request and route back the results.
The advantage of this approach is that Browser and OnPremServer don't have to be aware at all of the special nature of how requests get routed to OnPremServer. The server stack for OnPremServer can be whatever stack is needed, and doesn't need to be .NET or have any special configuration.
Similar to "Browser", other servers in the Cloud datacenter can also make requests via CloudProxy to access OnPremServer. They just need to have the right url path that will match the route configuration.
In our scenario we would have many internal networks and one public proxy. Agent is installed on each on-prem location, the agent is registered in the public proxy. When the agent starts up it makes the websocket connection at that route which identifies that agent, the agent needs to know "who he is". Now when incomming request to specific route, the public proxy can lookup the correct agent to pass the http-request.
"WebSocketHttpReverseProxyOptions": {
"Clusters": [
{
"Name": "agent01",
"Routes": [
{
"Path": "/SomeApp1",
"Endpoint": "http://192.168.1.1:5005"
},
{
"Path": "/SomeApp2",
"Endpoint": "http://192.168.1.2:5005"
}
]
},
{
"Name": "agent02",
"Routes": [
{
"Path": "/someApp3",
"Endpoint": "http://192.168.1.1:5005"
}
]
}
]
}
Also the system must support websockets, if the local app is a blazorserverapp, then public proxy will have to accept a websocket request from the browser, it will ask the agent to create a websocket connection to the local app, the agent will also make a websocket request to the public app, when the public app get this connection it will map it to the browser socket and proxy the connection.
I would also love to see a feature where a market reverse proxy (like YARP) allows LAN applications to connect outbound to a reverse proxy to get served through it. Currently most reverse proxies require that the proxy can call down to the application which is hard to achieve in LAN<->DMZ separations (firewall needs to be opened) and LAN <-> Cloud separations.
Here some posts where I tried to initiate similar discussions in the past: https://github.com/Azure/azure-relay/issues/60 https://github.com/dotnet/aspnetcore/issues/6981 https://github.com/ThreeMammals/Ocelot/issues/1271
We have an in-house developed solution where one ASP.net core application hosts a custom reverse proxy. LAN applications can be configured with a custom IServer
implementation which connects to this proxy through websockets and register the application configured endpoints. These LAN connections are put to a connection pool and once a client comes along we pick a connection from the pool to process the request.
This is similar to Azure relay but with a home-brew protocol (using a combination of property bags with OWIN keys for HTTP support, and on-demand WebSocket upgrade).
If YARP would support such usecases out of the box we could drop our custom solution.
Once I free up, I'll spike this. I have it mapped out in my head but this isn't the highest priority right now.
OK I hacked together a demo https://github.com/davidfowl/YarpTunnelDemo
- Shows how to implement this using our current extensibility
- The endpoint to register connections needs to be secured (see the comments)
- I hardcode a single backend, this could be expanded to support arbitrary clusters by having the backend provide the cluster id
- I wrote it in 2 hours but it seems to work well enough 😄 (I look forward to your pull requests)
I have a similar scenario and currently, I do that using RabbitMQ and NSQ. One can call a REST endpoint or the main app (the only one that is publically available), the request is serialized, and sent to a message queue. The agent that is installed in the client infrastructure (behind a firewall) reads the messages, queries local API or DB, and places the response in another message queue. The main app waits (with a timeout) for the response message and returns a response or a GUID (so the client can ask for a reply).
This works quite well, but a YARP version would be better and probably easier to maintain.
@davidfowl can the same technique that you have shown in your demo project be used to proxy REST requests? You have a public REST API that has all the security set up (authentication, authorization, rate limiting, etc) and an agent that is installed on-premise that is doing all the hard work (querying local API, doing file operations, calling SQL). The public app is responsible only for security, rate-limiting, passing the requests to the client, and returning the responses.
Yes there's no real magic at all. The project has been cleaned up and the diagram updated to explain how it works. The backend is the agent that has a custom Kestrel transport that uses an outbound connection instead of an inbound one. It supports HTTP/2 or websockets and uses those connections to proxy HTTP request from the front end. This plugs in at the connection layer so HTTP/1/2 and streaming protocols just work OOTB. These are handled by the SocketsHttpHandler and Kestrel.
If you use an MQ you'd need to decide what layer you want to extend, the HTTP protocol itself or the connection layer? Either way you can extend YARP to support that.
@davidfowl the requirement for using MQ was quite simple: some requests can take a long time to complete. For example, generating a report or doing a complex SQL query. So instead of waiting for the response, I return a 202 response code with a GUID that allows asking later for the response. Use case 1:
- do a request for the data
- it takes less than 3 seconds
- you get the response
Use case 2:
- do a request for the data that is taking a long time
- it takes more than 3 seconds so you get 202 response code with a GUID
- the request is still proceeded by the backend and the response is sent to MQ
- you do a GET request to a specific endpoint passing the GUID
- you get the actual response (from the MQ)
The 2.3 is tricky because I think that if I set a request timeout then the request will stop and I have no way to store the response. Am I correct on that?
Triage: We need to take @davidfowl prototype, list open design questions and write design doc.
Design doc in PR #1766
I would be very interested in this feature, very nice! Do you think an option to use gRPC bidirectional streaming instead of web socket would be feasible?
Yes, it would be, but why does it matter?
Having the flexibility to choose between the two would be a great feature to YARP I think, I'm also wondering if gRPC would achieve faster performance as the means of transport instead of using WebSocket. I see gRPC being promoted as "high performance" RPC option, however I'm not an expert on this, the difference might not be as big as I think.
Do you think the performance difference between WebSocket and gRPC with this approach would be negligible?
WebSockets and gRPC won't have significantly different performance characteristics in this streaming scenario.
There would likely be negligible performance benefits to gRPC - the benefits normally talked about are with respect to its binary serialization of message contents rather than JSON.
As the tunnel is persistent and would be using an inner protocol for multiplexing - there would not be much benefit to using HTTP/2 as the tunnel transport - HTTP/1.1 is simpler and likely to be supported by more firewalls and other devices along the communications path.
any updates on this? version 2.0.0 was released (https://github.com/microsoft/reverse-proxy/releases/tag/v2.0.0) but sadly tunneling isn't mentioned anywhere.
Just wanted to say that I would love to see this feature in YARP :), if with SignalR hub it would be even better.
I tried to do something in this direction https://github.com/FlorianGrimm/reverse-proxy/tree/reverse-proxy-tunneling It's not done - But Can I have a comment if this is the right direction?