Can We Simplify VPN Client Development?
Developing a production-grade, multi-platform VPN client is tough work (source: I’ve been developing one for the past two years). Not only do you have deal with the usual headaches of implementing the same UI and backend logic on several different platforms, but each client must also interact closely with its platform’s unique networking APIs, which are usually privileged, poorly documented, and hard to test.
However the underlying architecture of all VPN clients is essentially identical. Furthermore, this architecture is also applicable to most other circumvention tools. I believe we could greatly simplify the development of new tools by creating an open-source framework for VPN clients. Researchers and hobbyists could plug their proof-of-concept protocol code into this framework and create basic, functioning tools for any platform with minimal development overhead. Tool designers could use it as a base for creating more advanced, feature-rich products.
The framework is separated into cross-platform frontend and backend components, and a native bridge component.
Cross-Platform Frontend
The frontend UI code is implemented using a cross-platform solution like React Native or Flutter. The framework provides a basic UI template with the bare essentials for a VPN client (i.e. elements to select a server and start/stop the VPN). Developers can extend this template with the additional UI elements their tool needs.
Cross-Platform Backend
The backend protocol code is also implemented in a cross-platform language like C/C++. The framework defines an API for the essential functionality all custom protocol code must implement (i.e. open/close the communication tunnel, send/receive network packets to/from the client). Developers can plug their protocol code into the framework by implementing this small set of API functions.
Native Bridge
A native bridge is provided for each platform the framework supports (i.e. iOS, Windows). Native bridges have three primary features:
- Initialize a platform-specific app development project (i.e. an Android Studio project). Mount the frontend component for use as the app UI. Compile and load the backend component code. Developers don't have to touch this part at all except for advanced customizations.
- Invoke the platform’s networking APIs (i.e.
NEPacketTunnelProvideron Apple devices) to capture the client’s network traffic and route it through the backend code. Developers don’t have to touch this part at all unless their tool requires a custom networking configuration. - Facilitate two-way JSON-based communication between the frontend and backend components. A core implementation corresponding to the basic UI template is provided, carrying messages such as "startTunnel" (frontend -> backend) or "tunnelStatus" (backend -> frontend). Developers can add message types for their tool's requirements.
For convenience, native bridges also wrap common code elements that must be used in a platform-specific manner and provide a standard API for use by the backend component. For example:
- On Android, after a VPN interface has been established using the
VpnServiceAPI, theprotect()method must be called on all new sockets to prevent them from being looped back into the VPN. Native bridges expose acreateSocket()method that can be called by cross-platform code to obtain a new, usable socket without worrying about these platform-specific requirements.
So that’s the idea. It would be a major undertaking but in the end I think it would be well worth the effort. Looking forward to some feedback, specifically:
- Tool designers: Does this pain point resonate with you? Would you like to have a framework like this you could work with?
- Circumvention researchers/hobbyists: If there were a way to easily turn your proof of concept code into a functional multi-platform tool, would you use it? Does this seem like a feasible approach?
- Are there existing projects working toward this goal? I know of several open-source, multi-platform VPN clients (i.e. Google One VPN, Outline VPN), but their focus is on creating a single fully-functional tool, and as a result their code structure does not make it easy for developers to swap in a completely different frontend UI or backend protocol.
singbox?
AFAIK it is the first project that made attempts at a clean internal architecture while satisfying your described requirements.
singbox?
It's an interesting project, but seems more like Google One than what I am proposing. For example the UI seems to be implemented separately in native code for each of the platforms singbox supports. So trying to fork the project and create your own custom UI would still be quite a lot of work.
Researchers and hobbyists could plug their proof-of-concept protocol code into this framework and create basic, functioning tools for any platform with minimal development overhead.
This was kind of the ambition behind pluggable transports. "This specification describes a way to decouple protocol-level obfuscation from an application's client/server code, in a manner that promotes rapid development of obfuscation/circumvention tools and promotes reuse beyond the scope of the Tor Project's efforts in that area."
The original version of the pluggable transports specification (which is now retroactively named PT 1.0) was based on separate OS processes communicating through environment variables, stdin/stdout, and localhost sockets. Desktop versions of Tor Browser, as well as the bridge nodes, still use it today. It was never really used by anything other than Tor. The main exception I know of is ptadapter, which lets you run pluggable transports client and servers with applications other than Tor.
The pluggable transports concept and specification were further developed by https://www.pluggabletransports.info/, which released a series of 2.x and 3.0 specifications. Perhaps the biggest addition—which is something people had complained about in the Tor pluggable transports spec—is an "API" mode of operation, where the transport code is linked into the application, rather than being a separate process. There's an API defined for various languages: Go, Java, Swift. Much like you proposed: "Developers can plug their protocol code into the framework by implementing this small set of API functions."
My impression is that pluggable transports—neither the original version nor the 2.x and 3.0 updates—never really realized the goal of reusable/pluggable obfuscation modules and a preferred interface for all sorts of circumvention programs. But if you're thinking of defining a programming interface, the API definitions can serve as something to compare to.
The idea of modular circumvention transports has come up many times. So far, nobody has found a model that developers love to use, and cross-project sharing of transports still mostly happens via code forking rather than with a standardized module interface.
A related undertaking that comes to mind is WATER from FOCI 2024, based on WebAssembly. It has the advantage that you don't need to define an API for every language you might use; you just need whatever language you use to compile to WASM.
https://censorbib.nymity.ch/#Chi2024a https://archive.org/details/foci2024-1/video-03-chi.mkv
We introduce WATER (WebAssembly Transport Executables Runtime), a novel design that enables applications to use a WebAssembly-based application-layer (e.g., TLS) to wrap network connections and provide network transports. Deploying a new circumvention technique with WATER only requires distributing the WebAssembly Transport Module (WATM) binary and any transport-specific configuration, allowing dynamic transport updates without any change to the application itself. WATMs are also designed to be generic such that different applications using WATER can use the same WATM to rapidly deploy successful circumvention techniques to their own users, facilitating rapid interoperability between independent circumvention tools.
WATER has a distinct advantage over prior approaches that provide similar flexibility, such as Proteus [33] or Pluggable Transports [23]. Where WATER leverages WebAssembly, new techniques can be written in one of several (and growing [35]) high-level languages. In contrast, Proteus requires that techniques be written in a bespoke domain-specific language (DSL) which is incompatible with the import of other code or libraries, and thus must be entirely self-contained. Meanwhile, Pluggable Transports must maintain separate, language-specific APIs for each supported programming language (currently Go, Java, and Swift), which are incompatible with each other. In contrast, WATMs can run in any WATER runtime, currently featuring off-the-shelf implementation in Go and Rust with the potential of being implemented with any WebAssembly runtime with WASI support.
Perhaps somewhat related is the RACE/RACECAR project. I wasn't involved with it, so I can't speak from first-hand experience. I don't know that they had a specific requirement of modular/composed transports, but I get the impression that, due to the nature of the multi-institution collaboration, it kind of developed that way.
https://github.com/tst-race/race-quickstart
The Raceboat paper (presentation video), for example, is a framework for circumvention signaling channels.
I'm sure there are other examples.
One thing to consider is what kind of networking interface you want to support (at what network layer the system operates). Saying "VPN", you are probably aiming at something layer 3, working at the level of IP packets. Pluggable transports are different: they are at the level of application streams, more like a SOCKS proxy or HTTP proxy. There is probably less prior work on VPN-like modular transports.
There's also a delicate question of how much to abstract. It sounds like you're envisioning a model where the Backend Protocol makes its own sockets, which is probably a necessary degree of flexibility. When you say "the bare essentials for a VPN client (i.e. elements to select a server)", that presupposes a certain deployment model, where a client uses a server which is identified with an IP address or IP:port or similar. That may not be flexible enough: I know it's been something of a struggle finding uniform configuration interfaces for Tor pluggable transports, some of which (meek and Snowflake) don't have a single IP:port they connect to, but rather a set of configuration parameters like url=, front=, ice=.
This was kind of the ambition behind pluggable transports.
Although they share the same broad goal of modularizing circumvention technology, there is a key difference between the prior work you've mentioned and what I'm proposing. Pluggable Transports, WATER, RACE etc. all focus on helping developers of new circumvention systems simplify and reuse their backend protocol/transport code. My focus is on helping developers produce actual, usable client apps that allow non-technical end users to connect to these systems.
As a specific example, here is the high-level component architecture discussed in the Raceboat paper:
We see that "Raceboat sends/recvs messages through the encoding". But where do the messages come from? How are they connected to the Raceboat implementation?
The most broadly applicable design is: messages are captured from a client device's network stack and sent to Raceboat. Raceboat processes the messages to circumvent some network censor. Raceboat injects the received response traffic back into the client device's network stack.
In the framework I'm proposing, the entire (client-side) Raceboat implementation would be the "Backend Protocol", like so:
Currently, the propagation of new circumvention technology proceeds like this:
- A new idea is proposed and a proof of concept implementation is tested
- A bare-bones, command-line implementation makes the technology available for early adopters on a single platform (usually Linux)
- GUI-based clients make the technology readily accessible for users on all platforms
There is huge dropoff between Steps 2 and 3 because an enormous amount of developer effort is required for each platform. But we could make Step 3 as easy (or even easier) than Step 2 by creating this type of cross-platform client template. Once you're done with Step 1, you plug the client half of your code into the "Backend Protocol" part of the template, tweak the "Frontend UI" part if needed, and you've got GUI-based clients for every platform.
There's also a delicate question of how much to abstract.
Agreed. The success of the framework relies on the "Backend Protocol" and "Frontend UI" elements being abstract enough to encompass a wide range of specific systems, while still being concrete enough to provide substantial work reduction benefits. In fact, I believe it is possible for the framework to cover not only VPN clients, but almost all circumvention system clients. That is because from a high-level perspective these clients all operate the same way:
- Capture network traffic from the client device
- Process that traffic to achieve circumvention
- Inject reply traffic back into the client's network stack
The processing step is where you see all the variation between different systems. So the framework abstracts out the capturing and injecting steps (as well as more banal development tasks like loading the GUI, packaging the app bundle for installation). Only the core circumvention processing step remains to be implemented.
When you say "the bare essentials for a VPN client (i.e. elements to select a server)", that presupposes a certain deployment model, where a client uses a server which is identified with an IP address or IP:port or similar.
You're right, I should have used a different phrase than "select a server". What I meant was that every client GUI must include two parts:
- An element to start/stop capturing client network traffic ("start/stop the VPN")
- Elements to select what to do with the captured traffic ("select a server")
For a traditional VPN, the "what to do" selection is "send the captured traffic to this IP:port". But the framework allows any sort of selection, including "process the captured traffic using these Snowflake params". The framework doesn't need to know or care about the specifics of the frontend selection as long as the custom frontend code packages it in JSON format so it can be passed through the bridge (via the "Messaging Channel") to the custom backend.
Saying "VPN", you are probably aiming at something layer 3, working at the level of IP packets. Pluggable transports are different: they are at the level of application streams, more like a SOCKS proxy or HTTP proxy.
Yes, I think L3 is best for a general framework. The API for L3 interfaces is much simpler, and backends designed to work at L4 can always be refitted to work with an L3 interface by adding a user-space network stack. Alternatively, the framework could provide APIs for both L3 and L4.
I don't have personal experience with it, but it seems worthwhile to look into LEAP:
https://leap.se/ https://docs.leap.se/
Our primary product is LEAP VPN, an open source white label VPN designed for ease of use and utility within censored environments. LEAP VPN is the shared code base for RiseupVPN, CalyxVPN, and Bitmask.
As of today, LEAP offers building blocks for provisioning a secure VPN service. As an example, RiseupVPN is deployed used the LEAP VPN Platform, and the clients are built from the leap-android and leap-desktop codebase.
Developing a VPN that works seamlessly across multiple platforms is definitely a pain. However, while app frameworks can offer a solution, they often restrict developers who may prefer different implementation strategies. A more flexible and powerful approach is a cross-platform VPN API. Such an API should provide a uniform interface across all platforms and be extensible enough to incorporate new strategies without requiring changes to the core API itself.
This is precisely the goal of the Outline SDK. By extracting and modularizing the core logic from the Outline VPN, we've created a reusable and composable software development kit. Our development has been bottom-up, focusing on creating APIs that allow developers to combine and reuse various strategies.
Moving up towards a VPN API, we've also created a library for network-level functionality. This library aids in the implementation of TUN devices and tunneling using a transport-to-IP ("tun2socks") approach.
Having a cross-platform API has been in our wishlist. The way we intend to approach that is by cleaning up our own internal API enough that it can be extracted and reused. But this takes a lot of work that takes away from other features (for example, our successful Mobile Proxy), and we would like to see a future where apps can embed circumvention so that VPNs are not needed as much.
A higher-level VPN API will dramatically simplify the development process, allowing tool developers to build their applications in any way they choose.
I don't have personal experience with it, but it seems worthwhile to look into LEAP:
https://leap.se/ https://docs.leap.se/
Do you know if any of the LEAP contributors are active here? I was unable to connect to them on Matrix - it seems to require an email address at systemli.org which is invite only.
…now you have two problems.
Developing a VPN that works seamlessly across multiple platforms is definitely a pain. However, while app frameworks can offer a solution, they often restrict developers who may prefer different implementation strategies. A more flexible and powerful approach is a cross-platform VPN API. Such an API should provide a uniform interface across all platforms and be extensible enough to incorporate new strategies without requiring changes to the core API itself.
This is precisely the goal of the Outline SDK. By extracting and modularizing the core logic from the Outline VPN, we've created a reusable and composable software development kit. Our development has been bottom-up, focusing on creating APIs that allow developers to combine and reuse various strategies.
Moving up towards a VPN API, we've also created a library for network-level functionality. This library aids in the implementation of TUN devices and tunneling using a transport-to-IP ("tun2socks") approach.
Having a cross-platform API has been in our wishlist. The way we intend to approach that is by cleaning up our own internal API enough that it can be extracted and reused. But this takes a lot of work that takes away from other features (for example, our successful Mobile Proxy), and we would like to see a future where apps can embed circumvention so that VPNs are not needed as much.
A higher-level VPN API will dramatically simplify the development process, allowing tool developers to build their applications in any way they choose.
Wow, I hadn't taken a close look at Outline before. What an exciting project!
I guess in an ideal world we would have both a "VPN API" (like Outline) for developers who want more flexibility, and an "app framework" for those who just want a simple "plug-and-play" solution. But you're right, it would require a lot of time and effort that might be better spent on other things.
I'm not very optimistic "a future without VPNs" is achievable. Countries like China can unilaterally "veto" a new protocol by simply blocking its use in their market, as happened with ESNI. And VPNs will always be a better tool against strict censorship, since they make it possible to implement "system-wide" circumvention strategies, while a per-connection SDK will only be able to implement strategies designed around a single connection.
I'm not very optimistic "a future without VPNs" is achievable. Countries like China can unilaterally "veto" a new protocol by simply blocking its use in their market, as happened with ESNI. And VPNs will always be a better tool against strict censorship, since they make it possible to implement "system-wide" circumvention strategies, while a per-connection SDK will only be able to implement strategies designed around a single connection.
In many countries you can simply use encrypted DNS, perhaps combined with TCP split or TLS record fragmentation, to bypass interference. That's a much better experience for the user in terms of performance and costs, and it frees up capacity for the more challenging situations. So people in places like China, Iran or Russia still benefit from "proxyless" being deployed.
You can capture all the system traffic using the VPN APIs, even if you don't use a VPN server. We do that in the Intra app.
You can capture all the system traffic using the VPN APIs, even if you don't use a VPN server. We do that in the Intra app.
I didn't look too closely but it seems Intra is using the standard Android VpnService framework to capture all system traffic? (https://github.com/Jigsaw-Code/Intra/blob/master/Android/app/src/main/java/app/intra/sys/IntraVpnService.java)
My point is that implementing a system-wide circumvention strategy (like Intra) ultimately requires the user to install a dedicated VPN client app (like Intra). It's not something that can be done by individual apps/connections.
@wallpunch it absolutely can be implemented by any app. It will effectively turn them into a VPN app, which in turns requires user permission to install the VPN profile.
@fortuna
In many countries you can simply use encrypted DNS, perhaps combined with TCP split or TLS record fragmentation, to bypass interference. That's a much better experience for the user in terms of performance and costs, and it frees up capacity for the more challenging situations. So people in places like China, Iran or Russia still benefit from "proxyless" being deployed.
<script src='https://www.google.com/recaptcha/api.js' async defer nonce="x"></script> efficiently blocks users from China.
we would like to see a future where apps can embed circumvention so that VPNs are not needed as much.
There's a video recording of this talk from SplinterCon 2025 here: The Future of VPN is no VPN.
Do you know if any of the LEAP contributors are active here? I was unable to connect to them on Matrix - it seems to require an email address at systemli.org which is invite only.
I think you would only need a systemli.org email address if you want to create an account at the systemli matrix homeserver. It should be possible to join the room with an account at a different homeserver.
Otherwise, info at leap se should be a reachable email address.
I am building a VPN that is easy for everyone to build and share with people around me. The future goal is to decentralize https://github.com/projectshofar
I am building a VPN that is easy for everyone to build and share with people around me. The future goal is to decentralize https://github.com/projectshofar
@HaradaKashiwa how does that compare to Outline?
I am building a VPN that is easy for everyone to build and share with people around me. The future goal is to decentralize https://github.com/projectshofar
@HaradaKashiwa how does that compare to Outline?
Outline runs in Windows and only supports ss.
@HaradaKashiwa how does that compare to Outline?
Outline runs in Windows and only supports ss.
Outline runs on all platforms and it has a more resilient Shadowsocks implementation that continues to work reliably in many places (implementation matters).
You can also use Websockets. This covers both bases: don't look like something blocked, and look like something allowed.
@HaradaKashiwa how does that compare to Outline?
Outline runs in Windows and only supports ss.
Outline runs on all platforms and it has a more resilient Shadowsocks implementation that continues to work reliably in many places (implementation matters).
You can also use Websockets. This covers both bases: don't look like something blocked, and look like something allowed.
I'm talking about outline manager. The purpose of the project is to build a server more easily, not to connect to the server.