breadboard
breadboard copied to clipboard
Teach Board Server to work with Access Proxies
One benefit of using browser for running boards is that it neatly fits into the Web fetch model when using Access Proxies: the credentials are passed along with the fetch, and the Access Proxy gets the right result. When running the same board on the server though, the situation becomes murky: the server does not run with the user credentials. So accessing documents behind access proxy fails.
This issue is to track fixing this. Somehow, we need to allow the board server to successfully and safely access boards behind the Access Proxies.
So there are two scenarios here, and we may want to address them separately.
In both cases, there is a board that contains another board. The embedded board is referenced in the BGL by URL. At runtime, the board's contents are fetched via HTTP. In the Visual Editor, this request includes the current user's credentials, so the runtime has access to any endpoint that the user has access to (mostly).
Scenario 1: Headless execution on a board server. When a board is executed on a board server, calls for embedded boards go out with no credentials. If the board server call passes through any layer that requires them, the call will fail. Note that this can happen even if the embedded board is on the same server. The Board Server makes an HTTP call to itself. This call can fail if it passes through a proxy layer that rejects it.
Scenario 2: Fetching a board for offline execution. This is the case when the Breadboard runtime is embedded in an application, and it stores the BGL locally, either embedded into the code, or in a data storage layer like a DB. If that content is fetched from a board server using a GET /boards call, the BGL returned is shallow. If there are any embedded boards, they will be referenced by URL in the BGL.
One solution that may work for both "Scenario 1" with a local board, and "Scenario 2":
Return a deep copy of the board. For every embedded board in the BGL file, fetch it and embed it into the returned BGL file, so that the returned board has no remote dependencies.
This approach can still fail if there is a remote board that is inaccessible to the board server, but it's a first step that solves a number of current problems with headless execution of boards.
@wfaithfull FYI because this might affect the SQLite provider.
It shouldn't, though. Ideally anything we build for this should be built on top of the storage layer, so that the individual storage implementations don't need to be affected. We'll just need to make sure we've got our abstractions properly in order.
Graph inspector docs: https://breadboard-ai.github.io/breadboard/docs/inspector/graph/
One solution that may work for both "Scenario 1" with a local board, and "Scenario 2":
Return a deep copy of the board. For every embedded board in the BGL file, fetch it and embed it into the returned BGL file, so that the returned board has no remote dependencies.
This approach can still fail if there is a remote board that is inaccessible to the board server, but it's a first step that solves a number of current problems with headless execution of boards.
This is like "inlining" in compiler parlance. I assume there wouldn't be any persistence associated with this operation - it would be on demand. I'm imagining some folks referencing board urls from the head of git branches rather than by commit, expecting to always pull in the latest - dynamic dependencies are definitely not my personal taste, but something that people might expect to be supported or currently rely on.
@wfaithfull FYI because this might affect the SQLite provider.
It shouldn't, though. Ideally anything we build for this should be built on top of the storage layer, so that the individual storage implementations don't need to be affected. We'll just need to make sure we've got our abstractions properly in order.
Yes. Ooh, I love a nice storage abstraction 😌. Based off my reading so far I don't see why it possibly would touch the storage implementations. Can you elaborate what were you thinking might affect the SQLite provider?
Yes. Ooh, I love a nice storage abstraction 😌. Based off my reading so far I don't see why it possibly would touch the storage implementations. Can you elaborate what were you thinking might affect the SQLite provider?
Only because our abstractions might not be fully in order just yet. But the idea is that this logic would never touch the individual storage layers.
Also, to follow up on some internal discussions that we've been having, we've decided that we're not going to do this right away, so if you're doing something else (e.g. the ExpressJS migration), don't worry about conflicting with this for now. I'll contact you before I start changing anything.
With this commit can we call this issue closed?
I don't think there's much to do with "Scenario 2" as described above, since we don't directly support such usage (for now). I think the primary motivation for this issue was to address the problem of using boards that are resident on the local board server.
Are we also interested in addressing the issue of remote board servers behind access proxies?
SGTM!