client
client copied to clipboard
"Unable to establish sidebar-host communication channel"
We are seeing a large number of Sentry reports of errors setting up the sidebar-host channel using the new PortProvider/PortFinder infrastructure.
The error here is a timeout from the sidebar waiting for PortProvider to respond to fulfil a request for a host <-> sidebar MessageChannel.
Possible causes that we have identified:
- The user is visiting a site through a proxy which rewrites iframe URLs, making the origin of the sidebar app different from what PortProvider expects when trying to send messages to it. This can be identified in reports by the report's source URL being different than https://hypothes.is/app.html
- There is a mismatch between the version of the client used in the annotator and the version used in the sidebar. If the version used on the annotator side is from before we shipped PortProvider/PortFinder, then the sidebar-side client will fail to connect.
- To help debug this we added a
version
field to the config metadata that the annotator passes to the sidebar. As a result we saw some reports where this may have been the problem, but others where it definitely wasn't.
- To help debug this we added a
Regarding the mismatch of versions in annotator and sidebar, I would suggest to remove any caching from cloudflare for https://cdn.hypothes.is/hypothesis
. This is a very minimal script (8K minimised), but very critical to load the latest version of the client in the host
frame.
curl -I https://cdn.hypothes.is/hypothesis
HTTP/2 200
date: Thu, 02 Dec 2021 10:55:54 GMT
content-type: application/javascript; charset=UTF-8
content-length: 5753
x-amz-id-2: +7hmqcEcFoBNmVu2QLu63ymRSl/O94pE6Xm4wXZOHHx9HKAbLrmMXtY0JIxDmrKhCX4ifK+hGG0=
x-amz-request-id: X1VPE901AGQG5F1D
last-modified: Wed, 01 Dec 2021 13:30:34 GMT
etag: "e5ff6ba19940a34b515700aab155b90a"
cache-control: public, max-age=1800, must-revalidate
cf-cache-status: HIT
age: 616
accept-ranges: bytes
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
strict-transport-security: max-age=15552000; includeSubDomains; preload
x-content-type-options: nosniff
server: cloudflare
cf-ray: 6b74068f6986c26d-FRA
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400, h3-28=":443"; ma=86400, h3-27=":443"; ma=86400
Currently, there are several max-age
. I don't know which one the browser and cloudflare uses.
Regarding the mismatch of versions in annotator and sidebar, I would suggest to remove any caching from cloudflare for https://cdn.hypothes.is/hypothesis. This is a very minimal script (8K minimised), but very critical to load the latest version of the client in the host frame.
The max-age
value from cache-control
is the one that matters. This is currently 1800s or 30 minutes. The other max-age
references related to other, security-related, aspects of the transfer.
This script is used by both the annotator and the sidebar. One of the original reasons for caching it was so that the very-recently-fetched resource used when initializing the annotator would also be used when loading the sidebar, avoiding a network roundtrip. However this was before browsers introduced partitioning of the HTTP cache, so we'd have to re-check whether this is ever effective or not. 30 minutes is much longer than needed for this though. I think it might be a Cloudflare default.
Testing in Chrome 98 with an empty cache, loading the bookmarklet on https://example.com, it seems like the copy of https://cdn.hypothes.is/hypothesis fetched in the host page is not reused when the sidebar app loads. Chrome reports the latency of the fetch as 50ms.
In Firefox 94, the copy of the boot script fetched in the host page is used when the sidebar app loads. In Safari 15 it seems the cache is partitioned by top domain.
The plots in cloudflare seems to validate the correct working of the cache:
I have a question about https://hypothes.is/embed.js
which caches the temporal redirect for 10 hours:
% curl -I https://hypothes.is/embed.js
HTTP/2 302
date: Thu, 02 Dec 2021 13:03:49 GMT
content-type: text/html; charset=UTF-8
content-length: 202
location: https://cdn.hypothes.is/hypothesis
content-security-policy: font-src 'self' fonts.gstatic.com cdn.hypothes.is; script-src 'self' cdn.hypothes.is www.google-analytics.com; style-src 'self' fonts.googleapis.com cdn.hypothes.is 'unsafe-inline'
expires: Thu, 02 Dec 2021 13:04:58 GMT
cache-control: public, max-age=14400
referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin
x-xss-protection: 1; mode=block
vary: Cookie
cf-cache-status: HIT
age: 231
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
strict-transport-security: max-age=15552000; includeSubDomains; preload
x-content-type-options: nosniff
server: cloudflare
cf-ray: 6b74c1ebcdbe5c9e-FRA
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400, h3-28=":443"; ma=86400, h3-27=":443"; ma=86400
I could see that in sites that embed Hypothesis (and use https://hypothes.is/embed.js
), the browser could get https://cdn.hypothes.is/hypothesis
but not try again until 10 hours later.
Am I right, here?
I have never seen a mismatch of these two values:
I have only observed this type of mismatch:
To me this indicate only mismatches between very old version of the client in the host and new version of the sidebar.
One annoying problem with this issue is poor grouping of issues in Sentry:
- Issues are not grouped across releases. The absolute URL of the bundles contains the client version (eg.
https://cdn.hypothes.is/hypothesis/<version>/build/scripts/sidebar.bundle.js
) and Sentry resolves the stack frames to modules with paths likehypothesis/<version>/src/path/to/module.js
. Since<version>
is different across releases, the issues don't group. - Stack frames differ across browsers. As an example this Safari issue and this Chrome issue should be grouped, but the context line is different. In Chrome the context line points to the right place. In Safari it does not. This Firefox issue should also be grouped together, but it also has a stack frame that points to the wrong place.
Regarding (2), Sentry runs a sourcemap validation service which is reporting a ton of errors: https://sourcemaps.io/report/1638532141226_https%3A%2F%2Fcdn.hypothes.is%2Fhypothesis%2F1.929.0%2Fbuild%2Fscripts%2Fsidebar.bundle.js.
The errors start with references to the @hypothesis/frontend-shared package, so the investigation should probably start there.
Regarding (1), it may help if we upload the sourcemaps ourself as part of the release process using the Sentry CLI.
Looking at Sentry bug reports this morning I saw an issue in Firefox when annotating a file:// URL.
Testing locally I was not able to reproduce with Firefox's default settings, but I was able to reproduce after turning off the privacy.file_unique_origin
setting in about:config:
- Go to
about:config
and disable theprivacy.file_unique_origin
option (re-enable this afterwards) - Go to https://example.com and activate the client
- Save the page using File => Save Page As... => Select "Web archive, complete" in the options => Save. This will save an HTML file and associated folder to disk.
- Open the saved HTML file in Firefox.
This results in the following errors:
data:image/s3,"s3://crabby-images/34bf2/34bf24026f25387bb0716618df474bc4c7a26748" alt="Firefox file URL errors"
I tried saving a web page locally in Safari. Using the "Web Archive" mode, the client loaded successfully when opening the archived page. When using the "Page source" mode, the client did not appear in the saved page. I think it just saved the downloaded HTML from the origin.
I tried visiting https://chem.libretexts.org/Bookshelves/General_Chemistry/Map%3A_Chemistry_-The_Central_Science(Brown_et_al.)/07._Periodic_Properties_of_the_Elements/7.3%3A_Sizes_of_Atoms_and_Ions and saving the page using the "Page source" mode. When opening the saved HTML I saw a "The string did not match the expected pattern" error from us, along with an error about guest-sidebar communication channel setup. I didn't see an error about host-sidebar communication though.
I also tried a similar test in Chrome using its three different page-saving modes.
- With HTML-only, the client is not present on the page and there are no errors
- With "Web page, complete", the client fails to load due to the same CORS errors that Firefox reports when
privacy.file_unique_origin
is enabled. Bypassing that error by launching Chrome with the--allow-file-access-from-files
command-line flag got a little further, but it then ran into a "Strict MIME type checking is enforced for module scripts per HTML spec" error and failed to load the client. I haven't yet found a way to turn this off.
I tried a variation on the above where I saved the page as a file and then loaded it by running an HTTP server locally. In this case the client actually loaded successfully. Chrome does not preserve shadow DOM when saving the file, so the sidebar iframe did not appear in the file.
In summary, I was not yet able to reproduce an error in Safari or Chrome when saving HTML to a file and then opening a file.
I encountered this error sporadically on https://bjoc-nl.github.io in Safari 15.1:
data:image/s3,"s3://crabby-images/d1e63/d1e633011d8fa4a1cd3866e4e9011edb61119179" alt="Sidebar-host error"
An unusual thing I found about the above page is the location of the <hypothesis-sidebar>
element. Instead of being a child of the <body>
element it is a child of another element.
data:image/s3,"s3://crabby-images/aec77/aec77b15dc55f27745e3d6a3a667ac9c344f02a0" alt="hypothesis-sidebar-element pos"
In testing locally I found that moving the <hypothesis-sidebar>
around in the DOM by dragging and dropping in Chrome's devtools produces a similar error. I haven't yet found what code on the page is doing this.
This page which appeared in the Sentry logs has a crap ton of JavaScript on it, freezing my browser for a couple of seconds when the page loads normally (with no throttling): https://www.accessengineeringlibrary.com/content/book/9780071835091/toc-chapter/chapter1/section/section5
If I load the page in Chrome with 6x CPU throttling enabled in the Performance tab, I can trigger PortFinder's timeout for connecting to the host page.
To expand on the DOM tree change issue above. If an iframe gets moved in the DOM tree, either directly or as a result of an ancestor being moved, then it gets reloaded.
If the <hypothesis-sidebar>
or <hypothesis-notebook>
elements get moved in the DOM, the corresponding apps get reloaded. In the case of the sidebar we don't support it connecting to the host multiple times and its requests to establish a sidebar <-> host connection fail.
Additionally, when an iframe is moved in the DOM, the identity of its contentWindow
property changes. This means that when PortProvider
receives a second port request from the reloaded sidebar and checks for it in its _sentChannels
map, it doesn't find it and attempts to re-send the port. This fails since a port can only be transferred once.
To demonstrate this run the following in the browser console on a page with Hypothesis embedded:
var ifr = document.querySelector('hypothesis-sidebar').shadowRoot.querySelector('iframe');
var map = new Map();
map.set(ifr.contentWindow, 'test');
map.get(ifr.contentWindow); // Logs `"test"`
document.body.append(ifr); // Triggers reload of sidebar
map.get(ifr.contentWindow); // Logs `undefined`
This is a summary of what we've learned so far about the reasons sidebar-host communication setup can fail.
1. The sidebar times out waiting for the host page to process the request. Update 2022-01-04: The timeout was doubled, but we are still seeing this error.
If the host page has a very large about of JavaScript on it and the device is slow, it is possible for a postMessage
timeout to exceed the current 10 second threshold.
Fix: I think we're just going to need a bigger timeout here, and/or to downgrade the current timeout to a warning.
2. The sidebar app iframe is moved within the DOM. Update 2022-01-04: This issue is still not handled. The dumbest fix here would be to detect when this happened (this can be easily done in PortProvider) and log a warning for the benefit of the web page author, but don't report an error to Sentry. Issue extracted into https://github.com/hypothesis/client/issues/4095.
If the sidebar app iframe or any of its ancestors is moved within the DOM, the iframe will be reloaded and the identity of the iframe's contentWindow
will change. When the reloaded sidebar requests its port, the transfer will fail because PortProvider attempts to re-send a port that was already transferred to the previous instance of the app.
Fix: Making a reload of the sidebar work would involve changes to many parts of the app. All of the connections from the sidebar to other parts of the app would need to be recreated for example. I propose that in this case we just fail in a more obvious way and don't send reports to Sentry.
3. The sidebar app iframe is proxied through a service which changes the iframe's origin. Update 2022-01-04: We fixed this by configuring Sentry to only accept client error reports from https://hypothes.is. See "Allowed domains" setting in https://sentry.io/settings/hypothesis/projects/client/.
If the iframe is proxied in a way that changes its origin, an error will occur when PortProvider attempts to transfer the port because the iframe's origin doesn't match the expected origin, which comes from the boot script.
Fix: We don't intend to try and make the sidebar app work with these proxies, which often make many changes to the JS environment. The current plan is to just try and detect when this has happened and not send reports to Sentry.
4. There is a mismatch between the version of the annotator loaded in the host page and the sidebar. Update 2022-01-04: We check for this and stop sending Sentry reports if detected. See https://github.com/hypothesis/client/pull/4022. We also changed the caching configuration for the boot script in https://github.com/hypothesis/client/pull/4050.
This can happen if an old version of https://cdn.hypothes.is/hypothesis has been cached by the browser or some other service. It can also happen if the host page has been saved after the client has loaded and later re-loaded. When reloaded, the iframe is loaded afresh from https://hypothes.is/app.html but the host page will embed references to the specific version of the client that was originally used - either as references to eg. https://cdn.hypothesis/hypothesis/{version}/{path}
or as resources that were saved with the HTML page.
Fix: We can't fix all of the causes of this mismatch, but we can change cache settings on https://cdn.hypothes.is/hypothesis, which would help in the case that a mismatch occurs shortly after a new client release. The plan currently is:
- Change the caching headers on https://cdn.hypothes.is/hypothesis to prevent or shorten browser caching of this script while allowing longer caching by Cloudflare.
- Check for a version mismatch between the annotator and sidebar when the sidebar launches, and disable Sentry issue reporting in that case. We might also want to display a notice to the user that tells them about the potential issue.
5. Client is loaded inside a sandboxed iframe without the 'allow-same-origin' permission Update 2022-01-04: We check for this and stop sending Sentry reports if detected. See https://github.com/hypothesis/client/pull/4022.
If the client is loaded inside a sandboxed iframe with the 'allow-scripts' permission but without 'same-origin', then any iframes it creates will have the origin "null"
rather than the expected origin. This currently results in "Ignored invalid port request for channel sidebar-host from null" errors being sent to Sentry.
diff --git a/dev-server/documents/html/multi-frames.mustache b/dev-server/documents/html/multi-frames.mustache
index 03e70ae8b..7f0434613 100644
--- a/dev-server/documents/html/multi-frames.mustache
+++ b/dev-server/documents/html/multi-frames.mustache
@@ -22,8 +22,6 @@
Selecting text, adding annotations, opening and closing the sidebar here
should not have any impact in the other frames.
</p>
- <iframe src="/document/burns"></iframe>
- <iframe src="/document/doyle"></iframe>
- {{{hypothesisScript}}}
+ <iframe sandbox="allow-scripts" src="/document/burns"></iframe>
</body>
</html>
Aside from this issue, the client currently doesn't work in a sandboxed iframe without the 'allow-same-origin' flag because the login popup window relies on passing an access token back to the opener using window.postMessage
. In this call it sets the targetOrigin
argument to the registered origin to ensure that the access token is only returned to a trusted party.
Summary of fixes made for issues identified in https://github.com/hypothesis/client/issues/3986#issuecomment-989685057:
- The sidebar times out waiting for the host page to process the request.
The timeout was extended from 10 to 20 seconds. This appeared to reduce the number of reports significantly, though we still see 10-20 per hour. The longer timeout can still be exceeded for reasons outside our control, so we might need to ultimately downgrade this to a warning that is not sent to Sentry.
- The sidebar app iframe is moved within the DOM.
This is not yet handled. I think we will need a solution similar to what we did for issues 3, 4 and 5.
- The sidebar app iframe is proxied through a service which changes the iframe's origin.
- There is a mismatch between the version of the annotator loaded in the host page and the sidebar.
- Client is loaded inside a sandboxed iframe without the 'allow-same-origin' permission
These situations are now checked for by the sidebar on startup in a checkEnvironment
call. If detected, a warning is logged and Sentry error reporting is disabled.
Slack discussion about closing out this issue by detecting the two not-yet-fully-fixed situations from https://github.com/hypothesis/client/issues/3986#issuecomment-989685057 and downgrading them from errors to warnings which don't get reported to Sentry.
Another possibility that I came across which might explain this issue the browser's back-forwards cache. If the following can happen, it would cause this issue as well:
- Hypothesis is loaded in a page
- User navigates back/forwards. Page is cached in the bfcache.
- User returns to original page from step 1.
- Main frame is loaded from the bfcache, but sidebar/notebook iframes are loaded afresh
What I don't know, and haven't been able to confirm yet, is whether the bfcache treats a whole page, including all of its frames as a single unit, or whether it can selectively discard data for certain frames or decide not to cache them in the first place.
If this issue can happen, and if we fix https://github.com/hypothesis/client/issues/4095, that would solve this issue.
A useful debugging step in the interim would be to confirm if all the DataCloneError errors we are seeing are caused by the sidebar reloading, or if they are caused by a different issue. An interesting feature of these reports is that although we do get a similar error from Chrome, we get far more from Safari.
What I don't know, and haven't been able to confirm yet, is whether the bfcache treats a whole page, including all of its frames as a single unit, or whether it can selectively discard data for certain frames or decide not to cache them in the first place.
I spent a bunch of time looking at the WebKit source this morning to try and figure this out. Everything I read appeared to indicate that a page and its frame tree is always cached, not cached or removed from the cache ("pruned") as a unit (see BackForwardCache.cpp
). I did find a couple of references to child frames becoming "detached" while in the cache, but I'm not sure what can cause that to happen.
If it turns out that any browser can choose to discard the sidebar iframe from the back forwards cache, or not cache it in the first place, yet still cache the parent frame, then I think we'll just have to add some way to handle such reloads. This would be covered by https://github.com/hypothesis/client/issues/4095.
@robertknight can we close this?
I'm having a strange problem here: I get
sidebar.bundle.js?b9d5d8:652 The Hypothesis sidebar is running in a different origin (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek) than expected (null). It may not work.
and then
Uncaught (in promise) Error: Unable to establish sidebar-host communication channel at sidebar.bundle.js?b9d5d8:599
on zhihu.com
. Example page.
I'm aware of my plenty of scripts, extensions and proxies thing, but after disabling them, the problem persists.
Then I remember I used to make hypothesis work with this Disable Content-Security-Policy, magic extension, but it's not working now.
Hope there's a workaround one day!
I wasn't able to reproduce the problem on this page, so it might be an issue related to a proxy or extension.
I'm aware of my plenty of scripts, extensions and proxies thing, but after disabling them, the problem persists.
Does the problem happen even if you install Hypothesis as the only extension in a blank new Chrome profile?
sidebar.bundle.js?b9d5d8:652 The Hypothesis sidebar is running in a different origin (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek) than expected (null). It may not work.
The unexpected part of this message is the null
. It should be the same as the "chrome-extension://" URL before that. Can you try the following to help me understand where this incorrect value is coming from:
- Activate the Hypothesis extension on your example page
- Open the browser's developer tools and switch to the Console tab
- Enter the commands below (colored lines only) and press enter after each. The output should look the same as the grey lines.
var config = JSON.parse(document.querySelector('.js-hypothesis-config').textContent)
// Should display "undefined"
config.sidebarAppUrl
// Should display 'chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/client/app.html'
new URL(config.sidebarAppUrl).origin
// Should display 'chrome-extension://bjfhmglciegochdpefhhlphglcehbmek'
document.querySelector('hypothesis-sidebar').shadowRoot.querySelector('iframe').src
// Should display a long URL which, when copied and pasted, looks like 'chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/client/app.html#config=%7B%22appType%22%3A%22bookmarklet%22%2C%22openSidebar%22%3Afalse%2C%22showHighlights%22%3A%22always%22%2C%22origin%22%3A%22chrome-extension%3A%2F%2Fbjfhmglciegochdpefhhlphglcehbmek%22%2C%22version%22%3A%221.981.0%22%2C%22hostURL%22%3A%22https%3A%2F%2Fwww.zhihu.com%2Fquestion%2F421664666%2Fanswer%2F2107850670%22%7D'
Hello, sorry for the late reply. I typed the commands and it showed the following:
var config = JSON.parse(document.querySelector('.js-hypothesis-config').textContent)
//undefined
config.sidebarAppUrl
//"chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/client/app.html"
new URL(config.sidebarAppUrl).origin
//"null"
document.querySelector('hypothesis-sidebar').shadowRoot.querySelector('iframe').src
//"chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/client/app.html#config=%7B%22openSidebar%22%3Afalse%2C%22showHighlights%22%3A%22always%22%2C%22origin%22%3A%22null%22%2C%22version%22%3A%221.981.0%22%2C%22hostURL%22%3A%22https%3A%2F%2Fwww.zhihu.com%2Fquestion%2F421664666%22%7D"
when I clicked the bookmarklet with a new user profile in Microsoft Edge, the console shows the following error:
//Refused to load the script 'https://hypothes.is/embed.js' because it violates the following Content Security Policy directive: "script-src 'self' blob: *.zhihu.com g.alicdn.com qzonestyle.gtimg.cn res.wx.qq.com open.mobile.qq.com 'unsafe-eval' unpkg.zhimg.com unicom.zhimg.com resource: captcha.gtimg.com captcha.guard.qcloud.com pagead2.googlesyndication.com cpro.baidustatic.com pos.baidu.com dup.baidustatic.com i.hao61.net jsapi.qq.com 'nonce-19ad9f4b-2c0a-492c-94ce-a1389308a18b' hm.baidu.com zz.bdstatic.com b.bdstatic.com imgcache.qq.com vs-cdn.tencent-cloud.com www.mangren.com www.yunmd.net zhihu.govwza.cn gw.alipayobjects.com ssl.captcha.qq.com t.captcha.qq.com cstaticdun.126.net c.dun.163.com ac.dun.163.com/ acstatic-dun.126.net local.adguard.org 'nonce-28c5426481a54c9e9d86cd681cd'". Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.
This seemed that it can be tackled by the Disable CSP extension...
> new URL(config.sidebarAppUrl).origin
//"null"
Hmm. This is different than what I see in Chrome and Safari (using "chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/client/app.html" as the value of config.sidebarAppUrl
), although it does match the specification.
There might be some JavaScript code coming from an extension, or a difference in the browser, that is causing the issue.
What browser and version of it are you using?
when I clicked the bookmarklet with a new user profile in Microsoft Edge, the console shows the following error:
That is expected. The bookmarklet doesn't work on sites that use CSP. The browser extension is the main solution for this, although an extension that disables CSP could work for advanced users.
A trivial way to reproduce this issue in Chrome is to right-click in the sidebar and select "Reload Frame" from the context menu. Firefox has a similar option (This Frame => Reload Frame). Safari has a "Reload Page" option that reloads the whole tab, but not an action to reload one frame.
I found a scenario where this problem occurs: window.url
is overwritten.
For example, on this site: https://juejin.cn
. When we override window.URL
with window.webkitURL
(window.URL = window.webkitURL
), the plugin's sidebar opens properly.
That sounds like it is related to https://github.com/hypothesis/client/issues/4294.
That sounds like it is related to #4294.
Oh, yes, you're right.
This issue has been "resolved" for the moment by detecting when it happens and showing the user a notice to reload the page if they want to annotate. See https://github.com/hypothesis/client/issues/4095#issuecomment-1731451602.