richdocuments icon indicating copy to clipboard operation
richdocuments copied to clipboard

"Failed to load Nextcloud Office" error when opening any file with Collabora Online.

Open antlarr opened this issue 3 years ago • 17 comments

Describe the bug When opening an .ods or .odt file, the error message:

Document loading failed
Failed to load Nextcloud Office - please try again later

appears with a Close button.

To Reproduce Steps to reproduce the behavior:

  1. Click on any .ods or .odt file (existing or just created)
  2. The error message above appears

Expected behavior The file is opened and Collabora Online - Built-in CODE Server (ARM64) is started.

Client details:

  • OS: openSUSE Tumbleweed
  • Browser: Firefox and Chrome
  • Version: any
  • Device: desktop

Server details

Operating system: Raspbian 64 bit system (bullseye)

Web server: Apache 2.4.53

Database: Mariadb 10.5.15

PHP version: PHP 8.1.5

Nextcloud version: 24.0.0.12

Version of the richdocuments app 6.0.0

Version of Collabora Online Collabora Online - Built-in CODE Server (ARM64) 21.11.402

Note that I had this problem after upgrading to 22.2.7 and then to 23.0.4 (with php 8.0). After seeing it didn't work I tried upgrading to 24.0.0.12 and php 8.1 to see if that made any difference (it didn't).

Logs

Nextcloud log (data/nextcloud.log)

nextcloud.log

By the way, I tried replacing in richdocuments/lib/TokenManager.php (line 142):

$editGroups = array_filter(explode('|', $this->appConfig->getAppValue('edit_groups')));

with

$editGroups = array_filter(explode('|', $this->appConfig->getAppValue('edit_groups') or ''));

and the first error in the nextcloud.log file above seems to be fixed, but the second error is still logged (and the same error is shown in the browser)

antlarr avatar May 05 '22 19:05 antlarr

Same here! Any hint on how to solve it?

EricMaGo avatar May 30 '22 09:05 EricMaGo

Same here! Any hint on how to solve it?

I'm sorry but I don't have any hint. I haven't been able to use Collabora Online for the last month. During this time I've upgraded to Nextcloud 24.0.1 and richdocuments 6.1.0 but the problem is still there.

antlarr avatar May 30 '22 09:05 antlarr

Same error

  • richdocuments: 6.1.0
  • richdocumentscode: 21.11.402

ilsawa avatar Jun 20 '22 14:06 ilsawa

I'm having the same issue. Maybe related with https://github.com/CollaboraOnline/online/issues/4828

Today I tried three different official demo servers. But none of them worked.

  • Nextcloud 24.0.1
  • Nextcloud Office 6.1.0
  • collabora/code:21.11.5.0.1

Igortorrente avatar Jun 21 '22 21:06 Igortorrente

Same issue here as well. • richdocumentscode 22.5.401 • richdocuments 5.0.6 • Nextcloud 23.0.4 on a x86_64 VM

Chrome console has logs that indicate it may have to do with some https settings:

Refused to connect to 'https://nextcloud.xobotun.com:80/apps/richdocumentscode/proxy.php?status' because it violates the following Content Security Policy directive: "connect-src 'self'".

Refused to send form data to 'https://nextcloud.xobotun.com:80/apps/richdocumentscode/proxy.php?req=/browser/5f985be/cool.html?WOPISrc=https%3A%2F%2Fnextcloud.xobotun.com%2Findex.php%2Fapps%2Frichdocuments%2Fwopi%2Ffiles%2F186384_ocru8ytkzh2p&title=test.odt&lang=en&closebutton=1&revisionhistory=1' because it violates the following Content Security Policy directive: "form-action 'self' https://nextcloud.xobotun.com".

Refused to frame 'https://nextcloud.xobotun.com:80/' because it violates the following Content Security Policy directive: "frame-src 'self' https://nextcloud.xobotun.com".

On my setup it tries to connect to https://nextcloud.xobotun.com:80 rather than to https://nextcloud.xobotun.com which runs on 443. 80th port offers a redirect, but the browser won't even allow the app to go there and receive a redirect.

Also https://nextcloud.xobotun.com/index.php/settings/admin/logging has interesting lines like (IP is of the VM I run NextCloud on)

{"reqId":"93sw0T3HlNn7W0bcccoa","level":2,"time":"2022-07-26T20:21:03+00:00","remoteAddr":"192.168.1.10","user":"xobotun","app":"no app in context","method":"GET","url":"/index.php/core/preview?fileId=186384&c=0f08fc0a2770eb4865b3a1b7672b4852&x=250&y=250&forceIcon=0&a=0","message":"Host 192.168.1.33 was not connected to because it violates local access rules","userAgent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.167 Safari/537.36","version":"23.0.4.1","id":"62e04db4a8c2b"}

Will try to dig in further and locate some setting that may also help somebody else.

xobotun avatar Jul 26 '22 20:07 xobotun

With NC24, Nextcloud Office 6.2.0 and BuiltIn CODE 22.5.502, same "failed to load" error

Is there any setup working with built-in CODE ?

grosjo avatar Aug 28 '22 09:08 grosjo

Same issue with 24.04 docker. Unsure how to look at the office version or the builtin CODE version but they are what I am using.

I then went into settings for office, changed nothing, went back and the issue disappeared. Is it intermittent for everyone like me? Or are some of you having this as a continuous issue?

mokahless avatar Sep 11 '22 01:09 mokahless

I'm seeing the same today, but only on Chromium based browsers, Firefox works fine. Odd.

Moreover I see nothing in the Nextcloud logs on failed efforts. So I checked collabora logs, running:

journalctl -f -u coolwsd

then reloading in Chrome. Chrome shows me "Failed to load Nextcloud Office - please try again later" but nothing show son the log being followed. Firefox in the mean time can load the file. And Edge can't (also a chromium based browser).

I venture to guess this is a recent evolution. Tricky to even diagnose.

I'm guessing it's a webserver config issue. We shall see.

bernd-wechner avatar Sep 21 '22 02:09 bernd-wechner

And our issue became clear, it was a webserver conf. Chrome reports:

Refused to display in a frame because it set multiple 'X-Frame-Options' headers with conflicting values ('ALLOW, SAMEORIGIN'). Falling back to 'deny'.

but as no webserver conf changed I am guessing the latest Chrome updates are to blame for the changed behaviour. Either way adjusting the webserver's config ficed ti (for now just removing the SAMEORIGIN directive, but have yet to work out where ALLOW comes from - probably Collabora adds it so that it can open inside a Nextcloud frame when it's running on another host ... and probably (as we run Nextcloud and Collabora on the same host) we don't need it but hey.

bernd-wechner avatar Sep 21 '22 03:09 bernd-wechner

In my case it doesn't work in librewolf, chrome or firefox.

In any case, would you mind telling me where are you making those changes?

EricMaGo avatar Sep 21 '22 08:09 EricMaGo

I don't see any browser dependency. I tried several browsers on my computer and get the same error. I just tried to open a document on android via the Nextcloud app. I see a message: Nextcloud Office Connection... 0% Periodically, the percentage disappears and the opening attempt begins anew. If I restart NC, the document starts to open, but the problem repeats after a day or two of work.

  • Nextcloud 24.0.5
  • Nextcloud Office 6.2.0

ilsawa avatar Sep 21 '22 10:09 ilsawa

and seems not supported in NC25

grosjo avatar Sep 21 '22 10:09 grosjo

In any case, would you mind telling me where are you making those changes?

Not sure it matter does it? As the cause must be different given the symptom is different. But the clue, as I said, was on the browser console:

Refused to display in a frame because it set multiple 'X-Frame-Options' headers with conflicting values ('ALLOW, SAMEORIGIN'). Falling back to 'deny'.

Conclusions:

  • Collabora is displayed inside a frame on the Nextcloud page.
  • My webserver is setting X-Frame-Options to ('ALLOW, SAMEORIGIN')
  • Chromium falls back to DENY
  • Mozilla presumably falls back to SAMEORIGIN or ALLOW (conservative or lax interpretation)
  • In our webserver we are setting SAMEORIGIN
  • I find no evidence of us setting ALLOW anywhere
  • Conclusions:
    • Collabora or Nextcloud is setting ALLOW
    • Something changed recently to cause this problem and it could have been:
      • A Chromium update to a very conservative fallback on such conflicts
      • A Nextcloud or Collabora update that added an ALLOW header setting (which we don't need as we server Collabora from same origin)

The setting concerned at our end is a lighttpd configuration and appeared in this block:

    # Default security across the board
    setenv.add-response-header  = (
        "Strict-Transport-Security" => "max-age=63072000; includeSubdomains; ",
        "X-Frame-Options" => "SAMEORIGIN"
    )

which, to prove the point and test it now reads:

    # Default security across the board
    setenv.add-response-header  = (
        "Strict-Transport-Security" => "max-age=63072000; includeSubdomains; ",
        #"X-Frame-Options" => "SAMEORIGIN"
    )

Though technically I want to fix that to replace the "X-Frame-Options" header, not augment it, if the order of events permits that (i.e this is applied after NextCloud or Collabora adds it (which I presume is the order of events - the webserver having last say, resting higher in the stack).

For now it's functional, and a low priority to tighten frame serving security again (though for a top security rating a website should restrict frame serving to SAMEORIGIN I believe - to prevent inadvertent hijacking). Though it looks like ALLOW is not even legitimate:

https://html.spec.whatwg.org/multipage/browsing-the-web.html#the-x-frame-options-header

As I said, low priority. Other things call. But Chromium based browsers started behaving badly recently and this seems to have been the cause. The symptom was, that on all Chromium based browsers (include Chrome, Chromium, Edge, Opera ...) a page was shown with the OP message:

Document loading failed
Failed to load Nextcloud Office - please try again later

Commenting that line in my webserver conf, fixed it and it now opens in Chromium based browsers fine again.

bernd-wechner avatar Sep 21 '22 10:09 bernd-wechner

Same problem for me since several weeks/month now. Where I can find that 'webserver conf' file to try the patch?

SwedishChef avatar Sep 28 '22 12:09 SwedishChef

That depends on your we server and the OS it is running on. Both the config file's location and internal syntax will be thus defined.

In my case, an Ubuntu server running lighttpd, the config file is '/etc/lighttpd/lighttpd.conf'.

But you are very probably running under an Apache or Nginx server (most of the world seems to) .

bernd-wechner avatar Sep 28 '22 12:09 bernd-wechner

In my case nextcloud is running in a docker container.

Using grep I found in etc/apache2/ conf-available/security.conf:#Header set X-Frame-Options: "sameorigin" conf-enabled/security.conf:#Header set X-Frame-Options: "sameorigin"

...so it is already commented!?

SwedishChef avatar Sep 28 '22 15:09 SwedishChef

It does indeed look commented. You need to identify the cause of your failure. See the start of my comment earlier - it's easiest to check the browser console. So if in the browser when opening an Office document in Nextcloud you get:

Document loading failed
Failed to load Nextcloud Office - please try again later

displayed instead of the document, then display the browser console (press F12 typically, and click the Console tab, then clear it and reload) and look in the trace printed there for error messages. If you're lucky, you will find one that indicates the cause.

It is only because I found this error message:

Refused to display in a frame because it set multiple 'X-Frame-Options' headers with conflicting values ('ALLOW, SAMEORIGIN'). Falling back to 'deny'.

that I fixed it the way I did. If you see a different message, you have a different problem. If you see this same one, then somehow these conflicting response headers are being generated by your NextCloud stack (Webserver, Nextcloud, Collabora and any reverse proxies on the way).

bernd-wechner avatar Sep 29 '22 00:09 bernd-wechner

And this issue has resurfaced today. Now, the only error I see on a Firefox console i:

Invalid X-Frame-Options header was found when loading “https://cloud.hogs.org.au/index.php/apps/richdocuments/index?fileId=x&requesttoken=y&path=z: “ALLOW” is not a valid directive.

No conflict, just an illegal header set. ALLOW is not legal:

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Frame-Options

(the Firefox error links to that page).

It loads fine in Chrome, though. So Firefox is suddenly being picky about an illegal X-Frame-Options header.

So I looked in /var and found:

/var/www/html/nextcloud/apps/calendar/lib/Controller/PublicViewController.php:		$response->addHeader('X-Frame-Options', 'ALLOW');
/var/www/html/nextcloud/apps/richdocuments/lib/Controller/DocumentController.php:				$response->addHeader('X-Frame-Options', 'ALLOW');
/var/www/html/nextcloud/apps/richdocuments/lib/Controller/DocumentController.php:				$response->addHeader('X-Frame-Options', 'ALLOW');
/var/www/html/nextcloud/apps/richdocuments/lib/Controller/FederationController.php:		$response->setHeaders(['X-Frame-Options' => 'ALLOW']);
/var/www/html/nextcloud/apps/richdocuments/lib/Controller/DirectViewController.php:					$response->addHeader('X-Frame-Options', 'ALLOW');
/var/www/html/nextcloud/apps/richdocuments/lib/Controller/DirectViewController.php:				$response->addHeader('X-Frame-Options', 'ALLOW');
/var/www/html/nextcloud/apps/richdocuments/lib/Controller/DocumentTrait.php:		$response->addHeader('X-Frame-Options', 'ALLOW');

Sure enough, the richdocuments app appears to be setting this header (my we server sure as heck doesn't) and it is illegal, to wit a clear bug in richdocuments IMHO!

bernd-wechner avatar Oct 17 '22 11:10 bernd-wechner

I stand corrected. Not that it's setting an illegal X-Frame-Option, but in that this is causing a problem. It is in fact silently ignored. Instead it turns out I serve the same cloud under two domain names, and on one it works now and on the other not. And the mode of failure is a timeout.

22:49:41.732 FAILED Office.vue:198
    loadingTimeout Office.vue:198

on the console. This was never the case, and now is. Aaargh. How flakey are these systems ... sorry for my exasperation. But years of flawless operations (under two domain names) and now in a mere month or two successive issues.

That error is preceded by this one:

22:49:29.127 could not load recommendation preview 
error { target: img, isTrusted: true, srcElement: img, eventPhase: 0, bubbles: false, cancelable: false, returnValue: true, defaultPrevented: false, composed: false, timeStamp: 16902, … }
​bubbles: false
​cancelBubble: false
​cancelable: false
​composed: false
​currentTarget: null
​defaultPrevented: false
​eventPhase: 0
​explicitOriginalTarget: <img src="/index.php/core/preview?fileId=609249&x=32&y=32">
​isTrusted: true
​originalTarget: <img src="/index.php/core/preview?fileId=609249&x=32&y=32">
​returnValue: true
​srcElement: <img src="/index.php/core/preview?fileId=609249&x=32&y=32">
​target: <img src="/index.php/core/preview?fileId=609249&x=32&y=32">
​timeStamp: 16902
​type: "error"
​<get isTrusted()>: function isTrusted()
​<prototype>: EventPrototype { composedPath: composedPath(), stopPropagation: stopPropagation(), stopImmediatePropagation: stopImmediatePropagation(), … }
RecommendedFile.vue:126
    onerror RecommendedFile.vue:126

which is probably indicative. And yet a puzzle as the target /index.php/core/preview?fileId=609249&x=32&y=32 loads on both domains just fine (displaying a thumbnail - albeit oddly while it appears to be exactly the same thumbnail, on the working domain it displays in the middle of the browser window, and on the non-working domain it loads at top-centred, but clicking in the window immediately it jumps to centre-centre - i.e. oddly different behaviour but no timeout or failure).

So I repeat the loads on each site comparing consoles, and it's not reproducible, the reproducible pattern after several tries is that this line precedes the FAIL:

23:04:30.743 PostMessageService.sendPostMessage loolframe {"MessageId":"Host_PostmessageReady","SendTime":1666008270743,"Values":{}} postMessage.tsx:57:10

and on the domain that works, it is followed by a response:

23:04:12.338 [document] editorInitListener: Received post message  
Object { msgId: "App_LoadingStatus", args: {…}, deprecated: false }
​args: Object { Status: "Frame_Ready", Features: {…} }
​deprecated: false
​msgId: "App_LoadingStatus"
​<prototype>: Object { … }
document.js:259:13

The timestamps suggest the response comes in under 1s (23:04:12.338 - 23:04:11.887) and the timeout happens in 10s (23:04:41.072 - 23:04:30.743)

So for some reason one domain it seems, is not responding to this sendPostMessage the target of which is loolframe.

I maintain Aaaargh ... why of why is this so complicated.

Nothing is in the Collabora Logs (when I load the Collabora Admin page).

bernd-wechner avatar Oct 17 '22 11:10 bernd-wechner

In case you have two different domains under which the Nextcloud server is reachable, did you configure those properly in the coolwsd.xml config as one alias group?

https://sdk.collaboraonline.com/docs/installation/Configuration.html#multihost-configuration

juliusknorr avatar Oct 17 '22 13:10 juliusknorr

Or is it with the built-in code server?

juliusknorr avatar Oct 17 '22 13:10 juliusknorr

Not using the built-in server, no, an install of Collabora is in use.

It's been working under both host names for years, only recently stopped. I tend to use it on the one (working name) and on the other name is used by a club and under their domain name. I know it was fully functional only 20 days ago (see my note above), I had it running fine under the club domain name and tested and had club members report it worked.

Nothing has intentionally changed in configs since then, but as I don't use the club URL much it could have broken any time between those test about 20 days ago and now for that (I should point out 20 days ago it did not work on either domain name and removing a conflicting X-Frame-Options setter in my web server relieved that as described in comments above).

I just checked the coolwsd config and it contains something divergent from what is documented in your link and internally (anonymised):

<wopi allow="true" desc="Allow/deny wopi storage.">
	<max_file_size desc="Maximum document size in bytes to load. 0 for unlimited." type="uint">0</max_file_size>

	<locking desc="Locking settings">
		<refresh default="900" desc="How frequently we should re-acquire a lock with the storage server, in seconds (default 15 mins) or 0 for no refresh" type="int">900</refresh>
	</locking>

	<alias_groups desc="default mode is 'first' it allows only the first host when groups are not defined. set mode to 'groups' and define group to allow multiple host and its aliases" mode="first">
	    <!-- If you need to use multiple wopi hosts, please change the mode to "groups" and
            add the hosts below.  If one host is accessible under multiple ip addresses
            or names, add them as aliases. -->
	    <!--<group>
                <host desc="hostname to allow or deny." allow="true">scheme://hostname:port</host>
                <alias desc="regex pattern of aliasname">scheme://aliasname1:port</alias>
                <alias desc="regex pattern of aliasname">scheme://aliasname2:port</alias>
            </group>-->
           <!-- More "group"s possible here -->
       </alias_groups>

       <host allow="true" desc="Regex pattern of hostname to allow or deny.">(subdomain1|subdomain2.domain1.place</host>
       <host allow="true" desc="Regex pattern of hostname to allow or deny.">domain2.org.au</host>
       <host allow="true" desc="Regex pattern of hostname to allow or deny.">subdomain3.domain2.org.au</host>
</wopi>

While not as document, it's worked like this for years. Interestingly it is the .org.au domain that is failing not the .place domain. Have tested both from my LAN and from a remote site to confirm (as I do some hairpin avoidance on my gateway for local servers that is nice to remove/isolate, so I test from a site I RDP to as well).

I will try using aliases, but am not hopeful as this is old config working ~20 days ago!

bernd-wechner avatar Oct 17 '22 20:10 bernd-wechner

So I trid this quickly:

<wopi allow="true" desc="Allow/deny wopi storage.">
    <max_file_size desc="Maximum document size in bytes to load. 0 for unlimited." type="uint">0</max_file_size>
    <locking desc="Locking settings">
        <refresh default="900" desc="How frequently we should re-acquire a lock with the storage server, in seconds (default 15 mins) or 0 for no refresh" type="int">900</refresh>
    </locking>
    <alias_groups desc="default mode is 'first' it allows only the first host when groups are not defined. set mode to 'groups' and define group to allow multiple host and its aliases" mode="group">
        <group>
          <host allow="true" desc="Regex pattern of hostname to allow or deny.">domain1.place</host>
          <alias desc="regex pattern of aliasname">domain2.org.au</alias>
        </group>
    </alias_groups>
</wopi>

in /etc/coolwsd/coolwsd.xml then sudo service coolwsd restart and alas the symptoms persist. domain1.place works and domain2.org.au does not.

bernd-wechner avatar Oct 17 '22 20:10 bernd-wechner

Not sure if this will fix it, but there is a typo, should be mode="groups" plural.

Raudius avatar Oct 17 '22 20:10 Raudius

Not sure if this will fix it, but there is a typo, should be mode="groups" plural.

Well spotted. Tried that no luck. Also tried swapping the two domains, and no change, and finally tried removing the working domain and no change. This evidences that what IU'm doing is not impacting my service .... aaargh.

I am editing /etc/coolwsd/coolwsd.xml on the collabora server (double checked) and restarting the coolwsd service.

Wondering how coolwsd and loolwsd relate. And how my coolwsd is being configured if changes to /etc/coolwsd/coolwsd.xml are having no impact.

bernd-wechner avatar Oct 17 '22 21:10 bernd-wechner

Is there anything on your coolwsd logs?

sudo journalctl -u coolwsd.service

I would expect to see Requesting address is denied or something along those lines. If there is no such message then your .org.au requests aren't even reaching the Collabora server.

Raudius avatar Oct 17 '22 21:10 Raudius

I had this problem until today. Now it is fixed. The problem for me was that Collabora Online was trying to connect to the CODE Built-in Server app by its fully qualified domain name, which in my case resolved to the external IP address of my NAT router, which was unable to direct the traffic back whence it came.

I added the FQDN to the end of the 127.0.0.1 localhost line of /etc/hosts and the problem was resolved.

Brianetta avatar Oct 17 '22 21:10 Brianetta

I would expect to see Requesting address is denied or something along those lines. If there is no such message then your .org.au requests aren't even reaching the Collabora server.

This is a hot tip. And I see clues in there. As in the are messages about "Only allowed hoist is:" and "No acceptable WOPI hosts matching ...". and it doesn't match the .org.au domain.

No acceptable WOPI hosts found matching the target host [subdomain.domain.org.au] in config.| wsd/Storage.cpp:263

I'm out of time this morning and have to run, but next step I shall have to look at how to clear the log (so I know I'm looking at results of the latest test) or have time to do tests noting activity times to compare against log times ... then work out if my coolws.xml file is being used or not (i.e. do my experimental configs have any impact) and then see if I can get them right.

As ever, puzzled as this was all working fine 20 days ago and am not aware of any config changes in the interim. Nevertheless need to diagnose and fix.

Up front, the puzzle is that I have:

<wopi allow="true" desc="Allow/deny wopi storage.">
    <max_file_size desc="Maximum document size in bytes to load. 0 for unlimited." type="uint">0</max_file_size>
    <locking desc="Locking settings">
        <refresh default="900" desc="How frequently we should re-acquire a lock with the storage server, in seconds (default 15 mins) or 0 for no refresh" type="int">900</refresh>
    </locking>
    <alias_groups desc="default mode is 'first' it allows only the first host when groups are not defined. set mode to 'groups' and define group to allow multiple host and its aliases" mode="group">
        <group>
          <host allow="true" desc="Regex pattern of hostname to allow or deny.">domain.org.au</host>
        </group>
    </alias_groups>
</wopi>

in coolwsd.xml and it still loads in my .place domain but not my .org.au domain and thus can safely conclude that config did not register with coolwsd. That is the next thing to resolve (this evening).

bernd-wechner avatar Oct 17 '22 22:10 bernd-wechner

Back on the job briefly, here's coolwsd running:

â—Ź coolwsd.service - Collabora Online WebSocket Daemon
     Loaded: loaded (/lib/systemd/system/coolwsd.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2022-10-18 09:28:41 AEDT; 8h ago
   Main PID: 153566 (coolwsd)
      Tasks: 8 (limit: 18980)
     Memory: 58.1M
     CGroup: /system.slice/coolwsd.service
             ├─153566 /usr/bin/coolwsd --version --o:sys_template_path=/opt/cool/systemplate --o:child_root_path=/opt/cool/child-roots --o:file_server_root_path=/usr/share/coolwsd
             ├─153586 /usr/bin/coolforkit --systemplate=/opt/cool/systemplate --lotemplate=/opt/collaboraoffice --childroot=/opt/cool/child-roots/ --clientport=9980 --masterport=coolwsd-Sr4D8nO8 --rlimits=limit_virt_mem_mb:0;limit_stack_mem_kb:8000;limit_file_size_mb:0;limit_num_open_files:0 --version --ui=default
             └─153591 /usr/bin/coolforkit --systemplate=/opt/cool/systemplate --lotemplate=/opt/collaboraoffice --childroot=/opt/cool/child-roots/ --clientport=9980 --masterport=coolwsd-Sr4D8nO8 --rlimits=limit_virt_mem_mb:0;limit_stack_mem_kb:8000;limit_file_size_mb:0;limit_num_open_files:0 --version --ui=default

Oct 18 09:28:43 nephele coolwsd[153566]: wsd-153566-153585 2022-10-18 09:28:43.135379 +1100 [ prisoner_poll ] INF  Have 1 spare child after adding [153591]. Notifying.| wsd/COOLWSD.cpp:545
Oct 18 09:28:43 nephele coolwsd[153566]: wsd-153566-153585 2022-10-18 09:28:43.135454 +1100 [ prisoner_poll ] TRC  #18: Revents: 0x0| net/Socket.hpp:1296
Oct 18 09:28:43 nephele coolwsd[153566]: wsd-153566-153585 2022-10-18 09:28:43.135473 +1100 [ prisoner_poll ] TRC  #19: Removing socket (at 2 of 3) from prisoner_poll| net/Socket.cpp:468
Oct 18 09:28:43 nephele coolwsd[153566]: wsd-153566-153585 2022-10-18 09:28:43.135490 +1100 [ prisoner_poll ] TRC  #17: setupPollFds getPollEvents: 0x1| net/Socket.hpp:858
Oct 18 09:28:43 nephele coolwsd[153566]: wsd-153566-153585 2022-10-18 09:28:43.135601 +1100 [ prisoner_poll ] TRC  #18: setupPollFds getPollEvents: 0x1| net/Socket.hpp:858
Oct 18 09:28:43 nephele coolwsd[153566]: wsd-153566-153585 2022-10-18 09:28:43.135619 +1100 [ prisoner_poll ] TRC  ppoll start, timeoutMicroS: 5000000 size 2| net/Socket.cpp:337
Oct 18 09:28:43 nephele coolwsd[153566]: wsd-153566-153566 2022-10-18 09:28:43.135556 +1100 [ coolwsd ] TRC  Have 1 new children.| wsd/COOLWSD.cpp:5270
Oct 18 09:28:43 nephele coolwsd[153566]: wsd-153566-153566 2022-10-18 09:28:43.135688 +1100 [ coolwsd ] INF  WSD initialization complete: setting log-level to [warning] as configured.| wsd/COOLWSD.cpp:5286
Oct 18 09:28:43 nephele coolwsd[153566]: Ready to accept connections on port 9980.
Oct 18 09:29:43 nephele coolwsd[153566]: wsd-153566-153566 2022-10-18 09:29:43.981706 +1100 [ coolwsd ] WRN  Waking up dead poll thread [update], started: false, finished: false| net/Socket.hpp:725

providing no immediate clue as to where it reads its config from. The unit file provides no real clues either:

$ more /lib/systemd/system/coolwsd.service
[Unit]
Description=Collabora Online WebSocket Daemon
After=network.target

[Service]
EnvironmentFile=-/etc/sysconfig/coolwsd
ExecStart=/usr/bin/coolwsd --version --o:sys_template_path=/opt/cool/systemplate --o:child_root_path=/opt/cool/child-roots --o:file_server_root_path=/usr/share/coolwsd
KillSignal=SIGINT
TimeoutStopSec=120
User=cool
KillMode=control-group
Restart=always
LimitNOFILE=infinity:infinity

ProtectSystem=strict
ReadWritePaths=/opt/cool /var/log

ProtectHome=yes
PrivateTmp=yes
ProtectControlGroups=yes
CapabilityBoundingSet=CAP_FOWNER CAP_CHOWN CAP_MKNOD CAP_SYS_CHROOT CAP_SYS_ADMIN

[Install]
WantedBy=multi-user.target

The binary provides scant few clues too:

$ /usr/bin/coolwsd --help
usage: coolwsd OPTIONS
Collabora Online WebSocket server.

--daemon                       Run application as a daemon.
--umask=mask                   Set the daemon's umask (octal, e.g. 027).
--pidfile=path                 Write the process ID of the application to 
                               given file.
--help                         Display help information on command line 
                               arguments.
--version-hash                 Display product version-hash information and 
                               exit.
--version                      Display version and hash information.
--cleanup                      Cleanup jails and other temporary data and 
                               exit.
--port=port_number             Port number to listen to (default: 9980),
--disable-ssl                  Disable SSL security layer.
--disable-cool-user-checking   Don't check whether coolwsd is running under 
                               the user 'cool'.  NOTE: This is insecure, use 
                               only when you know what you are doing!
-oxmlpath, --override=xmlpath  Override any setting by providing full 
                               xmlpath=value.
--config-file=path             Override configuration file path.
--config-dir=path              Override extra configuration directory path.
--lo-template-path=path        Override the LOK core installation directory 
                               path.
--unattended                   Unattended run, won't wait for a debugger on 
                               faulting.
--signal                       Send signal SIGUSR2 to parent process when 
                               server is ready to accept connections
Forced Exit with code: 0
-167735 2022-10-18 17:35:15.188495 +1100 [ coolwsd ] FTL  Forced Exit with code: 0| common/Util.cpp:1097

and man coolwsd mainly helps by pointing me to coolconfig, but alas it doesn't help me confirm where coolwsd is reading configs from or what config it read (the one running as a service). This is a neat clue though:

$ coolwsd --version
Failed to initialize COOLWSD: Access to file denied: /etc/coolwsd/coolwsd.xml
-167990 2022-10-18 17:40:28.819029 +1100 [ coolwsd ] FTL  Failed to initialize COOLWSD: Access to file denied: /etc/coolwsd/coolwsd.xml| wsd/COOLWSD.hpp:488
Access to file denied: /etc/coolwsd/coolwsd.xml
$ ll /etc/coolwsd/coolwsd.xml
-rw-r----- 1 cool cool 24013 Oct 18 08:05 /etc/coolwsd/coolwsd.xml

Making progress slowly:

$ sudo -u cool coolwsd --version
[sudo] password for cirrus: 
wsd-168037-168037 2022-10-18 17:42:04.669635 +1100 [ coolwsd ] INF  Initializing wsd. Local time: Tue 2022-10-18 17:42:04 +1100. Log level is [8].| common/Log.cpp:328
wsd-168037-168037 2022-10-18 17:42:04.669702 +1100 [ coolwsd ] INF  Setting log-level to [trace] and delaying setting to configured [warning] until after WSD initialization.| wsd/COOLWSD.cpp:2100
wsd-168037-168037 2022-10-18 17:42:04.669748 +1100 [ coolwsd ] INF  Initializing coolwsd server []. Experimental features are disabled.| wsd/COOLWSD.cpp:2131
wsd-168037-168037 2022-10-18 17:42:04.669781 +1100 [ coolwsd ] INF  Anonymization of user-data is configurable.| wsd/COOLWSD.cpp:2139
wsd-168037-168037 2022-10-18 17:42:04.669833 +1100 [ coolwsd ] WRN  NOTE: both logging.anonymize.usernames and logging.anonymize.filenames are deprecated and superseded by logging.anonymize.anonymize_user_data. Please remove username and filename entries from the config and use only anonymize_user_data.| wsd/COOLWSD.cpp:2149
wsd-168037-168037 2022-10-18 17:42:04.669865 +1100 [ coolwsd ] WRN  Since logging.anonymize.anonymize_user_data is provided (false) in the config, it will be used.| wsd/COOLWSD.cpp:2153
wsd-168037-168037 2022-10-18 17:42:04.669897 +1100 [ coolwsd ] INF  Anonymization of user-data is disabled.| wsd/COOLWSD.cpp:2186
wsd-168037-168037 2022-10-18 17:42:04.670033 +1100 [ coolwsd ] INF  SSL support: SSL is disabled.| wsd/COOLWSD.cpp:2230
wsd-168037-168037 2022-10-18 17:42:04.670068 +1100 [ coolwsd ] INF  SSL support: termination is enabled.| wsd/COOLWSD.cpp:2231
wsd-168037-168037 2022-10-18 17:42:04.670105 +1100 [ coolwsd ] DBG  Setting envar PDFIMPORT_RESOLUTION_DPI=96 per config per_document.pdf_resolution_dpi| wsd/COOLWSD.cpp:2256
wsd-168037-168037 2022-10-18 17:42:04.670170 +1100 [ coolwsd ] INF  Creating childroot: /usr/bin/jails/| wsd/COOLWSD.cpp:2298
wsd-168037-168037 2022-10-18 17:42:04.670835 +1100 [ coolwsd ] INF  Cleaning up childroot directory [/usr/bin/jails/].| common/JailUtil.cpp:161
wsd-168037-168037 2022-10-18 17:42:04.670897 +1100 [ coolwsd ] TRC  Directory [/usr/bin/jails/] is not a directory or doesn't exist.| common/JailUtil.cpp:166
Failed to initialize COOLWSD: Access to file denied: /usr/bin/jails
wsd-168037-168037 2022-10-18 17:42:04.671038 +1100 [ coolwsd ] FTL  Failed to initialize COOLWSD: Access to file denied: /usr/bin/jails| wsd/COOLWSD.hpp:488
Access to file denied: /usr/bin/jails

Though decisively odd that coolwsd wants to load config and do all this work and initialisation (and fail) just to display its version number! Go figure, we live in a strange world after all.

But of course the system service sets childroot that exists so:

$ sudo service coolwsd stop
$ sudo -u cool /usr/bin/coolwsd --version --o:sys_template_path=/opt/cool/systemplate --o:child_root_path=/opt/cool/child-roots --o:file_server_root_path=/usr/share/coolwsd

does in the mass of logged output eventually reveal:

Server: COOLWSD HTTP Server 22.05.6.1

So we know the version at last ;-). Alas in that whole output it does not reveal what config file it read nor what config.

So the best I could do is set in /etc/coolwsd/coolwsd.xml raise the log level from warning to debug. Restarting coolwsg and checking the journal I see that debug is honoured and so it is reading this config file. Finally I have evidence that this config fils is being read.

So back to finding working example.s Best I could do was:

https://raw.githubusercontent.com/CollaboraOnline/online/master/coolwsd.xml.in

So looking at that, in spite of regex claims it puts full urls or at leastthe protocol part so I experiment with:

	<storage desc="Backend storage">
		<filesystem allow="false"/>
		<wopi allow="true" desc="Allow/deny wopi storage.">
			<max_file_size desc="Maximum document size in bytes to load. 0 for unlimited." type="uint">0</max_file_size>
			<locking desc="Locking settings">
				<refresh default="900" desc="How frequently we should re-acquire a lock with the storage server, in seconds (default 15 mins) or 0 for no refresh" type="int">900</refresh>
			</locking>
			<alias_groups desc="default mode is 'first' it allows only the first host when groups are not defined. set mode to 'groups' and define group to allow multiple host and its aliases" mode="groups">
				<group>
				  <host allow="true" desc="Regex pattern of hostname to allow or deny.">https://domain1.place</host>
				  <alias desc="regex pattern of aliasname">https://domain2.org.au</alias>				  
				</group>
			</alias_groups>
		</wopi>

and restart the coolwsd sevrice and test. Voila. Both domains now work!

Uttler nuts. I mean on two fronts:

  1. The config.xml I share earlier (https://github.com/nextcloud/richdocuments/issues/2188#issuecomment-1281472701) had been in place for a year or more in all liklehood working with both domains and tested no more than 20 days ago and worked.
  2. Now I find that that unless I use a groups alias, and specify the full URL I siply cannot get them both served in spite of internal docs claiming a rexex is expected.

Could there have been a coolwsd upgrade in the past 20 days that implement this changed intepretetation of the configs?

Either way, a long slog but aaargh, I ghave it operating on two domains again.

bernd-wechner avatar Oct 18 '22 08:10 bernd-wechner

Can you also try creating different groups for the two domains?


<alias_groups desc="default mode is 'first' it allows only the first host when groups are not defined. set mode to 'groups' and define group to allow multiple host and its aliases" mode="groups">
    <group>
        <host allow="true" desc="Regex pattern of hostname to allow or deny.">https://domain1.place</host>
    </group>

    <group>
        <host allow="true" desc="Regex pattern of hostname to allow or deny.">https://domain2.org.au</host>  
    </group>
</alias_groups>

Raudius avatar Oct 18 '22 09:10 Raudius