urbit icon indicating copy to clipboard operation
urbit copied to clipboard

pier: serf unexpectedly shut down

Open Quodss opened this issue 2 years ago • 10 comments

Description: After the last OTA, pier crashes with "pier: serf unexpectedly shut down". When starting the pier after the crash, it would crash again after few seconds.

In these couple of seconds i had time to print +vats. Here are the logs:

...
---------------- playback complete ----------------
vere: checking version compatibility
ames: live on 52142
conn: listening on \\.\pipe\urbit-conn-dozreg-toplud
http: web interface live on https://localhost:443
http: web interface live on http://localhost:80
http: loopback live on http://localhost:12321
pier (5773968): live
> +vats
%base
  /sys/kelvin:      [%zuse 418]
  base hash:        0v1v.p3a22.iv754.lt3ot.hpbee.0elcg.orl5e.fakce.r1pb5.sg0kt.46vqk
  %cz hash:         0v10.qk6vo.bpj1l.ckup2.mq161.g50a2.echpp.dh0q9.bmuvb.gvipe.c51sm
  app status:       running
  force on:         ~
  force off:        ~
  publishing ship:  ~
  updates:          tracking
  source ship:      ~zod
  source desk:      %kids
  source aeon:      120
  pending updates:  ~
::
%inet2022
  /sys/kelvin:      [%zuse 418]
  base hash:        0v1h.1odif.kbkos.7cef9.crs82.qdmhf.adl96.26c3n.gs27j.ta6t1.slsdk
  %cz hash:         0v1h.1odif.kbkos.7cef9.crs82.qdmhf.adl96.26c3n.gs27j.ta6t1.slsdk
  app status:       running
  force on:         ~
  force off:        ~
  publishing ship:  ~
  updates:          tracking
  source ship:      ~tocrex-holpen
  source desk:      %inet2022
  source aeon:      13
  pending updates:  ~
::
%studio
  /sys/kelvin:      [%zuse 418]
  base hash:        0vq.c6i3u.2lo4u.ucc6d.vhaia.gu9ic.1htet.gssjp.sdnaa.03e4m.ct81g
  %cz hash:         0vq.c6i3u.2lo4u.ucc6d.vhaia.gu9ic.1htet.gssjp.sdnaa.03e4m.ct81g
  app status:       running
  force on:         ~
  force off:        ~
  publishing ship:  [~ ~tirrel]
  updates:          tracking
  source ship:      ~tirrel
  source desk:      %studio
  source aeon:      19
  pending updates:  ~
::
%landscape
  /sys/kelvin:      [%zuse 418]
  base hash:        0vt.1ivo6.0pj6r.qeavd.e30q1.crqu7.5qc5l.gvs0j.aulvu.joe2j.pa66j
  %cz hash:         0vt.1ivo6.0pj6r.qeavd.e30q1.crqu7.5qc5l.gvs0j.aulvu.joe2j.pa66j
  app status:       running
  force on:         ~
  force off:        ~
  publishing ship:  [~ ~lander-dister-dozzod-dozzod]
  updates:          tracking
  source ship:      ~lander-dister-dozzod-dozzod
  source desk:      %landscape
  source aeon:      26
  pending updates:  ~
::
%webterm
  /sys/kelvin:      [%zuse 418]
  base hash:        0vb.81mgj.s695n.fuiqg.ddsd4.1oec7.kai2j.aqh8c.2h3u6.ndam6.mndil
  %cz hash:         0vb.81mgj.s695n.fuiqg.ddsd4.1oec7.kai2j.aqh8c.2h3u6.ndam6.mndil
  app status:       running
  force on:         ~
  force off:        ~
  publishing ship:  [~ ~mister-dister-dozzod-dozzod]
  updates:          tracking
  source ship:      ~mister-dister-dozzod-dozzod
  source desk:      %webterm
  source aeon:      6
  pending updates:  ~
::
%garden
  /sys/kelvin:      [%zuse 418]
  base hash:        0vk.ltcfc.a5mn5.6r5st.n5373.veo4f.knnsj.rjaeu.bamr6.pbuhq.qfvch
  %cz hash:         0vk.ltcfc.a5mn5.6r5st.n5373.veo4f.knnsj.rjaeu.bamr6.pbuhq.qfvch
  app status:       running
  force on:         ~
  force off:        ~
  publishing ship:  [~ ~mister-dister-dozzod-dozzod]
  updates:          tracking
  source ship:      ~mister-dister-dozzod-dozzod
  source desk:      %garden
  source aeon:      20
  pending updates:  ~
::
%docs
  /sys/kelvin:      [%zuse 418]
  base hash:        0v1.avb2u.b8pui.8g9a5.bcicv.nv8hp.s82m0.tlt39.sluhs.q74vm.shim0
  %cz hash:         0v1.avb2u.b8pui.8g9a5.bcicv.nv8hp.s82m0.tlt39.sluhs.q74vm.shim0
  app status:       running
  force on:         ~
  force off:        ~
  publishing ship:  ~
  updates:          tracking
  source ship:      ~pocwet
  source desk:      %docs
  source aeon:      28
  pending updates:  ~
::
%pals
  /sys/kelvin:      [%zuse 418]
  base hash:        0v1i.ctage.29fp1.uksc4.sfd6k.mchgh.bq2qe.e8tk8.hd84f.cs4mi.nukp0
  %cz hash:         0v1i.ctage.29fp1.uksc4.sfd6k.mchgh.bq2qe.e8tk8.hd84f.cs4mi.nukp0
  app status:       running
  force on:         ~
  force off:        ~
  publishing ship:  ~
  updates:          tracking
  source ship:      ~paldev
  source desk:      %pals
  source aeon:      17
  pending updates:  ~
::
%escape
  /sys/kelvin:      [%zuse 418]
  base hash:        0v11.dffq9.shga9.rcdt8.kp7he.mhv6f.4gnlq.ugic3.fnjfa.1ps1a.fld50
  %cz hash:         0v11.dffq9.shga9.rcdt8.kp7he.mhv6f.4gnlq.ugic3.fnjfa.1ps1a.fld50
  app status:       running
  force on:         ~
  force off:        ~
  publishing ship:  [~ ~dister-fabnev-hinmur]
  updates:          tracking
  source ship:      ~dister-fabnev-hinmur
  source desk:      %escape
  source aeon:      68
  pending updates:  ~[[%zuse 418]]
::
%kids %cz hash:     0vf.1ngkn.b1pi8.o2n82.k3das.t1fph.3hv0q.jrhb7.t371o.7gk3k.k7ej3
ames: czar del.urbit.org: ip .142.93.228.23
ames: czar ten.urbit.org: ip .104.196.239.18
ames: czar wet.urbit.org: ip .34.121.77.1
ames: czar bus.urbit.org: ip .35.247.126.229
ames: czar feb.urbit.org: ip .34.82.25.47
ames: czar dev.urbit.org: ip .35.227.173.38
ames: czar def.urbit.org: ip .35.230.109.40
ames: czar pub.urbit.org: ip .35.230.48.78
ames: czar lur.urbit.org: ip .35.233.250.88
ames: czar zod.urbit.org: ip .35.247.119.159
ames: czar nus.urbit.org: ip .34.83.26.147
ames: czar tug.urbit.org: ip .64.225.41.162
ames: czar rel.urbit.org: ip .34.83.230.207
ames: czar rys.urbit.org: ip .23.239.12.212
ames: czar deg.urbit.org: ip .13.59.219.247
>
pier: serf unexpectedly shut down

To Reproduce Launch urbit from command line.

System (please supply the following information, if relevant):

  • OS: Windows10
  • Vere and Urbit OS versions: 1.9.0
  • Your ship's %base hash (use +trouble to check): 0v1v.p3a22.iv754.lt3ot.hpbee.0elcg.orl5e.fakce.r1pb5.sg0kt.46vqk

Additional context I changed OTA source from my sponsor to ~zod prior to 1st June OTA.

I tried launching a comet and I see that %base hashes are different. Will I have to breach my planet to get proper OTA? Comet's %base hash is 0v2.r1lbp.i9jr2.hosbi.rvg16.pqe7u.i3hnp.j7k27.9jsgv.8k7rp.oi98q

Quodss avatar Jun 02 '22 19:06 Quodss

I breached the planet and booted it again with a new key; base hash is oi98q. It was running for a few minutes, then the same error reappeared.

Quodss avatar Jun 04 '22 18:06 Quodss

After an update the base hash became 46vqk; anyway, i tried running urbit with verbose flag, this is where the crash happens:

[ "|||"
  %give
  %gall
  [%unto %fact]
  i=/gall/use/azimuth/0w3.~Q7eU/out/~dozreg-toplud/eth-watcher/eth-watcher
  t=~[/dill //term/1]
]
pier: serf unexpectedly shut down

Quodss avatar Jun 04 '22 18:06 Quodss

I reproduced the same error with a fresh comet:

  1. using CLI in Windows 10 cmd, spawned a comet;
  2. wait for some time, crash happens and repeats each time I reboot the ship.

Sometimes enough time passes to update to 46vqk, but it is not necessary. Same last message with -v flag as for the planet case.

Quodss avatar Jun 04 '22 21:06 Quodss

I have had essentially the same error message since the update with several of my planets (all of the ones I have tried, for others I am waiting to see if there is some easy fix before doing anything too drastic). I also had a similar issue with a planet before the update, though this may well be unrelated. All of this happening on boot or during playback (with the most common error message shown during playback below):

newt: write failed end of file
pier: serf unexpectedly shut down

Edit:

For further specificity if it is useful, I'm having these issues with planets that are running on port

Further edit:

I have partially fixed this on my end, looks like it was just an issue of me trying to force a playback, doubt it fixes the actual issue raised in this thread

W-Glenton avatar Jun 06 '22 23:06 W-Glenton

Same bug here:

[ "||"
  %give
  %gall
  [%unto %fact]
  i=/gall/use/eth-watcher/0w2.PVfAJ/out/~fidwed-sipwyn/spider/running/azimuth
  t=~[/dill //term/1]
]
["|||" %give %gall [%unto %fact] i=/gall/use/azimuth/0w2.PVfAJ/out/~fidwed-sipwyn/eth-watcher/eth-watcher t=~[/dill //term/1]]
pier: serf unexpectedly shut down

marcusmiguel avatar Jun 07 '22 00:06 marcusmiguel

same here, happens on fake ~zod as well live planets.

ericfode avatar Jun 08 '22 19:06 ericfode

If you run |mass does your %gall section look like this?

  %gall:
      %foreign: KB/11.748
      %blocked:
        %azimuth-tracker: KB/5.616
        %face: KB/1.072
        %file-server: KB/8.208
        %goad: B/896
        'inet2022': B/424
        %orca: MB/1.164.264
        %pipe: KB/192.696
      --MB/1.373.176
      %active:

Then further in %active you see

        'inet2022': KB/273.204

Shouldn't it be %inet2022 and do you see the same thing?

benjaminkwilliams avatar Jun 30 '22 02:06 benjaminkwilliams

Hi @benjaminkwilliams

Here is %gall section from |mass:

%gall:
      %foreign: KB/10.568
      %blocked:
        %face: B/568
        %file-server: B/576
        %rumors: B/564
      --KB/1.708
      %active:

and I do not see inet2022 in %active

Quodss avatar Jul 01 '22 18:07 Quodss

@Quodss I did figure out that to "unblock" what was listed in %gall, I had to install Orca, Face, and Studio. I didn't uninstall Rumor, but as yours shows as %blocked, I'm going to guess you did at some point in the past.

benjaminkwilliams avatar Jul 13 '22 00:07 benjaminkwilliams

At the risk of being redundant I'm adding to this thread as opposed to opening a new ticket.

Also experiencing this issue even after updating to base hash 0vu.fptbs.6f05p.c9ghb.qfh7e.sbhum.vfnnr.osfs7.vv1i1.qveva.dfvli

Running -v these are the messages I'm getting before bail:

[ "|"
  %give
  %iris
  %http-response
    i
  / gall
    use
    spider
    0wJM111
    ~sonseg-dolful
    thread
    eth-watcher--0v18r.pqi6b.pc5id.qu4oo.kh89b.3af82.8s47o.st0rj.2gkoo.9fnsp.mdv51.rhc8s.defds.vr098.4r8bs.tgec9.b7dru.7cs5r.a89tv.6itl6.cnav9
    request
  t=~[/dill //term/1]
]
[ "||"
  %give
  %gall
  [%unto %fact]
  i=/gall/use/eth-watcher/0wJM111/out/~sonseg-dolful/spider/running/azimuth
  t=~[/dill //term/1]
]
[ "|||"
  %give
  %gall
  [%unto %fact]
  i=/gall/use/azimuth/0wJM111/out/~sonseg-dolful/eth-watcher/eth-watcher
  t=~[/dill //term/1]
]
pier: serf unexpectedly shut down

@benjaminkwilliams I tried your suggestion as I had gall blocking a couple of desks including rumors. I reinstalled the blocking desks but I still get:

%gall:
      %foreign: KB/9.768
      %blocked:
        %file-server: B/576
      --B/576

I've breached several times already based on assumptions that the issue could be due to corrupt installs etc but my current is about as clean as I can imagine starting from a factory reset performed today and I'm still getting bailed with the pier: serf unexpectedly shut down error. Please help!

@MarcusMiguel did you resolve? seems we have the same problem with eth-watcher/spider

knoidy avatar Aug 22 '22 02:08 knoidy

At the risk of being redundant I'm adding to this thread as opposed to opening a new ticket.

Also experiencing this issue even after updating to base hash 0vu.fptbs.6f05p.c9ghb.qfh7e.sbhum.vfnnr.osfs7.vv1i1.qveva.dfvli

Running -v these are the messages I'm getting before bail:

[ "|"
  %give
  %iris
  %http-response
    i
  / gall
    use
    spider
    0wJM111
    ~sonseg-dolful
    thread
    eth-watcher--0v18r.pqi6b.pc5id.qu4oo.kh89b.3af82.8s47o.st0rj.2gkoo.9fnsp.mdv51.rhc8s.defds.vr098.4r8bs.tgec9.b7dru.7cs5r.a89tv.6itl6.cnav9
    request
  t=~[/dill //term/1]
]
[ "||"
  %give
  %gall
  [%unto %fact]
  i=/gall/use/eth-watcher/0wJM111/out/~sonseg-dolful/spider/running/azimuth
  t=~[/dill //term/1]
]
[ "|||"
  %give
  %gall
  [%unto %fact]
  i=/gall/use/azimuth/0wJM111/out/~sonseg-dolful/eth-watcher/eth-watcher
  t=~[/dill //term/1]
]
pier: serf unexpectedly shut down

@benjaminkwilliams I tried your suggestion as I had gall blocking a couple of desks including rumors. I reinstalled the blocking desks but I still get:

%gall:
      %foreign: KB/9.768
      %blocked:
        %file-server: B/576
      --B/576

I've breached several times already based on assumptions that the issue could be due to corrupt installs etc but my current is about as clean as I can imagine starting from a factory reset performed today and I'm still getting bailed with the pier: serf unexpectedly shut down error. Please help!

@MarcusMiguel did you resolve? seems we have the same problem with eth-watcher/spider

Sorry for the late response, i'm still facing the same issue. Running the commands mentioned here seems to allow my ship to go longer without crashing but eventually it does crashes again.

marcusmiguel avatar Oct 03 '22 19:10 marcusmiguel

Sorry for the (much more egregiously) late response.

The original report in this thread was a ship running on Windows. Generally, vere is pretty good about printing an error message before a fatal error -- except on windows, where something is buffering (and not flushing) stderr. Is everyone else in this thread also running Windows?

joemfb avatar Oct 04 '22 05:10 joemfb

Running Windows here.

marcusmiguel avatar Oct 04 '22 13:10 marcusmiguel

Ditto, Windows

knoidy avatar Oct 04 '22 16:10 knoidy

Same on the Windows Server 2016

GeneralGDA avatar Oct 14 '22 04:10 GeneralGDA

I am experiencing the same exact issue with the same exact error messages (i.e. there seems to be some issue with "eth watcher") and have a Windows 10 computer.

I reset my network keys and installed from scratch without any success.

lecram2022 avatar Oct 17 '22 01:10 lecram2022

Running Port [app-1.9.1] on Windows 10 [Version 10.0.19044.2130] this is what I'm getting, shutdown happens almost immediately sometimes, other times it runs for 10 minutes or so:

Microsoft Windows [Version 10.0.19044.2130] (c) Microsoft Corporation. All rights reserved.

C:\Users\REDACTED>C:\Users\REDACTED\AppData\Local\port\app-1.9.1\resources\resources\w in\urbit C:\Users\REDACTED\AppData\Roaming\Port\piers\liquidation-station ~ urbit 1.10 boot: home is C:\Users\REDACTED\AppData\Roaming\Port\piers\liquidation-station loom: mapped 2048MB lite: arvo formula 11a9e7fe lite: core 38d4ad4d
lite: final state 38d4ad4d loom: mapped 2048MB boot: protected loom
live: loaded: MB/353.566.720 boot: installed 351 jets ---------------- playback starting ---------------- pier: replaying events 157132-157362 eyre: canceling ~[//http-server/0vp.ujgkv/59/3] eyre: canceling ~[//http-server/0vp.ujgkv/18/9] eyre: canceling ~[//http-server/0vp.ujgkv/74/22] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] [%e %authenticated-without-cookie] pier: (157362): play: done ---------------- playback complete ---------------- vere: checking version compatibility ames: live on 52897 conn: listening on \.\pipe\urbit-conn-ritdeg-havful-hansen-miptud--widdel-lorry m-dollur-litzod eyre: canceling ~[//http-server/0v4.fa6pg/20/11] eyre: canceling ~[//http-server/0v4.fa6pg/28/2] http: web interface live on http://localhost:80 http: loopback live on http://localhost:12321 pier (157370): live ames: czar zod.urbit.org: ip .35.247.119.159 ames: czar ten.urbit.org: ip .104.196.239.18 ames: czar dys.urbit.org: ip .157.90.16.237 ames: czar pub.urbit.org: ip .35.230.48.78 ; ~haddef-sigwen is ok ; ~niblyx-malnus is ok ames: czar at ned.urbit.org: not found (b) ames: czar dev.urbit.org: ip .35.227.173.38 ames: czar feb.urbit.org: ip .34.82.25.47 ames: czar bus.urbit.org: ip .35.247.126.229 ames: czar del.urbit.org: ip .142.93.228.23 ames: czar wet.urbit.org: ip .34.121.77.1 ames: czar deg.urbit.org: ip .13.59.219.247 ames: czar rys.urbit.org: ip .23.239.12.212 ames: czar rep.urbit.org: ip .198.199.121.116 ames: czar lur.urbit.org: ip .35.233.250.88 ames: czar ref.urbit.org: ip .143.198.51.180 ames: czar nus.urbit.org: ip .34.83.26.147 ames: czar nem.urbit.org: ip .66.228.53.179 ames: czar bel.urbit.org: ip .34.69.242.152 ames: czar dem.urbit.org: ip .34.69.220.110 pier: serf unexpectedly shut down

tapset avatar Oct 18 '22 14:10 tapset

Adding myself to the list of people experiencing this. Windows 10, using Port, but if I Start in Terminal under Manage, it looks basically the same as what tapset posted. Also the same timing variance; sometimes it crashes immediately, sometimes it takes a few minutes.

lowgradepanic avatar Oct 21 '22 23:10 lowgradepanic

I also have been chatting with someone on windows who is reporting this error, they just booted a new planet.

nodreb-borrus avatar Oct 24 '22 19:10 nodreb-borrus

Seconded. I'm seeing the same behavior on Windows 10 with a recently migrated pier under following three circumstances

  1. Using the latest windows urbit binary in command prompt and WSL
  2. Using the latest linux urbit binary in command prompt and WSL
  3. booting from port

however my fakezod on urbit 1.8 seems stable under WSL (ubuntu 16.04)

telestew avatar Oct 27 '22 05:10 telestew

I'm getting the same behavior. Running my ship on a Windows laptop through Port. If I go to manage and boot it through the terminal I get the pier: serf unexpectedly shut down after a few seconds; sometimes even minutes but still totally unusable. Any ideas?

Stahlblau4 avatar Nov 02 '22 22:11 Stahlblau4

Running Windows 10, cannot keep ship running for more than 2 seconds. Same issues here.

trosel avatar Nov 19 '22 19:11 trosel

Same problem here on Windows 10 running through Port.

dietofworms avatar Nov 28 '22 00:11 dietofworms

Same problem, windows 10, CLI, tried with my planet and a fresh comet. Runs for like 1 min and then says this pier: serf unexpectedly shut down and exits.

saunic avatar Dec 05 '22 03:12 saunic

Same problem, windows 10, with both Port and CLI. Is there any update on this or a stable version to roll back to?

yyuyulm avatar Dec 20 '22 22:12 yyuyulm

I've given planets to two people using Windows and they both have this problem nonstop.

philipquirk avatar Jan 05 '23 03:01 philipquirk

Based on the recent commits, it looks like Urbit just won't work on Windows anymore

trosel avatar Jan 30 '23 19:01 trosel

Correct - we're dropping official support for native windows binaries. The linux ones likely work via WSL (Windows Subsystem for Linux) which is now generally available from Microsoft, but I haven't personally tested this: https://learn.microsoft.com/en-us/windows/wsl/install

zalberico avatar Jan 30 '23 19:01 zalberico