skiboot icon indicating copy to clipboard operation
skiboot copied to clipboard

Fast reboot timedout after enabling PR_INSANE for console log driver.

Open pridhiviraj opened this issue 7 years ago • 6 comments

  1. Boot the system to OS
  2. Enable fast reboot
  3. Enable PR_INSANE(9) for console log driver using. nvram -p ibm,skiboot --update-config log-level-driver=9
  4. Do a reboot. Here fast-reboot timedout and failed, after that full reboot gets triggered,
[  556.906567] mpt3sas_cm0: sending message unit reset !!
[  556.908123] mpt3sas_cm0: message unit reset: SUCCESS
[  567.251350] xhci_hcd 0[  712.258794988,7] PHB#0000[0:0]: Purging all IODA tables...
005:01:00.0: Host[  712.262393357,7] PHB#0001[0:1]: Purging all IODA tables...
 halt failed, -11[  712.265837842,7] PHB#0002[0:2]: Purging all IODA tables...
0
[  567.252179][  712.269386852,7] PHB#0003[0:3]: Purging all IODA tables...
 reboot: Restarti[  712.273034609,7] PHB#0004[0:4]: Purging all IODA tables...
ng system
[  712.276483591,7] PHB#0005[0:5]: Purging all IODA tables...
[  712.285210537,5] OPAL: Reboot request...
[  712.286890499,7] NVRAM: 'fast-reset' not found
[  712.288995673,7] NVRAM: Searched for 'experimental-fast-reset' found 'feeling-lucky'
[  712.293254299,5] RESET: Initiating fast reboot 1...
[  712.295387113,7] RESET: Resetting from cpu: 0x0 (core 0x0)
[  712.299069612,7] RESET: Stopping the world...
[  712.301105370,7] RESET: Resetting all threads but self...
[  712.303565604,7] RESET: CPU 0x0001 reset in
[  712.303566685,7] RESET: CPU 0x0003 reset in
[  712.303567189,7] RESET: CPU 0x0004 reset in
[  712.303572540,7] RESET: CPU 0x000a reset in
[  712.303572038,7] RESET: CPU 0x0009 reset in
[  712.303570660,7] RESET: CPU 0x0007 reset in
[  712.303566180,7] RESET: CPU 0x0002 reset in
[  712.303569177,7] RESET: CPU 0x0006 reset in
[  712.303593150,7] RESET: CPU 0x0047 reset in
[  712.303585439,7] RESET: CPU 0x0031 reset in
[  712.303586041,7] RESET: CPU 0x0032 reset in
[  712.303582224,7] RESET: CPU 0x002b reset in
[  712.303595025,7] RESET: CPU 0x0052 reset in
[  712.303567700,7] RESET: CPU 0x0005 reset in
[  712.303584459,7] RESET: CPU 0x002f reset in
[  712.303580535,7] RESET: CPU 0x0028 reset in
[  712.303596598,7] RESET: CPU 0x0055 reset in
[  712.303576003,7] RESET: CPU 0x0010 reset in
[  712.303571440,7] RESET: CPU 0x0008 reset in
[  712.303573035,7] RESET: CPU 0x000b reset in
[  712.303588281,7] RESET: CPU 0x0036 reset in
[  712.303587068,7] RESET: CPU 0x0034 reset in
[  712.303574265,7] RESET: CPU 0x000d reset in
[  712.303598820,7] RESET: CPU 0x0059 reset in
[  712.303591125,7] RESET: CPU 0x0043 reset in
[  712.303592138,7] RESET: CPU 0x0045 reset in
[  712.303589825,7] RESET: CPU 0x0041 reset in
[  712.303602087,7] RESET: CPU 0x005f reset in
[  712.303599896,7] RESET: CPU 0x005b reset in
[  712.303598464,7] RESET: CPU 0x0058 reset in
[  712.303592633,7] RESET: CPU 0x0046 reset in
[  712.303583242,7] RESET: CPU 0x002d reset in
[  712.303601091,7] RESET: CPU 0x005d reset in
[  712.303599420,7] RESET: CPU 0x005a reset in
[  712.303600390,7] RESET: CPU 0x005c reset in
[  712.303587778,7] RESET: CPU 0x0035 reset in
[  712.303586566,7] RESET: CPU 0x0033 reset in
[  712.303588776,7] RESET: CPU 0x0037 reset in
[  712.303601592,7] RESET: CPU 0x005e reset in
[  712.303594438,7] RESET: CPU 0x0051 reset in
[  712.303589503,7] RESET: CPU 0x0040 reset in
[  712.303583726,7] RESET: CPU 0x002e reset in
[  712.303581112,7] RESET: CPU 0x0029 reset in
[  712.303591639,7] RESET: CPU 0x0044 reset in
[  712.303581661,7] RESET: CPU 0x002a reset in
[  712.303582737,7] RESET: CPU 0x002c reset in
[  712.303597845,7] RESET: CPU 0x0057 reset in
[  712.303576969,7] RESET: CPU 0x0012 reset in
[  712.303576406,7] RESET: CPU 0x0011 reset in
[  712.303577691,7] RESET: CPU 0x0013 reset in
[  712.303579253,7] RESET: CPU 0x0016 reset in
[  712.303579799,7] RESET: CPU 0x0017 reset in
[  712.303578222,7] RESET: CPU 0x0014 reset in
[  712.303575305,7] RESET: CPU 0x000f reset in
[  712.303597124,7] RESET: CPU 0x0056 reset in
[  712.303585128,7] RESET: CPU 0x0030 reset in
[  712.303593815,7] RESET: CPU 0x0050 reset in
[  712.303595531,7] RESET: CPU 0x0053 reset in
[  712.303574793,7] RESET: CPU 0x000e reset in
[  712.303578730,7] RESET: CPU 0x0015 reset in
[  712.303590374,7] RESET: CPU 0x0042 reset in
[  712.304123865,6] RESET: Boot CPU waiting for everybody...
[  712.303573554,7] RESET: CPU 0x000c reset in
[  712.303596080,7] RESET: CPU 0x0054 reset in
[  712.354802195,5] RESET: Fast reboot timed out waiting for secondaries to call in
[  712.354805856,6] IPMI: sending chassis control request 0x03
[  712.355152116,6] BT: seq 0x22 netfn 0x00 cmd 0x02: Message sent to host
[  712.360808499,6] BT: seq 0x22 netfn 0x00 cmd 0x02: IPMI MSG done


--== Welcome to Hostboot  ==--

  2.72476|secure|SecureROM valid - enabling functionality
  2.72481|secure|Booting in non-secure mode.
 14.55595|Ignoring boot flags, incorrect version 0x0
 14.56673|Booting from SBE side 0 on master proc=00050000
 14.72340|ISTEP  6. 5 - host_init_fsi
 14.79995|ISTEP  6. 6 - host_set_ipl_parms
 14.89154|ISTEP  6. 7 - host_discover_targets
 15.31058|HWAS|PRESENT> DIMM[03]=F0F0000000000000
 15.31059|HWAS|PRESENT> Proc[05]=8000000000000000
 15.31060|HWAS|PRESENT> Core[07]=FC3CCF0000000000
 15.32540|ISTEP  6. 8 - host_update_master_tpm
 23.02490|SECURE|Security Access Bit> 0x0000000000000000
 23.02491|SECURE|Secure Mode Disable (via Jumper)> 0x8000000000000000
 23.02532|ISTEP  6. 9 - host_gard
 23.04865|HWAS|FUNCTIONAL> DIMM[03]=F0F0000000000000
 23.04866|HWAS|FUNCTIONAL> Proc[05]=8000000000000000
 23.04867|HWAS|FUNCTIONAL> Core[07]=FC3CCF0000000000
 23.05196|ISTEP  6.10 - host_revert_sbe_mcs_setup
 23.06765|ISTEP  6.11 - host_start_occ_xstop_handler
 23.66422|ISTEP  6.12 - host_voltage_config
 23.70796|ISTEP  7. 1 - mss_attr_cleanup
 23.84531|ISTEP  7. 2 - mss_volt
 23.91415|ISTEP  7. 3 - mss_freq
 23.97337|ISTEP  7. 4 - mss_eff_config
 24.80155|ISTEP  7. 5 - mss_attr_update

pridhiviraj avatar Jan 22 '18 13:01 pridhiviraj

It seems I can reproduce with fast reboot. I had to update my skiboot since it didn't understand the loglevel nvram option.

All I did was: enable both fast reboot and PR_INSANE in nvram power off from Linux (to get into new skiboot) power on (via ipmi)

@pridhiviraj What version of skiboot did you use?

  3.45350|Ignoring boot flags, incorrect version 0x0
  3.55001|ISTEP  6. 3
  4.00590|ISTEP  6. 4
  4.00631|ISTEP  6. 5
 31.05729|HWAS|PRESENT> DIMM[03]=00000000AAAAAAAA
 31.05729|HWAS|PRESENT> Membuf[04]=0C0C000000000000
 31.05730|HWAS|PRESENT> Proc[05]=C000000000000000
 31.15404|ISTEP  6. 6
 31.22166|ISTEP  6. 7
 32.97166|ISTEP  6. 8
 33.00962|ISTEP  6. 9
 37.29127|ISTEP  6.10
 37.32007|ISTEP  6.11
 39.05660|ISTEP  6.12
 39.05809|ISTEP  6.13
 39.05874|ISTEP  7. 1
 39.12506|ISTEP  7. 2
 39.23654|ISTEP  7. 3
 39.26639|ISTEP  7. 4
 39.30574|ISTEP  7. 5
 39.52173|ISTEP  7. 6
 39.55034|ISTEP  7. 7
 39.77152|ISTEP  7. 8
 39.92390|ISTEP  7. 9
 39.92430|ISTEP  8. 1
 40.09502|ISTEP  8. 2
 40.90943|ISTEP  8. 3
 40.91926|ISTEP  8. 4
 41.16501|ISTEP  8. 5
 41.16567|ISTEP  8. 6
 41.75181|ISTEP  8. 7
 41.75254|ISTEP  8. 8
 41.76372|ISTEP  9. 1
 42.12401|ISTEP  9. 2
 42.73362|ISTEP 10. 1
 42.82156|ISTEP 10. 2
 43.61876|ISTEP 10. 3
 43.62004|ISTEP 10. 4
 43.62120|ISTEP 10. 5
 43.62206|ISTEP 10. 6
 43.62306|ISTEP 10. 7
 43.62368|ISTEP 10. 8
 43.62438|ISTEP 10. 9
 43.62559|ISTEP 10.10
 43.62616|ISTEP 10.11
 43.62669|ISTEP 10.12
 43.62753|ISTEP 10.13
 43.63007|ISTEP 10.14
 43.63069|ISTEP 11. 1
 43.72191|ISTEP 11. 2
 43.72271|ISTEP 11. 3
 43.80793|ISTEP 11. 4
 43.92405|ISTEP 11. 5
 43.96464|ISTEP 11. 6
 44.98020|ISTEP 11. 7
 44.98131|ISTEP 11. 8
 45.26953|ISTEP 11. 9
 45.27056|ISTEP 11.10
 45.35470|ISTEP 11.11
 45.35546|ISTEP 11.12
 45.35628|ISTEP 11.13
 45.36158|ISTEP 12. 1
 45.47253|ISTEP 12. 2
 45.54306|ISTEP 12. 3
 45.57723|ISTEP 12. 4
 45.98220|ISTEP 12. 5
 45.98283|ISTEP 13. 1
 46.07796|ISTEP 13. 2
 46.11929|ISTEP 13. 3
 46.12032|ISTEP 13. 4
 46.14130|ISTEP 13. 5
 46.14605|ISTEP 13. 6
 47.10765|ISTEP 13. 7
 47.38531|ISTEP 13. 8
 47.56170|ISTEP 13. 9
 49.92943|ISTEP 13.10
 49.96411|ISTEP 13.11
 50.09561|ISTEP 13.12
 50.09790|ISTEP 14. 1
 50.13291|ISTEP 14. 2
 50.15393|ISTEP 14. 3
 96.16603|ISTEP 14. 4
 96.21271|ISTEP 14. 5
 96.24916|ISTEP 14. 6
 96.26258|ISTEP 14. 7
 96.37570|ISTEP 14. 8
 96.37773|ISTEP 15. 1
 96.91962|ISTEP 15. 2
 96.93193|ISTEP 15. 3
 97.01609|ISTEP 16. 1
 98.07250|ISTEP 16. 2
 98.54983|ISTEP 16. 3
 98.56529|ISTEP 16. 4
 98.56939|ISTEP 18.13
 98.71641|ISTEP 18.14
 98.73357|ISTEP 20. 1
 99.67329|ISTEP 21. 1
112.90290|htmgt|OCCs are now running in ACTIVE state
123.34051|ISTEP 21. 2
123.32308|ISTEP 21. 3
[  123.333813361,5] OPAL 4e23b42 starting...
[  123.333816669,7] initial console log level: memory 7, driver 5
[  123.333819991,6] CPU: P8 generation processor (max 8 threads/core)
[  123.333822972,7] CPU: Boot CPU PIR is 0x0448 PVR is 0x004d0200
[  123.333826119,7] CPU: Initial max PIR set to 0x1fff
[  123.334187838,7] OPAL table: 0x300f4430 .. 0x300f4980, branch table: 0x30002000
[  123.334194137,7] Assigning physical memory map table for unused
[  123.334198811,7] FDT: Parsing fdt @0xff00000
[  123.340309929,6] CHIP: Initialised chip 0 from xscom@3fc0000000000
[  123.340323259,6] CHIP: Initialised chip 8 from xscom@3fc4000000000
[  123.340445494,5] CHIP: Chip ID 0000 type: P8 DD2.0
[  123.340449193,7] XSCOM: Base address: 0x3fc0000000000
[  123.340459212,5] CHIP: Chip ID 0008 type: P8 DD2.0
[  123.340462302,7] XSCOM: Base address: 0x3fc4000000000
[  123.340469899,7] XSTOP: XSCOM addr = 0x2010c82, FIR bit = 31
[  123.340473671,6] MFSI 0:0: Initialized
[  123.340476101,6] MFSI 0:2: Initialized
[  123.340478463,6] MFSI 0:1: Initialized
[  123.340481148,6] MFSI 8:0: Initialized
[  123.340483498,6] MFSI 8:2: Initialized
[  123.340485901,6] MFSI 8:1: Initialized
[  123.340998025,5] LPC: LPC[000]: Initialized, access via XSCOM @0xb0020
[  123.341012246,7] LPC: Default bus on chip 0x0
[  123.341192602,6] MEM: parsing reserved memory from node /ibm,hostboot/reserved-memory
[  123.341211867,7] HOMER: Init chip 0
[  123.341215227,7]   PBA BAR0 : 0x0000003ffd800000
[  123.341218180,7]   PBA MASK0: 0x0000000000300000
[  123.341221103,7]   HOMER Image at 0x3ffd800000 size 4MB
[  123.341225319,7]   PBA BAR2 : 0x4000003ffda00000
[  123.341228341,7]   PBA MASK2: 0x0000000000000000
[  123.341231133,7]   SLW Image at 0x3ffda00000 size 1MB
[  123.341234951,7]   PBA BAR3 : 0x0000003fff800000
[  123.341237782,7]   PBA MASK3: 0x0000000000700000
[  123.341240621,7]   OCC Common Area at 0x3fff800000 size 8MB
[  123.341243829,7] HOMER: Init chip 8
[  123.341246761,7]   PBA BAR0 : 0x0000003ffdc00000
[  123.341249576,7]   PBA MASK0: 0x0000000000300000
[  123.341252364,7]   HOMER Image at 0x3ffdc00000 size 4MB
[  123.341256262,7]   PBA BAR2 : 0x4000003ffde00000
[  123.341259216,7]   PBA MASK2: 0x0000000000000000
[  123.341262016,7]   SLW Image at 0x3ffde00000 size 1MB
[  123.341265790,7]   PBA BAR3 : 0x0000003fff800000
[  123.341268651,7]   PBA MASK3: 0x0000000000700000
[  123.341271478,7]   OCC Common Area at 0x3fff800000 size 8MB
[  123.341294450,7] CPU idle state device tree init
[  123.341299033,5] SLW: HB-provided idle states property found
[  123.341303563,5] SLW: Enabling: nap
[  123.341305926,5] SLW: Enabling: fastsleep_
[  123.341308786,5] SLW: Enabling: winkle
[  123.341638173,7] AST: PNOR LPC offset: 0x0c000000
[  123.341723096,5] PLAT: Using virtual UART
[  123.342103075,7] UART: Using LPC IRQ 4
[  123.369545658,5] PLAT: Detected Firestone platform
[  123.369629479,5] PLAT: Detected BMC platform AMI
[  123.373805124,5] CENTAUR: Found centaur for chip 0x0 channel 4
[  123.373916396,5] CENTAUR:   FSI host: 0x0 cMFSI0 port 7
[  123.375679402,5] CENTAUR: Found centaur for chip 0x0 channel 5
[  123.404690802,5] CENTAUR:   FSI host: 0x0 cMFSI0 port 6
[  123.404872568,5] CENTAUR: Found centaur for chip 0x8 channel 4
[  123.405012244,5] CENTAUR:   FSI host: 0x8 cMFSI0 port 7
[  123.405173119,5] CENTAUR: Found centaur for chip 0x8 channel 5
[  123.405311334,5] CENTAUR:   FSI host: 0x8 cMFSI0 port 6
[  123.408377380,5] CPU: All 160 processors called in...
[  127.133893068,5] FLASH: Found system flash: Macronix MXxxL51235F id:0
[  127.134108658,5] BT: Interface initialized, IO 0x00e4
[  128.083130118,5] NVRAM: Size is 576 KB
[  128.305588959,5] console: Setting driver log level to 9
[  128.305707740,7] NVRAM: 'log-level-memory' not found
[  128.305820729,5] STB: Found ibm,secureboot-v1
[  128.305991431,7] NVRAM: 'force-secure-mode' not found
[  128.306246912,5] STB: secure mode off
[  128.307558592,6] STB: Found CVC @ 3031ca28-30320a27
[  128.307651958,6] STB: Found CVC-sha512 @ 3031ca48, version=1
[  128.307812063,6] STB: Found CVC-verify @ 3031ca58, version=1
[  128.307979519,7] NVRAM: 'force-trusted-mode' not found
[  128.308124521,5] STB: trusted mode off
[  128.308244326,5] OPAL: Using OPAL UART console
[  128.308366418,7] NVRAM: 'uart-con-policy' not found
[  128.308453401,7] SLW: Init chip 0x0
[  128.308546943,7] SLW: Image size from image: 0x100000
[  128.308786309,7] SLW: Init chip 0x8
[  128.308871405,7] SLW: Image size from image: 0x100000
[  128.311057452,6] SLW: Timer facility on chip 0, resolution 10us
[  128.311168739,7] NVRAM: 'pcie-max-link-speed' not found
[  128.311349496,6] CAPI: Preloading ucode 200ea
[  128.311469804,7] FLASH: Queueing preload of 2/200ea
[  128.311599989,7] FLASH: Queueing preload of 0/0
[  128.311601530,7] blocklevel_read: 0x0	0x31c43b40	0x30
[  128.311604015,7]  3.91963|Ignoring boot flags, incorrect version 0x0
  4.01139|ISTEP  6. 3
  4.46563|ISTEP  6. 4
  4.46618|ISTEP  6. 5

cyrilbur-ibm avatar Jan 29 '18 05:01 cyrilbur-ibm

@cyrilbur-ibm Latest master skiboot is fine to reproduce this issue. You are hitting a known OPAL checkstop issue #145 on P8 platforms., Better you try this on a P9 platform.

pridhiviraj avatar Jan 29 '18 05:01 pridhiviraj

More information after updating skiboot to: 08188269016392f2d42835ef4b8548c0f312f941

Looks like 5.8 release worked. I'll try to bisect between the two.

[   36.374257] reboot: Restarting system
[  181.483836558,7] PHB0: Purging all IODA tables...
[  181.484253906,7] PHB1: Purging all IODA tables...
[  181.484674592,7] PHB20: Purging all IODA tables...
[  181.485818462,7] PHB21: Purging all IODA tables...
[  181.486247933,7] PHB22: Purging all IODA tables...
[  181.492254187,5] OPAL: Reboot request...
[  181.492603698,7] NVRAM: 'fast-reset' not found
[  181.492670147,5] RESET: Initiating fast reboot 1...
[  181.492836965,7] RESET: Resetting from cpu: 0x10 (core 0x2)
[  181.493021685,7] RESET Waking up core 0x2
[  181.493144496,7] RESET Waking up core 0x3
[  181.493204171,7] RESET Waking up core 0x4
[  181.493320635,7] RESET Waking up core 0x5
[  181.493425535,7] RESET Waking up core 0x6
[  181.493480147,7] RESET Waking up core 0x9
[  181.493591124,7] RESET Waking up core 0xa
[  181.493704379,7] RESET Waking up core 0xb
[  181.493820684,7] RESET Waking up core 0xc
[  181.493882782,7] RESET Waking up core 0xd
[  181.494016361,7] RESET Waking up core 0x1
[  181.494116288,7] RESET Waking up core 0x2
[  181.494176823,7] RESET Waking up core 0x4
[  181.494276607,7] RESET Waking up core 0x5
[  181.494382673,7] RESET Waking up core 0x6
[  181.494447613,7] RESET Waking up core 0x9
[  181.494552980,7] RESET Waking up core 0xb
[  181.494663646,7] RESET Waking up core 0xc
[  181.494774396,7] RESET Waking up core 0xd
[  181.494828892,7] RESET Waking up core 0xe
[  181.494941092,7] RESET: Stopping the world...
[  181.495169262,7] RESET: Resetting all threads but self...
[  181.495262719,7] RESET: CPU 0x0011 reset in
[  181.495263599,7] RESET: CPU 0x0012 reset in
[  181.495264574,7] RESET: CPU 0x0013 reset in
[  181.495265782,7] RESET: CPU 0x0014 reset in
[  181.495267039,7] RESET: CPU 0x0015 reset in
[  181.495268193,7] RESET: CPU 0x0016 reset in
[  181.495269491,7] RESET: CPU 0x0017 reset in
[  181.495270714,7] RESET: CPU 0x0018 reset in
[  181.495271546,7] RESET: CPU 0x0019 reset in
[  181.495272773,7] RESET: CPU 0x001a reset in
[  181.495273912,7] RESET: CPU 0x001b reset in
[  181.495275169,7] RESET: CPU 0x001c reset in
[  181.495276231,7] RESET: CPU 0x001d reset in
[  181.495277398,7] RESET: CPU 0x001e reset in
[  181.495278346,7] RESET: CPU 0x001f reset in
[  181.495279724,7] RESET: CPU 0x0020 reset in
[  181.495280844,7] RESET: CPU 0x0021 reset in
[  181.495281686,7] RESET: CPU 0x0022 reset in
[  181.495282639,7] RESET: CPU 0x0023 reset in
[  181.495283542,7] RESET: CPU 0x0024 reset in
[  181.495284710,7] RESET: CPU 0x0025 reset in
[  181.495285854,7] RESET: CPU 0x0026 reset in
[  181.495287092,7] RESET: CPU 0x0027 reset in
[  181.495288299,7] RESET: CPU 0x0028 reset in
[  181.495289423,7] RESET: CPU 0x0029 reset in
[  181.495290586,7] RESET: CPU 0x002a reset in
[  181.495291482,7] RESET: CPU 0x002b reset in
[  181.495292780,7] RESET: CPU 0x002c reset in
[  181.495293575,7] RESET: CPU 0x002d reset in
[  181.495294778,7] RESET: CPU 0x002e reset in
[  181.495295934,7] RESET: CPU 0x002f reset in
[  178.123936886,7] RESET: CPU 0x0030 reset in
[  178.123937949,7] RESET: CPU 0x0031 reset in
[  178.123939071,7] RESET: CPU 0x0032 reset in
[  178.123939988,7] RESET: CPU 0x0033 reset in
[  178.123941200,7] RESET: CPU 0x0034 reset in
[  178.123942294,7] RESET: CPU 0x0035 reset in
[  178.123943450,7] RESET: CPU 0x0036 reset in
[  178.123944598,7] RESET: CPU 0x0037 reset in
[  181.495306323,7] RESET: CPU 0x0048 reset in
[  181.495307396,7] RESET: CPU 0x0049 reset in
[  181.495308575,7] RESET: CPU 0x004a reset in
[  181.495309660,7] RESET: CPU 0x004b reset in
[  181.495310974,7] RESET: CPU 0x004c reset in
[  181.495312088,7] RESET: CPU 0x004d reset in
[  181.495313003,7] RESET: CPU 0x004e reset in
[  181.495313935,7] RESET: CPU 0x004f reset in
[  181.495315194,7] RESET: CPU 0x0050 reset in
[  181.495316351,7] RESET: CPU 0x0051 reset in
[  181.495317464,7] RESET: CPU 0x0052 reset in
[  181.495318351,7] RESET: CPU 0x0053 reset in
[  181.495319560,7] RESET: CPU 0x0054 reset in
[  181.495320678,7] RESET: CPU 0x0055 reset in
[  181.495321855,7] RESET: CPU 0x0056 reset in
[  181.495322783,7] RESET: CPU 0x0057 reset in
[  181.495323766,7] RESET: CPU 0x0058 reset in
[  181.495324979,7] RESET: CPU 0x0059 reset in
[  181.495326102,7] RESET: CPU 0x005a reset in
[  181.495327001,7] RESET: CPU 0x005b reset in
[  181.495328186,7] RESET: CPU 0x005c reset in
[  181.495329057,7] RESET: CPU 0x005d reset in
[  181.495330207,7] RESET: CPU 0x005e reset in
[  181.495331403,7] RESET: CPU 0x005f reset in
[  178.164904075,7] RESET: CPU 0x0060 reset in
[  178.164905124,7] RESET: CPU 0x0061 reset in
[  178.164906362,7] RESET: CPU 0x0062 reset in
[  178.164907415,7] RESET: CPU 0x0063 reset in
[  178.164908511,7] RESET: CPU 0x0064 reset in
[  178.164909560,7] RESET: CPU 0x0065 reset in
[  178.164910703,7] RESET: CPU 0x0066 reset in
[  178.164911615,7] RESET: CPU 0x0067 reset in
[  178.175146031,7] RESET: CPU 0x0068 reset in
[  178.175147119,7] RESET: CPU 0x0069 reset in
[  178.175148253,7] RESET: CPU 0x006a reset in
[  178.175149390,7] RESET: CPU 0x006b reset in
[  178.175150324,7] RESET: CPU 0x006c reset in
[  178.175151217,7] RESET: CPU 0x006d reset in
[  178.175152443,7] RESET: CPU 0x006e reset in
[  178.175153585,7] RESET: CPU 0x006f reset in
[  181.495367427,7] RESET: CPU 0x0409 reset in
[  181.495368643,7] RESET: CPU 0x040a reset in
[  181.495366488,7] RESET: CPU 0x0408 reset in
[  181.495369672,7] RESET: CPU 0x040b reset in
[  181.495370846,7] RESET: CPU 0x040c reset in
[  181.495371749,7] RESET: CPU 0x040d reset in
[  181.495373046,7] RESET: CPU 0x040e reset in
[  181.495374256,7] RESET: CPU 0x040f reset in
[  178.205862743,7] RESET: CPU 0x0410 reset in
[  178.205864231,7] RESET: CPU 0x0411 reset in
[  178.205865457,7] RESET: CPU 0x0412 reset in
[  178.205866831,7] RESET: CPU 0x0413 reset in
[  178.205868272,7] RESET: CPU 0x0414 reset in
[  178.205869637,7] RESET: CPU 0x0415 reset in
[  178.205870614,7] RESET: CPU 0x0416 reset in
[  178.205871837,7] RESET: CPU 0x0417 reset in
[  178.221222677,7] RESET: CPU 0x0420 reset in
[  178.221223867,7] RESET: CPU 0x0421 reset in
[  178.221225093,7] RESET: CPU 0x0422 reset in
[  178.221226271,7] RESET: CPU 0x0423 reset in
[  178.221227272,7] RESET: CPU 0x0424 reset in
[  178.221228233,7] RESET: CPU 0x0425 reset in
[  178.221229471,7] RESET: CPU 0x0426 reset in
[  178.221230783,7] RESET: CPU 0x0427 reset in
[  180.277561407,7] RESET: CPU 0x0428 reset in
[  180.277562671,7] RESET: CPU 0x0429 reset in
[  180.277563647,7] RESET: CPU 0x042a reset in
[  180.277564855,7] RESET: CPU 0x042b reset in
[  180.277566143,7] RESET: CPU 0x042c reset in
[  180.277567265,7] RESET: CPU 0x042d reset in
[  180.277568240,7] RESET: CPU 0x042e reset in
[  180.277569473,7] RESET: CPU 0x042f reset in
[  181.495404765,7] RESET: CPU 0x0430 reset in
[  181.495405889,7] RESET: CPU 0x0431 reset in
[  181.495407118,7] RESET: CPU 0x0432 reset in
[  181.495408285,7] RESET: CPU 0x0433 reset in
[  181.495409374,7] RESET: CPU 0x0434 reset in
[  181.495410511,7] RESET: CPU 0x0435 reset in
[  181.495411496,7] RESET: CPU 0x0436 reset in
[  181.495412719,7] RESET: CPU 0x0437 reset in
[  178.262185825,7] RESET: CPU 0x0448 reset in
[  178.262186968,7] RESET: CPU 0x0449 reset in
[  178.262188194,7] RESET: CPU 0x044a reset in
[  178.262189322,7] RESET: CPU 0x044b reset in
[  178.262190574,7] RESET: CPU 0x044c reset in
[  178.262191759,7] RESET: CPU 0x044d reset in
[  178.262192983,7] RESET: CPU 0x044e reset in
[  178.262194210,7] RESET: CPU 0x044f reset in
[  181.495424021,7] RESET: CPU 0x0458 reset in
[  181.495425089,7] RESET: CPU 0x0459 reset in
[  181.495425917,6] RESET: Boot CPU waiting for everybody...
[  181.495426308,7] RESET: CPU 0x045a reset in
[  181.495427273,7] RESET: CPU 0x045b reset in
[  181.495428600,7] RESET: CPU 0x045c reset in
[  181.495429730,7] RESET: CPU 0x045d reset in
[  181.495431869,7] RESET: CPU 0x045e reset in
[  181.495433333,7] RESET: CPU 0x045f reset in
[  181.495434777,7] RESET: CPU 0x0460 reset in
[  181.495436002,7] RESET: CPU 0x0461 reset in
[  181.495437494,7] RESET: CPU 0x0462 reset in
[  181.495439312,7] RESET: CPU 0x0463 reset in
[  181.495440861,7] RESET: CPU 0x0464 reset in
[  181.495442655,7] RESET: CPU 0x0465 reset in
[  181.495443939,7] RESET: CPU 0x0466 reset in
[  181.495445485,7] RESET: CPU 0x0467 reset in
[  181.495446879,7] RESET: CPU 0x0468 reset in
[  181.495448392,7] RESET: CPU 0x0469 reset in
[  181.495449599,7] RESET: CPU 0x046a reset in
[  181.495451097,7] RESET: CPU 0x046b reset in
[  181.495452978,7] RESET: CPU 0x046c reset in
[  181.495454774,7] RESET: CPU 0x046d reset in
[  181.495456067,7] RESET: CPU 0x046e reset in
[  181.495457583,7] RESET: CPU 0x046f reset in
[  178.328743966,7] RESET: CPU 0x0470 reset in
[  178.328745215,7] RESET: CPU 0x0471 reset in
[  178.328746738,7] RESET: CPU 0x0472 reset in
[  178.328748189,7] RESET: CPU 0x0473 reset in
[  178.328750114,7] RESET: CPU 0x0474 reset in
[  178.328751919,7] RESET: CPU 0x0475 reset in
[  178.328753221,7] RESET: CPU 0x0476 reset in
[  178.328754724,7] RESET: CPU 0x0477 reset in
[  182.034670367,5] RESET: Fast reboot timed out waiting for secondaries to call in
[  182.034674622,6] IPMI: sending chassis control request 0x03
[  182.034693835,6] BT: seq 0x20 netfn 0x00 cmd 0x02: Message sent to host
c[  182.162528314,6] BT: seq 0x20 netfn 0x00 cmd 0x02: IPMI MSG done
pu 0x0: Vector: e40 (Emulation Assist) at [c000003f3b60b950]
    pc: c000000000001da0: masked_interrupt+0x0/0x64
    lr: c000000000038e9c: opal_return+0xc/0x48
    sp: c000003f3b60bbd0
   msr: 9000000000081031
  current = 0xc000003f3cfa1b40
  paca    = 0xc00000000fe80000	 softe: 0	 irq_happened: 0x01
    pid   = 3094, comm = init
WARNING: exception is not recoverable, can't continue
  3.45373|Ignoring boot flags, incorrect version 0x0
  3.55121|ISTEP  6. 3
  4.00867|ISTEP  6. 4
  4.00921|ISTEP  6. 5

cyrilbur-ibm avatar Jan 29 '18 06:01 cyrilbur-ibm

@pridhiviraj Ok well I don't have access to P9s to test this kind of stuff, I suppose we'll have to leave it to you?

cyrilbur-ibm avatar Jan 29 '18 06:01 cyrilbur-ibm

@cyrilbur-ibm Either @stewart-ibm or me will provide a P9 system when it is free, Will ping you the system details.

pridhiviraj avatar Jan 29 '18 06:01 pridhiviraj

If I get access to a p9 I'd rather use it for something else.

For what its worth I've bisected to: commit 068de7bc7688cb6b47e04edf76ce0461a8f6708b (refs/bisect/bad) Author: Nicholas Piggin [email protected] Date: Wed Nov 29 15:36:46 2017 +1000

fast-reboot: add sreset timeout detection and handling

I've increased the timeout significantly and 08188269 (current master) fast-reboots with PR_INSANE on just fine. Could a firestone just be a bit too slow?

cyrilbur-ibm avatar Jan 29 '18 06:01 cyrilbur-ibm