firmware icon indicating copy to clipboard operation
firmware copied to clipboard

ASUS Chromebox 3 (CN65) Crashes under high CPU load

Open lanrat opened this issue 5 years ago • 38 comments

I'm not sure if this is the right place to file this issue, or even if its Firmware related, so feel free to close if this is not the correct place.

Device: ASUS Chromebox 3 (CN65) Fw Ver: MrChromebox-4.12 (06/04/2020)

I'm running Debian Linux, and whenever I run any process with high CPU load the CN65 instantly locks up and is entirely unresponsive, needing a hard reboot to come back to life. I've had this happen with multiple different processes, all that push the CPU. Unfortunately as it happens the instant the CPU load gets to high I'm unable to see anything in the system logs, and am currently at a loss on how to debug.

I'm using the stock 90W power supply.

lanrat avatar Nov 30 '20 03:11 lanrat

which CPU? any way to reliably reproduce? If so, I'd try booting a live USB and replicating there to rule out an issue with your install/kernel etc

MrChromebox avatar Nov 30 '20 05:11 MrChromebox

The CPU is a Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz.

In the past this has happened whenever I run aircrack-ng, hashcat, or gunzip. I just tried running the programs again and it appears to not crash. I'll run some more tests and report back.

I did notice that when doing these tests the CPU gets throttled due to the CPU temperature being above the threshold, which I guess is normal for this type of programs.

lanrat avatar Dec 01 '20 04:12 lanrat

I can provide a test firmware which doesn't set the TDP values as high (ie, back to stock power levels)

MrChromebox avatar Dec 01 '20 06:12 MrChromebox

How much lower are they set in your firmware vs. stock?

I'm using the OEM cooling which I'm assuming is not the best.

lanrat avatar Dec 01 '20 16:12 lanrat

stock is 15W/28W for PL1/2, current UEFI is 28W/51W

MrChromebox avatar Dec 01 '20 16:12 MrChromebox

Would the TDP values cause the entire system to freeze?

I would assume they would just down-clock the CPU and not cause a lock-up.

lanrat avatar Dec 03 '20 21:12 lanrat

hard to say, but I'm not able to reproduce here on a Celeron box

MrChromebox avatar Dec 03 '20 21:12 MrChromebox

Hi - I have a similar issue with my ASUS Chromebox 3 (CN65) - any chance of the test firmware with lower TDP values to see if that helps?

dixonalistair85 avatar Dec 12 '20 04:12 dixonalistair85

I second this, my CN65 has similar issues when running it with full load. Any way of getting a modified Firmware with 15 Watts of TDP? (or a short guide on how to build it myself?)

egrath avatar Oct 16 '21 08:10 egrath

right now it's set to 28W/51W, I'll reduce to 20/40 for the next release.

Are people seeing issues using a 95W power brick, or something smaller?

MrChromebox avatar Oct 16 '21 12:10 MrChromebox

Mine's one with a i7-8550U CPU and a 90 W rated PSU.

Interestingly, according to the Spec, the maximum configurable TDP for this CPU should not exceed 25 W. https://ark.intel.com/content/www/de/de/ark/products/122589/intel-core-i7-8550u-processor-8m-cache-up-to-4-00-ghz.html

egrath avatar Oct 16 '21 15:10 egrath

Hello everyone! Can someone email me a BIOS DUMP from CN65 i7-8550u? My BIOS is damaged. I want to bring him back to life. Thanks! [email protected]

CageOff avatar Dec 23 '21 21:12 CageOff

@CageOff please do not hijack this issue. Also, see https://wiki.mrchromebox.tech/Unbrcking for important info on directly flashing Chromeboxes.

MrChromebox avatar Dec 24 '21 16:12 MrChromebox

Has this been solved? I have 3 Chromebox3/i7-8550U CPU machines that are having the same stability issues when under load. They are powered by the 90W Asus powers supplies that came with them.

dgranz avatar Apr 05 '22 16:04 dgranz

I could never get my initial CN65 (Chromebox3) working under high load without crashing. However I've since had good luck on some other units that have been stable for a few months. So it might be hardware related? Maybe a slightly different revision that causes problems with this firmware?

lanrat avatar Apr 05 '22 18:04 lanrat

Do we even know it's firmware related? Could we try to reproduce it with original fw on ChromeOS?

bam80 avatar Apr 05 '22 18:04 bam80

I never had any issues with my units when I was running ChromeOS. But I also never used them under as high of a load, or for very long.

lanrat avatar Apr 05 '22 18:04 lanrat

I could never get my initial CN65 (Chromebox3) working under high load without crashing. However I've since had good luck on some other units that have been stable for a few months. So it might be hardware related? Maybe a slightly different revision that causes problems with this firmware?

The original Firmware sets the CPU TDP Limit to 15 W so it's always inside a very safe and conservative margin compared to the 28 W set by MrChromebox's Firmware - Intel recommends a maximum TDP of 25 W to be set by system integrators. IMHO the CN65 cooling system simply can't handle the thermal exhaust when running at 28 W.

egrath avatar Apr 05 '22 18:04 egrath

On ChromeOS, we could try to run the same power hungry utils in stock linux container

bam80 avatar Apr 05 '22 18:04 bam80

right now it's set to 28W/51W, I'll reduce to 20/40 for the next release.

Has the TDP limit been decreased? If so, could you build test fw with 28W/51W please?

Started to use my Chromebox for real things and faced the opposite problem: under high load (compiling), CPU clock is maxed to 1.8GHz despite "Frequency Boost" setting with Ondemand governor. CPU temp is about 60-64C all this time, with 36C when idle. When relatively idle, short spikes in load can rise the clock up to 3GHz and higher, as intended.

So I would really love to have the clock rise more under high load, even in cost of several C degrees.

bam80 avatar Jun 01 '22 14:06 bam80

Has the TDP limit been decreased? If so, could you build test fw with 28W/51W please?

it's 20/40 as I mentioned above. I never had any issues with 28/51 so that's what I use myself

MrChromebox avatar Jun 01 '22 17:06 MrChromebox

@MrChromebox nice, could you share 28/51 fw then so I could check if it helps with my issue above?

bam80 avatar Jun 01 '22 17:06 bam80

there's no way that 20W PL1 is causing throttling when idle.

MrChromebox avatar Jun 02 '22 17:06 MrChromebox

Started to use my Chromebox for real things and faced the opposite problem: under high load (compiling), CPU clock is maxed to 1.8GHz despite "Frequency Boost" setting with Ondemand governor. CPU temp is about 60-64C all this time, with 36C when idle.

I tried cold boot (from unpowered state) and it fixed the problem for me. Usual reboot didn't fix it. I didn't try to shut down/switch on system while connected to power. So it still might be something with EC I think. I'll report it separately when I reproduce it again.

bam80 avatar Jun 02 '22 22:06 bam80

I know this is an old post, but I'm still having this exact issue in 2024. I'm using an Asus Chromebox 3 i7-8550U to run a Minecraft server and it crashes when power usage goes above ~25W. Has anyone found a solution to this issue? Cold booting didn't change anything for me.

I'm creating a Github account just to comment on this, so let me know if there are any additional logs I need to upload to help resolve this issue.

klam2003 avatar Mar 16 '24 18:03 klam2003

@MrChromebox could it be helpful to investigate the problem with self-made Suzy-Q cable? If so I could share my experience of making one from just a pair of resistors and USB-C breakout board.

bam80 avatar Mar 16 '24 18:03 bam80

My cn65 on Windows 11 shutting down itself often due to "thermal event". It gets really hot when unzipping 5GB archive full of small files, CPU hits 95 C, nvme disk hits 80 C. Unfortunately, unexpected shutdown also happens when I'm not using the disk - it's just the CPU load. I found couple of reddit posts referring to the same problem.

i7-8550u with 90w power supply

aiac avatar May 13 '24 12:05 aiac

I found couple of reddit posts referring to the same problem.

@aiac Could you post the links here for the reference?

bam80 avatar May 13 '24 12:05 bam80

I found couple of reddit posts referring to the same problem.

@aiac Could you post the links here for the reference?

Now I can't find every instance of comments where someone has complained about this issue, but here are two examples: https://www.reddit.com/r/chrultrabook/comments/spl990/issues_when_running_windows_10_on_cn65 https://www.reddit.com/r/chrultrabook/comments/hwc5ng/experiences_from_an_asus_chromebox_3_cn65

I would be grateful for a new firmware with original PLs, I probably won't be able to compile it myself.

The ability to configure these limits in software would be best because we could test the settings on specific devices.

aiac avatar Jun 12 '24 20:06 aiac

The ability to configure these limits in software would be best because we could test the settings on specific devices.

once the necessary framework exists under coreboot and edk2 I'll be happy to do that, but for now recompilation is the only way. The PLs in the latest release (4.22.5) are very close to stock and should not be problematic

MrChromebox avatar Jun 13 '24 03:06 MrChromebox