steam-for-linux icon indicating copy to clipboard operation
steam-for-linux copied to clipboard

game installs are corrupting BTRFS on Fedora 40

Open jorp opened this issue 1 year ago • 12 comments

Your system information

  • Steam client version:
Steam Beta Branch:  Stable Client
Steam Version:  1715891371
Steam Client Build Date:  Thu, May 16 4:11 PM UTC -08:00
Steam Web Build Date:  Thu, May 16 3:36 PM UTC -08:00
Steam API Version:  SteamClient021
  • Distribution: Fedora 40
  • Opted into Steam client beta?: No
  • Have you checked for system updates?: Yes
  • Steam Logs: steam-logs.tar.gz
  • GPU: AMD Radeon RX 6800 XT

Please describe your issue in as much detail as possible:

Whenever installing a new game, BTRFS becomes corrupted and is not correctable. The game will start installing as normal, and sometime into downloading, it will stop and state it is corrupted and move on to the next game.

I have even done a fresh install of Fedora 40 on a brand new SSD and received the same result.

Steps for reproducing this issue:

  1. Install Fedora 40
  2. Make no changes to default filesystems or parititoning, choose to encrypt installation with LUKS
  3. Install steam from rpmfusion repository
  4. Begin installing a game
  5. Watch for errors in the steam client or in dmesg (below)
[ 2479.397510] BTRFS warning (device dm-0): csum failed root 256 ino 188739 off 147456 csum 0x355bd35c expected csum 0x36d69a3a mirror 1
[ 2479.397537] BTRFS error (device dm-0): bdev /dev/mapper/luks-ffffffffffffff errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
[ 2479.438946] BTRFS warning (device dm-0): csum failed root 256 ino 188739 off 147456 csum 0xb9577995 expected csum 0x36d69a3a mirror 1
[ 2479.438968] BTRFS error (device dm-0): bdev /dev/mapper/luks-ffffffffffffff errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
[ 2479.453308] BTRFS warning (device dm-0): csum failed root 256 ino 188739 off 147456 csum 0xb9577995 expected csum 0x36d69a3a mirror 1
[ 2479.453333] BTRFS error (device dm-0): bdev /dev/mapper/luks-ffffffffffffff errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
[ 2488.515036] BTRFS warning (device dm-0): csum failed root 256 ino 188739 off 147456 csum 0xb9577995 expected csum 0x36d69a3a mirror 1
[ 2488.515052] BTRFS error (device dm-0): bdev /dev/mapper/luks-ffffffffffffff errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
[ 2488.515308] BTRFS warning (device dm-0): csum failed root 256 ino 188739 off 147456 csum 0xb9577995 expected csum 0x36d69a3a mirror 1
[ 2488.515323] BTRFS error (device dm-0): bdev /dev/mapper/luks-ffffffffffffff errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
[ 2488.515436] BTRFS warning (device dm-0): csum failed root 256 ino 188739 off 147456 csum 0xb9577995 expected csum 0x36d69a3a mirror 1
[ 2488.515446] BTRFS error (device dm-0): bdev /dev/mapper/luks-ffffffffffffff errs: wr 0, rd 0, flush 0, corrupt 6, gen 0

jorp avatar May 18 '24 01:05 jorp

So I switched to beta, and the problem held off for a lot longer than it did last time. Now, where the games were nearly done, I received 'corrupt update files' for one game, and 'disk read error' for the other. dmesg is full of BTRFS errors again.

Here is a particular path included this time:

[ 5224.858741] BTRFS warning (device dm-0): checksum error at logical 152508235776 on dev /dev/mapper/luks-87b10272-c9ac-4365-8e90-a82ab75f77f3, physical 153623920640, root 256, inode 214119, offset 150450176, l
ength 4096, links 1 (path: jorp/.local/share/Steam/steamapps/downloading/2328760/PinballFX/Content/Paks/pakchunk2166-WindowsNoEditor.pak)

Here are the games in question: Screenshot from 2024-05-17 22-05-54

jorp avatar May 18 '24 02:05 jorp

Also made a post on the Fedora forums here

jorp avatar May 18 '24 03:05 jorp

That's a filesystem problem, it just so happens that steam is provoking it. I would recommend you use a different file system, ext4 is old as shit and pretty stable.

g572staem avatar May 19 '24 08:05 g572staem

check your ssd health and lifespan.

hifron avatar May 20 '24 20:05 hifron

check your ssd health and lifespan.

Thanks, I've already done this and haven't seen issues. I mentioned in the OP that I've also been able to reproduce this with a brand new drive.

As an update, I'm looking into RAM issues and it's possible that may be the culprit. Still researching and troubleshooting at the moment though.

jorp avatar May 20 '24 20:05 jorp

Hi all, I ram memtest86 and got 600+ errors.. I've since RMA'd my RAM and am using a fresh set while I await its return. I haven't run into any issues since. I am going to keep this open for another week or so to see if this comes back.

jorp avatar May 23 '24 22:05 jorp

Closing as a hardware issue.

kisak-valve avatar Jun 28 '24 18:06 kisak-valve

I'm also having this issue in the last couple days on a brand new machine. Seems odd for this to be popping up for multiple people. Also on Fedora using the rpmfusion packaged steam on BTRFS+LUKS.

The files at the inode from the error are: .local/share/Steam/steamapps/common/SteamLinuxRuntime_sniper/sniper_platform_0.20240618.92328/files/lib/i386-linux-gnu/libicudata.so.67.1 .local/share/Steam/steamapps/common/SteamLinuxRuntime_sniper/var/tmp-XG8DQ2/usr/lib/i386-linux-gnu/libicudata.so.67.1

I'll run memtest soon just to make sure.

trgeiger avatar Jul 01 '24 15:07 trgeiger

Update: memtest86+ passed. So no memory or disk hardware issues on my end. I deleted those sniper runtime files and had steam redownload them and I haven't had the issue pop up again, yet.

trgeiger avatar Jul 02 '24 14:07 trgeiger

I have problem exactly like this Using flatpak everything is fine Or if I move game from windows to linux and redownloading last parts

Destinyg133 avatar Sep 18 '24 23:09 Destinyg133

Sorry to bump an old issue, but looks like this problem has returned. I tested my RAM and found a bunch of errors again.. I was surprised because I just replaced it a few months ago.

I did notice that disabling XMP in my BIOS resulted in no more memtest86 errors. However, it looks like this issue with BTRFS is still persisting.

It did go away for some time after RMA-ing my RAM, and I guess it is possible that this pair somehow became faulty too.. not really sure were to direct tickets and information. My issue here was (understandably) closed, and I am still seeing others that are loosely related here.

jorp avatar Oct 12 '24 02:10 jorp

@kisak-valve could you reopen this?

jorp avatar Oct 12 '24 02:10 jorp

Same thing happens for me (Fedora 41 Kinoite). Memtest is OK, Samsung 990 PRO 2TB. Edit: I suspect my SSD may have failed. I tried Ubuntu 20.04 with ext4 and encountered similar problems.

spikhoff avatar Dec 25 '24 19:12 spikhoff