DietPi-Docs icon indicating copy to clipboard operation
DietPi-Docs copied to clipboard

Create a troubleshooting guide for common issues

Open ghost opened this issue 4 years ago • 4 comments

In reference to https://github.com/MichaIng/DietPi/issues/3835#issuecomment-727208523, a troubleshooting guide with common issues and fixes should be added to the docs.

ghost avatar Nov 14 '20 17:11 ghost

Jep, this makes sense. There are quite a few issues happening very regularly leading to duplicated reports and duplicated answers our sides. A section how to troubleshoot specific issues or where to start look at in case of unspecific unknown issues (htop, dmesg, journalctl, systemctl, ...) is a great idea.


Meta note @fpetru @StephanStS: The label "documentation" in a pure documentation repo is somehow nonsense 😄. Let's add "new content" (new pages/sections), "extension" (smaller additions to existing pages/sections), "enhancement" (clarifications, more details, better screenshots, better solutions/instructions), "correction" (fix for wrong/outdated info, spelling(?)) and such.

MichaIng avatar Nov 17 '20 12:11 MichaIng

I would propose to add selected issues from there: https://dietpi.com/phpbb/viewforum.php?f=11.

Then we also have to think about a logical substructuring, instead of a flat list of a bunch of issues. This substructuring we can start to introduce whenever we get more and more content to the TS guide.

The TS guide could be an own main topic like "Home", "Optimized Software", "Community".

StephanStS avatar Nov 18 '20 09:11 StephanStS

As well old GitHub issues will serve as examples. I agree to not make it a flat list, which could expend unlimited with various overlaps, but more with some generic debugging hints, something like:

  • General "health" check:
    • dmesg -l emerg,alert,crit,err/dmesg | tail -10 # Kernel errors/last kernel logs
    • free -m/htop # Memory and resource usage, kill hanging processes
    • systemctl # service status overview
    • G_OBTAIN_CPU_TEMP/cpu # CPU temperature
  • A service is not accessible/crashes:
    • systemctl status <service_name>
    • journalctl -u <service_name>
    • /var/log/
  • Access/permission issues:
    • whami/htop/systemctl cat <service_name> # Which user wants to access
    • groups <username> # Which groups is this user member of
    • ls -al /mnt/dietpi_userdata/<something> # Which owner/mode do the dirs/files have
  • Drive/mount issues:
    • df -T/lsblk/mount # mounted/available drives + mount options (e.g. ro?)
    • umount/fsck/dietpi-drive_manager # direct unmount + fsck
    • > /forcefsck/dietpi-drive_manager # fsck on next boot
    • journalctl -t systemd-fsck/cat /run/initramfs/fsck.log # check fsck logs from last boot
  • ... just as a rough starting idea.

Everything that is a known software-specific issue (e.g. Home Assistant first service start, Transmission not freeing memory cache, ...) should be instead added to the related software title docs, similar with known SBC-specific issues.

MichaIng avatar Nov 18 '20 14:11 MichaIng

Recovering broken ext4 superblocks: https://linuxexpresso.wordpress.com/2010/03/31/repair-a-broken-ext4-superblock-in-ubuntu/

MichaIng avatar Mar 17 '21 09:03 MichaIng