rpi-power-monitor
rpi-power-monitor copied to clipboard
Grafana password frequency and SD Card fail
Hi David!
Grafana (v7.1.3) is bugging me much more with password requests than I remember when first installing my system last fall.
Today, in the middle of a session of examining some data, it interrupted me and demanded a password again!
Now, I use chrome to store the password, so it is a few clicks, but I would like to reduce the frequency, or possibly go password free on Grafana.
Since the pi is behind a router NAT, it is not like the world has direct access to the pi. "ShieldsUp" shows my ports are in stealth, but that is the extent of my network security knowledge.
In stumbling about on Grafana, I found nothing in "preferences", but some mysterious variables in Server Admin->settings->auth. The comment is "These system settings are defined in grafana.ini or custom.ini (or overridden in ENV variables). To change these you currently need to restart grafana."
Have you accepted the defaults or can you suggest changes that might be less annoying?
Thanks.
Hi bobstanl, I suspect that your Grafana container might be restarting. Can you share the output of docker ps -a
and also uptime
? These two commands will tell me if docker has restarted any of the containers since the last boot.
This project uses the default Grafana settings, and those .ini files are used to tweak them. See this question on stack overflow about not requiring a login to view dashboards: https://stackoverflow.com/a/51173858/6711085
If the docker resetting is causing more frequent password requests, you have nailed it! I knew I had a problem with the pi mysteriously quitting. Here is a 30 day screen shot.It shows a PG&E shutdown on about 08/31, but then I had a mysterious shutdown on 09/06.No power outages I am aware of, but could have been a glitch.
Now, here are the commands you asked @.***:~ $ docker ps -aCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES5b0a94d033be influxdb "/entrypoint.sh infl…" 13 months ago Up 9 days 0.0.0.0:8086->8086/tcp influxa4961cc24745 grafana/grafana "/run.sh" 13 months ago Up 9 days 0.0.0.0:3000->3000/tcp
@.:~ $ uptime 12:55:24 up 1 day, 12:50, 1 user, load average: 7.05, 6.65, @.:~ $ uptime 12:59:17 up 1 day, 12:54, 1 user, load average: 5.60, 6.09, @.***:~ $ dateThu 16 Sep 13:08:19 PDT 2021 This is confusing. I power rebooted the pi, after the mysterious shutdown, on 09/14 after about 22:00.-Why does the docker think it was up 9 days instead of 2? I guess the uptime indicates I had a reset and boot 24 + 12 hours ago, i.e. my reboot on 09/14. I found a linux utility, "tuptime", that might help since uptime only goes to the last boot. May have to install that on the pi. -Do you know of any other way to debug mysterious reboots? Thanks for help!
On Thursday, September 16, 2021, 08:42:57 AM PDT, David00 ***@***.***> wrote:
Hi bobstanl, I suspect that your Grafana container might be restarting. Can you share the output of docker ps -a and also uptime? These two commands will tell me if docker has restarted any of the containers since the last boot.
This project uses the default Grafana settings, and those .ini files are used to tweak them. See this question on stack overflow about not requiring a login to view dashboards: https://stackoverflow.com/a/51173858/6711085
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
That's strange - I was expecting to see the containers have a status with a smaller uptime than the actual Pi's uptime. I don't think I've ever seen that before.
The screenshot didn't come through because the response gets sent to GitHub's issue tracker. You can attach a screenshot (and reply) directly to the issue here:
https://github.com/David00/rpi-power-monitor/issues/36
I'm not sure of any specific ways to debug reboots but a quick Google provided this link, which appears to be helpful: https://geekflare.com/check-linux-reboot-reason/
Here is the screenshot mentioned in previous comment
This issue is drifting from Grafana password to debugging pi shutdowns. I will need to do some further research to see if there is a better tool than tuptime in case this happens again. You did give a good lead for avoiding password. If you have any comments on debugging shutdowns, I would appreciate them. Otherwise, this issue is probably complete.
This is just a guess, but if it was an unclean shutdown I could envision a bug with docker getting confused about the start time of the container. This SO question might help to shed some more insight on the actual docker events that happened - https://serverfault.com/questions/909265/how-to-check-the-history-of-docker-container-restarts
And regarding shutdowns, I'd look at the log files first. journalctl and/or /var/log/syslog would the places I'd start with.
Thanks for the two suggestions.
First, the "docker events" did not work for me. Here is one of my attempts. I appeared to get a hang, after each attempt, and had to ctl-c out. They seem to follow the examples but may not be correct:
pi@raspberrypi:~ $ docker events --filter event=restart --since='2021-08-28'
^Z [1]+ Stopped docker events --filter event=restart --since='2021-08-28'
As for the syslog, I had never looked at one before so it took some time to learn they are one day long and a number of previous days are retained:
-rw-r----- 1 root adm 103869 Sep 16 16:55 syslog -rw-r----- 1 root adm 120642 Sep 16 00:00 syslog.1 -rw-r----- 1 root adm 29791 Sep 15 00:05 syslog.2.gz -rw-r----- 1 root adm 17941 Sep 6 00:00 syslog.3.gz -rw-r----- 1 root adm 12085 Sep 5 00:00 syslog.4.gz -rw-r----- 1 root adm 13232 Sep 4 00:00 syslog.5.gz -rw-r----- 1 root adm 11525 Sep 3 00:00 syslog.6.gz -rw-r----- 1 root adm 12127 Sep 2 00:00 syslog.7.gz
The syslog.2, dated Sep 15, actually seems to have my mysterious shutdown on Sep 6. The file date indicates the previous days capture.
Here are two snips from syslog.2 and the way I interpret it.
First, I am getting a lot of "Failed to write data to Influx. Reason: b'{"error":"timeout"}\n'" events throughout all the logs. It does not appear affect the data presented by Grafana, but my RPi3 is definitely overworked.
So, the first part of the snip is several lines of this error, followed by what I believe is my power reset on Sep 14, around midnite before the 15th. I believe rpi linux grabs a stored time at boot until it can get an NTP update. So, the start of my power reboot is "Sep 6 14:09:42", where the last reading before the mysterious shutdown was at "Sep 6 14:15:31", a later time! I then skip many lines of the reboot process and go to the end of the syslog file where it gets the timesyncd, decides it is after midnite and proceeds to create a new syslog file.
Snips from syslog.2.gz:
Sep 6 14:06:17 raspberrypi python3.7[1022]: 2021-09-06 14:06:17 : Failed to write data to Influx. Reason: b'{"error":"timeout"}\n' Sep 6 14:06:44 raspberrypi python3.7[1022]: 2021-09-06 14:06:44 : Failed to write data to Influx. Reason: b'{"error":"timeout"}\n' Sep 6 14:07:08 raspberrypi python3.7[1022]: 2021-09-06 14:07:08 : Failed to write data to Influx. Reason: b'{"error":"timeout"}\n' Sep 6 14:07:34 raspberrypi python3.7[1022]: 2021-09-06 14:07:34 : Failed to write data to Influx. Reason: b'{"error":"timeout"}\n' Sep 6 14:10:27 raspberrypi python3.7[1022]: 2021-09-06 14:10:27 : Failed to write data to Influx. Reason: b'{"error":"timeout"}\n' Sep 6 14:15:31 raspberrypi python3.7[1022]: 2021-09-06 14:15:31 : Failed to write data to Influx. Reason: b'{"error":"timeout"}\n' Sep 6 14:09:42 raspberrypi systemd-modules-load[107]: Inserted module 'i2c_dev' Sep 6 14:09:42 raspberrypi fake-hwclock[109]: Mon 6 Sep 20:17:02 UTC 2021 Sep 6 14:09:42 raspberrypi systemd-fsck[131]: e2fsck 1.44.5 (15-Dec-2018) Sep 6 14:09:42 raspberrypi systemd[1]: Started udev Coldplug all Devices. Sep 6 14:09:42 raspberrypi systemd[1]: Starting Helper to synchronize boot up for ifupdown... Sep 6 14:09:42 raspberrypi systemd[1]: Started Helper to synchronize boot up for ifupdown. Sep 6 14:09:42 raspberrypi systemd-fsck[131]: rootfs: clean, 69561/1899328 files, 1970600/7725184 blocks Sep 6 14:09:42 raspberrypi systemd[1]: Started File System Check on Root Device. Sep 6 14:09:42 raspberrypi systemd[1]: Starting Remount Root and Kernel File Systems... Sep 6 14:09:42 raspberrypi systemd[1]: Started Set the console keyboard layout.
...SKIP MANY BOOTUP LINES...
Sep 6 14:10:05 raspberrypi systemd[1]: Started Update UTMP about System Runlevel Changes. Sep 6 14:10:05 raspberrypi systemd[1]: Startup finished in 6.391s (kernel) + 29.124s (userspace) = 35.516s. Sep 6 14:10:09 raspberrypi dhcpcd[557]: vethbbc77be: probing for an IPv4LL address Sep 6 14:10:09 raspberrypi dhcpcd[557]: vethdfb0bbf: probing for an IPv4LL address Sep 6 14:10:10 raspberrypi systemd[1]: systemd-fsckd.service: Succeeded. Sep 15 00:05:28 raspberrypi systemd-timesyncd[330]: Synchronized to time server for the first time 195.85.215.215:123 (2.debian.pool.ntp.org). Sep 15 00:05:28 raspberrypi systemd[1]: Starting Rotate log files... Sep 15 00:05:28 raspberrypi systemd[1]: Starting Daily man-db regeneration... Sep 15 00:05:28 raspberrypi systemd[1]: Starting Daily apt download activities...
The next file in series, syslog.1, continues booting and starting power-monitor.
If anyone is interested, I could post the two files, syslog.2.gz and syslog.1.gz.
So, I still have no idea why the pi shut down on Sep 6. It hasn't done that before. I did buy an RPi 4 with intention of replacing this RPi3 but am using it on a different project. If I get another mysterious halt, I think replacing the pi is my next step.
Interesting about the docker events
issue. Can't say I have any ideas on that one.
You are correct about how syslog works, although the specifics on log rotation can vary depending upon system configuration, Linux distribution, etc. Also correct about timestamps jumping backwards, this can frequently happen with NTP. I imagine the kernel will default to pulling the time from the BIOS during boot until updated otherwise, which might have not been kept up to date - last I recall the hardware clock is completely separate from the internal OS clock at least on x86 systems.
And regarding the log content - to confirm, there's no log entries between the last entry on the 6th and the shutdown and reboot on the 15th? If so, this is rather unusual to have such a large gap, although not impossible. What is more likely is that the Pi hung just after 14:15 on the 6th, per the last error on failure to write to influx and large gap of missing data on the grafana dashboard. There are no logged errors obviously, but I've seen lots of systems with hardware issues exhibit this same exact behavior.
Trying on a new Pi I think is a great troubleshooting step, it's good to have one available even if temporarily. The RPi 4 is a fairly decent upgrade from a 3 as well. I run rpi-power-monitor on a v4 quite effectively, although I leverage existing deployments of influx and grafana on my kubernetes cluster.
Hi David,
Somewhat related to this. I'd like to modify the default grafana settings to remove the auth login screen as I use a reverse proxy with authelia in front for authentication. Could you tell me where the docker config files would be stored in the powermon os raspbian image?
Thanks
Hey @jmadden91, there are a couple ways to go about this, but to answer your specific question about the config files:
The custom Pi OS image has a docker-compose.yml
file in /boot/docker-compose/mydockerfolder
.
However, the Grafana service definition does not map any local volumes to the Grafana container, and this is where a couple different ways to disable the Grafana authentication come into play.
The best way (IMO) would be to spawn a new image from your existing Grafana container, then recreate your Grafana container (this time with a volume mapped). But, this is definitely the more complicated way. See this SO question/answer if you want to try this.
The easier way to get it done as quick as possible would be to connect to the container, edit the file from inside the container, and restart Grafana.
I am working on addressing the better way to do this in some future software updates (which will include pre-provisioning Grafana), so for the time being, here's the easier way to do this:
(@bobstanl - you may want to follow these steps too to disable your Grafana authentication)
-
Get a root shell inside the Grafana container (assuming
grafana
is the name of your existing container)docker exec -it --user 0 grafana sh
-
Use vi to edit the config file at /usr/share/Grafana/conf/defaults.ini
vi conf/defaults.ini
-
Press
/
in Vi to search and type in the following text:auth.anon
Set
enabled = true
in the[auth.anonymous]
config section. (If you aren't familiar with Vi, pressi
to switch to edit mode, and then you can navigate/type normally). -
Press
:
followed bywq!
to save and close the file. -
Finally, restart your Grafana container:
docker restart grafana
Latest report: SD Card totally dead, lost a years worth of power data!
This thread started with a problem of puzzling resets that caused me to log in to grafana more often than usual.
The earlier resets were likely caused by power outages, which we in the PG&E forest suffer a lot. But now, it could also have been a deteriorating sd card.
The SD card will not boot, nor will it respond to every attempt to read it. Below, I will describe the symptoms and how I tried to read the card. Then I will describe some changes I am considering for my next version.
On Thursday, I noticed a gap in the grafana plot data from earlier in the day. I logged in and then did a sudo reboot. It reset that time, but in checking things, it appeared influx was not responding. When I did the "docker ps -a" test, grafana was running but influxdb had Exited (2) shortly after the reboot.
So, I attempted another sudo reboot, and got : Failed to open initctl fifo: No such device or address Failed to talk to init daemon. Ehhh!!!??? Can't reboot? So, pulled the plug to do a poweroff but now the pi would not boot at all. Later, I did try the pi with another sd card and it will boot, so the fault is with the card.
Next day, I removed the sd card and attempted to examine it. I used two different USB sd card readers, checked them with other sd cards to be sure the readers worked, and tried both ubuntu and windows. The sd card would not respond. On Linux, lsusb showed the usb reader, but not the card. It is toast and I never backed up the power data from over a years worth of operation! Woe is me!
???- Any other ideas on getting the data from a non-responding sd card? (It was a Sandisk 32GB, HC-I C4.)
FUTURE REVISIONS:
-As I indicated earlier, I will change out the RPi3B with a RPi4B-2GB.
POWER OUTAGE PROTECTION
Since this is a big problem where I live, I want a battery backed UPS that will keep the pi on just long enough to detect the outage and then do a controlled shutdown. No use having a power-monitor system running when the power is off! It then needs to automatically boot up when power is restored.
Omzlo Pivoyager On a different project, I have been using an Omzlo Pivoyager on a RPi4B. It has had many problems and I do not recommend it on a RPi4. It has a wimpy MCP73871 Battery Charge IC. This would not allow the pi to boot with only USB 5V, it powered up for a few seconds, the did a momentary cutout that rebooted the pi. With a battery connected, the system would operate normally. The RTC ran fast, gained about 12 seconds a day. Finally, now the voyager is continuously reporting a fault on the MCP73871 after only a few weeks of operation. On the other hand, Omzlo has excellent documentation, schematic, and firmware available, which is why I picked it in the first place.
Raspberry Pi UPS HAT I just purchased a Raspberry Pi UPS HAT, but have not received it yet: https://www.pishop.us/product/raspberry-pi-ups-hat/ It appears to have what I need to gracefully shutdown the RPi4 after a power outage. Mechanically, it will take some "futzing" to mount with David's PCA, but with some 40pin extender sockets, I think it will go.
???- Does anyone know where I could find a schematic for the Raspberry Pi UPS Hat by Buyapi.ca?
Other Pi battery backups that I looked at and rejected for various reasons, including cost, are: MakerHawk, Pisugar S pro, PiJuice Hat and Kuman.
If the Pi UPS HAT is a problem, my next choice will be: Geekworm Raspberry Pi X728 V2.1 It costs a lot more plus you have to buy 2 18650 batteries.
Software changes for UPS: The UPS will require changes to the main "while True" loop in David's run_main code.
-
A check each loop for AC power out. Looks like Pi UPS HAT has a GPIO pin to check. Also, possibly could check David's "voltage".
-
If power is out, shutdown gracefully before taking much battery current. On the Omzlo I did the following: import subprocess (In main loop, I detected AC out and called following) def RPi_PowerShutdown(): try: # Perform PiVoyager shutdown sequence logger.info("LCL Monitor performing PiVoyager shutdown sequence") # Tell Voyager to power up Pi when USB back on result = subprocess.run(['pivoyager enable power-wakeup'], shell=True) # Tell Voyager to sleep after 25 sec result = subprocess.run(['pivoyager watchdog 25'], shell=True) # Kill Pi power result = subprocess.run(['shutdown --poweroff now'], shell=True)
except Exception as e: logger.critical(f"Failed to perform power off shutdown sequence. Reason: {e}") #print(f"Failed to perform power off shutdown sequence. Reason: {e}") #sys.exit()
On the Pi UPS HAT, I probably will only need: # Kill Pi power result = subprocess.run(['shutdown --poweroff now'], shell=True)
???- Does anyone see a problem with this power down technique? I am assuming the docker and so on will shutdown without trashing files.
POWER-MONITOR DATA BACKUP
In the past, if I wanted a snapshot of the pi, I would just use Win32 Disk Imager to clone the whole sd card to a hard drive. Nice, simple gui, good feedback, but it takes forever. Then I zipped it. Linux "dd" scares me, feedback poor and you can really screw things up if you misunderstand it.
Now, since I hate losing all that data, I need some way to backup at least the influx on a more regular basis.
My research has found, that if influxdb is operating, there is a backup and restore function that might be useful. The overview documentation is here: https://docs.influxdata.com/influxdb/v1.8/administration/backup_and_restore/
This thread has a nice discussion and even includes some docker exec examples of the influxd commands: https://stackoverflow.com/questions/56596533/influxdb-move-only-one-database-of-many-from-one-server-instance-to-another/56652014
It would be good to do this from a remote machine onto it's hard drive.
???- Does anyone know of a script to do an influx backup remotely? ???- Any other ideas to prevent such a loss of collected data?
I actually had a similar problem with an SD card being corrupted in the power-monitor due to unclean shutdowns. I used ddrescue successfully, mainly just to recover the calibration values since it took some time to set up initially. This was also why I moved my influxdb deployment to a different system. But if your card is so shot that it won't even be detected, I don't think you will have much luck there.
This is a known issue with rPi's in general though - even without any power interruptions or unclean shutdowns, SD cards are not rated for nearly the amount of write cycles as a typical SSD, even moreso with cheap SD cards. I have been working to eliminate as many idle writes as possible on my PiHole and other Pi's for this reason.
Regarding the UPS, I also struggled to find a good option. Hit similar issues with insufficient power delivery on a lot of devices out there (not all of them are rated for the increased rPi v4 power usage). I don't have a good solution in place currently, but I did find a 10000mAh battery with multiple voltages that allows for simultaneous charging while providing power. Soldered in a 15W buck converter and it works pretty well, although it does not do auto-shutdown. The capacity of the battery though is sufficient to run the Pi for quite a long time from my calculations, so I just manually shutdown if I have an extended outage. I want to have a better solution in place though, thanks for pointing those projects out. On the surface the GeekWire one looks promising.
With your shutdown procedure, it looks like it should work just fine. As long as the code runs as root I don't see any obvious issue, but I would test it of course :) Docker and all applications should shutdown cleanly without issue.
And with influx backups, the functionality seems good; specifically the fact you can natively send it to a remote host. I don't know of a backup script offhand, but it looks like influx does most of the heavy lifting. You'd only need some effort to tie into your docker and ideally also a way to get notified of a backup failure. Also testing restoring from backup from time to time is always smart!
@David00 Perfect mate, thanks for the detailed reply
@bobstanl I'm really sorry to hear you lost your monitoring data! Those rolling grid outages must be a pain to deal with.
The Pi-specific UPS solutions out there all look really neat, but unfortunately I have no experience with any of them. I am interested in trying one out though, because my power monitor Pi is not power-protected. Of the two you linked, I'd probably go with the Geekworm board. However, if you have the space around your Pi, I'd also recommend looking at the APC BE425M as a general battery backup solution.
As for the storage reliability - the version 5 Linux kernels support USB boot on Raspberry Pi's. However, all the ones I've tested have a bug with the underlying SPI driver that cuts the sample rate in half, which drastically impacts my project. So, we've been stuck on v4 kernels which don't support USB boot. It's been awhile since I have tested any of the latest v5 kernels, so perhaps the issue with SPI has been fixed, which would allow you to run your Pi from a USB flash drive or a NVMe disk with a USB to NVMe adapter.
You can also send the data to a remote InfluxDB server without the need for running InfluxDB on the Pi itself. Just change the host
to the IP address of your remote InfluxDB server in this line of config.py.
I am working on some major changes to the software to make it easier to setup and use, so I will look into incorporating InfluxDB backups with the new changes. (If you've ever ordered hardware from me via my shop, you'll receive an announcement email about these changes once I'm ready to release it).
Change directory for influxdb? @David001. Since there are issues in booting from USB, can we do an end-run and redirect the influxdb storage location to an external USB drive? Then, we continue to boot linux from the sd card. ??? Other than initial loading, would influx still make a lot of writes to the sd card if the data location were an external USB directory? In other words, would the USB drive now take all the "wear and tear" instead of the sd card? ??? How to redirect data storage on influx? Is it just changing "/opt/influxdb" in the following:docker run -d --restart always --name influx -p 8086:8086 -v /opt/influxdb:/var/lib/influxdb influxdb:1.8.3 ??? Would backing it up simply be copying that directory? (i.e. replacement for "/opt/influxdb") Or are there good reasons to use the "influxd backup" commands? Thanks for all your help!
Yes, you can change the directory of the local InfluxDB data on the Pi by changing the directory in the -v /opt/influxdb
part. Just make sure to keep the :/var/lib/influxdb
part since this specifies the directory inside the container itself.
This could add a layer of complexity to the InfluxDB container starting up... if your USB flash drive happens to not be mounted (like on boot, for example), the container may fail to start.
This should alleviate some wear and tear, but I don't think it will be as beneficial as protecting the Pi from unclean shut downs.
As for backups - I'd suggest sticking to the method in the documentation. I'm sure there are valid reasons for the method they recommend.
Ironically enough, I came back from a weekend holiday getaway to find an unresponsive power monitor Pi. After some troubleshooting and failed fsck
's, I determined that my microSD card has failed. It's stuck in a perma-read-only mode, so at least I should be able to recover the data. This card (a Samsung Evo 32GB) lasted a bit over a year.
When I get my Pi back up and running, I'll move the storage location for Influx to an external USB drive and provide some guidance on that process.
If I can just put my little contribution into the discussion, I believe the best type of Cards to buy is the ones with mention High Endurance that are designed for high I/O like Dashcam,.. or Raspberry PI :)
I've check up the Samsung Evo 32GB and it didn't mention that.
Regards.
Thanks for the suggestion @richie256. I've ordered one of the SanDisk endurance cards so I'll give that a go. I should have automated InfluxDB backups added to the project long before the new card expires!
Hi David You are probably already ahead of me, but here are my experiments at Influx backup ~Basic command backup -portable -database power_monitor /tmp/powermonsnapshot ~Execute in docker pi@powermon:~ $ docker exec -it influx influxd backup -portable -database power_monitor /tmp/powermonsnapshot 2021/11/24 06:22:50 backing up metastore to /tmp/powermonsnapshot/meta.00 2021/11/24 06:22:50 backing up db=power_monitor 2021/11/24 06:22:50 backing up db=power_monitor rp=autogen shard=3 to /tmp/powermonsnapshot/power_monitor.autogen.00003.00 since 0001-01-01T00:00:00Z 2021/11/24 06:22:57 backing up db=power_monitor rp=autogen shard=8 to /tmp/powermonsnapshot/power_monitor.autogen.00008.00 since 0001-01-01T00:00:00Z 2021/11/24 06:23:02 backup complete: 2021/11/24 06:23:02 /tmp/powermonsnapshot/20211124T062250Z.meta 2021/11/24 06:23:02 /tmp/powermonsnapshot/20211124T062250Z.s3.tar.gz 2021/11/24 06:23:02 /tmp/powermonsnapshot/20211124T062250Z.s8.tar.gz 2021/11/24 06:23:02 /tmp/powermonsnapshot/20211124T062250Z.manifest pi@powermon:~ $ ~Oops it is invisible to Linux OS pi@powermon:~ $ sudo du -sh /tmp/powermonsnapshot du: cannot access '/tmp/powermonsnapshot': No such file or directory ~So copy from docker to OS pi@powermon:~ $ docker cp influx:/tmp/powermonsnapshot /tmp/powermonsnapshot ~ I had just started my new sd card pi@powermon:~ $ sudo du -sh /tmp/powermonsnapshot 71M /tmp/powermonsnapshot pi@powermon:~ $ ~Here is what the directory contains pi@powermon:~ $ cd /tmp/powermonsnapshot pi@powermon:/tmp/powermonsnapshot $ ll total 72412 drwx------ 2 pi pi 4096 Nov 24 06:23 ./ drwxrwxrwt 9 root root 4096 Nov 24 06:33 ../ -rw------- 1 pi pi 495 Nov 24 06:23 20211124T062250Z.manifest -rw-r--r-- 1 pi pi 429 Nov 24 06:22 20211124T062250Z.meta -rw------- 1 pi pi 47167916 Nov 24 06:22 20211124T062250Z.s3.tar.gz -rw------- 1 pi pi 26962181 Nov 24 06:23 20211124T062250Z.s8.tar.gz pi@powermon:/tmp/powermonsnapshot $
~Now need to move to external computer or drive from Pi
Funnily enough, I also seem to be having card issues with a different rPi running HASS. During debugging I did find an interesting product though, while definitely an additional cost over the Pi itself, it eliminates this issue by leveraging an M.2 SSD instead of an SD card.
From what I can tell you would need the case + expansion board (linked below) plus a small SSD. These types of rPi devices are new to me, so I'm going to give it a shot - figured I'd mention in case someone else was interested.
https://www.amazon.com/Argon-Raspberry-Support-B-Key-Compatible/dp/B08MJ3CSW7 https://www.amazon.com/dp/B08MHYWJCP/
@taintedkernel - You can do the same with a USB flash drive, but I don't know how much more reliable that might be over the microSD card. The SSD is definitely the better option.
However, if you're looking to run the entire operating system from the drive, you'd have to use USB boot, which I recall reading somewhere is only supported in v5 linux kernels. Unfortunately, there's an issue in the v5 kernels that essentially cuts the sample rate for my project in half, which reduces the accuracy of the power calculations. My custom OS image currently uses a late v4 kernel, which doesn't support USB boot, but gives us the better sample rates.
I have tested v5 kernels from 5.10.1 up through 5.10.31, and the latest kernel is 5.10.82. So, the problem might have been fixed in the later versions, and if so, that would allow us to use USB boot.
If anyone would like to try testing the speed with the latest v5 kernel, just issue the command sudo rpi-update c827259e4adb63d1dd36e21d51dcd4243d0c1255
followed by sudo reboot 0
. Then, generate a debug plot with the power monitor software and the sample rate will be displayed near the bottom of the plot. The correct sample rate should be around 30 kSPS, and about half that if the kernel has the sample-rate problem.
@David00 Ah yes, you did mention that issue above and it seems this device does rely upon USB booting. I can try updating, but what would be the revision I would use to revert back to in case the issue still persists?
The known working version is 4.19.118, which you can revert to with this command:
sudo rpi-update e1050e94821a70b2e4c72b318d6c6c968552e9a2
Unfortunately no luck on that kernel, I was getting ~14 KSPS. The revert process worked fine and now back to around 31. I'm pretty unfamiliar with the rPi internals, but just taking a wild guess - maybe something with the default kernel config on the newer versions is causing the issue? I would (hopefully) think a notable regression in performance on a new kernel series would have been caught in testing?
Thanks for trying! There's an open issue about it on the Raspberry Pi linux repository, and there hasn't been any action on it since I suggested using v4.19.118 as a workaround back in April: https://github.com/raspberrypi/linux/issues/3381
Anytime! Thanks for pointing out the issue, following it myself now as well.
my pi runs POE, and my network switch etc is on a 1000VA UPS. Running NUT on the pi as well for monitoring the UPS and it will gracefully shut down the pi when battery life is under 20% with no AC. My SD card is only for boot as well, and am running a M.2 drive as the root partition in a USB3.1 to NvME adapter.
Hi David! I am in trouble again and it looks suspiciously like another SD failure. My grafana stopped displaying data, found out influx had exited mysteriously. In investigating that, during reboots, docker will no longer start! (But, unlike last sd failure, it still boots into OS...) Can forward my notes on investigation, if you are interested, but I think I will attemot to implement your latest since I am two years behind on my install. So, NO DOCKER anymore! Wow, that should make it lots easier to back up my data! (BTW, I have no idea of how to recover the data from this new failed SD card if I can't start docker, maybe that's another thread...)
Main question now is: Did someone fix the need for an old RPi OS version? i.e. 4.19.118, no 5.0 or later?
- I am confused that you don't stress need for 4.19.118 in latest Software 0. Installation page.
If fixed, can I USB3 boot off SSD instead of damn SD cards? Life would be much better!
Hey @bobstanl! That's unfortunate!! What microSD card were you using this time? It might be helpful to start tracking cards that seem to be no good for this project. On another note, I'm running the SanDisk High Endurance cards on two different systems for over a year now and so far, so good.
That's good that you can boot the card still - and also you won't need to start Docker to get the data off the card. The data should be in /opt/influxdb
, so if you create an archive of this entire directory, it will be easy to get the data off and into your next instance.
Try:
sudo tar -cvf /home/pi/influx_backup.tar /opt/influxdb/
This will put everything from /opt/influxdb into a single tar file at /home/pi/ called influx_backup.tar. Then, you can copy this file out with SCP or by mounting a flash drive.
Regarding the updates to the project:
- Yes, no more docker! I am almost done working on v0.3.0 and should be releasing it sometime next week if all continues to go well. There will be an included backup script that you can use to take periodic backups and save them to a USB flash drive.
- Yes, the old RPI OS version requirement is removed. We can now use 5.x kernels, and the current v0.2.0 OS image I've built for this project uses the latest Raspberry Pi OS Lite image from Sept. 2022. The new v0.3.0 release will use the same image.
- You should be able to use USB boot with the current v0.2.0 image and the upcoming v0.3.0 image.
- Also, v0.3.0 is going to bring a lot of improvements to the setup of the project. I've done a bit of refactoring that removed the need for the extensive calibration procedure, among a bunch of other smaller things.
- I'm also building a new site for documentation to clean up the docs. I'll leave the Wiki as is for the time being, as a reference for anyone still running the older v0.1.0 or current (as of this comment) v0.2.0 stuff, but future stuff should be referencing the new site.
If you're right on the threshold of starting from scratch, would you be interested in testing out the upcoming v0.3.0 release and going through the new documentation? I can provide both of them if so.