PellMon icon indicating copy to clipboard operation
PellMon copied to clipboard

give up and return fail

Open japlab opened this issue 6 years ago • 37 comments

I just get my new usb2serial adapter installed but have the same fail as on the other two i try out. I don’t think it could be the adapter any more that cource the fail. Anyway, pellmon is working perfect, only this anoying fail in the log. Anything to try out.

I'm running, NBE 4.99F

pellmon11

japlab avatar Mar 07 '18 15:03 japlab

Probably not the adapter then. Maybe the burner controller hardware or the particular firmware version 4.99F or possible the cable picks up electrical interference from something. When I was running a V6 controller I had a timeout maybe once of twice in a week at most, so the controller doesn't always respond. I have never seen or heard about that much timeouts so something is different in your setup, but I have no real clue about what it might be, unfortunately.

motoz avatar Mar 07 '18 16:03 motoz

Hi Anders. Thanks for the suggestions. I now tried to use shielded cable (5meter) and connect shield to real ground in one end. No improvement. There is still at least 6 timeouts every 24h. I think the controller with firmware 4.99F is way overloaded. I noticed the 7 segments digits on front flicker when connected to serial interface. I'll try to dig into the python code and look at the NBE protocol manegement. I soon report back. BR Jan

japlab avatar Mar 08 '18 18:03 japlab

Yes it might be just overloaded, I never had stokercontrol so I don't know how often they poll 'officially'.

The code is in the 'scotteprocol' module. Maybe you could try to add a small delay before the retry about here: https://github.com/motoz/PellMon/blob/3e2511ada9e2cd5715c5c9ad67f45a41ef8d82f3/src/Scotteprotocol/protocol.py#L299 just to see if that helps.

(And also note that that code was more or less my firt python code ever while trying to learn the language so I'm not very proud of it..)

The reads are triggered by polling the internal database from several different places, this one is quite fast (2s delay between polls): https://github.com/motoz/PellMon/blob/3e2511ada9e2cd5715c5c9ad67f45a41ef8d82f3/src/Pellmonsrv/pellmonsrv.py#L106

motoz avatar Mar 09 '18 07:03 motoz

Hi, I have had the same issue for some time now. I tried replacing the serial to usb device, but the issue continues. Previously it only occurred once in a while, where I solved it by restarting my raspberry. That's not the case anymore :-( I have chip version 6.87. Being overloaded? Well maybe.

Can something be reconfigured in pellmon to lower the poll intervalls, or any other ideas?

Thomsus avatar Mar 11 '18 08:03 Thomsus

Thomsus, do you mean that your controller stops responding completely?

motoz avatar Mar 11 '18 18:03 motoz

Hi Motoz. Pellmon works, as it shows the graphs, updates the vector image, sends mails etc. But I get a lot of timeouts. image

Thomsus avatar Mar 11 '18 20:03 Thomsus

Can you give a little more details on this: "where I solved it by restarting my raspberry". Did restarting the raspberry eliminate the timeouts for a while? For how long? Did you restart only the raspberry and not the controller? And now the timeouts occur more frequently and restarting the raspberry does not make a difference any more?

Anything else? Like upgraded raspbian, upgraded pellmon, installed something else on the raspberry?

motoz avatar Mar 12 '18 06:03 motoz

I tried to make the story short, but understand why you ask for more :-) I had an old raspberry pi with Pellmon running for a few years, that suffered this occasional timeout issue. It was so seldom that I simply rebooted the raspberry for this to get fixed. Simply restarting the two Pellmon services didn't do it. One day the SD card suffered a crash. Yeah, should have made a backup, I know :-) So I brought a new raspberry pi 3, a new micro sd card, and made a clean install, not using any of the old stuff (DB etc). I don't remember if it had the timeout issue from the beginning, or if it started shortly after. But it's been there ever since, for a couple of months now.

I just upgraded and rebooted it. At first it detects some parameter changes that shouldn't have changed: Parameter 'feeder_capacity' changed from '70' to '870' (hasn't been changed for a looong time) Parameter 'boiler_temp_set' changed from '799' to '70' (799?? It should have exploded then :-) ) 'mode' changed from '651' to 'Running' 'alarm' state went from '18' to 'ok' (I have no alarm called 18)

Having been running for 11 minutes, the timeout issues started again.

Thomsus avatar Mar 12 '18 08:03 Thomsus

I still don't understand what difference the reboot made, if you only had an occasional timeout every now and then, why reboot?

motoz avatar Mar 12 '18 08:03 motoz

My two cent info: Last year I was installing pellmon on a RPI v3 and changed to a older RPI v2 and boot without GUI. Just by swapping the SD card. Easy. This setup has worked for several month except for the timeout issue. After trying tree different USB2Serial adapter, I decided to try changing back to RPI v3. I was forced to do a total reinstall of Rasperian and pellmon but with backup conf files. I noticed a longer run without timeout complains, but It began after a while again.... Please see the log screen photo. Im In doubt about the clock is 12 or 24h ! I can't find any other logs to look into. I thought of something with the SD card RW. timeout

japlab avatar Mar 12 '18 08:03 japlab

@Motoz: The reboot made Pellmon work fine for months (no timeouts).

Thomsus avatar Mar 12 '18 09:03 Thomsus

Was running for several day without any problem. The log was not giving any warnings, only the 7 lines showing above dated "2018-03-18 04-10-10.53.634 ...... " Probably because I use the Raspberry v3, the fastest one. !

Today the server was stopping and I can't restart it. I get this messages: Are the SD card weared out !

The server encountered an unexpected condition which prevented it from fulfilling the request.

Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line 670, in respond response.body = self.handler() File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 217, in call self.body = self.oldhandler(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line 61, in call return self.callable(*self.args, **self.kwargs) File "/usr/lib/python2.7/dist-packages/Pellmonweb/pellmonweb.py", line 661, in index parameterlist = dbus.getFullDB(['','','','','']) File "/usr/lib/python2.7/dist-packages/Pellmonweb/pellmonweb.py", line 189, in getFullDB raise DbusNotConnected("server not running") DbusNotConnected: server not running

japlab avatar Mar 16 '18 12:03 japlab

I have always used a B+ model (not even a raspberry 2 that is), so a raspberry 3 shouldn't make a difference for pellmon itself, but maybe there is some other software running that causes trouble. Do you run a full raspbian or raspbian lite? I have always used raspbian lite since I have no use for the extra software that comes with the full version (which is gigantic these days...). And I'm still on rasbian jessie btw, are you on the latest (stretch I beleive it's called)?

Regarding the server (pellmonsrv) fail, if you start it in the forground according to https://github.com/motoz/PellMon/wiki/Contributing-to-PellMon you should at least see some crash info if it crashes.

motoz avatar Mar 16 '18 12:03 motoz

I dug out my old V6.33 controller box yesterday and set it up (without a burner connected) with an old raspberry B with 512M ram running raspbian (lite) jessie. So far not a single timeout. Which raspbian version are you running (jessie or strech), full or lite?

Have you installed anything else on top of a default raspbian install apart from pellmon?

motoz avatar Mar 19 '18 07:03 motoz

Anders, would you like me to start my own thread with my issue? I know I'm the one who interfered with jablab's issue, as it seemed similar to mine.

Thomsus avatar Mar 19 '18 10:03 Thomsus

I struggle with installing Stretch Light, so I ones again, installed Stretch Full on a new SD card. Pellmon v0.7.1-alpha1. The time stamp show a god run for 18 hour, then it begin to do timeout. Btw, it dont't crash, so I live fine with it. I'm still very happy with the performance. To Thomsus, you are welcome to interfere here in this thread. More input mean better chance to solve the trouble. NOte: I have do a clean install of Rasperian full and done a "apt-get upgrade". No VNC, no SSH, just enabling Serial and OneWire.

2018-03-19 07:33:06,514 - INFO - Timeout again, give up and return fail 2018-03-19 07:11:02,053 - INFO - Timeout again, give up and return fail 2018-03-19 07:00:00,031 - INFO - Timeout again, give up and return fail 2018-03-19 06:37:55,728 - INFO - Timeout again, give up and return fail 2018-03-19 06:18:35,355 - INFO - Timeout again, give up and return fail 2018-03-18 12:44:40,308 - INFO - copied /var/lib/pellmon/nvrrd.db to /mnt/ramdisk/rrd.db 2018-03-18 12:44:40,211 - INFO - Activated plugins: ScotteCom, SiloLevel, Consumption, Cleaning, Onewire 2018-03-18 12:44:40,191 - INFO - chip version detected as: 4.99F 2018-03-18 12:44:40,116 - INFO - serial port ok 2018-03-18 12:44:39,894 - INFO - Looking for plugins in /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins 2018-03-18 12:44:39,893 - INFO - starting PellMon

japlab avatar Mar 19 '18 11:03 japlab

Japlab, what exactly do you mean by 'enabling Serial'?

And Thomsus, are you also on full raspbian stretch or something else?

motoz avatar Mar 19 '18 12:03 motoz

In "raspi-config" I chose OneWire and Serial. I now see that the Serial should't be needed since it run over USB, or !

japlab avatar Mar 19 '18 13:03 japlab

Ok, the serial console used to be enabled by default, and having it enabled should not cause any problems anyway.

I suppose I have to download and install the full raspbian stretch at some point and check out if that makes any difference. I think the full raspbian nowadays starts the graphical desktop by default, that of course uses a lot of ram and probably also cpu resources, even if it's not used. I would recommend doing the pellmon installation and setup over ssh, then you never need the graphical desktop and the 'lite' version works just the same as the full.

motoz avatar Mar 19 '18 14:03 motoz

Now it has been running for 16 hour on the full raspbian stretch without a single timeout. Too early to say anything certain but it doesn't look like it's related to the raspbian version.

Maybe you could both list exactly which usb-serial adapters you have tried to see if there is any semblance. Output of lsusb, any identification printed on the case, and such.

Mine is listed as ID 067b:2303 Prolific Technology, Inc. PL2303 Serial Port

Another possibility is too weak power supply. It's recommended to use a 2.5A power for a raspberry3.

motoz avatar Mar 20 '18 11:03 motoz

Hi Motoz. My Raspberry pi is running "Raspbian GNU/Linux 9 (stretch)" USB says: "Bus 001 Device 004: ID 067b:2303 Prolific Technology, Inc. PL2303 Serial Port"

Power supply you say? Well, it was using an official 2,5 A raspberry PSU. Swapped it with another, but no difference. When I restarted the raspberry, before the log was filled with error messages, I noticed it said: INFO - invalid setting for plugin_dirs

So I enabled debugging, and checked the log. It seems something is missing: 2018-03-21 10:43:54,853 - DEBUG - error in [pollvalues]: d11 missing, written as undefi$ 2018-03-21 10:43:54,854 - DEBUG - error in [pollvalues]: d12 missing, written as undefi$ 2018-03-21 10:43:54,854 - DEBUG - error in [pollvalues]: d13 missing, written as undefi$ 2018-03-21 10:43:54,854 - DEBUG - error in [pollvalues]: d14 missing, written as undefi$ 2018-03-21 10:43:54,855 - DEBUG - error in [pollvalues]: d15 missing, written as undefi$ 2018-03-21 10:43:54,855 - DEBUG - Polling is on 2018-03-21 10:43:54,857 - INFO - invalid setting for plugin_dirs 2018-03-21 10:43:54,868 - INFO - starting PellMon 2018-03-21 10:43:54,869 - INFO - Looking for plugins in /usr/lib/python2.7/dist-package$ 2018-03-21 10:43:54,875 - DEBUG - PluginManager walks into directory: /usr/lib/python2.$ 2018-03-21 10:43:54,877 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/cleaning.pellmon-plugin 2018-03-21 10:43:54,879 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/scottecom.pellmon-plugin 2018-03-21 10:43:54,881 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/onewire.pellmon-plugin 2018-03-21 10:43:54,883 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/openweathermap.pellmon-plug$ 2018-03-21 10:43:54,885 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/raspberrygpio.pellmon-plugin 2018-03-21 10:43:54,888 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/owfs.pellmon-plugin 2018-03-21 10:43:54,890 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/calculate.pellmon-plugin 2018-03-21 10:43:54,892 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/consumption.pellmon-plugin 2018-03-21 10:43:54,895 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/customalarms.pellmon-plugin 2018-03-21 10:43:54,898 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/nbecom.pellmon-plugin 2018-03-21 10:43:54,900 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/pelletcalc.pellmon-plugin 2018-03-21 10:43:54,903 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/exec.pellmon-plugin 2018-03-21 10:43:54,905 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/silolevel.pellmon-plugin 2018-03-21 10:43:54,907 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/heatingcircuit.pellmon-plug$ 2018-03-21 10:43:54,910 - DEBUG - PluginManager found a candidate: /usr/lib/python2.7/dist-packages/Pellmonsrv/plugins/testplugin.pellmon-plugin

Thomsus avatar Mar 21 '18 10:03 Thomsus

OK, so your adapter is either identical to mine that's working fine, or you have one of those chinese fakes... Your log is completely normal.

I still don't have a single timeout so I can't really do anything about this.

motoz avatar Mar 21 '18 10:03 motoz

Normal log? I wouldn't have guessed so. Anyway, I found this in the log: 2018-03-20 21:49:34,191 - INFO - Timeout again, give up and return fail 2018-03-20 21:49:44,207 - INFO - error in retry [Errno 0] GetItem failed 2018-03-20 22:07:34,769 - INFO - Timeout again, give up and return fail

So what are my options? I can e-mail the log, if that's worth anything.

Thomsus avatar Mar 21 '18 12:03 Thomsus

Yeah, the complaining about plugin_dirs in the log is my mistake, not having plugin_dirs in the conf at all shouldn't generate a warning as it's the normal case. The errors in debug mode for missing stuff in [pollvalues] is likewise not really an error, and since a couple of years the normal case as I added a couple of unused columns to the standard rrd definition to make it easier to add data later. DEBUG mode is anyway only useful when you are debugging, ie. following up what happenes by reading the code.

The log wouldn't give anything, I have tried to replicate your systems as far as I can without success, so there is nothing I can do. Have any of you tried to run pellmon in debug mode as described in the 'Contributing to pellmon' page? If you are able to make changes to the code and test them then I could make suggestions what to try.

motoz avatar Mar 21 '18 12:03 motoz

Hi Thomsus, Have you tried with another com-cable and checked for a good connection on the terminals on the print of the controller? I used to have timeouts from time to time but it is gone after I improved the connection. The wires in my COM-cable it extremely thin and to begin with I had only put the bare wires into the terminals on the controller. I then tinned the exposed parts of the wires which made it a bit thicker and stronger. I think that this helped to get a better connection.

Best regards Richard

rcapion avatar Mar 21 '18 22:03 rcapion

I have checked the cable connections. All wires are tightly fitted in the terminals, on the print. My issue really exploded after re-installing the raspbian lite with pellmon. So I don't suspect the cable fitting. What a shame the old card suffered a data loss, otherwise it would probably still be running.

with no actions for me to try out, I will leave it running as is. It's a bit annoying seeing the many error messages, but I can live with it.

Thomsus avatar Mar 22 '18 20:03 Thomsus

I clone the project, what to do next. I'm lost, and dumb.

pi@raspberrypi:~/PellMon $ ./autogen.sh ./autogen.sh: line 1: autoreconf: command not found

This is content of autogen.sh file:

autoreconf --install || exit 1

update 10 min. later ;-)

pi@raspberrypi:~/PellMon $ ./autogen.sh ./autogen.sh: line 1: autoreconf: command not found pi@raspberrypi:~/PellMon $ cd src pi@raspberrypi:~/PellMon/src $ ./pellmonsrv start error reading config file 2018-03-22 21:35:42,625 - INFO - invalid database definition, data polling not possible 2018-03-22 21:35:42,626 - INFO - Invalid poll_interval setting, using 10s 2018-03-22 21:35:42,627 - DEBUG - Polling is off noconf No section: 'plugin_settings'

japlab avatar Mar 22 '18 21:03 japlab

I see the the 'contributing to' guide didn't mention installing the dependencies. As you already installed PellMon using the .deb package, your system package manager already installed them for you. However one is missing, namely the one listed in the Readme under 'Additional dependencies for building'. Install that with sudo apt-get install autoconf and you should get further. Don't forget to shut down the installed pellmon with sudo service pellmonsrv stop and sudo service pellmonweb stop before running from the source code.

I updated the 'contributing' guide. It's good to get feedback about where you get stuck so I can improve the guide.

motoz avatar Mar 23 '18 06:03 motoz

And when you get that far, you should probably use pellmonsrv debug instead of pellmonsrv start to start it. That way it won't become a background process so you can see anything it 'prints' or if it crashes you can see the error information.

Since you are not debugging pellmonweb, you can start that in the background with ./pellmonweb -D before starting pellmonsrv, if you need it running.

motoz avatar Mar 23 '18 06:03 motoz

Thanks. I now try and get this:

pi@raspberrypi:~/PellMon/src $ ./pellmonsrv debug
No option 'plugin_dirs' in section: 'plugin_settings'
/home/pi/PellMon/src/Pellmonsrv/plugins/onewire/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/nbecom/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/pelletcalc/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/scottecom/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/customalarms/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/exec/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/cleaning/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/silolevel/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/consumption/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/raspberrygpio/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/owfs/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/calculate/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/heatingcircuit/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/testplugin/__init__
/home/pi/PellMon/src/Pellmonsrv/plugins/openweathermap/__init__
^Z
[5]+  Stopped                 ./pellmonsrv debug

japlab avatar Mar 23 '18 12:03 japlab