matomo-log-analytics icon indicating copy to clipboard operation
matomo-log-analytics copied to clipboard

Logfile import seems to fall into an infinite loop

Open pennyarcade opened this issue 6 years ago • 5 comments

Command line:

python /opt/piwig_log/matomo-log-analytics/import_logs.py --url=https://monitoring.xxx.de --login= xxx--password=xxx --recorders=4 --enable-http-errors --enable-http-redirects --enable-static --idsite=1 --enable-bots /var/log/apache2/*_access.log

On Ubuntu server, latest updates, cloned master branch of this project. There are no errors. I tried running the script over night but at some point no further log lines are processed and only the "records/sec" value goes down slowly. At first log lines seem to be processed normally. Plese tell me what further information you need to debug the problem. The goal is running the script automatically with logrotate wich is not possible with this problem.

143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)

This is the output after running it for more than 12 hours straight.

pennyarcade avatar Jan 16 '18 07:01 pennyarcade

Bump... Any ideas on this? Any more information I can add to get some help on this?

pennyarcade avatar Jan 19 '18 11:01 pennyarcade

I captured on production this situation: at a given time Matomo was unreachable. import_logs.py somehow noticed the situation, however it didn't either recovered nor terminated. It kept running (accepting events on stdin) without being able to resume a valid session with Matomo.

2018-06-01 21:21:39,814: [INFO] Error when connecting to Matomo: HTTP Error 504: Gateway Time-out
2018-06-01 21:21:39,816: [INFO] Retrying request, attempt number 2
2018-06-01 21:21:40,267: [INFO] Error when connecting to Matomo: HTTP Error 504: Gateway Time-out
2018-06-01 21:21:40,269: [INFO] Retrying request, attempt number 2
Exception in thread Thread-4:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 765, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/var/www/html/piwik/misc/log-analytics/import_logs.py", line 1858, in _run_bulk
    self._record_hits(hits)
  File "/var/www/html/piwik/misc/log-analytics/import_logs.py", line 2016, in _record_hits
    on_failure=self._on_tracking_failure
  File "/var/www/html/piwik/misc/log-analytics/import_logs.py", line 1595, in call
    data=data)
  File "/var/www/html/piwik/misc/log-analytics/import_logs.py", line 1549, in _call_wrapper
    response = func(*args, **kwargs)
  File "/var/www/html/piwik/misc/log-analytics/import_logs.py", line 1485, in _call
    response = opener.open(request, timeout = timeout)
  File "/usr/lib64/python2.7/urllib2.py", line 431, in open
    response = self._open(req, data)
  File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
    '_open', req)
  File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "/usr/lib64/python2.7/urllib2.py", line 1258, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/usr/lib64/python2.7/urllib2.py", line 1217, in do_open
    r = h.getresponse(buffering=True)
  File "/usr/lib64/python2.7/httplib.py", line 1113, in getresponse
    response.begin()
  File "/usr/lib64/python2.7/httplib.py", line 444, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/httplib.py", line 400, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib64/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
  File "/usr/lib64/python2.7/ssl.py", line 759, in recv
    return self.read(buflen)
  File "/usr/lib64/python2.7/ssl.py", line 653, in read
    v = self._sslobj.read(len or 1024)
SSLError: ('The read operation timed out',)

74713537 lines parsed, 70014081 lines recorded, 317 records/sec (avg), 0 records/sec (current)
74713537 lines parsed, 70014081 lines recorded, 316 records/sec (avg), 0 records/sec (current)
74713537 lines parsed, 70014081 lines recorded, 316 records/sec (avg), 0 records/sec (current)

(note line-numbers might not corresponds to current master because I have patched version of the script).

kaplun avatar Jun 04 '18 07:06 kaplun

I am experiencing the same issue with Matomo 3.9.1 I wrote a Powershell script to loop through my daily log files but it is stalling when it calls import_logs.py My daily log files are up to 60MB The script import_logs.py seems to stall or fall into an infinite loop as mentionned by pennyarcade. Only the records/sec seems to go down while other progress indicators do not change matomo-stalls

tkodjo avatar Apr 12 '19 11:04 tkodjo

Same behavior using matomo 4.9.0

viba1 avatar Apr 24 '22 09:04 viba1

Same issue on matomo docker: Matomo: 4.9.1 MySQL: 8.0.29 PHP: 8.0.19

5619 lines parsed, 4800 lines recorded, 2 records/sec (avg), 0 records/sec (current) 2022-05-13 08:53:57,140: [INFO] Error when connecting to Matomo: The read operation timed out 2022-05-13 08:53:57,142: [INFO] Retrying request, attempt number 2 ... 5619 lines parsed, 4800 lines recorded, 1 records/sec (avg), 0 records/sec (current) 2022-05-13 09:04:17,404: [INFO] Error when connecting to Matomo: The read operation timed out 2022-05-13 09:04:17,405: [INFO] Max number of attempts reached, server is unreachable! Fatal error: The read operation timed out You can restart the import of "/var/log/matomo-logs/access.log" from the point it failed by specifying --skip=4811 on the command line.

Calling with --skip=4811leads to the same behavior: log/matomo-logs/access.log 0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current) Parsing log /var/log/matomo-logs/access.log...

4611 lines parsed, 3800 lines recorded, 3 records/sec (avg), 0 records/sec (current) 2022-05-13 09:22:26,448: [INFO] Error when connecting to Matomo: The read operation timed out 2022-05-13 09:22:26,449: [INFO] Max number of attempts reached, server is unreachable! Fatal error: The read operation timed out You can restart the import of "/var/log/matomo-logs/access.log" from the point it failed by specifying --skip=3809 on the command line.

viba1 avatar May 13 '22 17:05 viba1