wikiteam icon indicating copy to clipboard operation
wikiteam copied to clipboard

Dumpgenerator halts without finishing

Open TimSC opened this issue 7 years ago • 4 comments
trafficstars

It seems to do it at a random point. It should time out and retry automatically.

	Downloaded 10 images
	Downloaded 20 images
	Downloaded 30 images
	Downloaded 40 images
	Downloaded 50 images
	Downloaded 60 images
	Downloaded 70 images
	Downloaded 80 images
	Downloaded 90 images
	Downloaded 100 images
	Downloaded 110 images

^CTraceback (most recent call last):
  File "dumpgenerator.py", line 2084, in <module>
	main()
  File "dumpgenerator.py", line 2076, in main
	createNewDump(config=config, other=other)
  File "dumpgenerator.py", line 1663, in createNewDump
	session=other['session'])
  File "dumpgenerator.py", line 1109, in generateImageDump
	r = requests.get(url=url)
  File "/usr/lib/python2.7/dist-packages/requests/api.py", line 67, in get
	return request('get', url, params=params, **kwargs)
  File "/usr/lib/python2.7/dist-packages/requests/api.py", line 53, in request
	return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 468, in request
	resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 608, in send
	r.content
  File "/usr/lib/python2.7/dist-packages/requests/models.py", line 737, in content
	self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
  File "/usr/lib/python2.7/dist-packages/requests/models.py", line 660, in generate
	for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "/usr/lib/python2.7/dist-packages/urllib3/response.py", line 344, in stream
	data = self.read(amt=amt, decode_content=decode_content)
  File "/usr/lib/python2.7/dist-packages/urllib3/response.py", line 301, in read
	data = self._fp.read(amt)
  File "/usr/lib/python2.7/httplib.py", line 612, in read
	s = self.fp.read(amt)
  File "/usr/lib/python2.7/socket.py", line 384, in read
	data = self._sock.recv(left)
KeyboardInterrupt

TimSC avatar Feb 02 '18 12:02 TimSC

@TimSC Can't you resume? Also, you can delete the downloaded images entries from -images.txt file.

emijrp avatar Feb 11 '18 13:02 emijrp

Resume for images doesn't work for me. Possibly related to #250 ?

If I repeatedly resume with a modified -images.txt, this works as a work around.

TimSC avatar Feb 12 '18 14:02 TimSC

We cannot do much for transient errors apart from replacing that requests.get(url=url) with our now usual session.get(url=url) which uses some rather insistent retrying. If a particular wiki fails constantly upon multiple attempts, we can look into how to solve that; otherwise, just retry.

nemobis avatar Feb 10 '20 22:02 nemobis

Note that part of the retrying is now done (or not) by mwclient when using the API.

nemobis avatar Mar 07 '20 21:03 nemobis