Mining-the-Social-Web
Mining-the-Social-Web copied to clipboard
memory error on example 3-5
Hi,
I would really appreciate your help on the following issues:
In example 3-6: After adding in Couch DB configuration path for couchpy( asbolute path "C:\Python27\Scripts\couchpy.exe" ) and restarting the service I executed the following code :
import sys import couchdb from couchdb.design import ViewDefinition try: ... import jsonlib2 as json ... except ImportError: ... import json ... DB = 'enronami' START_DATE = '1900-01-01' #YYYY-MM-DD END_DATE = '2100-01-01' #YYYY-MM-DD def dateTimeToDocMapper(doc): ... from dateutil.parser import parse ... from datetime import datetime as dt ... if doc.get('Date'): ... _date = list(dt.timetuple(parse(doc['Date']))[:-3]) ... yield (_date, doc) ... view = ViewDefinition('index', 'by_date_time', dateTimeToDocMapper, ... language='python') Traceback (most recent call last): File "
", line 2, in File "C:\Python27\lib\site-packages\couchdb-0.8-py2.7.egg\couchdb\design.py", line 93, in init map_fun = _strip_decorators(getsource(map_fun).rstrip()) File "C:\Python27\lib\inspect.py", line 699, in getsource lines, lnum = getsourcelines(object) File "C:\Python27\lib\inspect.py", line 688, in getsourcelines lines, lnum = findsource(object) File "C:\Python27\lib\inspect.py", line 529, in findsource raise IOError('source code not available') IOError: source code not available
Also in example 3-5 I got the following error:
db.update(docs, all_or_nothing=True) Traceback (most recent call last): File "
", line 1, in File "C:\Python27\lib\site-packages\couchdb-0.8-py2.7.egg\couchdb\client.py", line 733, in update _, _, data = self.resource.post_json('_bulk_docs', body=content) File "C:\Python27\lib\site-packages\couchdb-0.8-py2.7.egg\couchdb\http.py", li ne 399, in post_json status, headers, data = self.post(_a, *_k) File "C:\Python27\lib\site-packages\couchdb-0.8-py2.7.egg\couchdb\http.py", li ne 381, in post **params) File "C:\Python27\lib\site-packages\couchdb-0.8-py2.7.egg\couchdb\http.py", li ne 419, in _request credentials=self.credentials) File "C:\Python27\lib\site-packages\couchdb-0.8-py2.7.egg\couchdb\http.py", li ne 176, in request body = json.encode(body).encode('utf-8') MemoryError
But this error atleast temporarily I was able to solve by trimming enron.mbox.json to 2000 objects instead of the full size which had 41000 json objects.
With Regards, Amitabh
I'm having this problem too. I'm just increasing my memory on the VM and hoping this will solve it. Unfortunately I'm now up to 3 GB of RAM and it still hasn't fixed it.
Edited to add: 4 GB doesn't work either. Maybe it needs more virtual memory. I'm watching the System Monitor and it has about 50% RAM free when it crashes.
Discopatrick - Are you also using Windows in you VM? Wondering if that is the common thread
Actually the VM is Ubuntu 12.04. Running on VirtualBox 4.2.6.
Sorry this is taking a while for me to help you with, but I am hoping that we can pin down the issue soon. Can you give me the other pertinent details of your situation so I can better reproduce this? Version of CouchDB and Python are two that come to mind. Version of the couchdb package is another one.
Thanks for your help Russell. I've skipped ahead to other parts of the book, but if you'd still like the details, here they are:
On starting python in the terminal I see:
Python 2.7.3 (default, Aug 1 2012, 05:16:07) [GCC 4.6.3] on linux2
CouchDB: Apache CouchDB 1.0.1
Looking in the Ubuntu Software Center at python-couchdb I see: python-couchdb 0.8-0ubuntu2
Any more info you need, just let me know.
Almost all of my software was installed using the Ubuntu Software Center, I think there were one or two exceptions, one of which was Redis IIRC.
I'm getting the same error on mountain lion:
python mbox-dateload.py new-enron 1900-01-01 2012-01-01
Finding docs dated from 1900-1-1 to 2012-1-1
Traceback (most recent call last):
File "mbox-dateload.py", line 34, in
Localised error:
view = ViewDefinition('index', 'by_date_time', dateTimeToDocMapper, language='python')
Traceback (most recent call last):
File "
Python details: Python 2.7.2 (default, Jun 16 2012, 12:38:40) [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Any ideas?
Thanks
Carving out some this this evening to try and work on this. In previous attempts, I haven't been able to reproduce it, and I think it may have been because I wasn't using the same version of CouchDB as was producing the problem under the faulty assumption that my older version would have probably exhibited the same issue. What version are you using? Also, what version of the couchdb package are you using? (What does couchdb.version return?)
It's 1.21
Edit - sorry 0.8.
On Monday, 11 February 2013 at 20:17, Matthew A. Russell wrote:
Carving out some this this evening to try and work on this. In previous attempts, I haven't been able to reproduce it, and I think it may have been because I wasn't using the same version of CouchDB as was producing the problem under the faulty assumption that my older version would have probably exhibited the same issue. What version are you using?
— Reply to this email directly or view it on GitHub (https://github.com/ptwobrussell/Mining-the-Social-Web/issues/12#issuecomment-13400260).
My attempts to reproduce this problem were not fruitful. We could try to further isolate the problem on your environment, but it might be just as easy to reach out to #couchdb on IRC or ask the mailing list for help since this appears that it could be a CouchDB specific issue involving a memory setting.
it's not as cool, but you can get around the memory issue with: for doc in docs: db.save(doc)
Oh great thanks for helping.
D
On Friday, 12 April 2013 at 15:31, laksdjhfads wrote:
it's not as cool, but you can get around the memory issue with: for doc in docs: db.save(doc)
— Reply to this email directly or view it on GitHub (https://github.com/ptwobrussell/Mining-the-Social-Web/issues/12#issuecomment-16296020).
@laksdjhfads and @Pragueham - Did this workaround do ok for you? If so, would either of you like to submit a pull request so I can credit you with the fix?