combine
combine copied to clipboard
Memory error when running with enrichment?
While testing combine (pristine) in a vagrant box
$ ./combine.py -e
[...]
2015-05-16 22:48:05,051 - combine.winnower - ERROR - Could not determine address type for ckaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa listed as None
2015-05-16 22:48:05,065 - combine.winnower - ERROR - Could not determine address type for vendor.almsyar.com:8080 listed as FQDN
2015-05-16 22:48:05,080 - combine.winnower - INFO - Dumping results
Traceback (most recent call last):
File "./combine.py", line 44, in <module>
winnow('crop.json', 'crop.json', 'enrich.json')
File "/home/vagrant/combine/winnower.py", line 203, in winnow
e_data = json.dumps(enriched, indent=2, ensure_ascii=False).encode('utf8')
File "/usr/lib/python2.7/json/__init__.py", line 250, in dumps
sort_keys=sort_keys, **kw).encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 210, in encode
return ''.join(chunks)
MemoryError
$ free -m
total used free shared buffers cached
Mem: 2001 217 1784 1 1 77
-/+ buffers/cache: 139 1862
Swap: 0 0 0
Hmm, that is a first. Do you mind sharing some details on the vagrant box you stood up (or even the config) so we could try replicating this?
No problem, it's a pretty basic one
# -*- mode: ruby -*-
# vi: set ft=ruby :
VAGRANTFILE_API_VERSION = "2"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.vm.box = "ubuntu/trusty64"
config.vm.provider "virtualbox" do |v|
v.memory = 2048
end
config.vm.network "public_network", bridge: 'eth0'
config.vm.network "forwarded_port", guest: 80, host: 9880
config.vm.synced_folder "/path/1", "/p"
end
and after virtualenv for requirements and execution combine without enrichment is ok if I switched to 4GB, it's ok.
Maybe adding a warning?
I have noticed this as well, and it's an artifact of how we store everything in memory before writing to disk at the end of the job.