Leaking memory
Hi,
Great project bro very well done but in my setup it leaks memory. each time i run it creates a new chrome process . a simple sudo pkill chrome gave some memory back.
@sbOogway can you please give me some more information?
- OS
- how you start the program ?
i am in debian and running it through docker. every time i run it, it spawns some (from 6 to 18) threads of chrome in the root directory. should nt be they closed when the scraping job is finished?
this is the form data i am passing. i am calling the api from python. even when i run through the browser it spawns threads like in python payload = "name=test&keywords=bar%20in%20calabria&lang=en&zoom=0&latitude=0&longitude=0&depth=3&maxtime=10m"
now they re more
how often you call it? it requires around 3m to cleanup.
I assume that you have a web server or similar and you make a request. Do you wait for the program to exist or you sent multiple requests?
when you start the program as it is intended to get started is the behavior the same?
@gosom
i restarted the docker container and i run it through the web interface and i get the same problem. i have waited for each job to finish.
can you confirm the following:
(1) start the app using the provided docker container in the latest version (2) the web interface is accessible at localhost:8080 (3) you add a keyword in the form and click the start button (4) you wait until the job finishes (5) you click download CSV and it's not empty (6) after these you can still see open chromium browsers?
yes. i dont click the download button. i only start the job. why the chromium browsers run in root directory?
this is in the docker container not in your host. docker run as root.
let me check if the same happens to me (Fedora) and let you know
ok thanks bro, appreciate it.
docker pull gosom/google-maps-scraper
mkdir -p gmapsdata && docker run -v $PWD/gmapsdata:/gmapsdata -p 8080:8080 gosom/google-maps-scraper -web -data-folder /gmapsdata
I run the above.
the browsers cleanup up
ok thanks for the time. i still get the problem but i can work it around with pkill. thanks a lot
@sbOogway you are right regarding the above. It is reproduced.
It has something to do with the way the browser are cleaned up.
@gosom if u want i can try to fix it. In which files do i have to look roughly?
@sbOogway can you try the latest release (v1.5.1). I believe that this is fixed but I was not in the first place able to reproduce consistently.
FYI: the issue most likely was that in the scrapemate repo the playwright browsers and playwright instance were not closed properly. here is the commit with the fix:
https://github.com/gosom/scrapemate/commit/153bd5a946e1e24ae0e15f2123617f64b8856600
@gosom i still get it bro
have you tried to compile and run from your machine? also please double check that you pulled the latest image just in case.
This is weird, trying to reproduce again
yeah bro
i have to go for the day. replying to you tomorrow. have a nice one
this is still a issue it seems? large datasets create endless memory until the server dies