google-maps-scraper Leaking memory

Hi, Great project bro very well done but in my setup it leaks memory. each time i run it creates a new chrome process . a simple sudo pkill chrome gave some memory back.

chrome_ps

Oct 19 '24 13:10 sbOogway

@sbOogway can you please give me some more information?

OS
how you start the program ?

Oct 19 '24 13:10 gosom

i am in debian and running it through docker. every time i run it, it spawns some (from 6 to 18) threads of chrome in the root directory. should nt be they closed when the scraping job is finished? chrome-ps2

this is the form data i am passing. i am calling the api from python. even when i run through the browser it spawns threads like in python payload = "name=test&keywords=bar%20in%20calabria&lang=en&zoom=0&latitude=0&longitude=0&depth=3&maxtime=10m"

now they re more

Oct 19 '24 13:10 sbOogway

how often you call it? it requires around 3m to cleanup.

I assume that you have a web server or similar and you make a request. Do you wait for the program to exist or you sent multiple requests?

when you start the program as it is intended to get started is the behavior the same?

Oct 19 '24 13:10 gosom

@gosom

i restarted the docker container and i run it through the web interface and i get the same problem. i have waited for each job to finish.

Oct 19 '24 13:10 sbOogway

can you confirm the following:

(1) start the app using the provided docker container in the latest version (2) the web interface is accessible at localhost:8080 (3) you add a keyword in the form and click the start button (4) you wait until the job finishes (5) you click download CSV and it's not empty (6) after these you can still see open chromium browsers?

Oct 19 '24 14:10 gosom

yes. i dont click the download button. i only start the job. why the chromium browsers run in root directory?

Oct 19 '24 14:10 sbOogway

this is in the docker container not in your host. docker run as root.

let me check if the same happens to me (Fedora) and let you know

Oct 19 '24 14:10 gosom

ok thanks bro, appreciate it.

Oct 19 '24 14:10 sbOogway

docker pull gosom/google-maps-scraper
mkdir -p gmapsdata && docker run -v $PWD/gmapsdata:/gmapsdata -p 8080:8080 gosom/google-maps-scraper -web -data-folder /gmapsdata

I run the above.

the browsers cleanup up

Oct 19 '24 14:10 gosom

ok thanks for the time. i still get the problem but i can work it around with pkill. thanks a lot

Oct 19 '24 14:10 sbOogway

@sbOogway you are right regarding the above. It is reproduced.

It has something to do with the way the browser are cleaned up.

Oct 20 '24 09:10 gosom

@gosom if u want i can try to fix it. In which files do i have to look roughly?

Oct 20 '24 10:10 sbOogway

@sbOogway can you try the latest release (v1.5.1). I believe that this is fixed but I was not in the first place able to reproduce consistently.

FYI: the issue most likely was that in the scrapemate repo the playwright browsers and playwright instance were not closed properly. here is the commit with the fix:

https://github.com/gosom/scrapemate/commit/153bd5a946e1e24ae0e15f2123617f64b8856600

Oct 20 '24 10:10 gosom

@gosom i still get it bro

chrome-ps3