facebook-scraper icon indicating copy to clipboard operation
facebook-scraper copied to clipboard

[Question] A library to search for facebook groups with given keywords

Open sla-te opened this issue 3 years ago • 9 comments

2 years ago this was possible with Facebook Graph API, which is sadly deprecated now, is there a library you know of, that is capable of doing this?

sla-te avatar Aug 09 '21 15:08 sla-te

Why not use Facebook's web interface to do this search? For example:

https://www.facebook.com/search/groups/?q=games or https://m.facebook.com/search/groups/?q=games&source=filter&isTrending=0&tsid=0.44263932103561143

neon-ninja avatar Aug 09 '21 23:08 neon-ninja

Yeah, sounds good, your lib would be a good base to start, as we need to be logged in to search for groups afaik. Would be sufficient being able to provide keywords for starters and return the group URLs. - If you create the request, that returns the HTML which in includes the URLs I could write the part to scrape them with beautifulsoup for instance.

sla-te avatar Aug 10 '21 12:08 sla-te

I meant, why not do it manually in your browser? Do you have hundreds of search terms or something?

neon-ninja avatar Aug 10 '21 21:08 neon-ninja

Yeah, I have 400 keywords, that I need to find the facebook groups for. : )

sla-te avatar Aug 10 '21 21:08 sla-te

Related issue: https://github.com/kevinzg/facebook-scraper/issues/419

neon-ninja avatar Aug 10 '21 23:08 neon-ninja

I believe this does what is requested. It adds a method get_groups_by_search which searches for groups, finds their id, and yields the result of get_group_info with that group_id.

from facebook_scraper import FacebookScraper, utils, get_group_info
from facebook_scraper.constants import FB_MOBILE_BASE_URL


class FacebookScraper(FacebookScraper):
    def get_groups_by_search(self, word: str, **kwargs):

        """Searches Facebook groups and yields ids for each result
        on the first page"""

        group_search_url = utils.urljoin(FB_MOBILE_BASE_URL, f"search/groups/?q={word}")
        r = self.get(group_search_url)
        for group_element in r.html.find('div[role="button"]'):
            button_id = group_element.attrs["id"]
            group_id = find_group_id(button_id, r.text)
            yield get_group_info(group_id)


def find_group_id(button_id, raw_html):

    """Each group button has an id, which appears later in the script
    tag followed by the group id."""

    s = raw_html[raw_html.rfind(button_id) :]
    group_id = s[s.find("result_id:") :].split(",")[0].split(":")[1]
    return int(group_id)


scraper = FacebookScraper()
scraper.login(email=EMAIL, password=PWD)

for group_info in scraper.get_groups_by_search("coffee"):
    print(group_info)

Result:

{'id': '1996185023800606', 'name': 'Coffee lovers', 'type': 'Public group', 'members': 14299}
{'id': '2204925119', 'name': 'COFFEE COFFEE COFFEE!!!', 'type': 'Public group', 'members': 340455}
{'id': '755007758392142', 'name': 'LATTE ART', 'type': 'Public group', 'members': 46079}
{'id': '534483107108037', 'name': 'BARISTA COMMUNITY', 'type': 'Public group', 'members': 169960}
{'id': '721633338172381', 'name': 'Funny Coffee Memes', 'type': 'Public group', 'members': 219281}
{'id': '587751572609633', 'name': 'Coffee ☕ & Rain 🌧', 'type': 'Public group', 'members': 116986}
{'id': '823558245059998', 'name': '林芊妤 Coffee 粉絲群組', 'type': 'Public group', 'members': 7932}
{'id': '1574636316089193', 'name': 'I Love Coffee', 'type': 'Public group', 'members': 208646}
{'id': '120661273275592', 'name': 'Coffee & Cake Lovers 💏 ☕🍰', 'type': 'Public group', 'members': 40836}
{'id': '359032028835121', 'name': 'Coffee ☕❤', 'type': 'Public group', 'members': 21074}
{'id': '364701647546998', 'name': 'COFFEE BEANS MARKET', 'type': 'Public group', 'members': 56691}
{'id': '746157059433578', 'name': 'Coffee Everyday', 'type': 'Public group', 'members': 6113}

bipsen avatar Jul 01 '22 11:07 bipsen

Great - could you please submit a pull request?

neon-ninja avatar Jul 01 '22 18:07 neon-ninja

Tôi tin rằng điều này làm những gì được yêu cầu. Nó thêm một phương thức get_groups_by_searchtìm kiếm các nhóm, tìm id của họ và mang lại kết quả get_group_infovới group_id đó.

from facebook_scraper import FacebookScraper, utils, get_group_info
from facebook_scraper.constants import FB_MOBILE_BASE_URL


class FacebookScraper(FacebookScraper):
    def get_groups_by_search(self, word: str, **kwargs):

        """Searches Facebook groups and yields ids for each result
        on the first page"""

        group_search_url = utils.urljoin(FB_MOBILE_BASE_URL, f"search/groups/?q={word}")
        r = self.get(group_search_url)
        for group_element in r.html.find('div[role="button"]'):
            button_id = group_element.attrs["id"]
            group_id = find_group_id(button_id, r.text)
            yield get_group_info(group_id)


def find_group_id(button_id, raw_html):

    """Each group button has an id, which appears later in the script
    tag followed by the group id."""

    s = raw_html[raw_html.rfind(button_id) :]
    group_id = s[s.find("result_id:") :].split(",")[0].split(":")[1]
    return int(group_id)


scraper = FacebookScraper()
scraper.login(email=EMAIL, password=PWD)

for group_info in scraper.get_groups_by_search("coffee"):
    print(group_info)

Kết quả:

{'id': '1996185023800606', 'name': 'Coffee lovers', 'type': 'Public group', 'members': 14299}
{'id': '2204925119', 'name': 'COFFEE COFFEE COFFEE!!!', 'type': 'Public group', 'members': 340455}
{'id': '755007758392142', 'name': 'LATTE ART', 'type': 'Public group', 'members': 46079}
{'id': '534483107108037', 'name': 'BARISTA COMMUNITY', 'type': 'Public group', 'members': 169960}
{'id': '721633338172381', 'name': 'Funny Coffee Memes', 'type': 'Public group', 'members': 219281}
{'id': '587751572609633', 'name': 'Coffee ☕ & Rain 🌧', 'type': 'Public group', 'members': 116986}
{'id': '823558245059998', 'name': '林芊妤 Coffee 粉絲群組', 'type': 'Public group', 'members': 7932}
{'id': '1574636316089193', 'name': 'I Love Coffee', 'type': 'Public group', 'members': 208646}
{'id': '120661273275592', 'name': 'Coffee & Cake Lovers 💏 ☕🍰', 'type': 'Public group', 'members': 40836}
{'id': '359032028835121', 'name': 'Coffee ☕❤', 'type': 'Public group', 'members': 21074}
{'id': '364701647546998', 'name': 'COFFEE BEANS MARKET', 'type': 'Public group', 'members': 56691}
{'id': '746157059433578', 'name': 'Coffee Everyday', 'type': 'Public group', 'members': 6113}

hey pro, pls help me, i can't run this code

TranHuuHieu15 avatar Apr 03 '24 08:04 TranHuuHieu15

this is error D:\MyJob\Python\PyCharm\ai-report.venv\lib\site-packages\facebook_scraper\facebook_scraper.py:855: UserWarning: Facebook language detected as vi_VN - for best results, set to en_US warnings.warn( Traceback (most recent call last): File "D:\MyJob\Python\PyCharm\ai-report\demo.py", line 32, in scraper.login(email=EMAIL, password=PWD) File "D:\MyJob\Python\PyCharm\ai-report.venv\lib\site-packages\facebook_scraper\facebook_scraper.py", line 998, in login f.write(response.text) File "D:\Important\IT\python2\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u1ead' in position 50: character maps to

TranHuuHieu15 avatar Apr 03 '24 08:04 TranHuuHieu15