facebook-scraper icon indicating copy to clipboard operation
facebook-scraper copied to clipboard

reactors no longer being returned

Open curiousier-george opened this issue 3 years ago • 15 comments

Just today, reactors stopped being returned for me. The following program exhibits the problem.

from facebook_scraper import get_posts, set_user_agent
from pprint import pprint
import sys

cookie_file = 'facebook_cookies.txt'

set_user_agent("Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)")

post_ids = sys.argv[1 : ]

for post_id in post_ids:
    post = next(get_posts(post_urls=[post_id], cookies=cookie_file,
                          options={'allow_extra_requests': False, 'reactors': True}))
    pprint(post)

When I invoke this as

python reactors.py 10158741881073601

the reactors field returned is None, although there are actually reactors to the post. It was working until today.

curiousier-george avatar Feb 26 '22 04:02 curiousier-george

I merged https://github.com/kevinzg/facebook-scraper/pull/707 into master branch, and that fixed reactor extraction for me for this post. Give latest master branch a try and see how you go.

neon-ninja avatar Mar 29 '22 22:03 neon-ninja

Hmm, no, that didn't work for me. What's really weird, though, is that I had made the changes in #707 in my own copy of facebook_scraper, and that did work for me - at least for links and names, although not types.

curiousier-george avatar Mar 30 '22 02:03 curiousier-george

Maybe you still had an old version of the library. Try pip uninstall facebook-scraper twice before running pip install git+https://github.com/kevinzg/facebook-scraper.git

neon-ninja avatar Mar 30 '22 03:03 neon-ninja

Sorry, I've encountered the same problem.

I've tried pip uninstall facebook-scraper twice and then pip install git+https://github.com/kevinzg/facebook-scraper.git, then I copied the code George (who raised this issue) posted and made some revision, such as

from facebook_scraper import get_posts, set_user_agent
from pprint import pprint


cookies=MY_COOKIES

set_user_agent("Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)")

post_ids = ["10158741881073601"]

for post_id in post_ids:
    post = next(
        get_posts(
            post_urls=[post_id],
            cookies=cookies,
            options={
                'allow_extra_requests': False,
                'reactors': True
            }
        )
    )
    pprint(post)

and I got the return like this (I only showed part of it):

'page_id': None,
 'post_id': '10158741881073601',
 'post_text': 'Penne Smith Sandbeck',
 'post_url': 'https://m.facebook.com/10158741881073601',
 'reaction_count': 14,
 'reactions': {'care': 3, 'like': 8, 'love': 1, 'sad': 2},
 'reactors': [],
 'shared_post_id': None,
 'shared_post_url': None,
 'shared_text': None,

The reactors field is an empty list, but it seems that actually there are some reactors on this post.

ben31406 avatar Mar 30 '22 15:03 ben31406

I'm not sure if I found the problem. It seems that it'll raise an exception during this line k = str(demjson.decode(sigil.attrs.get("data-store"))["reactionType"]) I added a breakpoint() before this line, and checked for demjson.decode(sigil.attrs.get("data-store")), it returned like this {'reactionID': 478547315650144}, which didn't contain the key, reactionType

ben31406 avatar Mar 30 '22 16:03 ben31406

I see - try https://github.com/kevinzg/facebook-scraper/commit/5539ec467223286d952c5af5c19d89a8cffedb17

With this commit I get:

'reaction_count': 14,
 'reactions': {'care': 3, 'like': 8, 'love': 1, 'sad': 2},
 'reactors': [{'link': 'https://facebook.com/lynnswisher.spears?fref=pb',
               'name': 'Lynn Swisher Spears',
               'type': 'care'},
              {'link': 'https://facebook.com/profile.php?id=100001509791215&fref=pb',
               'name': 'D.j. Bost',
               'type': 'like'},
              {'link': 'https://facebook.com/audra.halemaddox?fref=pb',
               'name': 'Audra Hale-Maddox',
               'type': 'care'},
              {'link': 'https://facebook.com/lin.stogner?fref=pb',
               'name': 'Lin Stogner',
               'type': 'like'},
              {'link': 'https://facebook.com/pam.morris.73?fref=pb',
               'name': 'Pam Morris',
               'type': 'sad'},
              {'link': 'https://facebook.com/shane.petersen.507?fref=pb',
               'name': 'Shane Petersen',
               'type': 'like'},
              {'link': 'https://facebook.com/jeroen.vandenhurk?fref=pb',
               'name': 'Jeroen van den Hurk',
               'type': 'sad'},
              {'link': 'https://facebook.com/judy.e.woodall?fref=pb',
               'name': 'Judy Edwards Woodall',
               'type': 'like'},
              {'link': 'https://facebook.com/kari.tgeorge?fref=pb',
               'name': 'Kari Turcogeorge',
               'type': 'love'},
              {'link': 'https://facebook.com/susan.r.briley?fref=pb',
               'name': 'Susan Reesman Briley',
               'type': 'like'},
              {'link': 'https://facebook.com/hutson.nick?fref=pb',
               'name': 'Nick Hutson',
               'type': 'like'},
              {'link': 'https://facebook.com/jeffrey.harris.1441?fref=pb',
               'name': 'Jeffrey Harris',
               'type': 'care'},
              {'link': 'https://facebook.com/holden.richards?fref=pb',
               'name': 'Holden Richards',
               'type': 'like'},
              {'link': 'https://facebook.com/darrell.e.cook?fref=pb',
               'name': 'Darrell E. Cook',
               'type': 'like'}],

neon-ninja avatar Mar 30 '22 21:03 neon-ninja

I apologize for my ignorance, but will pip install git+https://github.com/kevinzg/facebook-scraper.git install that commit?

curiousier-george avatar Mar 30 '22 21:03 curiousier-george

It should do, yes

neon-ninja avatar Mar 30 '22 21:03 neon-ninja

I've also just pushed a new version (0.2.55) to PyPI, so pip install -U facebook-scraper would now do it too

neon-ninja avatar Mar 30 '22 21:03 neon-ninja

Yes, works great! Thank you!

curiousier-george avatar Mar 30 '22 22:03 curiousier-george

Thank you so much! It works great in that post, but it doesn't work in some posts, such as 5561190327250419. In some cases, it only returns one or two reactors, or even 0.

ben31406 avatar Mar 31 '22 02:03 ben31406

I see - try https://github.com/kevinzg/facebook-scraper/commit/c41e14e1c8271ae82d2e981d64bf8cd21db08a85

neon-ninja avatar Mar 31 '22 02:03 neon-ninja

Yes! It works great now. Thank you so much!

ben31406 avatar Mar 31 '22 02:03 ben31406

Sorry, I encountered another problem about reactors. In some specific facebook fanpage, I can't get correct post_id and reactors, but it works for others. Below is my testing code,

from facebook_scraper import get_posts
from pprint import pprint

cookies = {
    "wd": "XXX",
    "datr": "XXX",
    "sb": "XXX",
    "c_user": "XXX",
    "xs": "XXX",
    "fr": "XXX",
}

posts = get_posts(
    post_urls=["https://facebook.com/story.php?story_fbid=524560755699898&id=100044379341462"],
    options={
        "allow_extra_requests": False,
        "comments": "generator",
        "reactors": True,
        "reactions": True,
        "comment_reactors": False,
    },
    cookies=cookies,
)
post = next(posts)
pprint(post)

Here is part of the return. The post_id seems to be sourced from the first comment instead of the post itself, and the same problem is found in reactions and reactors fields as well.

'page_id': '1536864699976440',
 'post_id': '524560755699898_524611979028109',
 'post_text': '同學、學長、學妹傳來的照片\n'
              '阿金的書在吉隆坡 IPC, 新山 Mid Valley, 新加坡 Popular 目前都有展示\n'
              '\n'
              'IPC 還是「海景第一排」呢!\n'
              '和新馬的朋友分享~',
 'post_url': 'https://facebook.com/story.php?story_fbid=524560755699898&id=100044379341462',
 'reaction_count': 1,
 'reactions': {'like': 1},
 'reactors': [{'link': 'https://facebook.com/icudoctor?fref=pb',
               'name': 'Icu醫生陳志金',
               'type': 'like'}],

I found some discussions talking about cookies, and here is the return after I added "noscript": "1" in my cookies,

'page_id': '1536864699976440',
 'post_id': '524560755699898',
 'post_text': '同學、學長、學妹傳來的照片\n'
              '阿金的書在吉隆坡 IPC, 新山 Mid Valley, 新加坡 Popular 目前都有展示\n'
              '\n'
              'IPC 還是「海景第一排」呢!\n'
              '和新馬的朋友分享~',
 'post_url': 'https://facebook.com/story.php?story_fbid=524560755699898&id=100044379341462',
 'reaction_count': None,
 'reactions': None,
 'reactors': None,

the post_id is correct now, but the reactors and reactions fields turned to be None.

ben31406 avatar Apr 05 '22 15:04 ben31406

Here's the output I get with your test code:

'page_id': '1536864699976440',
 'post_id': 524560755699898,
 'post_text': '同學、學長、學妹傳來的照片\n'
              '阿金的書在吉隆坡 IPC, 新山 Mid Valley, 新加坡 Popular 目前都有展示\n'
              '\n'
              'IPC 還是「海景第一排」呢!\n'
              '和新馬的朋友分享~',
 'post_url': 'https://facebook.com/story.php?story_fbid=524560755699898&id=100044379341462',
 'reaction_count': 568,
 'reactions': {'care': 1, 'like': 563, 'love': 2, 'wow': 2},
 'reactors': [{'link': 'https://facebook.com/profile.php?id=100080093550026&fref=pb',
               'name': 'Leo Hsu',
               'type': 'like'},
              {'link': 'https://facebook.com/profile.php?id=100077404213696&fref=pb',
               'name': '林幸君',
               'type': 'like'},

Try update to latest master branch, and try set your Facebook language to English. Also try:

from facebook_scraper import _scraper
with open("524560755699898.html", "w") as f:
    f.write(_scraper.get("524560755699898").html.html)

and upload the resulting HTML file

neon-ninja avatar Apr 12 '22 05:04 neon-ninja