RedditExtractor icon indicating copy to clipboard operation
RedditExtractor copied to clipboard

Thread content URL containing Permalink instead of URL to external site

Open cschwem2er opened this issue 1 year ago • 0 comments

Describe the bug Using get_thread_content(), we receive a list containing two data frames, one for metadata about a particular thread. This data frame includes a variable url, which, however, does not actually include the url to the corresponding external website of a thread (if available) but instead the permalink for the thread.

As an example. please check this thread: https://www.reddit.com/r/worldnews/comments/ebamnt/venezuelas_civilian_militia_surpasses_target.json

The correct url would be "https://venezuelanalysis.com/news/14742", but instead the permalink "https://www.reddit.com/r/worldnews/comments/ebamnt/venezuelas_civilian_militia_surpasses_target" is listed.

I suggest adding the external url as an additional variable in the "thread" dataframe, either via a name such as "external_url" or by adjusting the naming conventions in line with the API results ("url", "permalink").

cschwem2er avatar Jun 12 '23 10:06 cschwem2er