feedzy-rss-feeds icon indicating copy to clipboard operation
feedzy-rss-feeds copied to clipboard

Full Content Error inside the Preview

Open AndreeaCristinaRadacina opened this issue 1 year ago • 4 comments

Description

A user reported that Feedzy doesn't parse the full content and that he has this error appearing inside the preview too.

I tested and it appears that his source has several HTML elements. After using define( 'FEEDZY_ALLOW_UNSAFE_HTML', true ); the content was imported, but the preview still displays an error.

ref: https://secure.helpscout.net/conversation/2511140666/404715?viewId=212385

Step-by-step reproduction instructions

  1. Go to Feedzy > Imports
  2. Create a new import and use this source - https://rss.beehiiv.com/feeds/hThEYw1fsM.xml
  3. Use the [#item_full_content] tag
  4. Click on Preview import
  5. Save the changes, then run the import
  6. Add this line define( 'FEEDZY_ALLOW_UNSAFE_HTML', true ); in the wp-config.php file
  7. Save the changes, reset the import and check the preview again

Screenshots, screen recording, code snippet or Help Scout ticket

image image Also, even if those two posts appear as successfully retrieving full content, they actually don't: image

Environment info

No response

Is the issue you are reporting a regression

No

AndreeaCristinaRadacina avatar Feb 20 '24 08:02 AndreeaCristinaRadacina

Using the trick with defined( 'FEEDZY_ALLOW_UNSAFE_HTML', true ); is working

Image

But one thing to remember is that some styling might be site related; in this case, they are not imported since they do not belong to the feed.


@vytisbulkevicius should we implement a toggle to allow users to import with unsafe elements?

Soare-Robert-Daniel avatar Mar 08 '24 13:03 Soare-Robert-Daniel

@Soare-Robert-Daniel,

It would be better to have that constant as a toggle, for sure. We can have a separate issue for that but in this case the problem was also that Preview is not accurate (even with constant applied): image

In your case, does the preview shows correct information with a constant set to allow HTML?

vytisbulkevicius avatar Mar 11 '24 21:03 vytisbulkevicius

@vytisbulkevicius for me is getting the full content

Image

Also, one thing to be aware of is that the full content is done by a third-party service which grabs the main content of the page. Since the page has some elements like style along the content, you can get this result.

Image

Soare-Robert-Daniel avatar Mar 12 '24 11:03 Soare-Robert-Daniel

@Soare-Robert-Daniel,

So using the same feed above - https://rss.beehiiv.com/feeds/hThEYw1fsM.xml

In the preview we see that full content extracted for some items and for some not. Let's focus on 2 items so it's easier: image

This preview means, that first item should be imported with FULL Content and 2nd items with content from XML RSS feed only as full content can't be retried.

However, if you run the import, BOTH of them are imported with content that is available in the RSS XML Feed, not from the website.

1st item: https://astonished-iguana-2195f1.vertisite.cloud/overwhelmed-by-the-daily-grind-of-15-linkedin-comments-for-growth/ 2nd item: https://astonished-iguana-2195f1.vertisite.cloud/urgent-easter-special-2-spots-only/

Because, 1st item in the XML RSS Feed is:

<item>
<title>Overwhelmed By The Daily Grind Of 15 LinkedIn Comments For Growth? </title>
<description>Try this...</description>
<link>https://matthews-newsletter-e7b060.beehiiv.com/p/overwhelmed-daily-grind-15-linkedin-comments-growth</link>
<guid isPermaLink="true">https://matthews-newsletter-e7b060.beehiiv.com/p/overwhelmed-daily-grind-15-linkedin-comments-growth</guid>
<pubDate>Sun, 07 Apr 2024 13:27:00 +0000</pubDate>
<atom:published>2024-04-07T13:27:00Z</atom:published>
<dc:creator>Matthew Baltzell</dc:creator>
<content:encoded>
<![CDATA[ <div class='beehiiv'><style> .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; } .bh__table_cell { padding: 5px; background-color: #FFFFFF; } .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; } .bh__table_header { padding: 5px; background-color:#F1F1F1; } .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; } </style><div class='beehiiv__body'><p class="paragraph" style="text-align:left;">LinkedIn can feel like a hamster wheel I get it:</p><p class="paragraph" style="text-align:left;">Log on<br>Read post<br>Hit the like button<br>Thoughtful comment</p><p class="paragraph" style="text-align:left;">Repeat <br>Repeat <br>Brain Dead</p><p class="paragraph" style="text-align:left;">Congratulations, you’ve hit your daily quota for LinkedIn commenting for growth. <br><br>But…<br><br>It comes at a cost - your creative energy. </p><p class="paragraph" style="text-align:left;">You realize this and want to stop the brain drain so close your MacBook Air while swearing under your breath to the Social Media Gods you will never put yourself through that torture again.<br><br>(While typing this out in my Arctic Blue office I felt like I had a 20 pound kettle bell resting on my chest causing me massive anxiety just thinking about this. )<br><br>I don’t want that for you. <br><br>The way I see it…<br><br>You’ve got 3 options:</p><ol start="1"><li><p class="paragraph" style="text-align:left;">Repeat the same process</p></li><li><p class="paragraph" style="text-align:left;">Delegate/Outsource</p></li><li><p class="paragraph" style="text-align:left;">Quit</p></li></ol><p class="paragraph" style="text-align:left;">Options 1 and 3 are off the table. </p><p class="paragraph" style="text-align:left;">That leaves you with one option.</p><p class="paragraph" style="text-align:left;"><span style="text-decoration:underline;">You need to delegate this task in order to maintain stable business growth on LinkedIn and your sanity. </span><br><br>I present to you The Hands Free Commenting Strategy</p><p class="paragraph" style="text-align:left;">Gone are the days of endless hours on LinkedIn typing comments.<br><br>My team of LinkedIn Assasins will help you grow your account by 200 followers every month through organic commenting or your money back + $25. <br><br>We’ve already helped one client gain 100 followers in 10 days and they don’t even post content (Imagine the possibilities)</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/844860b2-1c02-4339-a2d0-3a0e979f2ba5/Screenshot_2567-04-07_at_11.34.25.png?t=1712464481"/></div><p class="paragraph" style="text-align:left;">Interested?<br><br><a class="link" href="https://matthew876527.typeform.com/to/cm88ef1o?utm_source=matthews-newsletter-e7b060.beehiiv.com&utm_medium=newsletter&utm_campaign=overwhelmed-by-the-daily-grind-of-15-linkedin-comments-for-growth" target="_blank" rel="noopener noreferrer nofollow"><span style="text-decoration:underline;">Fill out this quick form</span></a><br><br>Cheers, </p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/e2420533-0e0e-4dcf-b6c1-852d0d930429/Circle_Profile.jpg?t=1711264245"/></div><p class="paragraph" style="text-align:left;"><span style="color:rgba(255, 255, 255, 0.9);font-family:-apple-system, system-ui, system-ui, Segoe UI, Roboto, Helvetica Neue, Fira Sans, Ubuntu, Oxygen, Oxygen Sans, Cantarell, Droid Sans, Apple Color Emoji, Segoe UI Emoji, Segoe UI Emoji, Segoe UI Symbol, Lucida Grande, Helvetica, Arial, sans-serif;font-size:14px;">Tired of writing 20 LinkedIn comments daily for growth? Try this</span></p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=168bb1a7-4b88-4ecd-ae3d-fa43f845aecc&utm_medium=post_rss&utm_source=the_curious_creator">Powered by beehiiv</a></div></div> ]]>
</content:encoded>
</item>

And the content from inside content:encoded tag is what we get imported while actually based on the preview we should get imported content from this link - https://matthews-newsletter-e7b060.beehiiv.com/p/overwhelmed-daily-grind-15-linkedin-comments-growth (content is similar but there are more text in the full content that is missing in the XML RSS Feed).

And to confirm here, it's expected that sometimes we can't reach source website to crawl the content from there but the issue here is to make sure our preview works correctly.

vytisbulkevicius avatar Apr 26 '24 07:04 vytisbulkevicius

@vytisbulkevicius, It seems that our full content feature is not always the full content of the page.

Look at this response from our server when you use [#item_full_content]:

<entry>
    <guid>https://matthews-newsletter-e7b060.beehiiv.com/p/overwhelmed-daily-grind-15-linkedin-comments-growth</guid>
    <link href="https://matthews-newsletter-e7b060.beehiiv.com/p/overwhelmed-daily-grind-15-linkedin-comments-growth">
    <title>&lt;![CDATA[Overwhelmed By The Daily Grind Of 15 LinkedIn Comments For Growth?]]&gt;</title>
    <summary type="html"><!--[CDATA[Try this...]]--></summary>
    <content type="html"><!--[CDATA[<div class="beehiiv"--><div class="beehiiv__body"><p class="paragraph">LinkedIn can feel like a hamster wheel I get it:</p><p class="paragraph">Log on<br>Read post<br>Hit the like button<br>Thoughtful comment</p><p class="paragraph">Repeat <br>Repeat <br>Brain Dead</p><p class="paragraph">Congratulations, you’ve hit your daily quota for LinkedIn commenting for growth. <br><br>But…<br><br>It comes at a cost - your creative energy. </p><p class="paragraph">You realize this and want to stop the brain drain so close your MacBook Air while swearing under your breath to the Social Media Gods you will never put yourself through that torture again.<br><br>(While typing this out in my Arctic Blue office I felt like I had a 20 pound kettle bell resting on my chest causing me massive anxiety just thinking about this. )<br><br>I don’t want that for you. <br><br>The way I see it…<br><br>You’ve got 3 options:</p><ol start="1"><li><p class="paragraph">Repeat the same process</p></li><li><p class="paragraph">Delegate/Outsource</p></li><li><p class="paragraph">Quit</p></li></ol><p class="paragraph">Options 1 and 3 are off the table. </p><p class="paragraph">That leaves you with one option.</p><p class="paragraph"><span>You need to delegate this task in order to maintain stable business growth on LinkedIn and your sanity. </span><br><br>I present to you The Hands Free Commenting Strategy</p><p class="paragraph">Gone are the days of endless hours on LinkedIn typing comments.<br><br>My team of LinkedIn Assasins will help you grow your account by 200 followers every month through organic commenting or your money back + $25. <br><br>We’ve already helped one client gain 100 followers in 10 days and they don’t even post content (Imagine the possibilities)</p><div class="image"><img alt="" class="image__image" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/844860b2-1c02-4339-a2d0-3a0e979f2ba5/Screenshot_2567-04-07_at_11.34.25.png?t=1712464481"></div><p class="paragraph">Interested?<br><br><a class="link" href="https://matthew876527.typeform.com/to/cm88ef1o?utm_source=matthews-newsletter-e7b060.beehiiv.com&amp;utm_medium=newsletter&amp;utm_campaign=overwhelmed-by-the-daily-grind-of-15-linkedin-comments-for-growth" target="_blank" rel="noopener noreferrer nofollow"><span>Fill out this quick form</span></a><br><br>Cheers, </p><div class="image"><img alt="" class="image__image" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/e2420533-0e0e-4dcf-b6c1-852d0d930429/Circle_Profile.jpg?t=1711264245"></div><p class="paragraph"><span>Tired of writing 20 LinkedIn comments daily for growth? Try this</span></p></div><div class="beehiiv__footer"><br class="beehiiv__footer__break"><hr class="beehiiv__footer__line"><a target="_blank" class="beehiiv__footer_link" href="https://www.beehiiv.com/?utm_campaign=168bb1a7-4b88-4ecd-ae3d-fa43f845aecc&amp;utm_medium=post_rss&amp;utm_source=the_curious_creator">Powered by beehiiv</a></div>]]&gt;</content>
    <dc:creator><!--[CDATA[]]--></dc:creator>
    <published>Sun, 07 Apr 2024 13:27:00 +0000</published>
    <link rel="enclosure" href="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/e2420533-0e0e-4dcf-b6c1-852d0d930429/Circle_Profile.jpg?t=1711264245" type="">
    <full-content type="html"><!--[CDATA[<div--><p>But…</p><p>It comes at a cost - your creative energy. </p><div><p>(While typing this out in my Arctic Blue office I felt like I had a 20 pound kettle bell resting on my chest causing me massive anxiety just thinking about this. )</p><p>I don’t want that for you. </p><p>The way I see it…</p><p>You’ve got 3 options: </p></div><div><ol start="1"><li><p> Repeat the same process </p></li><li><p> Delegate/Outsource </p></li><li><p> Quit </p></li></ol></div><div><p>I present to you The Hands Free Commenting Strategy </p></div><div><p>My team of LinkedIn Assasins will help you grow your account by 200 followers every month through organic commenting or your money back + $25. </p><p>We’ve already helped one client gain 100 followers in 10 days and they don’t even post content (Imagine the possibilities) </p></div><div><img alt="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/844860b2-1c02-4339-a2d0-3a0e979f2ba5/Screenshot_2567-04-07_at_11.34.25.png?t=1712464481"></div><div><img alt="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/e2420533-0e0e-4dcf-b6c1-852d0d930429/Circle_Profile.jpg?t=1711264245"></div>]]&gt;</full-content>
    <feedzy:parent-source>The Curious Creator</feedzy:parent-source>
    <error></error>
</entry>

You can see that <content from the original feed has more elements (like Log on<br>Read post<br>) than <full-content generated by Graby. Thus, we think we used content and not full-content.

In my testing, the error display of the preview is correct. If Graby can not get the content, it will give an error. If yes, it will give the page content, but it is not always scrapping all the page's content.

Soare-Robert-Daniel avatar May 22 '24 12:05 Soare-Robert-Daniel

Got it. Thanks for the investigation, I will close this issue in such case.

vytisbulkevicius avatar Jul 02 '24 08:07 vytisbulkevicius