webwhiz
webwhiz copied to clipboard
Content being ignored by Crawlee
While crawling content it is ignoring text between certain tags like for example the content below between <aside></aside>
is completely ignored.
<aside class="content tip astro-duqfclob" aria-label="Tip">
<p class="title astro-duqfclob" aria-hidden="true">
<span class="icon astro-duqfclob">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 18 18" width="16" height="16" class="astro-duqfclob">
<path fill-rule="evenodd" d="M14 0a8.8 8.8 0 0 0-6 2.6l-.5.4-.9 1H3.3a1.8 1.8 0 0 0-1.5.8L.1 7.6a.8.8 0 0 0 .4 1.1l3.1 1 .2.1 2.4 2.4.1.2 1 3a.8.8 0 0 0 1 .5l2.9-1.7a1.8 1.8 0 0 0 .8-1.5V9.5l1-1 .4-.4A8.8 8.8 0 0 0 16 2v-.1A1.8 1.8 0 0 0 14.2 0h-.1zm-3.5 10.6-.3.2L8 12.3l.5 1.8 2-1.2a.3.3 0 0 0 .1-.2v-2zM3.7 8.1l1.5-2.3.2-.3h-2a.3.3 0 0 0-.3.1l-1.2 2 1.8.5zm5.2-4.5a7.3 7.3 0 0 1 5.2-2.1h.1a.3.3 0 0 1 .3.3v.1a7.3 7.3 0 0 1-2.1 5.2l-.5.4a15.2 15.2 0 0 1-2.5 2L7.1 11 5 9l1.5-2.3a15.3 15.3 0 0 1 2-2.5l.4-.5zM12 5a1 1 0 1 1-2 0 1 1 0 0 1 2 0zm-8.4 9.6a1.5 1.5 0 1 0-2.2-2.2 7 7 0 0 0-1.1 3 .2.2 0 0 0 .3.3c.6 0 2.2-.4 3-1.1z" class="astro-duqfclob"></path>
</svg>
</span>
Tip
</p>
<section class="astro-duqfclob">
<p>A common pattern in Astro is to import global CSS inside a <a href="/en/core-concepts/layouts/">Layout component</a>. Be sure to import the Layout component before other imports so that it has the lowest precedence.</p>
</section>
</aside>
The above code produces output as per screenshot below and also can be seen in action on this link :
All text inside <aside></aside>
is ignored. Please advise.