web_scraper icon indicating copy to clipboard operation
web_scraper copied to clipboard

Insufficient documentation of tests and usage

Open theRealBitcoinClub opened this issue 3 years ago • 5 comments

Hey bro, look, your package seems to be awesome, but I can hardly figure out how to use it.

For example you provide this code as test, but what would be that HTML element that you are matching with that command?:

var names = webScraper.getElementAttribute( 'div.thumbnail > div.caption > h4 > a.title', 'title');

It would be very easy for you, to provide an example HTML element of what exactly the command is matching, please could you invest a few minutes to help out everone else in this beautiful universe?

Thanks, enjoy the holiday on 03.01.

Actually as I am writing this, I realize that you are doing a test request within that test code, so maybe I can figure it out by checking the source and then I can provide you with a PR to make life easier for everyone, let me try to solve this issue myself, Ill get back here ASAP.

theRealBitcoinClub avatar Dec 24 '21 13:12 theRealBitcoinClub

Thanks, @theRealBitcoinClub.

I would really appreciate a PR.

tusharojha avatar Dec 24 '21 14:12 tusharojha

Ok so I found out that the statement above is related to the following code:

<div class="thumbnail">
  <img class="img-responsive" alt="item" src="/images/test-sites/e-commerce/items/cart2.png">
    <div class="caption">
      <h4>
        <a href="/test-sites/e-commerce/allinone/product/502" class="title" title="IdeaTab A3500-H">IdeaTab A3500-H</a>
      </h4>
    </div>
  </div>
</div>

And most probably targets this aspect: title="IdeaTab A3500-H"? Is that correct?

It is unclear because the content inside the following brackets is equivalent. While you are answering this question, what would be the command to get the content from inside the brackets?

theRealBitcoinClub avatar Dec 24 '21 14:12 theRealBitcoinClub

Does the scraper still work in the same manner if the class tag holds various classes? e.g. class="DvzRrc ab_button"

theRealBitcoinClub avatar Dec 24 '21 14:12 theRealBitcoinClub

<a jsaction="cGXGTb" href="javascript:void(0);" id="wrkpb" role="button" style="color:#e8eaed;text-decoration:none" class="DvzRrc ab_button" data-aspect-feedback-mode="1" data-attribution="lu-desktop-write-review" data-enable-add-photo="true" data-fid="0x8e803ca4b69dc375:0x4bd279fcc5aff74c" data-language-code="es-VE" data-maps-rw-api-key="AIzaSyBcv0QfUNUfBwo8pIGJ3teNCkaluSGUWus" data-pid="ChIJdcOdtqQ8gI4RTPevxfx50ks" data-session-index="0" data-edit-label-id="Editar opinión" data-ved="2ahUKEwjFsLq8o_z0AhWiQTABHRH_AsgQgCl6BAgUEAc">

How would you extraxt the data from the attribute "data-pid" ?

theRealBitcoinClub avatar Dec 24 '21 15:12 theRealBitcoinClub

A working example using a html string and not a website where you have to look for the source code of the page would be awesome!

In fact, more than one example, would be best, like 3 or 4. Thank you

solsticedhiver avatar Jan 06 '23 20:01 solsticedhiver