booking_scraper icon indicating copy to clipboard operation
booking_scraper copied to clipboard

It does not work but...

Open jumpjack opened this issue 3 years ago • 1 comments

It does not work (anymore?), but this script works in extracting data from html results:

resultsCount = document.querySelector("#right").children[0].children[0].children[0].children[1].children[0].innerHTML;
mainResult = document.querySelector("#search_results_table");
actualResults = mainResult.children[1].children[0].children[0].children[0].children[2];
res = [];
resultsArr = [...actualResults.children];
res = [];
resultsArr.forEach((result) => {
	if (result.getAttribute("data-testid")) {
		if (result.getAttribute("data-testid") == "property-card") {
			try {
				name=result.children[0].children[1].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].innerHTML
			} catch(e) {
				name = "n/a";
			}

			priceBase=result.children[0].children[1].children[0].children[1].children[1].children[0].children[0].children[0].children[0].children[1].children[0];
			try {
				price=priceBase.children[0].children[0].innerHTML;
			} catch(e) {
				try {
					price=priceBase.children[0].innerHTML;
				} catch(e) {
					try {
						price=priceBase.innerHTML;
					} catch(e) {
						price = "n/a";
					}
				}
			}

			price = price.replace(" ","");
			try {
				type=result.children[0].children[1].children[0].children[1].children[0].children[0].children[1].children[0].children[0].children[0].innerHTML;
			} catch(e) {
				type = "n/a";
			}

			try {
				rank=result.children[0].children[1].children[0].children[0].children[1].children[0].children[0].children[0].children[0].children[0].innerHTML;
			} catch(e) {
				rank = "n/a";
			}


			res.push({name: name, price: price, type: type, rank:rank})
		}
	}
});
console.log(res);



https://www.booking.com/searchresults.it.html?checkin_month=11&checkin_monthday=9&checkin_year=2022&checkout_month=11&checkout_monthday=11&checkout_year=2022&group_adults=3&group_children=0&order=price&ss=Rome%2C%20Italy&offset=26&nflt=mealplan%3D1%3Boos%3D1%3Breview_score%3D80

  • https://www.booking.com/searchresults.it.html
    • checkin_month=11
    • checkin_monthday=9
    • checkin_year=2022
    • checkout_month=11
    • checkout_monthday=11
    • checkout_year=2022
    • group_adults=3
    • group_children=0
    • order=price
    • ss=Rome, Italy // location
    • offset=26

Additional filters: Use &nflt= followed by a string of "parameter%3dvalue" , each parameter separated by %3b (=";"); example: &nflt=mealplan%3D1%3Boos%3D1%3Breview_score%3D80

  • mealplan%3D1 // "mealplan=1", i.e. "breakfast included"; 999 = kitchen; 9 = breakfast + dinner
  • oos%3D1 // "oos=1", i.e. only available locations; ";" is the separator
  • review_score%3D80 // "review_score=80", i.e. ranking >=8; ";" is the separator
  • ht_id%3D204 // "ht_id=204", i.e. "only hotels"; 208 = B&B;
  • pri%3D2 // "pri=2" (available price ranges: 1=0-75; 2= 75-150; 3=...)
  • class%3D3 // class=3 (number of stars = 3)
  • fc%3D2 // fc=2 (free cancelation)
  • roomfacility%3D38 // roomfacility=38 (private bathroom); 11 = air conditioning; 5=bathtube; 75 = flat TV

jumpjack avatar Nov 03 '22 15:11 jumpjack

JSON result on a map: https://www.booking.com/markers_on_map?aid=304142&aid=304142&dest_id=-130358&dest_type=&sr_id=&ref=searchresults&limit=100&stype=1&lang=it&ssm=1&checkin=2022-12-29&checkout=2023-01-01&sech=1&ngp=1&room1=A%2CA%2CA&ugr=1&maps_opened=1&nsopf=1&nsobf=1&esf=1&nflt=mealplan%3D1%3Breview_score%3D80%3Bfc%3D2&sr_countrycode=it&sr_lat=&sr_long=&sgh=1&dba=1&dbc=1&spr=1&currency=EUR&&shws=1%20&huks=1&somp=1&mdimb=1%20&tp=1%20&img_size=270x200%20&avl=1%20&nor=1%20&spc=1%20&rmd=1%20&slpnd=1%20&sbr=1&at=1%20&sat=1%20&ssu=1&srocc=1&order=price;BBOX=13.673105411340503,41.75777869190572,14.244394473840503,42.36949079906152&_=1667554522637

Assign the result to "h" to get a simplified list:

h.b_hotels.forEach((hotel) => {console.log(hotel.b_hotel_title, hotel.b_accommodation_type, hotel.b_review_score + "(" + hotel.b_review_nr + ")" , hotel.b_marker_type , hotel.b_u_total_price, "dist=" + (Math.sqrt(Math.pow(centerlat*1-hotel.b_latitude,2) + Math.pow(centerlon*1-hotel.b_longitude,2))*111).toFixed(0)) });

From hotel.b_marker_type you can understand if location is available or not; if it is, also its price (hotel.b_u_total_price) is available.

Given this value for "boundin box" (BBOX):

bbox="13.673105411340503,41.75777869190572,14.244394473840503,42.36949079906152"

You can get center value by:

coords=bbox.split(",");
lat1=coords[1]*1;
lon1=coords[0]*1;
lat2=coords[3]*1;
lon2=coords[2]*1;
centerlat = lat1 + (lat2-lat1)/2;
centerlon = lon1 + (lon2-lon1)/2;

And hotel distance from center by:

    Math.sqrt(Math.pow(centerlat*1-h.b_hotels[HOTEL_NUM].b_latitude,2) + Math.pow(centerlon*1-h.b_hotels[HOTEL_NUM].b_longitude,2))*111

jumpjack avatar Nov 04 '22 09:11 jumpjack