GoBooDo icon indicating copy to clipboard operation
GoBooDo copied to clipboard

Received invalid response

Open Deadshot-SSJB opened this issue 3 years ago • 11 comments

When i try this book: https://www.google.co.in/books/edition/Xamidea_Social_Science_for_Class_9_CBSE/94s2EAAAQBAJ?hl=en&gbpv=0

it says Received invalid response

i use the id: 94s2EAAAQBAJ

It detects the name of the book but does not do anything

please fix this

thank you

Deadshot-SSJB avatar Sep 10 '21 18:09 Deadshot-SSJB

try replaceing lines:

        try:
            stringResponse = ("["+scripts[6].text.split("_OC_Run")[1][1:-2]+"]")
        except:
            stringResponse = ("["+scripts[-4].text.split("_OC_Run")[1][1:-2]+"]")

with:

        target = "_OC_Run"           
        index = [i for i, content in enumerate(scripts) if '_OC_Run' in str(content)]
        index = index[0]
        stringResponse = f"[{str(scripts[index]).split('_OC_Run')[1][1:].strip(');</script>')}]"

It worked for me.

mrelg avatar Sep 13 '21 20:09 mrelg

thanks! but now im getting

------------------- Creating PDF -------------------
Traceback (most recent call last):
  File "GoBooDo2.py", line 217, in <module>
    book.start()
  File "GoBooDo2.py", line 197, in start
    self.processBook()
  File "GoBooDo2.py", line 150, in processBook
    service.makePdf()
  File "F:\GoBooDo-master\makePDF.py", line 15, in makePdf
    firstPath = self.imageNameList[0]
IndexError: list index out of range

is it related?

book id: PtkMiNeajNMC

QUAKEULUS avatar Oct 16 '21 16:10 QUAKEULUS

I don't know for certain, but it doesn't look related to me. I tried your book and got links for all but one page. (PT174) Unfortunately, it looks like my install of tesseract isn't detecting missing pages correctly, and only 40 out of 178 were actually fetched. As for your case, it looks to me that you can't fetch any pages. Check if you have any images in "book"\images and maybe try adding proxies to the proxies.txt I put over 140 on my list. Just go to the web and get sites with lots of free proxy IPs

mrelg avatar Oct 16 '21 20:10 mrelg

Also, you should open a separate Issue.

mrelg avatar Oct 16 '21 20:10 mrelg

@mrelg For the tesseract problem, you might want to try:

diff --git a/storeImages.py b/storeImages.py
--- a/storeImages.py	(revision 94bd40aa323abc30d88bcda81afc9cd28b0e94c4)
+++ b/storeImages.py	(date 1634742102818)
@@ -65,7 +65,7 @@
         except:
             pytesseract.pytesseract.tesseract_cmd = self.tesserPath
             text = pytesseract.image_to_string(bw)
-        return text.replace('\n', " ") == 'image not available'
+        return text.strip().replace('\n', " ") == 'image not available'
 
 
     def getImages(self,retries):

pcdi avatar Oct 20 '21 15:10 pcdi

to use proxies I put "proxy_links": 1, in settings. Is this correct?

mmmx10 avatar Nov 23 '21 17:11 mmmx10

try replaceing lines:

        try:
            stringResponse = ("["+scripts[6].text.split("_OC_Run")[1][1:-2]+"]")
        except:
            stringResponse = ("["+scripts[-4].text.split("_OC_Run")[1][1:-2]+"]")

with:

        target = "_OC_Run"           
        index = [i for i, content in enumerate(scripts) if '_OC_Run' in str(content)]
        index = index[0]
        stringResponse = f"[{str(scripts[index]).split('_OC_Run')[1][1:].strip(');</script>')}]"

It worked for me.

Works for me too

DmytroSytnyk avatar Dec 01 '21 20:12 DmytroSytnyk

@mrelg Could you please make the pull request so the code get into the codebase?

DmytroSytnyk avatar Dec 01 '21 20:12 DmytroSytnyk

hello everyone,

I tried the solution in the photo below but after replacing the lines of code as described

replaceing line

the error appears in the next photo

error goboodo

could somebody help me or explain me what kind of error is that?

blue2908 avatar Apr 27 '22 22:04 blue2908

I'm a python rookie, so believe me when I say it, it is quite a rookie mistake. Open up your code in some editor that can visualize white space characters like notpad++ (there is an option for that under View/ShowSymbols), and make sure to match the number and type of indentations in the code. https://stackoverflow.com/questions/1016814/what-to-do-with-unexpected-indent-in-python

BTW can someone please implement my fix in a pull request? I don't have the time to learn how to do a proper GitHub thing and do it myself. It's a little tiring to answer everyone how to type it in themself, especially when the bug isn't necessarily the same one.

mrelg avatar Apr 28 '22 09:04 mrelg

the fix offered by @mrelg is still needed to make this project work.

Here is the patch version of the same as https://github.com/vaibhavk97/GoBooDo/issues/60#issuecomment-918563558

diff --git a/GoBooDo.py b/GoBooDo.py
index 0971a7d..2dab8f3 100644
--- a/GoBooDo.py
+++ b/GoBooDo.py
@@ -82,10 +82,11 @@ class  GoBooDo:
         print(f'Downloading {self.name[:-15]}')
         if self.found == False:
             scripts = (soup.findAll('script'))
-            try:
-                stringResponse = ("["+scripts[6].text.split("_OC_Run")[1][1:-2]+"]")
-            except:
-                stringResponse = ("["+scripts[-4].text.split("_OC_Run")[1][1:-2]+"]")
+            target = "_OC_Run"
+            index = [i for i, content in enumerate(scripts) if '_OC_Run' in str(content)]
+            index = index[0]
+            stringResponse = f"[{str(scripts[index]).split('_OC_Run')[1][1:].strip(');</script>')}]"
+
             jsonResponse = json.loads(stringResponse)
             self.createPageDict(jsonResponse)
             print(f'Pages to be fetched in the current iteration are : {len(self.pageList)}')

I also attached it as a file in case copy-n-paste fails to work:

patch.txt

save the file and then run:

git clone https://github.com/vaibhavk97/GoBooDo
cd GoBooDo
# now copy patch.txt to here and finally
git apply patch.txt
python  GoBooDo.py --id=YOURID

stas00 avatar Feb 16 '23 23:02 stas00