Menextract2pdf converturl2abspath does not work in Windows and the temporary workaround

converturl2abspath does not work in Windows and the temporary workaround

Open yishilin14 opened this issue 5 years ago • 9 comments

The bat files do not work in Windows as indicated in https://github.com/cycomanic/Menextract2pdf/issues/13.

The python itself does not work too. The function converturl2abspath does not return the correct file name in Windows. When url is file:///C:/Users/xxxxxx/xxxxxx.pdf, it returns C:/C:/Users/xxxxxx/xxxxxx.pdf.

Here goes my temporary fix, just in case some one need it in the future.

In menextract2pdf.py

Add import urllib, urllib.request at the beginning.
In converturl2abspath, change the return line to return urllib.request.url2pathname(pth).

Then run the following command.

python menextract2pdf.py "C:\Users\xxxxx\AppData\Local\Mendeley Ltd.\Mendeley Desktop\[email protected]@www.mendeley.com.sqlite" "C:\Users\xxxxx\AppData\Local\Mendeley Ltd.\Mendeley Desktop\Downloaded_new"

Now it works like magic!

Mar 25 '19 14:03 yishilin14

I can confirm this issue on Windows 10. Also, my sincere thanks to @yishilin14 for the suggested fix and to @cycomanic for this fantastic script. I had reconciled myself with losing all my Mendeley annotations upon moving to Zotero (Mendeley is evil) but no more!

I almost never work on Windows, but since Mendeley encrypted the database I've had a v1.18 installation on a Windows 10 box. That meant all the PDF paths in the Mendeley database were Windows paths, so that's why I had to run this extraction on Windows. In the hopes it might benefit someone else, here's the steps I took.

Windows 10 with Python (pip is apparently built-in these days):

> python -V
Python 3.7.2
> pip list
Package         Version
--------------- -------
pip             19.0.3
PyPDF2          1.26.0
python-dateutil 2.8.0
setuptools      40.6.2
six             1.12.0

I cloned this repo, and made only the following edits (note the slightly tweaked import compared to above):

$ git diff
diff --git a/src/menextract2pdf.py b/src/menextract2pdf.py
index c91a811..9b6ab5a 100644
--- a/src/menextract2pdf.py
+++ b/src/menextract2pdf.py
@@ -4,6 +4,7 @@
 # GPLv3 licence. See the COPYING file for details
 
 from __future__ import print_function
+from six.moves import urllib
 import sqlite3
 try:
     from urllib.parse import unquote, urlparse
@@ -28,7 +29,8 @@ def converturl2abspath(url):
         pth = unquote(urlparse(url).path) #this is necessary for filenames with unicode strings
     except:
         pth = unquote(str(urlparse(url).path)).decode("utf8") #this is necessary for filenames with unicode strings
-    return os.path.abspath(pth)
+    return urllib.request.url2pathname(pth)
 
 def get_highlights_from_db(db, results={}):
     """Extract the locations of highlights from the Mendeley database

I did try import urllib, urllib.request but got syntax errors, and using just import urllib gave AttributeError: module 'urllib' has no attribute 'request'. I'm not very familiar with Python, but this SO answer made me try from six.moves import urllib instead and that worked with the return line as-is.

> python .\menextract2pdf.py 'C:\Users\chepec\AppData\Local\Mendeley Ltd\Mendeley Desktop\[email protected]' Z:\chepec\literature\menextract\

Success, and over a thousand highlights will live to see another day :-)

Mar 29 '19 17:03 solarchemist

I am not a programmer. I am trying to import my Mendeley database, with thousands of highlights and notes, into Zotero. I've tried using this script as described in https://www.zotero.org/support/kb/mendeley_import, as well as the various fixes here. None of them works. Any help?

Jun 19 '19 04:06 brendonyoder

@chepec I only wrote the script with python2 initially, so the error you were encountering was an incompatibility with python3. I don't have a mendeley installation anymore, which makes keeping this updated a bit challenging. Maybe I will have to create an installation just to check things.

Jun 28 '19 07:06 cycomanic

@brendonyoder what OS and what are the errors?

Jun 28 '19 07:06 cycomanic

I'm running Windows 10 and have Python 2.7 installed. I navigate to the extracted Menextract2pdf-master folder and enter menextract2pdf_overwrite.bat "%LOCALAPPDATA%\Mendeley Ltd.\Mendeley Desktop\" "%LOCALAPPDATA%\Mendeley Ltd.\Mendeley Desktop\Downloaded\" and get REM Helps to find the right mendeley.sqlite-DB 1\*www.mendeley.com.sqlite"') do @set mendeleydb=a was unexpected at this time.

Jun 29 '19 05:06 brendonyoder

I'm running Windows 10 and have Python 2.7 installed. I navigate to the extracted Menextract2pdf-master folder and enter menextract2pdf_overwrite.bat "%LOCALAPPDATA%\Mendeley Ltd.\Mendeley Desktop\" "%LOCALAPPDATA%\Mendeley Ltd.\Mendeley Desktop\Downloaded\" and get REM Helps to find the right mendeley.sqlite-DB 1\*www.mendeley.com.sqlite"') do @set mendeleydb=a was unexpected at this time.

same here

Nov 12 '19 00:11 daledali

The bat files do not work in Windows as indicated in #13.

The python itself does not work too. The function converturl2abspath does not return the correct file name in Windows. When url is file:///C:/Users/xxxxxx/xxxxxx.pdf, it returns C:/C:/Users/xxxxxx/xxxxxx.pdf.

Here goes my temporary fix, just in case some one need it in the future.

In menextract2pdf.py

Add import urllib, urllib.request at the beginning.

In converturl2abspath, change the return line to return urllib.request.url2pathname(pth).

Then run the following command.
python menextract2pdf.py "C:\Users\xxxxx\AppData\Local\Mendeley Ltd.\Mendeley Desktop\[email protected]@www.mendeley.com.sqlite" "C:\Users\xxxxx\AppData\Local\Mendeley Ltd.\Mendeley Desktop\Downloaded_new"
Now it works like magic!

this worked!

Nov 12 '19 00:11 daledali

After following direction above, I receive the following error:

Traceback (most recent call last):

File "C:\Users\XXXX\Downloads\Menextract2pdf-master\menextract2pdf.py", line 200, in <module> mendeley2pdf(fn, dir_pdf)

File "C:\Users\XXXX\Downloads\Menextract2pdf-master\menextract2pdf.py", line 175, in mendeley2pdf highlights = get_highlights_from_db(db)

File "C:\Users\XXXX\Downloads\Menextract2pdf-master\menextract2pdf.py", line 60, in get_highlights_from_db ret = db.execute(query)

sqlite3.OperationalError: no such table: Files

Help with resolving this error would be much appreciated as it may preserve countless annotations.

Oct 29 '20 20:10 jorishijmans

I am experiencing the same issue @J-FSS has. At the moment, my Mendeley's version is 1.19.4 and am not sure whether I need to downgrade it to 1.18 for the temporary fix of @yishilin14 to work. Can anybody help, please?

Below is the output I get in my terminal, when I run the code from an anaconda environment with Python 2.7.

(pyenv_27) C:\Users\Valerio\Downloads\Menextract2pdf-master\src>python menextract2pdf.py "C:\Users\Valerio\AppData\Local\Mendeley Ltd.\Mendeley Desktop\valerio***@***@www.mendeley.com.sqlite" "C:\Users\Valerio\AppData\Local\Mendeley Ltd.\Mendeley Desktop\Downloaded_new"
Traceback (most recent call last):
  File "menextract2pdf.py", line 199, in <module>
    mendeley2pdf(fn, dir_pdf)
  File "menextract2pdf.py", line 174, in mendeley2pdf
    highlights = get_highlights_from_db(db)
  File "menextract2pdf.py", line 59, in get_highlights_from_db
    ret = db.execute(query)
sqlite3.OperationalError: no such table: Files

Thanks a lot!

P.S.: I really would like to migrate to Zotero, but must keep my notes and highlights from Mendeley... please, help us leave Mendeley! Thanks again.

Nov 21 '20 18:11 PHJT003

Menextract2pdf Menextract2pdf copied to clipboard

converturl2abspath does not work in Windows and the temporary workaround

Menextract2pdf
Menextract2pdf copied to clipboard