Menextract2pdf
Menextract2pdf copied to clipboard
converturl2abspath does not work in Windows and the temporary workaround
The bat
files do not work in Windows as indicated in https://github.com/cycomanic/Menextract2pdf/issues/13.
The python itself does not work too. The function converturl2abspath
does not return the correct file name in Windows. When url
is file:///C:/Users/xxxxxx/xxxxxx.pdf
, it returns C:/C:/Users/xxxxxx/xxxxxx.pdf
.
Here goes my temporary fix, just in case some one need it in the future.
In menextract2pdf.py
- Add
import urllib, urllib.request
at the beginning. - In
converturl2abspath
, change the return line toreturn urllib.request.url2pathname(pth)
.
Then run the following command.
python menextract2pdf.py "C:\Users\xxxxx\AppData\Local\Mendeley Ltd.\Mendeley Desktop\[email protected]@www.mendeley.com.sqlite" "C:\Users\xxxxx\AppData\Local\Mendeley Ltd.\Mendeley Desktop\Downloaded_new"
Now it works like magic!
I can confirm this issue on Windows 10. Also, my sincere thanks to @yishilin14 for the suggested fix and to @cycomanic for this fantastic script. I had reconciled myself with losing all my Mendeley annotations upon moving to Zotero (Mendeley is evil) but no more!
I almost never work on Windows, but since Mendeley encrypted the database I've had a v1.18 installation on a Windows 10 box. That meant all the PDF paths in the Mendeley database were Windows paths, so that's why I had to run this extraction on Windows. In the hopes it might benefit someone else, here's the steps I took.
Windows 10 with Python (pip is apparently built-in these days):
> python -V
Python 3.7.2
> pip list
Package Version
--------------- -------
pip 19.0.3
PyPDF2 1.26.0
python-dateutil 2.8.0
setuptools 40.6.2
six 1.12.0
I cloned this repo, and made only the following edits (note the slightly tweaked import compared to above):
$ git diff
diff --git a/src/menextract2pdf.py b/src/menextract2pdf.py
index c91a811..9b6ab5a 100644
--- a/src/menextract2pdf.py
+++ b/src/menextract2pdf.py
@@ -4,6 +4,7 @@
# GPLv3 licence. See the COPYING file for details
from __future__ import print_function
+from six.moves import urllib
import sqlite3
try:
from urllib.parse import unquote, urlparse
@@ -28,7 +29,8 @@ def converturl2abspath(url):
pth = unquote(urlparse(url).path) #this is necessary for filenames with unicode strings
except:
pth = unquote(str(urlparse(url).path)).decode("utf8") #this is necessary for filenames with unicode strings
- return os.path.abspath(pth)
+ return urllib.request.url2pathname(pth)
def get_highlights_from_db(db, results={}):
"""Extract the locations of highlights from the Mendeley database
I did try import urllib, urllib.request
but got syntax errors, and using just import urllib
gave AttributeError: module 'urllib' has no attribute 'request'
. I'm not very familiar with Python, but this SO answer made me try from six.moves import urllib
instead and that worked with the return line as-is.
> python .\menextract2pdf.py 'C:\Users\chepec\AppData\Local\Mendeley Ltd\Mendeley Desktop\[email protected]' Z:\chepec\literature\menextract\
Success, and over a thousand highlights will live to see another day :-)
I am not a programmer. I am trying to import my Mendeley database, with thousands of highlights and notes, into Zotero. I've tried using this script as described in https://www.zotero.org/support/kb/mendeley_import, as well as the various fixes here. None of them works. Any help?
@chepec I only wrote the script with python2 initially, so the error you were encountering was an incompatibility with python3. I don't have a mendeley installation anymore, which makes keeping this updated a bit challenging. Maybe I will have to create an installation just to check things.
@brendonyoder what OS and what are the errors?
I'm running Windows 10 and have Python 2.7 installed. I navigate to the extracted Menextract2pdf-master folder and enter menextract2pdf_overwrite.bat "%LOCALAPPDATA%\Mendeley Ltd.\Mendeley Desktop\" "%LOCALAPPDATA%\Mendeley Ltd.\Mendeley Desktop\Downloaded\"
and get REM Helps to find the right mendeley.sqlite-DB 1\*www.mendeley.com.sqlite"') do @set mendeleydb=a was unexpected at this time.
I'm running Windows 10 and have Python 2.7 installed. I navigate to the extracted Menextract2pdf-master folder and enter
menextract2pdf_overwrite.bat "%LOCALAPPDATA%\Mendeley Ltd.\Mendeley Desktop\" "%LOCALAPPDATA%\Mendeley Ltd.\Mendeley Desktop\Downloaded\"
and getREM Helps to find the right mendeley.sqlite-DB 1\*www.mendeley.com.sqlite"') do @set mendeleydb=a was unexpected at this time.
same here
The
bat
files do not work in Windows as indicated in #13.The python itself does not work too. The function
converturl2abspath
does not return the correct file name in Windows. Whenurl
isfile:///C:/Users/xxxxxx/xxxxxx.pdf
, it returnsC:/C:/Users/xxxxxx/xxxxxx.pdf
.Here goes my temporary fix, just in case some one need it in the future.
In
menextract2pdf.py
- Add
import urllib, urllib.request
at the beginning.- In
converturl2abspath
, change the return line toreturn urllib.request.url2pathname(pth)
.Then run the following command.
python menextract2pdf.py "C:\Users\xxxxx\AppData\Local\Mendeley Ltd.\Mendeley Desktop\[email protected]@www.mendeley.com.sqlite" "C:\Users\xxxxx\AppData\Local\Mendeley Ltd.\Mendeley Desktop\Downloaded_new"
Now it works like magic!
this worked!
After following direction above, I receive the following error:
Traceback (most recent call last):
File "C:\Users\XXXX\Downloads\Menextract2pdf-master\menextract2pdf.py", line 200, in <module> mendeley2pdf(fn, dir_pdf)
File "C:\Users\XXXX\Downloads\Menextract2pdf-master\menextract2pdf.py", line 175, in mendeley2pdf highlights = get_highlights_from_db(db)
File "C:\Users\XXXX\Downloads\Menextract2pdf-master\menextract2pdf.py", line 60, in get_highlights_from_db ret = db.execute(query)
sqlite3.OperationalError: no such table: Files
Help with resolving this error would be much appreciated as it may preserve countless annotations.
I am experiencing the same issue @J-FSS has. At the moment, my Mendeley's version is 1.19.4 and am not sure whether I need to downgrade it to 1.18 for the temporary fix of @yishilin14 to work. Can anybody help, please?
Below is the output I get in my terminal, when I run the code from an anaconda environment with Python 2.7.
(pyenv_27) C:\Users\Valerio\Downloads\Menextract2pdf-master\src>python menextract2pdf.py "C:\Users\Valerio\AppData\Local\Mendeley Ltd.\Mendeley Desktop\valerio***@***@www.mendeley.com.sqlite" "C:\Users\Valerio\AppData\Local\Mendeley Ltd.\Mendeley Desktop\Downloaded_new"
Traceback (most recent call last):
File "menextract2pdf.py", line 199, in <module>
mendeley2pdf(fn, dir_pdf)
File "menextract2pdf.py", line 174, in mendeley2pdf
highlights = get_highlights_from_db(db)
File "menextract2pdf.py", line 59, in get_highlights_from_db
ret = db.execute(query)
sqlite3.OperationalError: no such table: Files
Thanks a lot!
P.S.: I really would like to migrate to Zotero, but must keep my notes and highlights from Mendeley... please, help us leave Mendeley! Thanks again.