docling
docling copied to clipboard
feature: add support for google docs/files urls
Requested feature
let googleDocId = url.match(/google\.com\/(file|document)\/d\/([\w-]+)/);
if (googleDocId)
url = googleDocId[1] === 'file'
? `https://drive.google.com/uc?export=download&id=${googleDocId[2]}`
: `https://docs.google.com/document/d/${googleDocId[2]}/export?format=pdf`;
...
import re
def convert_google_url_to_download(url):
googleDocId = re.search(r'google\.com\/(file|document)\/d\/([\w-]+)', url)
if googleDocId:
if googleDocId.group(1) == 'file':
return f'https://drive.google.com/uc?export=download&id={googleDocId.group(2)}'
else:
return f'https://docs.google.com/document/d/{googleDocId.group(2)}/export?format=pdf'
return None
Alternatives
https://airesearch.js.org/functions/extractor/pdf-to-html/