otter-browser icon indicating copy to clipboard operation
otter-browser copied to clipboard

Implement HTML bookmark import for QtWebEngine

Open fnkkio opened this issue 4 years ago • 14 comments

Fixes #1613

Worked fine with my test cases, but this may require further testing to make sure everything is OK.

Video demo: https://youtu.be/CtPK3Qr7zi0

fnkkio avatar Nov 18 '19 21:11 fnkkio

@fnkkio, I was working on backend independent import, but I guess that we could go this path as well (especially since there wasn't much progress with the shared importer). But with this approach I think that it would be better to move the logic to backend itself, to avoid ifdefs etc. (also that would be required for the plan to move backends to independent modules, without linking the core to their dependencies directly). I'll prepare new API and move there code used by QtWebKit, then we can revisit this patch.

Emdek avatar Nov 22 '19 11:11 Emdek

@Emdek Ok, I will wait for your API.

fnkkio avatar Nov 22 '19 12:11 fnkkio

@fnkkio, it's ready, QtWebKitBookmarksImportJob.

Emdek avatar Nov 25 '19 19:11 Emdek

@Emdek Thanks! Will get it done tomorrow.

fnkkio avatar Nov 25 '19 19:11 fnkkio

@Emdek Updated.

fnkkio avatar Dec 02 '19 05:12 fnkkio

@fnkkio, do we really need QEventLoop there? You can send first signal after receiving data too, it's not a big deal. Also in this case it might be a good idea to try QWebChannel, we don't use it directly but it's an indirect dependency anyway. Longer JS scripts should be moved to resources, to make them more readable. Also you are missing QLatin1String for strings.

Emdek avatar Dec 02 '19 20:12 Emdek

And I'll think about moving that date extracting method as protected in base class.

Emdek avatar Dec 02 '19 21:12 Emdek

And done.

Emdek avatar Dec 02 '19 21:12 Emdek

@fnkkio, looks better now, but still needs some formatting fixes etc., I guess that the faster route will be to let me to do that cleanup, I'll post results later this week.

Emdek avatar Dec 03 '19 20:12 Emdek

@Emdek Used QtWebChannel in latest commit, like you suggested. I don't know if this is an improvement, it's up to you to decide :-)

fnkkio avatar Dec 04 '19 15:12 fnkkio

@fnkkio, sorry for being late, but here it is:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index bfcedaa26..9d80b8ae1 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -529,7 +529,7 @@ add_executable(otter-browser WIN32 MACOSX_BUNDLE
 )
 
 if (Qt5WebEngineWidgets_FOUND AND ENABLE_QTWEBENGINE)
-	target_link_libraries(otter-browser Qt5::WebEngineCore Qt5::WebEngineWidgets)
+        target_link_libraries(otter-browser Qt5::WebEngineCore Qt5::WebEngineWidgets Qt5::WebChannel)
 endif ()
 
 if (Qt5WebKitWidgets_FOUND AND ENABLE_QTWEBKIT)
diff --git a/src/modules/backends/web/qtwebengine/QtWebEngineResources.qrc b/src/modules/backends/web/qtwebengine/QtWebEngineResources.qrc
index cc31d302c..bb5f022f7 100644
--- a/src/modules/backends/web/qtwebengine/QtWebEngineResources.qrc
+++ b/src/modules/backends/web/qtwebengine/QtWebEngineResources.qrc
@@ -8,5 +8,6 @@
         <file>resources/hideBlockedRequests.js</file>
         <file>resources/hitTest.js</file>
         <file>resources/imageViewer.js</file>
+        <file>resources/importBookmarks.js</file>
     </qresource>
 </RCC>
diff --git a/src/modules/backends/web/qtwebengine/QtWebEngineWebBackend.cpp b/src/modules/backends/web/qtwebengine/QtWebEngineWebBackend.cpp
index b1eb2857a..fedaf8e4d 100644
--- a/src/modules/backends/web/qtwebengine/QtWebEngineWebBackend.cpp
+++ b/src/modules/backends/web/qtwebengine/QtWebEngineWebBackend.cpp
@@ -23,6 +23,7 @@
 #include "QtWebEngineTransfer.h"
 #include "QtWebEngineUrlRequestInterceptor.h"
 #include "QtWebEngineWebWidget.h"
+#include "../../../../core/BookmarksManager.h"
 #include "../../../../core/ContentFiltersManager.h"
 #include "../../../../core/HandlersManager.h"
 #include "../../../../core/NetworkManagerFactory.h"
@@ -37,6 +38,7 @@
 #include <QtCore/QCoreApplication>
 #include <QtCore/QDir>
 #include <QtCore/QRegularExpression>
+#include <QtWebChannel/QtWebChannel>
 #include <QtWebEngineWidgets/QWebEngineProfile>
 #include <QtWebEngineWidgets/QWebEngineSettings>
 
@@ -276,6 +278,11 @@ WebWidget* QtWebEngineWebBackend::createWidget(const QVariantMap &parameters, Co
 	return new QtWebEngineWebWidget(parameters, this, parent);
 }
 
+BookmarksImportJob* QtWebEngineWebBackend::createBookmarksImportJob(BookmarksModel::Bookmark *folder, const QString &path, bool areDuplicatesAllowed)
+{
+	return new QtWebEngineBookmarksImportJob(folder, path, areDuplicatesAllowed, this);
+}
+
 QString QtWebEngineWebBackend::getName() const
 {
 	return QLatin1String("qtwebengine");
@@ -349,4 +356,178 @@ bool QtWebEngineWebBackend::hasSslSupport() const
 	return true;
 }
 
+QtWebEngineBookmarksImportJob::QtWebEngineBookmarksImportJob(BookmarksModel::Bookmark *folder, const QString &path, bool areDuplicatesAllowed, QObject *parent) : BookmarksImportJob(folder, areDuplicatesAllowed, parent),
+	m_path(path),
+	m_currentAmount(0),
+	m_totalAmount(-1),
+	m_isRunning(false)
+{
+}
+
+void QtWebEngineBookmarksImportJob::start()
+{
+	QFile bookmarksFile(m_path);
+
+	if (!bookmarksFile.open(QIODevice::ReadOnly))
+	{
+		markAsFinished(false);
+
+		return;
+	}
+
+	m_isRunning = true;
+
+	QWebChannel *webChannel(new QWebChannel(this));
+	QWebEnginePage *page(new QWebEnginePage(this));
+	page->setWebChannel(webChannel);
+
+	webChannel->registerObject(QLatin1String("bookmarkImporter"), this);
+
+	connect(page, &QWebEnginePage::loadFinished, [&]()
+	{
+		QFile webchannelFile(QLatin1String(":/qtwebchannel/qwebchannel.js"));
+
+		if (!webchannelFile.open(QIODevice::ReadOnly))
+		{
+			markAsFinished(false);
+
+			return;
+		}
+
+		page->runJavaScript(QString::fromLatin1(webchannelFile.readAll()), [&](const QVariant &result)
+		{
+			Q_UNUSED(result)
+
+			QFile importBookmarksFile(QLatin1String(":/modules/backends/web/qtwebengine/resources/importBookmarks.js"));
+
+			if (importBookmarksFile.open(QIODevice::ReadOnly))
+			{
+				page->runJavaScript(QString::fromLatin1(importBookmarksFile.readAll()));
+
+				importBookmarksFile.close();
+			}
+			else
+			{
+				markAsFinished(false);
+			}
+
+		});
+
+		webchannelFile.close();
+	});
+
+	page->setHtml(QString::fromLatin1(bookmarksFile.readAll()));
+
+	bookmarksFile.close();
+}
+
+void QtWebEngineBookmarksImportJob::cancel()
+{
+}
+
+void QtWebEngineBookmarksImportJob::processBookmarks(const QVariantList &items)
+{
+	processBookmarks(items, getImportFolder(), -1);
+
+	if (m_currentAmount == m_totalAmount)
+	{
+		markAsFinished(true);
+	}
+}
+
+void QtWebEngineBookmarksImportJob::processBookmarks(const QVariantList &items, BookmarksModel::Bookmark *importFolder, int numChildren)
+{
+	while (numChildren != 0 && m_currentAmount < m_totalAmount)
+	{
+		const QVariant &item(items.at(m_currentAmount++));
+
+		--numChildren;
+
+		if (item.isNull())
+		{
+			BookmarksManager::addBookmark(BookmarksModel::SeparatorBookmark, {}, importFolder);
+
+			emit importProgress(Importer::BookmarksImport, m_totalAmount, m_currentAmount);
+
+			continue;
+		}
+
+		const QVariantMap itemMap(item.toMap());
+		QMap<int, QVariant> metaData({{BookmarksModel::TitleRole, itemMap[QLatin1String("title")].toString()}});
+		const QDateTime timeAdded(getDateTime(itemMap[QLatin1String("dateAdded")].toString()));
+		const QDateTime timeModified(getDateTime(itemMap[QLatin1String("dateModified")].toString()));
+
+		if (timeAdded.isValid())
+		{
+			metaData[BookmarksModel::TimeAddedRole] = timeAdded;
+			metaData[BookmarksModel::TimeModifiedRole] = timeAdded;
+		}
+
+		if (timeModified.isValid())
+		{
+			metaData[BookmarksModel::TimeAddedRole] = timeModified;
+		}
+
+		if (itemMap[QLatin1String("type")].toString() == QLatin1String("folder"))
+		{
+			processBookmarks(items, BookmarksManager::addBookmark(BookmarksModel::FolderBookmark, metaData, importFolder), itemMap[QLatin1String("numChildren")].toInt());
+		}
+		else
+		{
+			const QString url(itemMap[QLatin1String("url")].toString());
+
+			if (!areDuplicatesAllowed() && BookmarksManager::hasBookmark(url))
+			{
+				continue;
+			}
+
+			metaData[BookmarksModel::UrlRole] = url;
+
+			const QString keyword(itemMap[QLatin1String("keyword")].toString());
+
+			if (!keyword.isEmpty())
+			{
+				metaData[BookmarksModel::KeywordRole] = keyword;
+			}
+
+			BookmarksManager::addBookmark(BookmarksModel::UrlBookmark, metaData, importFolder);
+		}
+
+		emit importProgress(Importer::BookmarksImport, m_totalAmount, m_currentAmount);
+	}
+}
+
+void QtWebEngineBookmarksImportJob::markAsFinished(bool isSuccess)
+{
+	m_isRunning = false;
+
+	BookmarksManager::getModel()->endImport();
+
+	emit importFinished(Importer::BookmarksImport, (isSuccess ? Importer::SuccessfullImport : Importer::FailedImport), m_totalAmount);
+	emit jobFinished(isSuccess);
+
+	deleteLater();
+}
+
+void QtWebEngineBookmarksImportJob::setAmounts(int totalAmount, int estimatedUrlsAmount, int estimatedKeywordsAmount)
+{
+	m_totalAmount = totalAmount;
+
+	emit importStarted(Importer::BookmarksImport, m_totalAmount);
+
+	if (m_totalAmount == 0)
+	{
+		markAsFinished(true);
+	}
+	else
+	{
+		BookmarksManager::getModel()->beginImport(getImportFolder(), estimatedUrlsAmount, estimatedKeywordsAmount);
+	}
+}
+
+bool QtWebEngineBookmarksImportJob::isRunning() const
+{
+	return m_isRunning;
+}
+
 }
diff --git a/src/modules/backends/web/qtwebengine/QtWebEngineWebBackend.h b/src/modules/backends/web/qtwebengine/QtWebEngineWebBackend.h
index 5e8386552..ce850b77e 100644
--- a/src/modules/backends/web/qtwebengine/QtWebEngineWebBackend.h
+++ b/src/modules/backends/web/qtwebengine/QtWebEngineWebBackend.h
@@ -45,6 +45,7 @@ public:
 	explicit QtWebEngineWebBackend(QObject *parent = nullptr);
 
 	WebWidget* createWidget(const QVariantMap &parameters, ContentsWidget *parent = nullptr) override;
+	BookmarksImportJob* createBookmarksImportJob(BookmarksModel::Bookmark *folder, const QString &path, bool areDuplicatesAllowed) override;
 	QString getName() const override;
 	QString getTitle() const override;
 	QString getDescription() const override;
@@ -81,6 +82,32 @@ private:
 friend class QtWebEnginePage;
 };
 
+class QtWebEngineBookmarksImportJob final : public BookmarksImportJob
+{
+	Q_OBJECT
+
+public:
+	explicit QtWebEngineBookmarksImportJob(BookmarksModel::Bookmark *folder, const QString &path, bool areDuplicatesAllowed, QObject *parent = nullptr);
+
+	bool isRunning() const override;
+	Q_INVOKABLE void processBookmarks(const QVariantList &items);
+	Q_INVOKABLE void setAmounts(int totalAmount, int estimatedUrlsAmount, int estimatedKeywordsAmount);
+
+public slots:
+	void start() override;
+	void cancel() override;
+
+protected:
+	void processBookmarks(const QVariantList &items, BookmarksModel::Bookmark *importFolder, int numChildren);
+	void markAsFinished(bool isSuccess);
+
+private:
+	QString m_path;
+	int m_currentAmount;
+	int m_totalAmount;
+	bool m_isRunning;
+};
+
 }
 
 #endif
diff --git a/src/modules/backends/web/qtwebengine/resources/importBookmarks.js b/src/modules/backends/web/qtwebengine/resources/importBookmarks.js
new file mode 100644
index 000000000..4d82de851
--- /dev/null
+++ b/src/modules/backends/web/qtwebengine/resources/importBookmarks.js
@@ -0,0 +1,42 @@
+new QWebChannel(qt.webChannelTransport, function(webChannel)
+{
+	function getAmount(selector)
+	{
+		return document.querySelectorAll(selector).length;
+	}
+
+	webChannel.objects.bookmarksImporter.setAmounts(getAmount('DT, HR'), getAmount('A[href]'), getAmount('A[shortcuturl]'));
+
+	var nodes = [];
+
+	document.querySelectorAll('H3, A, HR').forEach(function(node)
+	{
+		if (node.tagName.toUpperCase() === 'H3')
+		{
+			nodes.push({
+				type: 'folder',
+				title: node.innerHTML,
+				numChildren: node.parentNode.querySelector('dl').querySelectorAll(':scope > dt').length,
+				dateAdded: node.getAttribute('add_date'),
+				dateModified: node.getAttribute('last_modified')
+			});
+		}
+		else if (node.tagName.toUpperCase() === 'A')
+		{
+			nodes.push({
+				type: 'url',
+				title: node.innerHTML,
+				url: node.getAttribute('href'),
+				dateAdded: node.getAttribute('add_date'),
+				dateModified: node.getAttribute('last_modified'),
+				keyword: node.getAttribute('shortcuturl')
+			});
+		}
+		else if (node.tagName.toUpperCase() === 'HR')
+		{
+			nodes.push(null);
+		}
+	});
+
+	webChannel.objects.bookmarksImporter.processBookmarks(nodes);
+});

I haven't tested it but it needs some more work anyway.

The main idea behind going for web channel was to send bookmark data one by one, so it won't choke on huge imports. Basically, the logic in the script should be closer to what QtWebKit backend is doing, including checking for FEEDURL attribute. Perhpas going for API similar to this:

void beginImport(); // renamed setAmounts()
void endImport(); // calls markAsFinished()
void beginFolder(); // adds folder and enters it
void endFolder(); // calls goToParent() (I don't think that we should add Q_INVOKABLE on parent class)
void addBookmark(); // adds bookmark, passing type (url, feed, separator), dates, keyword, title and URL itself

Also, page should disable JavaScript (QWebEngineSettings::JavascriptEnabled), maybe loading images too (QWebEngineSettings::AutoLoadImages).

Emdek avatar Dec 09 '19 20:12 Emdek

@Emdek ok, will update this PR tomorrow.

fnkkio avatar Dec 19 '19 14:12 fnkkio

@Emdek Updated.

fnkkio avatar Dec 21 '19 23:12 fnkkio

@fnkkio, sorry for being late, looks good in general, but needs some minor fixes (mostly style) before merging, I'll post review soon.

Emdek avatar Dec 27 '19 20:12 Emdek