readium-js
readium-js copied to clipboard
Text search over entire book
We use jQuery to search through a book. In previous Readium versions, with eager loading enabled, we were able to search through the entire book. But eager loading has been removed.
We would prefer a Readium native way to search through the entire book. This could use the same approach as Total Page Numbers Issue #16 with a second hidden readium and iterating over that.
How can you achieve that?
Don't know if this will make m1.1, but will leave it under that milestone for now.
I hear good things about Forage (previously-known as Norch): https://github.com/fergiemcdowall/forage
Good work being done here: https://github.com/GermanCentralLibraryForTheBlind/readium-js-viewer/commits/feature/full_text_search
https://github.com/fergiemcdowall/forage
How to use above to read Epub book. How to integrate this along with https://github.com/GermanCentralLibraryForTheBlind/readium-js-viewer/commits/feature/full_text_search
How to implement text search over entire book?
@naveen1941 you can experiment with this Node-based search engine:
https://github.com/larsvoigt/epub-full-text-search
I will post when the mentioned search engine can be run and tested. I think, this will be in the near future :-).
Thanks you team. ☺☺
On Mon, Jul 13, 2015 at 5:57 PM, Lars Voigt [email protected] wrote:
I will post when the mentioned search engine can be run and tested. I think, this will be in the near future :-).
— Reply to this email directly or view it on GitHub https://github.com/readium/readium-js/issues/17#issuecomment-120909357.
_Regards,_Naveen Kumar
@larsvoigt : Thanks! Much appreciated
Hi all,
I think I have reached a pre-alpha state of the full-text-search-feature.
Status informations following:
Note:
- Full-Text-Search-Feature works only in Webkit based browsers without limits
- The current feature only supports the cloud-reader version (maybe for browser version in future to come)
Running demo
A running demo can be found here: http://fulltextsearch-readium.rhcloud.com/. It combines epub-full-text-search and readium-js-viewer.
Source of demo
The source code of the demo can be found here: Source Code For testing call:
-
git clone https://github.com/larsvoigt/readium-full-text-search-demo.git
-
npm install
-
npm run start
Note: The indexing process starts automatically and it takes a few seconds until the search service is really available. Availability is indicated by cli status "all is indexed".
epub-full-text-search usage
If you want see how this feature can be implemented then check out branch. To get it run call:
- Prerequisites are same like the orginal readium repro
-
npm run build
-
npm run dist:cloudReaderWithFullTextSearch
Feature main components:
- Client-Side-Implementation: FullTextSearch.js
- Server-Side-Service: Server.js
Thank you for your feedback :-)
@larsvoigt great thanks! I tried searching for "reilly" in the "accessible EPUB3" ebook hosted at http://fulltextsearch-readium.rhcloud.com => when pressing the next / previous buttons, the highlighted fragments of text (search hits) are sometimes incorrect. Can you reproduce this bug?
Speaking of highlights...I assume you are using the built-in "annotations" plugin, which utilises a div
overlay method to render the selection? This plugin is now deactivated in the develop branch, and it is likely to be removed entirely due to memory bugs / obsolescence (CC @JCCR):
https://github.com/readium/readium-shared-js/issues/201
Once the existing CFI bugs have been ironed out, and the architecture improved, it will be a good time to reinstate a more robust version of the highlighting mechanism.
Thank @danielweck for your feedback. I will fix this bug as soon as possible. Indeed I use the built-in "annotations" plugin. Upps I am not up-to-date. Do you mean, that highlighting is impossible by using the latest developing branch?
As of right now, the develop branch still contains the "annotations" plugin, it is just deactivated. See https://github.com/readium/readium-shared-js/blob/develop/plugins/plugins.cson
You can re-activate it in your custom readium-shared-js/plugins/plugins-override.cson
, but the plan is indeed to extract the plugin into a feature branch, so that it does not interfere with future core CFI bug fixes / architecture improvements.
Thank you for sharing this information. At the moment I will use the re-activate option to support highligting. This way I will keep synchronized with the latest developing.
@larsvoigt , note that Juan Corona @JCCR has started work on refactoring the old "annotations" plugin (which was really more about "highlighting" document ranges than a full-blown annotations engine).
See: https://github.com/readium/readium-js-viewer/pull/403 https://github.com/readium/readium-shared-js/pull/212
This is an architecturally-improved version of the old annotations plugin, with some fixes too.
Thanks @danielweck for this hint. Great job @JCCR! Unfortunately I can't test it today. But next week I will have a look on it.
What is the status of this? Search would be highly useful :)
Have you tried @larsvoigt 's open-source server-side indexer / search service? https://github.com/larsvoigt/epub-full-text-search http://protected-dusk-3051.herokuapp.com/?searchbox=popu