readium-js icon indicating copy to clipboard operation
readium-js copied to clipboard

Text search over entire book

Open pivotal-versapub opened this issue 11 years ago • 19 comments

We use jQuery to search through a book. In previous Readium versions, with eager loading enabled, we were able to search through the entire book. But eager loading has been removed.

We would prefer a Readium native way to search through the entire book. This could use the same approach as Total Page Numbers Issue #16 with a second hidden readium and iterating over that.

pivotal-versapub avatar Jan 17 '14 01:01 pivotal-versapub

How can you achieve that?

jozol avatar Mar 10 '14 04:03 jozol

Don't know if this will make m1.1, but will leave it under that milestone for now.

rkwright avatar Sep 19 '14 17:09 rkwright

I hear good things about Forage (previously-known as Norch): https://github.com/fergiemcdowall/forage

danielweck avatar Sep 19 '14 18:09 danielweck

Good work being done here: https://github.com/GermanCentralLibraryForTheBlind/readium-js-viewer/commits/feature/full_text_search

danielweck avatar May 10 '15 06:05 danielweck

https://github.com/fergiemcdowall/forage

How to use above to read Epub book. How to integrate this along with https://github.com/GermanCentralLibraryForTheBlind/readium-js-viewer/commits/feature/full_text_search

naveen1941 avatar Jun 09 '15 10:06 naveen1941

How to implement text search over entire book?

naveen1941 avatar Jun 09 '15 10:06 naveen1941

@naveen1941 you can experiment with this Node-based search engine:

https://github.com/larsvoigt/epub-full-text-search

danielweck avatar Jul 13 '15 10:07 danielweck

I will post when the mentioned search engine can be run and tested. I think, this will be in the near future :-).

larsvoigt avatar Jul 13 '15 12:07 larsvoigt

Thanks you team. ☺☺

On Mon, Jul 13, 2015 at 5:57 PM, Lars Voigt [email protected] wrote:

I will post when the mentioned search engine can be run and tested. I think, this will be in the near future :-).

— Reply to this email directly or view it on GitHub https://github.com/readium/readium-js/issues/17#issuecomment-120909357.

_Regards,_Naveen Kumar

naveen1941 avatar Jul 13 '15 14:07 naveen1941

@larsvoigt : Thanks! Much appreciated

rkwright avatar Jul 13 '15 15:07 rkwright

Hi all,

I think I have reached a pre-alpha state of the full-text-search-feature.

Status informations following:

Note:

  • Full-Text-Search-Feature works only in Webkit based browsers without limits
  • The current feature only supports the cloud-reader version (maybe for browser version in future to come)

Running demo

A running demo can be found here: http://fulltextsearch-readium.rhcloud.com/. It combines epub-full-text-search and readium-js-viewer.

Source of demo

The source code of the demo can be found here: Source Code For testing call:

  • git clone https://github.com/larsvoigt/readium-full-text-search-demo.git
  • npm install
  • npm run start

Note: The indexing process starts automatically and it takes a few seconds until the search service is really available. Availability is indicated by cli status "all is indexed".

epub-full-text-search usage

If you want see how this feature can be implemented then check out branch. To get it run call:

  • Prerequisites are same like the orginal readium repro
  • npm run build
  • npm run dist:cloudReaderWithFullTextSearch

Feature main components:

Thank you for your feedback :-)

larsvoigt avatar Aug 18 '15 13:08 larsvoigt

@larsvoigt great thanks! I tried searching for "reilly" in the "accessible EPUB3" ebook hosted at http://fulltextsearch-readium.rhcloud.com => when pressing the next / previous buttons, the highlighted fragments of text (search hits) are sometimes incorrect. Can you reproduce this bug?

Speaking of highlights...I assume you are using the built-in "annotations" plugin, which utilises a div overlay method to render the selection? This plugin is now deactivated in the develop branch, and it is likely to be removed entirely due to memory bugs / obsolescence (CC @JCCR): https://github.com/readium/readium-shared-js/issues/201 Once the existing CFI bugs have been ironed out, and the architecture improved, it will be a good time to reinstate a more robust version of the highlighting mechanism.

danielweck avatar Aug 18 '15 14:08 danielweck

Thank @danielweck for your feedback. I will fix this bug as soon as possible. Indeed I use the built-in "annotations" plugin. Upps I am not up-to-date. Do you mean, that highlighting is impossible by using the latest developing branch?

larsvoigt avatar Aug 18 '15 14:08 larsvoigt

As of right now, the develop branch still contains the "annotations" plugin, it is just deactivated. See https://github.com/readium/readium-shared-js/blob/develop/plugins/plugins.cson You can re-activate it in your custom readium-shared-js/plugins/plugins-override.cson, but the plan is indeed to extract the plugin into a feature branch, so that it does not interfere with future core CFI bug fixes / architecture improvements.

danielweck avatar Aug 18 '15 14:08 danielweck

Thank you for sharing this information. At the moment I will use the re-activate option to support highligting. This way I will keep synchronized with the latest developing.

larsvoigt avatar Aug 18 '15 14:08 larsvoigt

@larsvoigt , note that Juan Corona @JCCR has started work on refactoring the old "annotations" plugin (which was really more about "highlighting" document ranges than a full-blown annotations engine).

See: https://github.com/readium/readium-js-viewer/pull/403 https://github.com/readium/readium-shared-js/pull/212

This is an architecturally-improved version of the old annotations plugin, with some fixes too.

danielweck avatar Sep 18 '15 09:09 danielweck

Thanks @danielweck for this hint. Great job @JCCR! Unfortunately I can't test it today. But next week I will have a look on it.

larsvoigt avatar Sep 18 '15 09:09 larsvoigt

What is the status of this? Search would be highly useful :)

brezal avatar Jun 19 '16 19:06 brezal

Have you tried @larsvoigt 's open-source server-side indexer / search service? https://github.com/larsvoigt/epub-full-text-search http://protected-dusk-3051.herokuapp.com/?searchbox=popu

danielweck avatar Jun 30 '16 19:06 danielweck