openlibrary
openlibrary copied to clipboard
Adding Search / Filter capability to Reading Log
Describe the problem that you'd like solved
(What it would take to) add search capabilities to the Reading Log page.
Proposal & Constraints
Please note, that currently, this proposal will not work because book titles, authors, and the other data we'd like to search for are not kept in our ReadingLog db table, only the OL identifiers. We may be able to achieve this with solr in the future. But assuming we did have the desired info in our database (and as a thought exercise):
NOTE: Read first, Working with Reading Log here: https://github.com/internetarchive/openlibrary/issues/4267#issuecomment-824151094
- First, we'd need to update the Reading Log html template (https://github.com/internetarchive/openlibrary/blob/master/openlibrary/templates/account/books.html) to include a search box (design task). For a first version, we'd probably use an html form which submits a GET search query , similar to what we have on the author's page: https://openlibrary.org/authors/OL7283091A
. In the future, we might want to use javascript (similar to how we the real-time Search box works at the top of the website):
- Next, we'd need to update the
public_my_books
controller method in https://github.com/internetarchive/openlibrary/blob/master/openlibrary/plugins/upstream/account.py#L733-L760 to accept a GET parameter. Already, the function expects apage
variable to be sent as a GET parameters (https://github.com/internetarchive/openlibrary/blob/master/openlibrary/plugins/upstream/account.py#L738) so accomplishing this should be as straightforward as adding another parameters like,i = web.input(page=1, search=None)
. - When/where we fetch the patron's books here: https://github.com/internetarchive/openlibrary/blob/master/openlibrary/plugins/upstream/account.py#L754, we need to alter the logic to check whether a
i.search
query is present (e.g.if i.search
). If thei.search
value is present, we'll need change the linereadlog.get_works
call so this optionalsearch
parameter is passed along with our request for matching books. -
readlog
is an instance ofplugins.upstream.account.ReadingLog
(class defined here: https://github.com/internetarchive/openlibrary/blob/1f57759886b65430d805270830677120c1dc067d/openlibrary/plugins/upstream/account.py#L645). Itsget_works
method (https://github.com/internetarchive/openlibrary/blob/1f57759886b65430d805270830677120c1dc067d/openlibrary/plugins/upstream/account.py#L716) will need to be updated to accept an optionalsearch
parameter (e.g.(key, page=1, limit=RESULTS_PER_PAGE, search=None)
). ThisReadingLog.get_works
function essentially uses aKEYS
dictionary (defined here: https://github.com/internetarchive/openlibrary/blob/1f57759886b65430d805270830677120c1dc067d/openlibrary/plugins/upstream/account.py#L654-L660) to lookup and then invoke the proper book-fetching function. - Each of the corresponding
ReadingLog
methods referenced by theKEYS
dictionary (namely:get_waitlisted_editions
,get_loans
,get_want_to_read
,get_currently_reading
,get_already_read
) must thus also be updated to take an optionalsearch
parameter. Each of these functions ultimately makes an API call to the same function within ourBookshelves
API model:Bookshelves.get_users_logged_books
(https://github.com/internetarchive/openlibrary/blob/master/openlibrary/core/bookshelves.py#L118-L149) - After a search box form has been added to the
template
, thepublic_my_books
view/controller has been edited to expect asearch
parameter, thissearch
parameter is forwarded to ourreadlog.get_works
call, and thereadlog
object (i.e. theReadingLog
class) have all been updated to accept an optionalsearch
parameter, we'll then need to do the hard work of modifying the actual APIBookshelves.get_users_logged_books
(the thing which calls the database) to consider the possibility of an optional search parameter when requesting data from the database: https://github.com/internetarchive/openlibrary/blob/master/openlibrary/core/bookshelves.py#L118-L149).
Related to
#4262, #4267
Stakeholders
@cdrini
Implementation One possible way to do this is to:
- take the search query...
- get a list of all books on a patron's want to read list (careful, this could be 10k+ books!)
- Fetch the work ids from infobase in bulk for the search query (perhaps, if < 1k titles -- this would serve most people, and if there are more than e.g. 1k titles, don't show the search form for now :( -- I know... the people w/ most titles most need search)
- Do a simple check if search query
==
or isin
the book title.
For now, because this is expensive, we probably can't do real-time search (like we do on the topnav)
Perhaps the actual solution is to use (a) use solr or (b) have this information mirrored in their archive.org items (privately) or (c) to include the book title in the bookshelves db (which may affect performance)
In short we'd probably want a solr query like:
{
'fq': 'key:(/works/OL1W OR /works/OL234W)', # You know but dynamic
'q.op': 'AND',
'q': q, # User query
'start': offset,
'rows': limit,
'fl': ','.join(DEFAULT_SEARCH_FIELDS), # From worksearch/code.py
'qt': 'standard',
'sort': 'work_count desc',
'wt': 'json',
'defType': 'edismax',
'qf': 'text title^20 author_name^20'
},
We should DRY this up more because it duplicates some of the search page logic, but this is fine for now. Pass through execute_solr_query
to get the books!
Yes, this would be a big win! e.g. Getting list of all work ids from a patrons reading log shelf and then limit a solr search to these IDs!
I'd like to work on this, @mekarpeles.
This should do the trick:
do_search(
{'q': 'rowling' + ' key:(/works/OL1W OR /works/OL2W)'},
sort=None,
page=1,
rows=20,
)
Sample mockup just copying authors page
I am looking for an issue for my first contribution . Can you help me here.