Jimmy Lin

Results 211 issues of Jimmy Lin

Capturing discussion with @AileenLin - For `cw12b13`, for whatever reason, even with `-storeDocvector` during indexing, there is at least one document that doesn't have a doc vector. This means that...

As a result, on the Pyserini end, the `LuceneSearcher` and `IndexReader` are completely disconnected.

@justram I believe this was introduced by #1828 ``` TypeError: init_query_encoder() missing 1 required positional argument: 'multimodal' Traceback (most recent call last): File "/home/jimmylin/.conda/envs/pyserini-dev3/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code,...

Here: https://castorini.github.io/pyserini/2cr/msmarco-v2-doc.html We're missing uniCOIL (noexp) and uniCOIL (w/ doc2query-T5) for TREC 2023. @MXueguang can you please add this?

We want to add the latest SPLADE++ ED BEIR regressions here: https://castorini.github.io/pyserini/2cr/beir.html

From @sahel-sh - we can improve the onboarding docs by more accurately characterizing how long things take, on what hardware, RAM/CPU requirements, etc.

@yilinjz @UShivani3 et al. recently had issues getting Pyserini installed... I think we should refactor the installation instructions? + I had a start here: https://github.com/castorini/pyserini/pull/1609 but I don't think it's...

It'd be great to have a version of this: https://github.com/castorini/pyserini/blob/master/docs/experiments-nfcorpus.md but using OpenAI embeddings.

#1572 needs a test case.

I'd like to have documentation that has a complete end-to-end example worked out from indexing to retrieval - + indexing BM25 + retrieval using BM25 + indexing using dense vectors...