tantivy icon indicating copy to clipboard operation
tantivy copied to clipboard

About range query performance questions.

Open MochiXu opened this issue 1 year ago • 12 comments

Recently, I'm interpreting tantivy in ClickHouse. Here is the schema in tantivy:

let mut schema_builder = Schema::builder();
    schema_builder.add_u64_field("row_id", INDEXED | STORED | FAST);
    schema_builder.add_text_field("text", TEXT);
    let schema = schema_builder.build();

    let mut index = match Index::create_in_dir(dir_str, schema) {
        Ok(index) => index,
        Err(_) => return std::ptr::null_mut(),
    };

And this is my FFI function in Rust, it will consume 0.99s.

#[no_mangle]
pub extern "C" fn tantivy_search_in_range(ir: *mut IndexR, query_ptr: *const c_char, lrange: u64, rrange: u64) -> bool {
    let query_c_str = unsafe {
        assert!(!query_ptr.is_null());
        CStr::from_ptr(query_ptr)
    };

    let schema = unsafe { (*ir).index.schema() };
    let row_id = schema.get_field("row_id").expect("missing field row_id");
    let text = schema.get_field("text").expect("missing field text");

    let searcher = unsafe { (*ir).reader.searcher() };

    let range_query = RangeQuery::new_u64("row_id".to_string(), lrange..rrange);

    let top_docs = searcher.search(&range_query, &Docs::with_limit(std::u64::MAX as usize)).expect("failed to search");

    false
}

If only text search is applied, it will only consume 0.2s.

#[no_mangle]
pub extern "C" fn tantivy_search_in_range(ir: *mut IndexR, query_ptr: *const c_char, lrange: u64, rrange: u64) -> bool {
    let query_c_str = unsafe {
        assert!(!query_ptr.is_null());
        CStr::from_ptr(query_ptr)
    };
    let query_str = query_c_str.to_str().expect("failed to get &str from cstr");
    let query_string = query_str.to_string();

    let schema = unsafe { (*ir).index.schema() };
    let row_id = schema.get_field("row_id").expect("missing field row_id");
    let text = schema.get_field("text").expect("missing field text");

    let searcher = unsafe { (*ir).reader.searcher() };

    let query_parser = QueryParser::for_index(unsafe { &(*ir).index }, vec![text]);
    let text_query = query_parser.parse_query(query_str).expect("tantivy failed to parse query");
    
    let top_docs = searcher.search(&text_query, &Docs::with_limit(std::u64::MAX as usize)).expect("failed to search");

    false
}

In fact, I want to filter the row_id using lrange and rrange first, and then do a text search based on the filtered results. I thought this would speed up the query results, but now it looks like range queries are too slow. The data set I used was wiki abstract 5 million, with each text containing about 200 words.

Would you please give me some advice on what I can do to get a more efficient query😨

MochiXu avatar Nov 20 '23 07:11 MochiXu

Can you retest without setting the field fast? If the field is fast we use the columnar storage for range queries. But if only few terms hit the range, a query on the inverted index would be better.

PSeitz avatar Nov 20 '23 08:11 PSeitz

@PSeitz Thank you very much for your reply. I use INDEXED retest it. Here I give a minimum test case, the time consume is lower.

You can download the wiki_5m dataset from this link, the datasets is about 1GB. The variable recreate_index in main function can decide whether recreate tantivy index or not. If you changed field_options for test, please modify recreate_index together.

Here is some information about test case:

  • range query: I will use RangeQuery::new_u64("row_id".to_string(), 80000..90000) to search from index.
  • text query: I will use term volunteer to search.
  • boolean query: combine range query and text query.
BooleanQuery::new(vec![
      (Occur::Must, Box::new(range_query)),
      (Occur::Must, Box::new(text_query)),
  ]);

Here is the test result:

  • row_id field_options is FAST:
range query answer size:10000, range_duration:280ms
text query answer size:2574, range_duration:0ms
boolean query answer size:5, range_duration:333ms
  • row_id field_options is INDEXED:
range query answer size:10000, range_duration:31ms
text query answer size:2574, range_duration:0ms
boolean query answer size:5, range_duration:30ms

One more thing I want to figure out: In boolean query, I want to know the order in which range query and text query are executed 😺.

Here is my minimum test case:

extern crate tantivy;

use std::fs;
use std::path::Path;
use std::time::Instant;

use tantivy::collector::Count;
use tantivy::query::BooleanQuery;
use tantivy::query::QueryParser;
use tantivy::query::RangeQuery;
use tantivy::query_grammar::Occur;
use tantivy::schema::*;
use tantivy::Index;
use tantivy::ReloadPolicy;
use serde::{Deserialize, Serialize};


#[derive(Deserialize, Serialize)]
struct Item {
    id: u64,
    title: String,
    body: String,
}


fn recreate_or_load_index(index_path: &Path, recreate: bool, schema: &Schema) -> tantivy::Result<Index> {
    if recreate {
        if index_path.exists() {
            let _ = fs::remove_dir_all(index_path);
        }
        let _ = fs::create_dir_all(index_path);
        let mut index = Index::create_in_dir(index_path, schema.clone()).expect("create index error");
        let _ = index.set_default_multithread_executor();
        Ok(index)
    } else {
        let mut index = Index::open_in_dir(index_path).expect("msg");
        let _ = index.set_default_multithread_executor();
        Ok(index)
    }
}


fn index_from_json_file(index:&Index, json_file_path: &Path, schema: &Schema) -> tantivy::Result<()> {
    // Create index_writer and set merge policy.
    let mut index_writer = index.writer_with_num_threads(8, 1024 * 1024 * 1024 * 8).unwrap();
    let mut merge_policy = tantivy::merge_policy::LogMergePolicy::default();
    merge_policy.set_max_docs_before_merge(100_000);
    merge_policy.set_min_num_segments(4);
    index_writer.set_merge_policy(Box::new(merge_policy));

    let row_id = schema.get_field("row_id").unwrap();
    let title = schema.get_field("title").unwrap();
    let body = schema.get_field("body").unwrap();

    // Read datasets.
    let json_content = fs::read_to_string(json_file_path).expect("file should be read");
    let documents: Vec<Item> = serde_json::from_str(&json_content)?;

    // Index data from json.
    for doc in documents {
        let mut temp = Document::default();
        temp.add_u64(row_id, doc.id);
        temp.add_text(title, doc.title);
        temp.add_text(body, doc.body);
        let _ = index_writer.add_document(temp);
    }
    index_writer.commit()?;

    // Release merging threads.
    let segment_ids = index
    .searchable_segment_ids()
    .expect("Searchable segments failed.");
    if segment_ids.len() > 1 {
        index_writer.merge(&segment_ids).wait()?;
        index_writer.wait_merging_threads()?;
    }
    Ok(())

}

fn main() -> tantivy::Result<()> {
    // Tantivy index files local path.
    let index_path = Path::new("/home/mochix/workspace_github/tantivy_demo/index_path");
    // Datasets, you can download it from AWS S3: wget https://myscale-example-datasets.s3.amazonaws.com/wiki_560w.json
    let json_file_path = Path::new("/home/mochix/workspace_github/tantivy_demo/wiki_560w.json");
    // If you want to recreate tantivy index, make it be true.
    let recreate_index = true;

    let mut schema_builder = Schema::builder();
    schema_builder.add_u64_field("row_id", FAST);
    schema_builder.add_text_field("title", TEXT);
    schema_builder.add_text_field("body", TEXT);
    let schema = schema_builder.build();
    
    let index = recreate_or_load_index(index_path, recreate_index, &schema)?;
    if recreate_index {
        let _ = index_from_json_file(&index, json_file_path, &schema);
    }

    let reader = index
    .reader_builder()
    .reload_policy(ReloadPolicy::OnCommit)
    .try_into()?;
    let searcher = reader.searcher();

    let range_query_start = Instant::now();
    let range_query = RangeQuery::new_u64("row_id".to_string(), 80000..90000);
    let range_size = searcher.search(&range_query, &Count).expect("range query error");
    let range_query_duration = range_query_start.elapsed().as_millis();
    println!("range query answer size:{:?}, range_duration:{:?}ms", range_size, range_query_duration);

    let text_query_start = Instant::now();
    let query_parser = QueryParser::for_index(&index, vec![schema.get_field("body").unwrap()]);
    let text_query = query_parser.parse_query("volunteer")?;
    let text_ans_size = searcher.search(&text_query, &Count)?;
    let text_query_duration = text_query_start.elapsed().as_millis();
    println!("text query answer size:{:?}, range_duration:{:?}ms", text_ans_size, text_query_duration);

    let boolean_query_start = Instant::now();
    let boolean_query = BooleanQuery::new(vec![
        (Occur::Must, Box::new(range_query)),
        (Occur::Must, Box::new(text_query)),
    ]);
    let boolean_ans_size = searcher.search(&boolean_query, &Count)?;
    let boolean_query_duration = boolean_query_start.elapsed().as_millis();
    println!("boolean query answer size:{:?}, range_duration:{:?}ms", boolean_ans_size, boolean_query_duration);


    Ok(())
}

MochiXu avatar Nov 20 '23 09:11 MochiXu

These numbers seem off. Are you running in release mode? What CPU are you on?

In boolean query, I want to know the order in which range query and text query are executed 😺.

They run at the same time, but with INDEXED, a datastructure is created beforehand for efficient intersection.

PSeitz avatar Nov 20 '23 09:11 PSeitz

Where can I find out whether this test case is running in release or debug mode? My test case Cargo.toml is:

[package]
name = "tantivy_demo"
version = "0.1.0"
edition = "2018"

[dependencies]
tantivy = "0.21.1"
tempfile = "3.2"
serde = { version = "1.0.127", features = ["derive"] }
serde_json = "1.0.66"

[[bin]]
name = "tantivy_demo"
path = "src/main.rs"

Here is my cpu info:

❯ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         48 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  16
  On-line CPU(s) list:   0-15
Vendor ID:               AuthenticAMD
  Model name:            AMD Ryzen 9 6900HX with Radeon Graphics
    CPU family:          25
    Model:               68
    Thread(s) per core:  2
    Core(s) per socket:  8
    Socket(s):           1
    Stepping:            1
    Frequency boost:     enabled
    CPU max MHz:         4933.8862
    CPU min MHz:         1600.0000
    BogoMIPS:            6587.56
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmpe
                         rf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topo
                         ext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveo
                         pt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter
                          pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
Virtualization features:
  Virtualization:        AMD-V
Caches (sum of all):
  L1d:                   256 KiB (8 instances)
  L1i:                   256 KiB (8 instances)
  L2:                    4 MiB (8 instances)
  L3:                    16 MiB (1 instance)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-15
Vulnerabilities:
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Mitigation; safe RET, no microcode
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

MochiXu avatar Nov 20 '23 09:11 MochiXu

You just need to add --release to you cargo run or cargo test command

fulmicoton avatar Nov 20 '23 10:11 fulmicoton

Here is test result in release mode:

  • row_id field_options is INDEXED:
❯ ./target/release/tantivy_demo 
range query answer size:10000, range_duration:5ms
text query answer size:2574, range_duration:0ms
boolean query answer size:5, range_duration:5ms

MochiXu avatar Nov 20 '23 11:11 MochiXu

What about with FAST?

PSeitz avatar Nov 20 '23 11:11 PSeitz

@PSeitz Here is FAST in release mode

❯ ./target/release/tantivy_demo
range query answer size:10000, range_duration:29ms
text query answer size:2574, range_duration:0ms
boolean query answer size:5, range_duration:26ms

Based on the results from the release mode of INDEXED, a search through 5.6 million lines of text takes 0ms. However, the speed decreases when combined with a range query(5ms). My requirement is to determine if any row_id within the specified range is present in the results of a text search.

Could you offer some suggestions? The integration of Tantivy into ClickHouse demands stringent requirements regarding search latency, which is quite critical in our process.

MochiXu avatar Nov 20 '23 13:11 MochiXu

I’ve realized that by using a custom Collector, I can gather the results for row_id. My current schema is as follows:

schema_builder.add_u64_field("row_id", FAST | INDEXED);
schema_builder.add_text_field("title", TEXT);
schema_builder.add_text_field("body", TEXT);

And the code for the custom Collector is provided below (code file: row_id_collector.rs):

use tantivy::columnar::Column;
// ---
// Importing tantivy...
use tantivy::collector::{Collector, SegmentCollector};
use tantivy::query::QueryParser;
use tantivy::schema::{Schema, FAST, INDEXED, TEXT, Field};
use tantivy::{doc, Index, Score, SegmentReader};


pub struct RowIdCollector {
    pub row_id_field: String,
}

impl RowIdCollector {
    pub fn with_field(row_id_field: String) -> RowIdCollector {
        RowIdCollector { row_id_field }
    }
}

impl Collector for RowIdCollector {
    // That's the type of our result.
    // Our standard deviation will be a float.
    type Fruit = Vec<u64>;

    type Child = RowIdSegmentCollector;

    fn for_segment(
        &self,
        _segment_local_id: u32,
        segment_reader: &SegmentReader,
    ) -> tantivy::Result<Self::Child> {
        let row_id_reader = segment_reader.fast_fields().u64(&self.row_id_field)?;
        Ok(RowIdSegmentCollector {
            row_id_reader: row_id_reader,
            row_ids: Vec::new(),
        })
    }

    fn requires_scoring(&self) -> bool {
        // this collector does not care about score.
        false
    }

    fn merge_fruits(&self, segment_row_ids: Vec<Self::Fruit>) -> tantivy::Result<Self::Fruit> {
        Ok(segment_row_ids.into_iter().flatten().collect())
    }
}

pub struct RowIdSegmentCollector {
    row_id_reader: Column,
    row_ids: Vec<u64>,
}

impl SegmentCollector for RowIdSegmentCollector {
    type Fruit = Vec<u64>;

    fn collect(&mut self, doc: u32, _score: Score) {
        // Since we know the values are single value, we could call `first_or_default_col` on the
        // column and fetch single values.
        for value in self.row_id_reader.values_for_doc(doc) {
            self.row_ids.push(value);
        }
    }

    fn harvest(self) -> <Self as SegmentCollector>::Fruit {
        self.row_ids
    }
}

And here is the test case result in release mode:

❯ ./target/release/tantivy_demo
range query answer size:10000, range_query_duration: 24.703 ms
text query answer size:2574, text_query_duration: 0.118 ms
boolean query answer size:5, boolean_query_duration: 11.758 ms
row_id answer size:2574, row_id_collector_query_duration: 0.777 ms

I am interested in knowing how to modify this custom Collector to accelerate the query if the indexing method for the row_id column is INDEXED rather than FAST | INDEXED.

Based on yesterday's experience, using INDEXED yielded better results than FAST. Therefore, I am considering changing the indexing method of row_id to INDEXED. This means I need to modify the for_segment method of my custom Collector. I am curious about how to find similar example code in the Tantivy source code. Could you guide me on this? 🥺

MochiXu avatar Nov 21 '23 03:11 MochiXu

Based on the results from the release mode of INDEXED, a search through 5.6 million lines of text takes 0ms.

That's not how it works, it's 1 lookup in the inverted index and return the docs.

However, the speed decreases when combined with a range query(5ms).

Range Query with INDEXED does 10_000 lookups. Range Query with FAST does a fullscan over the 5.6million docs. This is close to a pathological case how the intersection algorithm works in this case. The distribution of the docs from the search is quite uniform. The skipping is relatively short and the fast field range query always scans chunks of docs.

The issue here is that seek, doesn't just move the cursor foward, but also loads the next doc in this DocSet.

E.g. if DocSet1 (Term DocSet) has docids [1_000_000, 2_000_000] and DocSet2 (FastField DocSet) is empty. DocSet1 would call seek(1_000_000) on DocSet2. There's no next hit on DocSet2 and would try to load the next docid, doing a fullscan.

Show DocIds For TermQuery

hit:2693 hit:5232 hit:8717 hit:11396 hit:11551 hit:15126 hit:18079 hit:19384 hit:26570 hit:29538 hit:33228 hit:34408 hit:43163 hit:45752 hit:48777 hit:49029 hit:52346 hit:60052 hit:65910 hit:70748 hit:71233 hit:78036 hit:79163 hit:81731 hit:82433 hit:84981 hit:85891 hit:91327 hit:97348 hit:97378 hit:97758 hit:102461 hit:113383 hit:116688 hit:118032 hit:118098 hit:119591 hit:122237 hit:126246 hit:128203 hit:128808 hit:130609 hit:131208 hit:136276 hit:140294 hit:140756 hit:140788 hit:144749 hit:146657 hit:146886 hit:150969 hit:151223 hit:153866 hit:154855 hit:161728 hit:162534 hit:163081 hit:169103 hit:169606 hit:171902 hit:172141 hit:172989 hit:173039 hit:173055 hit:173120 hit:173523 hit:174259 hit:175109 hit:178578 hit:182622 hit:183159 hit:184821 hit:188031 hit:190642 hit:193326 hit:195727 hit:198021 hit:200637 hit:201507 hit:201692 hit:205145 hit:205309 hit:205464 hit:214898 hit:224189 hit:225944 hit:235564 hit:250388 hit:253567 hit:253594 hit:256686 hit:257085 hit:260438 hit:260441 hit:260443 hit:260471 hit:260556 hit:260695 hit:260853 hit:260856 hit:262210 hit:265224 hit:265929 hit:266464 hit:269289 hit:269304 hit:269492 hit:269552 hit:274301 hit:275523 hit:279053 hit:279288 hit:279680 hit:279681 hit:287601 hit:287757 hit:289911 hit:294220 hit:294605 hit:295453 hit:296057 hit:317717 hit:321075 hit:321104 hit:321271 hit:321547 hit:321548 hit:324347 hit:324852 hit:327414 hit:328677 hit:329098 hit:329325 hit:333065 hit:334275 hit:336724 hit:337091 hit:337226 hit:338980 hit:341889 hit:343365 hit:346549 hit:347006 hit:350734 hit:351281 hit:352102 hit:352677 hit:353489 hit:353612 hit:353637 hit:353675 hit:354015 hit:355071 hit:355654 hit:356922 hit:358636 hit:359561 hit:361215 hit:363456 hit:365885 hit:366302 hit:366392 hit:366739 hit:366858 hit:369073 hit:371240 hit:371273 hit:371278 hit:371396 hit:376732 hit:376741 hit:376765 hit:376865 hit:376867 hit:378830 hit:378838 hit:381111 hit:382131 hit:382326 hit:383138 hit:390957 hit:390962 hit:391452 hit:401437 hit:410925 hit:410961 hit:416546 hit:416693 hit:417824 hit:421726 hit:424114 hit:424477 hit:426936 hit:428321 hit:428324 hit:430319 hit:430577 hit:432384 hit:434909 hit:435970 hit:436055 hit:438227 hit:438229 hit:438240 hit:438380 hit:441515 hit:441670 hit:444199 hit:444397 hit:446093 hit:449578 hit:453278 hit:455825 hit:457231 hit:458012 hit:461202 hit:461672 hit:463472 hit:463894 hit:466791 hit:474577 hit:475090 hit:475274 hit:475561 hit:478483 hit:480062 hit:485510 hit:485797 hit:489664 hit:501235 hit:502758 hit:504514 hit:514297 hit:515840 hit:516208 hit:516646 hit:519344 hit:520160 hit:520887 hit:528502 hit:530476 hit:532617 hit:534463 hit:534675 hit:536355 hit:539558 hit:540091 hit:542569 hit:542610 hit:543991 hit:544015 hit:549092 hit:551747 hit:557294 hit:558018 hit:558538 hit:560895 hit:564717 hit:566862 hit:568494 hit:569793 hit:580969 hit:581346 hit:582921 hit:585367 hit:586647 hit:587988 hit:588773 hit:589440 hit:593060 hit:594596 hit:594854 hit:598487 hit:601092 hit:609049 hit:610482 hit:610641 hit:612068 hit:613822 hit:613852 hit:617220 hit:624665 hit:626328 hit:626867 hit:627869 hit:627922 hit:638118 hit:638809 hit:640929 hit:641326 hit:647145 hit:652021 hit:654964 hit:657166 hit:657984 hit:659303 hit:659958 hit:661623 hit:665022 hit:667303 hit:669093 hit:674252 hit:675250 hit:678847 hit:683480 hit:683893 hit:687706 hit:694781 hit:695060 hit:702240 hit:704129 hit:704552 hit:706300 hit:709321 hit:711058 hit:711625 hit:717560 hit:717901 hit:725322 hit:731287 hit:731905 hit:735227 hit:737216 hit:739476 hit:739668 hit:740040 hit:741701 hit:741934 hit:744268 hit:745075 hit:745499 hit:747539 hit:750061 hit:763469 hit:764636 hit:765217 hit:766904 hit:769449 hit:769844 hit:772304 hit:774323 hit:774977 hit:780940 hit:784451 hit:793998 hit:797961 hit:800547 hit:803977 hit:804360 hit:809740 hit:811075 hit:811785 hit:812053 hit:815216 hit:816234 hit:818049 hit:819325 hit:819648 hit:824601 hit:827316 hit:828184 hit:839416 hit:841115 hit:841980 hit:844209 hit:844933 hit:845144 hit:856010 hit:857727 hit:857855 hit:861044 hit:862292 hit:871727 hit:872951 hit:877247 hit:878443 hit:879749 hit:882472 hit:885263 hit:888391 hit:888893 hit:890169 hit:891148 hit:891547 hit:891838 hit:891884 hit:892508 hit:895794 hit:896147 hit:898598 hit:900531 hit:900656 hit:901316 hit:910982 hit:912664 hit:913125 hit:913290 hit:916834 hit:919642 hit:928322 hit:932800 hit:932933 hit:934272 hit:937384 hit:937667 hit:940496 hit:940563 hit:940565 hit:941818 hit:941950 hit:944511 hit:960576 hit:960584 hit:963738 hit:971520 hit:978576 hit:978980 hit:980912 hit:980914 hit:981285 hit:982384 hit:982417 hit:984976 hit:985842 hit:986517 hit:986937 hit:994232 hit:996030 hit:996450 hit:996852 hit:997630 hit:1001034 hit:1002189 hit:1002221 hit:1002241 hit:1002357 hit:1003919 hit:1005699 hit:1006224 hit:1007212 hit:1012150 hit:1016877 hit:1017032 hit:1017036 hit:1022833 hit:1023606 hit:1027128 hit:1033266 hit:1034528 hit:1035041 hit:1035390 hit:1038109 hit:1041024 hit:1041388 hit:1041570 hit:1042684 hit:1042947 hit:1044098 hit:1044099 hit:1044765 hit:1044839 hit:1048065 hit:1050599 hit:1051445 hit:1051736 hit:1054004 hit:1056466 hit:1060403 hit:1061396 hit:1062053 hit:1063426 hit:1064080 hit:1064242 hit:1064243 hit:1064246 hit:1070149 hit:1071236 hit:1072025 hit:1072027 hit:1074228 hit:1074877 hit:1076583 hit:1076830 hit:1081607 hit:1084697 hit:1087104 hit:1089016 hit:1089057 hit:1093966 hit:1098872 hit:1099505 hit:1099559 hit:1100360 hit:1102428 hit:1104284 hit:1104385 hit:1104434 hit:1105129 hit:1105138 hit:1105241 hit:1105922 hit:1108173 hit:1109881 hit:1112343 hit:1113713 hit:1115922 hit:1115929 hit:1116939 hit:1119995 hit:1124259 hit:1126037 hit:1126414 hit:1129001 hit:1130226 hit:1132469 hit:1134273 hit:1135042 hit:1135498 hit:1136562 hit:1137991 hit:1140744 hit:1146185 hit:1147361 hit:1148405 hit:1149042 hit:1149237 hit:1150540 hit:1152934 hit:1153705 hit:1156184 hit:1160394 hit:1169210 hit:1169674 hit:1174884 hit:1175371 hit:1176909 hit:1186355 hit:1187785 hit:1187881 hit:1188067 hit:1188900 hit:1189220 hit:1192271 hit:1193136 hit:1194846 hit:1195563 hit:1197135 hit:1197466 hit:1199007 hit:1200612 hit:1201145 hit:1207944 hit:1210758 hit:1211996 hit:1212361 hit:1217431 hit:1217902 hit:1218421 hit:1221992 hit:1222195 hit:1224113 hit:1226465 hit:1226984 hit:1226986 hit:1227311 hit:1229740 hit:1232712 hit:1233181 hit:1233245 hit:1235299 hit:1236974 hit:1237924 hit:1242540 hit:1242576 hit:1244804 hit:1245317 hit:1246104 hit:1251734 hit:1255405 hit:1255646 hit:1255748 hit:1256521 hit:1258974 hit:1261177 hit:1262938 hit:1263750 hit:1263969 hit:1264009 hit:1267095 hit:1268987 hit:1269597 hit:1276201 hit:1277197 hit:1277491 hit:1278496 hit:1281239 hit:1287945 hit:1288341 hit:1289178 hit:1292083 hit:1294713 hit:1296431 hit:1301959 hit:1305313 hit:1306556 hit:1306675 hit:1309687 hit:1309932 hit:1312640 hit:1314073 hit:1321869 hit:1323679 hit:1329393 hit:1332643 hit:1332761 hit:1335138 hit:1339367 hit:1343039 hit:1343088 hit:1344162 hit:1350806 hit:1353831 hit:1354197 hit:1354342 hit:1354543 hit:1356532 hit:1359120 hit:1359289 hit:1362555 hit:1362880 hit:1370038 hit:1376831 hit:1381568 hit:1384438 hit:1384665 hit:1384774 hit:1385479 hit:1390037 hit:1390557 hit:1391224 hit:1393797 hit:1394647 hit:1397043 hit:1397457 hit:1399244 hit:1399406 hit:1400804 hit:1406066 hit:1410709 hit:1413025 hit:1417189 hit:1417808 hit:1418784 hit:1419368 hit:1421408 hit:1422273 hit:1430091 hit:1430500 hit:1433113 hit:1435051 hit:1439721 hit:1443985 hit:1450178 hit:1454518 hit:1454535 hit:1456531 hit:1459519 hit:1461856 hit:1468509 hit:1468595 hit:1471102 hit:1482465 hit:1486867 hit:1488989 hit:1489194 hit:1495516 hit:1495722 hit:1496979 hit:1498910 hit:1504289 hit:1507975 hit:1508103 hit:1508986 hit:1519451 hit:1519761 hit:1519840 hit:1524130 hit:1532692 hit:1533295 hit:1534825 hit:1535786 hit:1537014 hit:1540407 hit:1543894 hit:1556102 hit:1557507 hit:1558471 hit:1559466 hit:1561221 hit:1568243 hit:1568604 hit:1569532 hit:1571790 hit:1579705 hit:1580404 hit:1582094 hit:1583937 hit:1585428 hit:1586491 hit:1590317 hit:1590673 hit:1590692 hit:1590699 hit:1599071 hit:1599668 hit:1604517 hit:1608169 hit:1611045 hit:1611162 hit:1611469 hit:1611881 hit:1612840 hit:1615711 hit:1619409 hit:1619953 hit:1620145 hit:1620725 hit:1622263 hit:1627081 hit:1628994 hit:1630163 hit:1632178 hit:1638630 hit:1644968 hit:1645293 hit:1647985 hit:1650270 hit:1652071 hit:1656505 hit:1656730 hit:1660015 hit:1660021 hit:1660753 hit:1661257 hit:1663279 hit:1663753 hit:1664057 hit:1664612 hit:1673182 hit:1675252 hit:1676676 hit:1681023 hit:1683600 hit:1685861 hit:1692669 hit:1696556 hit:1697005 hit:1697171 hit:1698179 hit:1698240 hit:1702071 hit:1702913 hit:1706196 hit:1709325 hit:1709860 hit:1710641 hit:1717272 hit:1718448 hit:1718456 hit:1718686 hit:1718708 hit:1719063 hit:1722493 hit:1727765 hit:1728117 hit:1728190 hit:1728778 hit:1739573 hit:1744500 hit:1744610 hit:1752051 hit:1753369 hit:1762048 hit:1762551 hit:1763772 hit:1764624 hit:1765331 hit:1769819 hit:1770930 hit:1771053 hit:1771071 hit:1771252 hit:1771596 hit:1771599 hit:1774215 hit:1778464 hit:1778599 hit:1780857 hit:1785792 hit:1786462 hit:1786552 hit:1787475 hit:1787901 hit:1788498 hit:1788501 hit:1788523 hit:1790216 hit:1792891 hit:1795321 hit:1796201 hit:1800649 hit:1801156 hit:1801581 hit:1805202 hit:1805416 hit:1810574 hit:1810711 hit:1810726 hit:1810935 hit:1812640 hit:1813528 hit:1814010 hit:1815292 hit:1816717 hit:1819599 hit:1819706 hit:1822909 hit:1823702 hit:1827828 hit:1828236 hit:1828360 hit:1828365 hit:1828366 hit:1829162 hit:1831161 hit:1833450 hit:1834144 hit:1834334 hit:1835004 hit:1835263 hit:1835820 hit:1839850 hit:1841742 hit:1841809 hit:1842288 hit:1845411 hit:1845826 hit:1846196 hit:1846256 hit:1846626 hit:1851582 hit:1855589 hit:1857988 hit:1858357 hit:1860962 hit:1870795 hit:1872256 hit:1873099 hit:1875646 hit:1878332 hit:1878785 hit:1879204 hit:1883244 hit:1886663 hit:1889316 hit:1889379 hit:1889785 hit:1890768 hit:1891458 hit:1892895 hit:1893116 hit:1894914 hit:1895232 hit:1895270 hit:1895893 hit:1896729 hit:1896992 hit:1897990 hit:1902567 hit:1907645 hit:1908571 hit:1914453 hit:1915109 hit:1919956 hit:1919959 hit:1920486 hit:1925406 hit:1925456 hit:1925551 hit:1927982 hit:1928608 hit:1931412 hit:1936685 hit:1937153 hit:1940482 hit:1941824 hit:1943199 hit:1943821 hit:1946248 hit:1946566 hit:1946659 hit:1950082 hit:1952205 hit:1952451 hit:1953724 hit:1957159 hit:1959844 hit:1963287 hit:1963791 hit:1964854 hit:1968171 hit:1969101 hit:1971388 hit:1971824 hit:1974397 hit:1974407 hit:1974749 hit:1975242 hit:1977048 hit:1982394 hit:1987825 hit:1992315 hit:1996256 hit:2002373 hit:2005547 hit:2008083 hit:2008101 hit:2008242 hit:2008358 hit:2008570 hit:2011258 hit:2014495 hit:2015461 hit:2017213 hit:2017599 hit:2019162 hit:2028529 hit:2029264 hit:2029361 hit:2034952 hit:2036878 hit:2037042 hit:2037713 hit:2040161 hit:2040817 hit:2042139 hit:2042242 hit:2052803 hit:2053987 hit:2055755 hit:2056081 hit:2069935 hit:2070147 hit:2070251 hit:2081238 hit:2082669 hit:2083891 hit:2085956 hit:2088449 hit:2091338 hit:2100133 hit:2100448 hit:2101420 hit:2106610 hit:2115043 hit:2116549 hit:2123359 hit:2125214 hit:2133229 hit:2138015 hit:2140311 hit:2145344 hit:2152951 hit:2153568 hit:2156724 hit:2158327 hit:2162496 hit:2164986 hit:2167231 hit:2168562 hit:2173624 hit:2174076 hit:2174227 hit:2176011 hit:2178396 hit:2192146 hit:2196682 hit:2208346 hit:2209052 hit:2209071 hit:2214325 hit:2219185 hit:2221619 hit:2223179 hit:2223752 hit:2223987 hit:2225748 hit:2229648 hit:2229811 hit:2236059 hit:2238264 hit:2240638 hit:2243182 hit:2247854 hit:2249814 hit:2252835 hit:2254338 hit:2259737 hit:2262195 hit:2264389 hit:2264485 hit:2265891 hit:2266450 hit:2269351 hit:2284010 hit:2286839 hit:2289093 hit:2290730 hit:2291159 hit:2292768 hit:2295035 hit:2297463 hit:2297632 hit:2299783 hit:2301135 hit:2303915 hit:2306189 hit:2312908 hit:2312913 hit:2317641 hit:2319272 hit:2330251 hit:2330268 hit:2331164 hit:2331431 hit:2331726 hit:2331884 hit:2333104 hit:2335146 hit:2339176 hit:2339254 hit:2339672 hit:2339677 hit:2339685 hit:2339688 hit:2340962 hit:2341474 hit:2349380 hit:2349502 hit:2349519 hit:2350744 hit:2351033 hit:2367460 hit:2368320 hit:2369253 hit:2369449 hit:2372032 hit:2374170 hit:2374198 hit:2374353 hit:2375339 hit:2376565 hit:2376907 hit:2382539 hit:2385812 hit:2386504 hit:2386986 hit:2393581 hit:2395443 hit:2397437 hit:2403783 hit:2404755 hit:2406114 hit:2408196 hit:2408322 hit:2408605 hit:2409857 hit:2412128 hit:2412453 hit:2414303 hit:2424618 hit:2424788 hit:2424829 hit:2426894 hit:2427869 hit:2431680 hit:2432295 hit:2432545 hit:2432581 hit:2440824 hit:2440859 hit:2443540 hit:2444996 hit:2447623 hit:2448963 hit:2450149 hit:2450265 hit:2450268 hit:2450908 hit:2459347 hit:2459383 hit:2460419 hit:2460567 hit:2463752 hit:2465238 hit:2465668 hit:2467021 hit:2467028 hit:2472387 hit:2475166 hit:2476820 hit:2483813 hit:2484755 hit:2485114 hit:2485946 hit:2487591 hit:2491784 hit:2493883 hit:2493887 hit:2495683 hit:2497340 hit:2502164 hit:2503344 hit:2504310 hit:2506328 hit:2508283 hit:2510893 hit:2511008 hit:2511149 hit:2512305 hit:2512332 hit:2512380 hit:2512550 hit:2513004 hit:2513809 hit:2516268 hit:2516293 hit:2518181 hit:2518540 hit:2521149 hit:2521620 hit:2525247 hit:2525612 hit:2528427 hit:2528742 hit:2529046 hit:2529065 hit:2532957 hit:2533075 hit:2539173 hit:2539211 hit:2540430 hit:2541394 hit:2542756 hit:2542894 hit:2542973 hit:2542975 hit:2546691 hit:2546733 hit:2547086 hit:2547314 hit:2547320 hit:2547323 hit:2547338 hit:2547343 hit:2547350 hit:2547353 hit:2550402 hit:2550403 hit:2553689 hit:2555351 hit:2557660 hit:2558038 hit:2558484 hit:2562731 hit:2567757 hit:2570033 hit:2574368 hit:2574520 hit:2574771 hit:2580666 hit:2583047 hit:2583482 hit:2585907 hit:2589010 hit:2589122 hit:2590363 hit:2590380 hit:2590982 hit:2593201 hit:2593260 hit:2593266 hit:2593846 hit:2598369 hit:2598953 hit:2599511 hit:2599719 hit:2603295 hit:2604633 hit:2604644 hit:2605081 hit:2607738 hit:2608725 hit:2609954 hit:2621211 hit:2621556 hit:2624111 hit:2625619 hit:2626361 hit:2628357 hit:2630207 hit:2640961 hit:2644752 hit:2646940 hit:2647404 hit:2654940 hit:2655016 hit:2655675 hit:2655827 hit:2661344 hit:2661531 hit:2661610 hit:2662403 hit:2667405 hit:2668719 hit:2669308 hit:2670575 hit:2671926 hit:2674703 hit:2674947 hit:2677287 hit:2679722 hit:2684980 hit:2686252 hit:2686617 hit:2688206 hit:2692549 hit:2700168 hit:2701296 hit:2706210 hit:2707857 hit:2709264 hit:2710397 hit:2712442 hit:2714920 hit:2715662 hit:2719144 hit:2723005 hit:2726645 hit:2727519 hit:2727700 hit:2727815 hit:2729238 hit:2731013 hit:2738321 hit:2741902 hit:2742041 hit:2746296 hit:2747820 hit:2749986 hit:2751246 hit:2755396 hit:2756143 hit:2757357 hit:2759303 hit:2762093 hit:2764232 hit:2770067 hit:2770298 hit:2770966 hit:2771694 hit:2774275 hit:2775941 hit:2779655 hit:2779928 hit:2783137 hit:2785045 hit:2788434 hit:2791243 hit:2794281 hit:2796227 hit:2800094 hit:2801591 hit:2801953 hit:2804032 hit:2804766 hit:2806727 hit:2810194 hit:2823101 hit:2826085 hit:2830673 hit:2831071 hit:2834270 hit:2837482 hit:2837591 hit:2838018 hit:2841751 hit:2841973 hit:2842847 hit:2843530 hit:2843624 hit:2847580 hit:2853878 hit:2854128 hit:2854404 hit:2855064 hit:2860732 hit:2863414 hit:2865244 hit:2869580 hit:2869916 hit:2876554 hit:2880365 hit:2882814 hit:2884776 hit:2893565 hit:2902334 hit:2905384 hit:2905713 hit:2906175 hit:2906498 hit:2906937 hit:2915486 hit:2921149 hit:2921256 hit:2921817 hit:2922352 hit:2923136 hit:2927486 hit:2928978 hit:2929288 hit:2930018 hit:2931813 hit:2940458 hit:2940621 hit:2941072 hit:2944607 hit:2945008 hit:2945598 hit:2951335 hit:2953588 hit:2954616 hit:2955326 hit:2960370 hit:2962196 hit:2964540 hit:2965549 hit:2967470 hit:2967863 hit:2971769 hit:2978425 hit:2985318 hit:2986910 hit:2988429 hit:2989233 hit:2989805 hit:2989809 hit:2994336 hit:2995973 hit:3001155 hit:3003161 hit:3003444 hit:3005480 hit:3006431 hit:3007412 hit:3011774 hit:3013912 hit:3016751 hit:3018978 hit:3022824 hit:3031921 hit:3032474 hit:3032505 hit:3032520 hit:3034820 hit:3035414 hit:3036725 hit:3037467 hit:3049805 hit:3050555 hit:3055119 hit:3055829 hit:3061377 hit:3061414 hit:3063866 hit:3068963 hit:3074661 hit:3076317 hit:3079036 hit:3079126 hit:3082422 hit:3084137 hit:3084315 hit:3087078 hit:3088235 hit:3088606 hit:3088974 hit:3089130 hit:3092364 hit:3095813 hit:3099050 hit:3100265 hit:3102035 hit:3113970 hit:3115546 hit:3116987 hit:3118199 hit:3119706 hit:3121798 hit:3122405 hit:3123611 hit:3124220 hit:3127395 hit:3130137 hit:3134318 hit:3134710 hit:3135876 hit:3137196 hit:3138627 hit:3138692 hit:3139190 hit:3139258 hit:3140222 hit:3145909 hit:3146036 hit:3148419 hit:3148576 hit:3149998 hit:3152732 hit:3152982 hit:3154177 hit:3157566 hit:3161005 hit:3161126 hit:3161922 hit:3165357 hit:3166399 hit:3167310 hit:3167529 hit:3170105 hit:3171242 hit:3174055 hit:3174149 hit:3174513 hit:3176980 hit:3183184 hit:3183209 hit:3183239 hit:3184185 hit:3192817 hit:3192825 hit:3192963 hit:3193551 hit:3195667 hit:3198568 hit:3199478 hit:3200608 hit:3202189 hit:3202234 hit:3204678 hit:3206166 hit:3210784 hit:3210973 hit:3212276 hit:3212339 hit:3212492 hit:3212495 hit:3212962 hit:3212985 hit:3213161 hit:3216065 hit:3217454 hit:3217878 hit:3218453 hit:3220660 hit:3222620 hit:3223839 hit:3225018 hit:3225275 hit:3225412 hit:3226933 hit:3229209 hit:3229323 hit:3229932 hit:3229936 hit:3229940 hit:3230301 hit:3234534 hit:3234823 hit:3234909 hit:3235038 hit:3235227 hit:3235254 hit:3235273 hit:3235279 hit:3235799 hit:3240075 hit:3242538 hit:3242540 hit:3243024 hit:3245673 hit:3246702 hit:3247345 hit:3247348 hit:3247367 hit:3247388 hit:3247403 hit:3247411 hit:3247464 hit:3247522 hit:3250393 hit:3252152 hit:3252384 hit:3252872 hit:3259812 hit:3264704 hit:3266311 hit:3270818 hit:3271893 hit:3281896 hit:3284453 hit:3294348 hit:3296799 hit:3297393 hit:3298315 hit:3299633 hit:3304264 hit:3306372 hit:3306584 hit:3308095 hit:3314409 hit:3315726 hit:3315800 hit:3315816 hit:3320601 hit:3320619 hit:3320623 hit:3325849 hit:3326258 hit:3331749 hit:3332207 hit:3334164 hit:3337538 hit:3338559 hit:3339383 hit:3343551 hit:3345979 hit:3353701 hit:3358945 hit:3359429 hit:3360050 hit:3361255 hit:3365473 hit:3366530 hit:3368704 hit:3373833 hit:3375734 hit:3378754 hit:3381529 hit:3382583 hit:3384840 hit:3384917 hit:3388511 hit:3389143 hit:3394448 hit:3396745 hit:3396884 hit:3398449 hit:3401997 hit:3405187 hit:3409043 hit:3409728 hit:3411704 hit:3413031 hit:3416887 hit:3418014 hit:3420073 hit:3421262 hit:3430284 hit:3435894 hit:3445512 hit:3450016 hit:3451771 hit:3452911 hit:3453489 hit:3453538 hit:3453646 hit:3454873 hit:3459726 hit:3461068 hit:3461150 hit:3461299 hit:3468092 hit:3471862 hit:3472661 hit:3478892 hit:3480284 hit:3481032 hit:3485207 hit:3491996 hit:3502974 hit:3506619 hit:3508650 hit:3510128 hit:3513243 hit:3513926 hit:3516831 hit:3519976 hit:3519997 hit:3524293 hit:3532072 hit:3533017 hit:3533767 hit:3536204 hit:3538990 hit:3540477 hit:3541398 hit:3542576 hit:3552368 hit:3554374 hit:3557056 hit:3566925 hit:3568139 hit:3568572 hit:3569369 hit:3570076 hit:3571917 hit:3574006 hit:3576554 hit:3593029 hit:3596696 hit:3599101 hit:3600250 hit:3602941 hit:3603052 hit:3603283 hit:3604315 hit:3606181 hit:3606322 hit:3611188 hit:3611261 hit:3616043 hit:3621981 hit:3626265 hit:3627233 hit:3634599 hit:3636635 hit:3639719 hit:3649266 hit:3650629 hit:3652364 hit:3656406 hit:3661894 hit:3663902 hit:3667488 hit:3672720 hit:3675135 hit:3678833 hit:3678916 hit:3678947 hit:3682752 hit:3685306 hit:3689515 hit:3690643 hit:3692056 hit:3697805 hit:3701126 hit:3703035 hit:3703513 hit:3703629 hit:3707454 hit:3712733 hit:3717913 hit:3725501 hit:3725508 hit:3727213 hit:3728570 hit:3728587 hit:3733111 hit:3734134 hit:3735076 hit:3736714 hit:3738253 hit:3743044 hit:3746531 hit:3746537 hit:3747089 hit:3766265 hit:3768218 hit:3768809 hit:3771390 hit:3774179 hit:3774954 hit:3784261 hit:3789210 hit:3791508 hit:3792137 hit:3792149 hit:3792202 hit:3793386 hit:3793783 hit:3794093 hit:3796676 hit:3796974 hit:3801241 hit:3802886 hit:3803670 hit:3803675 hit:3804331 hit:3805281 hit:3818233 hit:3818835 hit:3820337 hit:3820423 hit:3823114 hit:3825799 hit:3826788 hit:3827860 hit:3828866 hit:3832307 hit:3832319 hit:3836629 hit:3841544 hit:3842271 hit:3842274 hit:3842315 hit:3842501 hit:3843712 hit:3843841 hit:3848656 hit:3849811 hit:3849815 hit:3849854 hit:3849871 hit:3850767 hit:3857806 hit:3857873 hit:3857874 hit:3859692 hit:3861242 hit:3863999 hit:3865025 hit:3865041 hit:3865067 hit:3866220 hit:3866513 hit:3867449 hit:3869886 hit:3870900 hit:3871531 hit:3871544 hit:3874433 hit:3874712 hit:3876445 hit:3878585 hit:3882831 hit:3884166 hit:3887070 hit:3887157 hit:3888291 hit:3890218 hit:3892457 hit:3894325 hit:3895211 hit:3895362 hit:3895386 hit:3895698 hit:3895746 hit:3898916 hit:3901317 hit:3901517 hit:3907286 hit:3910366 hit:3910375 hit:3910378 hit:3910379 hit:3910559 hit:3910917 hit:3911049 hit:3911051 hit:3914416 hit:3915350 hit:3915710 hit:3916539 hit:3917093 hit:3919309 hit:3920596 hit:3920687 hit:3920767 hit:3922415 hit:3922711 hit:3924504 hit:3924771 hit:3924777 hit:3926165 hit:3927401 hit:3928097 hit:3928349 hit:3928403 hit:3928538 hit:3931710 hit:3933178 hit:3934536 hit:3937349 hit:3939740 hit:3939755 hit:3939760 hit:3939782 hit:3939900 hit:3944070 hit:3945173 hit:3945624 hit:3945658 hit:3945764 hit:3945769 hit:3946579 hit:3946968 hit:3947598 hit:3948389 hit:3958039 hit:3958101 hit:3958987 hit:3962071 hit:3968099 hit:3968211 hit:3968214 hit:3968296 hit:3969697 hit:3976566 hit:3979434 hit:3982145 hit:3986370 hit:3991221 hit:3991257 hit:3993198 hit:3994816 hit:3996881 hit:4003136 hit:4004791 hit:4005117 hit:4005263 hit:4008416 hit:4009448 hit:4009866 hit:4010552 hit:4011351 hit:4012075 hit:4012765 hit:4014735 hit:4018444 hit:4022771 hit:4023711 hit:4025844 hit:4028377 hit:4030651 hit:4030787 hit:4031844 hit:4034918 hit:4035890 hit:4036496 hit:4036630 hit:4037368 hit:4038492 hit:4042771 hit:4049498 hit:4051438 hit:4056690 hit:4056692 hit:4057386 hit:4060459 hit:4062692 hit:4063536 hit:4065697 hit:4067351 hit:4070317 hit:4077459 hit:4086314 hit:4086669 hit:4087066 hit:4089593 hit:4093434 hit:4096093 hit:4099503 hit:4099746 hit:4101801 hit:4102180 hit:4104709 hit:4108602 hit:4111489 hit:4114428 hit:4114476 hit:4115564 hit:4115779 hit:4115877 hit:4115973 hit:4118138 hit:4123139 hit:4125453 hit:4125613 hit:4126233 hit:4126605 hit:4127282 hit:4127511 hit:4127701 hit:4128013 hit:4129085 hit:4133274 hit:4145601 hit:4147048 hit:4152140 hit:4153450 hit:4154665 hit:4156638 hit:4156980 hit:4157133 hit:4158116 hit:4158852 hit:4161931 hit:4163641 hit:4169310 hit:4174099 hit:4174867 hit:4174978 hit:4179535 hit:4194251 hit:4197296 hit:4208094 hit:4208727 hit:4212166 hit:4213497 hit:4221854 hit:4224140 hit:4229173 hit:4233893 hit:4234465 hit:4237620 hit:4238977 hit:4239099 hit:4241623 hit:4241652 hit:4247651 hit:4247894 hit:4251670 hit:4252204 hit:4252255 hit:4261435 hit:4263388 hit:4269233 hit:4277070 hit:4279601 hit:4280039 hit:4289949 hit:4292490 hit:4292621 hit:4298406 hit:4298650 hit:4302896 hit:4303650 hit:4304490 hit:4316638 hit:4321512 hit:4321527 hit:4321905 hit:4326033 hit:4326117 hit:4326237 hit:4327774 hit:4330065 hit:4331216 hit:4334333 hit:4334392 hit:4335136 hit:4335289 hit:4335482 hit:4336469 hit:4339019 hit:4339358 hit:4349893 hit:4352256 hit:4353690 hit:4354476 hit:4356810 hit:4359710 hit:4366664 hit:4367699 hit:4368141 hit:4368721 hit:4369987 hit:4373079 hit:4373290 hit:4376775 hit:4380648 hit:4381517 hit:4388960 hit:4390014 hit:4390209 hit:4395688 hit:4396539 hit:4397843 hit:4397845 hit:4405427 hit:4415994 hit:4419725 hit:4422297 hit:4424018 hit:4426458 hit:4430704 hit:4433807 hit:4434514 hit:4435128 hit:4436944 hit:4440108 hit:4450040 hit:4451272 hit:4452071 hit:4452564 hit:4452807 hit:4455858 hit:4459724 hit:4462245 hit:4462294 hit:4464760 hit:4465915 hit:4467009 hit:4467128 hit:4468344 hit:4477360 hit:4479253 hit:4480541 hit:4480850 hit:4481186 hit:4482340 hit:4483443 hit:4483481 hit:4486583 hit:4486655 hit:4489888 hit:4490202 hit:4497378 hit:4497775 hit:4499948 hit:4503266 hit:4503496 hit:4508112 hit:4509521 hit:4509573 hit:4510162 hit:4511571 hit:4518198 hit:4518206 hit:4518266 hit:4523068 hit:4528087 hit:4528146 hit:4532143 hit:4534742 hit:4536489 hit:4536889 hit:4538687 hit:4542634 hit:4543194 hit:4544252 hit:4544910 hit:4547930 hit:4549471 hit:4555599 hit:4556364 hit:4556809 hit:4557028 hit:4558664 hit:4561199 hit:4562682 hit:4563921 hit:4564104 hit:4566977 hit:4571749 hit:4571840 hit:4574357 hit:4574445 hit:4578809 hit:4581074 hit:4581603 hit:4583070 hit:4583804 hit:4585269 hit:4585365 hit:4587048 hit:4587432 hit:4587650 hit:4589633 hit:4594951 hit:4598008 hit:4598065 hit:4598115 hit:4598353 hit:4598357 hit:4598367 hit:4600870 hit:4601865 hit:4602293 hit:4602426 hit:4607724 hit:4608597 hit:4608638 hit:4609110 hit:4609134 hit:4610954 hit:4611061 hit:4611846 hit:4611847 hit:4612369 hit:4613402 hit:4616265 hit:4616559 hit:4616927 hit:4616932 hit:4617001 hit:4617007 hit:4617013 hit:4618296 hit:4618685 hit:4619752 hit:4623180 hit:4623460 hit:4627318 hit:4627368 hit:4633846 hit:4635131 hit:4635822 hit:4635981 hit:4636454 hit:4636470 hit:4636982 hit:4638770 hit:4642477 hit:4644335 hit:4644617 hit:4644722 hit:4646031 hit:4649134 hit:4653315 hit:4656225 hit:4658676 hit:4658714 hit:4659213 hit:4662296 hit:4662315 hit:4662426 hit:4663980 hit:4667392 hit:4667654 hit:4670605 hit:4670738 hit:4672333 hit:4673774 hit:4673924 hit:4673931 hit:4678837 hit:4687226 hit:4690162 hit:4693014 hit:4694896 hit:4695565 hit:4696062 hit:4703658 hit:4704060 hit:4710253 hit:4710395 hit:4711925 hit:4717591 hit:4717905 hit:4718070 hit:4719790 hit:4729136 hit:4729728 hit:4736320 hit:4741117 hit:4744005 hit:4744500 hit:4745444 hit:4745683 hit:4747610 hit:4748348 hit:4752686 hit:4756006 hit:4756457 hit:4756564 hit:4757428 hit:4758232 hit:4758547 hit:4772566 hit:4772660 hit:4773000 hit:4774066 hit:4774444 hit:4775330 hit:4778120 hit:4778260 hit:4781641 hit:4789957 hit:4790815 hit:4791610 hit:4795474 hit:4801259 hit:4801717 hit:4804093 hit:4805754 hit:4805916 hit:4808785 hit:4810246 hit:4811236 hit:4813981 hit:4821388 hit:4823603 hit:4824216 hit:4826395 hit:4835313 hit:4836867 hit:4840417 hit:4843347 hit:4845311 hit:4845803 hit:4846525 hit:4850822 hit:4852331 hit:4852345 hit:4857788 hit:4859768 hit:4862035 hit:4866107 hit:4870070 hit:4871542 hit:4875811 hit:4877092 hit:4877410 hit:4882825 hit:4883886 hit:4884021 hit:4893333 hit:4903744 hit:4903964 hit:4904710 hit:4906093 hit:4906259 hit:4910241 hit:4912149 hit:4913795 hit:4919328 hit:4919365 hit:4924145 hit:4924632 hit:4925928 hit:4926271 hit:4928637 hit:4933750 hit:4934275 hit:4942181 hit:4942771 hit:4943812 hit:4949065 hit:4963428 hit:4965341 hit:4973243 hit:4973350 hit:4976968 hit:4979300 hit:4980819 hit:4986947 hit:4989160 hit:4991836 hit:4992379 hit:5001674 hit:5003541 hit:5006749 hit:5008874 hit:5009488 hit:5010656 hit:5012705 hit:5014266 hit:5020284 hit:5023591 hit:5028318 hit:5028751 hit:5032935 hit:5039863 hit:5042063 hit:5044227 hit:5047652 hit:5048153 hit:5048702 hit:5052553 hit:5054381 hit:5057026 hit:5059001 hit:5062355 hit:5065283 hit:5074063 hit:5074982 hit:5075614 hit:5076344 hit:5077182 hit:5081233 hit:5082819 hit:5084416 hit:5085415 hit:5085426 hit:5088261 hit:5091056 hit:5097289 hit:5099888 hit:5101019 hit:5101344 hit:5102899 hit:5105658 hit:5105934 hit:5105935 hit:5106042 hit:5107642 hit:5109596 hit:5110126 hit:5113107 hit:5113522 hit:5115212 hit:5119392 hit:5119498 hit:5119504 hit:5122947 hit:5123017 hit:5127396 hit:5127494 hit:5127642 hit:5128017 hit:5128082 hit:5129401 hit:5131138 hit:5134216 hit:5137005 hit:5139698 hit:5141886 hit:5142999 hit:5150071 hit:5150635 hit:5150642 hit:5150812 hit:5150817 hit:5151170 hit:5151563 hit:5155277 hit:5155674 hit:5157681 hit:5161576 hit:5163605 hit:5166690 hit:5166736 hit:5166738 hit:5166854 hit:5166877 hit:5167974 hit:5168477 hit:5168910 hit:5169217 hit:5169523 hit:5169988 hit:5172002 hit:5176666 hit:5180037 hit:5185396 hit:5188339 hit:5188566 hit:5188748 hit:5196599 hit:5197125 hit:5199958 hit:5201800 hit:5203826 hit:5204024 hit:5206292 hit:5206403 hit:5208863 hit:5213194 hit:5214186 hit:5214550 hit:5214912 hit:5214917 hit:5223306 hit:5233487 hit:5233776 hit:5237249 hit:5241026 hit:5241191 hit:5244196 hit:5244579 hit:5248736 hit:5248883 hit:5248887 hit:5248899 hit:5251855 hit:5254275 hit:5255343 hit:5255345 hit:5255424 hit:5255559 hit:5260356 hit:5261859 hit:5262962 hit:5263400 hit:5263744 hit:5265793 hit:5267725 hit:5269507 hit:5273412 hit:5273571 hit:5273572 hit:5273864 hit:5276274 hit:5280773 hit:5280828 hit:5284753 hit:5288490 hit:5288677 hit:5289982 hit:5292629 hit:5296047 hit:5300856 hit:5302518 hit:5302837 hit:5304178 hit:5307785 hit:5307787 hit:5309026 hit:5309065 hit:5311492 hit:5312805 hit:5312905 hit:5315699 hit:5315869 hit:5318733 hit:5320685 hit:5321041 hit:5322004 hit:5325280 hit:5330160 hit:5330180 hit:5330182 hit:5330189 hit:5330193 hit:5330262 hit:5330265 hit:5332198 hit:5333129 hit:5333725 hit:5337556 hit:5338889 hit:5338928 hit:5339900 hit:5344390 hit:5344866 hit:5347201 hit:5347411 hit:5350292 hit:5352796 hit:5357091 hit:5359983 hit:5360050 hit:5362470 hit:5364789 hit:5368404 hit:5369361 hit:5370044 hit:5370606 hit:5372416 hit:5375024 hit:5379350 hit:5382809 hit:5391962 hit:5392748 hit:5393846 hit:5394641 hit:5396163 hit:5399581 hit:5400600 hit:5400797 hit:5400996 hit:5402993 hit:5403362 hit:5407228 hit:5407302 hit:5411939 hit:5412407 hit:5412976 hit:5418367 hit:5420173 hit:5424684 hit:5426245 hit:5428657 hit:5434305 hit:5434486 hit:5441513 hit:5442562 hit:5442770 hit:5445885 hit:5446363 hit:5446513 hit:5447622 hit:5453347 hit:5456866 hit:5457166 hit:5461448 hit:5462110 hit:5464127 hit:5464794 hit:5465778 hit:5466911 hit:5468117 hit:5469307 hit:5471382 hit:5473198 hit:5473919 hit:5476851 hit:5476934 hit:5477533 hit:5477963 hit:5478070 hit:5479906 hit:5481040 hit:5484836 hit:5485759 hit:5488706 hit:5490768 hit:5495465 hit:5497395 hit:5503764 hit:5504476 hit:5511547 hit:5515569 hit:5521754 hit:5527364 hit:5529979 hit:5530092 hit:5534006 hit:5538303 hit:5541905 hit:5549914 hit:5550026 hit:5550526 hit:5554826 hit:5558235 hit:5562122 hit:5564068 hit:5569276 hit:5569693 hit:5572724 hit:5572971 hit:5577321 hit:5577580 hit:5579384 hit:5579464 hit:5579780 hit:5581770 hit:5582751 hit:5586361 hit:5588369 hit:5589645 hit:5589650 hit:5590343 hit:5591309 hit:5594968 hit:5599340 hit:5602824 hit:5603198 hit:5606095 hit:5608529 hit:5614911 hit:5618517

size:2574

Could you offer some suggestions? The integration of Tantivy into ClickHouse demands stringent requirements regarding search latency, which is quite critical in our process.

What are the requirements?

I am interested in knowing how to modify this custom Collector to accelerate the query if the indexing method for the row_id column is INDEXED rather than FAST | INDEXED.

The code for range queries on the fast fields FAST or inverted index INDEXED is here: https://github.com/quickwit-oss/tantivy/tree/main/src/query/range_query.

The biggest gain would probably be on this type of query to sort the index by row_id and then exploit the sorting in the RangeDocSet, by finding the relevant docid range via a binary search.

PSeitz avatar Nov 21 '23 03:11 PSeitz

Thank you for your patient explanation, now I understand the scanning method of the FAST index. My specific requirement is to write an FFI search function for C++ usage.

The tantivy index schema structure looks like:

let mut schema_builder = Schema::builder();
schema_builder.add_u64_field("row_id", FAST | INDEXED);
schema_builder.add_text_field("text", TEXT);
let schema = schema_builder.build();

This FFI search function takes three important parameters:

  • query_ptr the string used for searching
  • lrange the left interval of row_id_range
  • rrange the right interval of row_id_range.

It looks like:

pub struct IndexR {
    pub path: String,
    pub index: Index,
    pub reader: IndexReader,
}

#[no_mangle]
pub extern "C" fn tantivy_search_in_range(ir: *mut IndexR, query_ptr: *const c_char, lrange: u64, rrange: u64) -> bool {
    // ...
}

The function returns a boolean value. This function will search for the string in the tantivy index and collect a set of corresponding row_ids from the search results. Then, it will determine if there is an overlap between this row_ids set and the row_id_range provided by the function parameters [lrange, rrange]. If there is an overlap, it returns true, otherwise false.

🎯 Everything became clear after that.

1️⃣ ➡️ Initially, I tried using the following more traditional logic to determine if the row_ids overlap with the row_id_range:

let docs_results = searcher.search(&query, &Docs::with_limit(limit as usize)).expect("failed to search");
for (_score, doc_address) in docs_results {
    let row_id_column = &row_id_columns[doc_address.segment_ord as usize];
    let row_id_value = row_id_column.values_for_doc(doc_address.doc_id).next().unwrap_or(0);
    if lrange <= row_id_value && row_id_value <= rrange {
        return true;
    }
}
false

2️⃣ ➡️ Later, I planned to optimize this FFI interface. The goal of the optimization is to reduce latency. So, I thought of combining a range query. First, I would limit the row_id according to the function's parameters [lrange, rrange], and then perform a text search within this specified range [lrange, rrange]. If a result is found, this FFI function would immediately return true; otherwise, it would return false. The code would look something like this:

let range_query = RangeQuery::new_u64("row_id".to_string(), lrange..rrange);
let query_parser = QueryParser::for_index(&index, vec![schema.get_field("text").unwrap()]);
let text_query = query_parser.parse_query("volunteer")?;
let boolean_query = BooleanQuery::new(vec![
    (Occur::Must, Box::new(range_query)),
    (Occur::Must, Box::new(text_query)),
]);
let boolean_ans_size = searcher.search(&boolean_query, &Count)?;
if boolean_ans_size !=0 {
    return true;
}
false

However, the latency when using a range query was still relatively high, as shown in the previous results I posted. Executing a text search alone takes 0.118 ms, but adding a range query increases the latency to 5ms (with the row_id field using an INDEXED only index).

3️⃣ ➡️ Therefore, I started to consider whether it would be possible to work on the custom Collector used in the text search. By directly collecting the row_ids corresponding to the search results in the Collector, it's feasible to determine during the collector process whether the [lrange, rrange] range has been hit. Code looks like:

let row_id_hit_size = searcher.search(&text_query, &RowIdCollector::with_field("row_id".to_string(), lrange, rrange))?;
if row_id_hit_size.unwrap().get_hit_count() !=0 {
    return true;
}
false

The custom Collector code is below:

use tantivy::columnar::Column;
use tantivy::collector::{Collector, SegmentCollector};
use tantivy::{ Score, SegmentReader};

#[derive(Default)]
pub struct RowIdRangeHit {
    hit: bool,
    hit_count: u64,
}

impl RowIdRangeHit {
    pub fn new() -> RowIdRangeHit {
        RowIdRangeHit {
            hit: false,
            hit_count: 0,
        }
    }

    pub fn get_hit_count(&self) -> u64 {
        self.hit_count
    }

    pub fn mark_hit(&mut self) {
        self.hit_count += 1;
        self.hit = true;
    }

    pub fn is_hit(&self) -> bool {
        self.hit
    }
}

pub struct RowIdCollector {
    pub row_id_field: String,
    pub lrange: u64,
    pub rrange: u64,
}

impl RowIdCollector {
    pub fn with_field(row_id_field: String, lrange: u64, rrange: u64) -> RowIdCollector {
        RowIdCollector { row_id_field, lrange, rrange }
    }
}

impl Collector for RowIdCollector {
    type Fruit = Option<RowIdRangeHit>;
    type Child = RowIdSegmentCollector;

    fn for_segment(
        &self,
        _segment_local_id: u32,
        segment_reader: &SegmentReader,
    ) -> tantivy::Result<Self::Child> {
        let row_id_reader_ = segment_reader.fast_fields().u64(&self.row_id_field)?;
        Ok(RowIdSegmentCollector {
            row_id_reader: row_id_reader_,
            row_id_range_hit: RowIdRangeHit::new(),
            lrange: self.lrange,
            rrange: self.rrange,
        })
    }

    fn requires_scoring(&self) -> bool {
        // this collector does not care about score.
        false
    }

    fn merge_fruits(&self, segment_row_ids: Vec<Self::Fruit>) -> tantivy::Result<Self::Fruit> {
        let mut row_id_range_hit = RowIdRangeHit::new();
        for segment_row_id_range_hit in segment_row_ids.into_iter().flatten() {
            if segment_row_id_range_hit.is_hit() {
                row_id_range_hit.hit = true;
                row_id_range_hit.hit_count += segment_row_id_range_hit.get_hit_count();
            }
        }
        Ok(Some(row_id_range_hit))
    }
}

pub struct RowIdSegmentCollector {
    row_id_reader: Column,
    row_id_range_hit: RowIdRangeHit,
    pub lrange: u64,
    pub rrange: u64,
}

impl SegmentCollector for RowIdSegmentCollector {
    type Fruit = Option<RowIdRangeHit>;

    fn collect(&mut self, doc: u32, _score: Score) {
        for row_id_ in self.row_id_reader.values_for_doc(doc) {
            if row_id_ >= self.lrange && row_id_ <= self.rrange {
                self.row_id_range_hit.mark_hit();
            }
        }
    }

    fn harvest(self) -> <Self as SegmentCollector>::Fruit {
        Some(self.row_id_range_hit)
    }
}

This approach is currently the most effective method I have found for minimizing latency. I tested this custom Collector in the test code mentioned earlier (using the row_id with a FAST | INDEXED indexing method). The search latency was reduced to 0.717 ms, which is very exciting.

range query answer size:10000, range_query_duration: 15.182 ms
text query answer size:2574, text_query_duration: 0.126 ms
boolean query answer size:5, boolean_query_duration: 12.160 ms
row_id hit count size:5, row_id_collector_query_duration: 0.717 ms

I'm wondering if changing the indexing method of row_id to INDEXED only could further accelerate the entire query process. I suspect that in cases where there are a large number of search results, the efficiency of this custom Collector might also be affected by the row_id FAST index. For instance, if I search for the word "a", I might get results like the following:

❯ ./target/release/tantivy_demo                   
range query answer size:10000, range_query_duration: 53.900 ms
text query answer size:1796537, text_query_duration: **3.695 ms**
boolean query answer size:3109, boolean_query_duration: 11.944 ms
row_id hit count size:3109, row_id_collector_query_duration: **11.278 ms**

So, if I change the indexing method of row_id to INDEXED only, how should I modify the for_segment function in my custom Collector? The fast_fields used inside this function would require the indexing method of row_id to be FAST, right?

Thank you for your patient reading.💗

MochiXu avatar Nov 21 '23 06:11 MochiXu

My specific requirement is to write an FFI search function for C++ usage.

I meant the "stringent requirements regarding search latency"

So, if I change the indexing method of row_id to INDEXED only, how should I modify the for_segment function in my custom Collector? The fast_fields used inside this function would require the indexing method of row_id to be FAST, right?

Similar to the range_query. It's slower for large ranges.

DocSets And Intersection

The problem can be simplified two DocSets, each set hits a list of DocIds.

DocSet1  DocSet2
1
2
3
..
100_000  100_000
100_001
...
200_000  200_000

The intersection would result in two DocIds [100_000, 200_000].

On its own, DocSet1 is bad for the INDEXED variant. It also largely depends on the number of terms though, since we need to load all the docs per term and build the BitSet from it. On its own, DocSet2 is bad for FAST, because there is no index so we need to scan the whole index, although there are just 2 docs. With intersection it gets more complicated, on which DocSet is going to drive the seek. Currently it seems to alternate, but that doesn't seem the best choice since DocSets may have quite different performance characteristics regarding seek.

Currently tantivy always uses the columnar storage with FAST, but that should be conditional.

You collector manually implements the intersection. It alleviates the problem that seek also forwards to the next DocId. It also generates new issues, since it is the final destination it can't cause seek on another DocSet, therefore there is no seek on the Term Query.

Early Exit

Your use case is also special, that you would want to early exit the query after the first hit. There is no such mechanism currently, but it could be added by changing how the query is driven with some parameter. There several methods, but one of them is: https://github.com/quickwit-oss/tantivy/blob/main/src/query/weight.rs#L26

PSeitz avatar Nov 21 '23 09:11 PSeitz