document with facet can not be deleted
I encountered a strange issue. The document has three fields: fields a and b are of type u64, and field c is a facet field. I found that records with a facet value added cannot be deleted, whereas records without the facet can be deleted normally using a term query. Why is this happening?
I am using the latest version.
To clarify, it seems that after modifying a document, it cannot be deleted. The modification is done by first calling delete_query, then add_document, and finally commit.
Can you provide some code to reproduce?
#[tokio::test]
async fn test2() {
use tantivy::schema::{FAST, INDEXED, STORED, STRING};
let mut builder = Schema::builder();
let a = builder.add_u64_field("a", INDEXED | FAST);
let b = builder.add_text_field("b", STRING | STORED | FAST);
let schema = builder.build();
let index = Index::create_in_ram(schema);
let mut index_writer: IndexWriter = index.writer(50_000_000).unwrap();
let delete_term1 = Term::from_field_u64(a, 1u64);
let delete_term2 = Term::from_field_u64(a, 2u64);
let delete_term3 = Term::from_field_u64(a, 3u64);
let operations = vec![
//UserOperation::Delete(delete_term1),
UserOperation::Add(doc!(
a => 1u64,
b => "test1"
)),
//UserOperation::Delete(delete_term2),
UserOperation::Add(doc!(
a => 2u64,
b => "test1"
)),
//UserOperation::Delete(delete_term3),
UserOperation::Add(doc!(
a => 3u64,
b => "test1"
)),
];
index_writer.run(operations).unwrap();
index_writer.commit().unwrap();
let reader = index.reader().unwrap();
let searcher = reader.searcher();
let tq = TermQuery::new(Term::from_field_u64(a, 3), IndexRecordOption::Basic);
let docs = searcher.search(&tq, &TopDocs::with_limit(1)).unwrap();
if let Some((_, doc_address)) = docs.first() {
let old_doc: TantivyDocument = searcher.doc_async(*doc_address).await.unwrap();
let mut new_doc = TantivyDocument::new();
for (field, value) in old_doc.field_values() {
if field == a {
new_doc.add_field_value(a, value);
}
}
new_doc.add_text(b, "test2");
let delete_term = Term::from_field_u64(a, 3);
index_writer.delete_term(delete_term);
index_writer.commit().unwrap();
index_writer.add_document(new_doc).unwrap();
index_writer.commit().unwrap();
}
reader.reload().unwrap();
let searcher = reader.searcher();
let docs = searcher.search(&tq, &TopDocs::with_limit(1)).unwrap();
if let Some((_, doc_address)) = docs.first() {
let doc: TantivyDocument = searcher.doc_async(*doc_address).await.unwrap();
for (field, value) in doc.field_values() {
if field == b {
let value = value.as_str();
println!("{:#?}", value);
}
}
} else {
println!("not found")
}
let delete_term = Term::from_field_u64(a, 3);
index_writer.delete_term(delete_term);
index_writer.commit().unwrap();
reader.reload().unwrap();
let searcher = reader.searcher();
let docs = searcher.search(&tq, &TopDocs::with_limit(1)).unwrap();
if let Some((_, doc_address)) = docs.first() {
let doc: TantivyDocument = searcher.doc_async(*doc_address).await.unwrap();
for (field, value) in doc.field_values() {
if field == b {
let value = value.as_str();
println!("{:#?}", value);
}
}
} else {
println!("not found")
}
}
A similar code flow to this one currently produces results that don’t match expectations. However, I discovered that when I changed this line — let a = builder.add_u64_field("a", INDEXED | FAST); — and added STORED to the field, things started working correctly. Why is that?
I don't see any facets in your example. Can you provide a minimal example with an assertion?
@PSeitz Sorry, I sent a text example because I later realized they behave the same. This example is about these two fields:
let a = builder.add_u64_field("a", INDEXED | FAST);
let b = builder.add_text_field("b", STRING | STORED | FAST);
If the a field is not set as STORED, then when I first delete a record using a as the delete_term, then modify the b field of that record, and try to delete it again, it doesn’t take effect. But if a is set as STORED, then everything works fine. I want to understand why this happens.
Can you provide a minimal example with an assertion?
#[tokio::test]
async fn test2() {
use tantivy::schema::{FAST, INDEXED, STORED, STRING};
let mut builder = Schema::builder();
let a = builder.add_u64_field("a", INDEXED | FAST);
let b = builder.add_text_field("b", STRING | STORED | FAST);
let schema = builder.build();
let index = Index::create_in_ram(schema);
let mut index_writer: IndexWriter = index.writer(50_000_000).unwrap();
let delete_term1 = Term::from_field_u64(a, 1u64);
let delete_term2 = Term::from_field_u64(a, 2u64);
let delete_term3 = Term::from_field_u64(a, 3u64);
let operations = vec![
UserOperation::Delete(delete_term1),
UserOperation::Add(doc!(
a => 1u64,
b => "v1"
)),
UserOperation::Delete(delete_term2),
UserOperation::Add(doc!(
a => 2u64,
b => "v1"
)),
UserOperation::Delete(delete_term3),
UserOperation::Add(doc!(
a => 3u64,
b => "v1"
)),
];
index_writer.run(operations).unwrap();
index_writer.commit().unwrap();
let reader = index.reader().unwrap();
let searcher = reader.searcher();
let tq = TermQuery::new(Term::from_field_u64(a, 3), IndexRecordOption::Basic);
let docs = searcher.search(&tq, &TopDocs::with_limit(1)).unwrap();
assert!(docs.first().is_some());
if let Some((_, doc_address)) = docs.first() {
let old_doc: TantivyDocument = searcher.doc_async(*doc_address).await.unwrap();
let mut new_doc = TantivyDocument::new();
for (field, value) in old_doc.field_values() {
if field == a {
new_doc.add_field_value(a, value);
}
if field == b {
assert_eq!(Some("v1"), value.as_str())
}
}
new_doc.add_text(b, "v2");
let delete_term = Term::from_field_u64(a, 3);
index_writer.delete_term(delete_term);
index_writer.add_document(new_doc).unwrap();
index_writer.commit().unwrap();
}
reader.reload().unwrap();
let searcher = reader.searcher();
let docs = searcher.search(&tq, &TopDocs::with_limit(1)).unwrap();
assert!(docs.first().is_some());
if let Some((_, doc_address)) = docs.first() {
let doc: TantivyDocument = searcher.doc_async(*doc_address).await.unwrap();
for (field, value) in doc.field_values() {
if field == b {
assert_eq!(Some("v2"), value.as_str())
}
}
}
let delete_term = Term::from_field_u64(a, 3);
index_writer.delete_term(delete_term);
index_writer.commit().unwrap();
reader.reload().unwrap();
let searcher = reader.searcher();
let docs = searcher.search(&tq, &TopDocs::with_limit(1)).unwrap();
assert!(docs.first().is_none());
if let Some((_, doc_address)) = docs.first() {
let doc: TantivyDocument = searcher.doc_async(*doc_address).await.unwrap();
for (field, value) in doc.field_values() {
if field == b {
let value = value.as_str();
println!("{:#?}", value);
}
}
}
}
It's unclear what the expectation is and which assert fails. Can you add a minimal example, where a document that should be deleted is still there? You can replace doc_async with the simpler doc.
@rustmailer this is most likely not a bug, and has nothing to do with facet.
I suspect you are reusing a searcher or forgot to reload the reader. You can see a searcher as a handle over a snapshot view of your index.
As long as you use it, you will not see any change in your index.
To make sure you get an up to date searcher, you need to call reader.reload()?; and acquire a new searcher.