tntsearch icon indicating copy to clipboard operation
tntsearch copied to clipboard

Build an Index Manually Without a Datastore (no initial seeding of index)

Open syntaxseed opened this issue 7 years ago • 1 comments

From my comment on issue #118...

My data, as it exists in the DB is not in a format ready to be indexed. So I was hoping to start with an empty index, and then just add records as the publishing process creates them. This means I need to create an index, without connecting to a database. Using only the manual insertion ($index->insert(...)).

Doing the following:

$tnt = new TNTSearch;

$tnt->loadConfig([
    'storage'   => __DIR__.'/tntsearch/indexes/'
]);

$indexer = $tnt->createIndex('test.index');
$index->insert(['id' => '1', 'content' => 'some awesome searchable content']);  // I want to ONLY do this.

Gives an error because the db driver is not specified in the config. If I put some dummy mysql settings in there, it complains because it wants to connect to the db.

I have already created an empty text.index file in the storage path.

How can I use ONLY $index->insert(...) to add my data, and skip connecting to the db? Is there a 'none' option for driver?

syntaxseed avatar Feb 16 '18 13:02 syntaxseed

Well I may have come up with a workaround:

Use the filesystem as your driver, and have it look in an empty dir for the initial seeding of the db.

// TNT Search
$tnt = new TNTSearch;
$tnt->loadConfig([
    "driver"    => 'filesystem',
    "location"  => __DIR__.'/tntsearch/dummysource/',
    "extension" => "txt",
    'storage'   => __DIR__.'/tntsearch/indexes/'
]);

$indexer = $tnt->createIndex('test.index');
$indexer->run();

$indexer->insert(['id' => '1', 'content' => 'new awesome article about php']);
$indexer->insert(['id' => '2', 'content' => 'another article about php']);
$indexer->insert(['id' => '3', 'content' => 'read this one because it is cool.']);
$indexer->insert(['id' => '4', 'content' => 'some stuff about interesting things']);

The index is first created from an empty dir so it has zero records. Then I've manually inserted 4 records.

Then query the index as normal.

$tnt->selectIndex('test.index');
$results = $tnt->search('article', 12);

So, this might work for now... but wondering if there is a better way to do this to skip the initial delay as it attempts to read the empty source directory?

syntaxseed avatar Feb 16 '18 14:02 syntaxseed