QG Index Creation and Build Failure
Hello,
I had created a 113M embedding index (768 dim), with the following prf:
AccuracyTable
BatchSizeForCreation 200
BuildTimeLimit 0
DatabaseType Memory
Dimension 768
DistanceType Cosine
DynamicEdgeSizeBase 30
DynamicEdgeSizeRate 20
EdgeSizeForCreation 10
EdgeSizeForSearch 40
EdgeSizeLimitForCreation 5
EpsilonForCreation 0.1
GraphType ANNG
IncomingEdge 80
IncrimentalEdgeSizeLimitForTruncation 0
IndexType GraphAndTree
ObjectAlignment False
ObjectType Float-4
OutgoingEdge 10
PathAdjustmentInterval 0
PrefetchOffset 1
PrefetchSize 3072
SeedSize 10
SeedType None
ThreadPoolSize 32
TruncationThreadPoolSize 8
The search works fine on this index.
However, when I attempt to create and build a QG index, I run into multiple issues.
- Creating a QG index for this, by executing the following command,
qbg create-qg -d 768 -D C -E 10 -S 40 -i t -o f -p 32 -N 384 -c 16 -C sqsu8 -B 2 -b 200 -M l -L s -e 0.1 -v /path_to_index
Issue: The generated /path_to_index/qg/prf file contains values that do not match the arguments I passed.
Here is the generated qg/prf file:
BatchSize 1000
CentroidCreationMode 1
DataSize 0
DataType 1
Dimension 768
DistanceType 1
GenuineDataType 1
GenuineDimension 768
GlobalCentroidLimit 1
GlobalRange 0
LocalCentroidCreationMode 1
LocalCentroidLimit 16
LocalClusterDataType 2
LocalCodebookState 1
LocalDivisionNo 384
LocalIDByteSize 2
LocalRange 0
LocalSampleCoefficient 100
MaxMagnitude -1
QuantizerType 0
RefinementDataType 99
ScalarQuantizationClippingRate 0.01
ScalarQuantizationNoOfSamples 0
ScalarQuantizationOffset 0
ScalarQuantizationScale 0
SingleLocalCodebook 0
ThreadSize 24
Q. Firstly, the qg/prf values are not matching the passed argument values, Why are the values in qg/prf different from the ones I passed? Is there an internal default overriding them?
- QG Index Build Failure
Post creation, I executed the command for building QG index,
qbg build-qg -E 128 -v /path_to_index
which fails in creation (or gets stuck).
Observations:
- CPU is getting 100% when is it processing the index objects (machine used is
n2-highmem-128) - Intermediate logs:
append: Data loading time=2.255e-05 (sec) 0.02255 (msec)
# of objects=16
Index creation time=0.00132462 (sec) 1.32462 (msec)
qbg: loading the rotation...
QuantizationCodebook::buildIndex
QuantizationCodebook::buildIndex # of the centroids=1
load() done
codebook index size=1
- Tail logs for this command:
# of processed objects=105000000, time=21.5226 (m), vm size=104.58 G/104.58 G
# of processed objects=106000000, time=21.7245 (m), vm size=104.58 G/104.58 G
# of processed objects=107000000, time=21.9246 (m), vm size=104.58 G/104.58 G
# of processed objects=108000000, time=22.1269 (m), vm size=104.58 G/104.58 G
# of processed objects=109000000, time=22.3308 (m), vm size=104.58 G/104.58 G
# of processed objects=110000000, time=22.5344 (m), vm size=104.58 G/104.58 G
# of processed objects=111000000, time=22.7377 (m), vm size=104.58 G/104.58 G
# of processed objects=112000000, time=22.941 (m), vm size=104.58 G/104.58 G
# of processed objects=113000000, time=23.1424 (m), vm size=104.58 G/104.58 G
cp: cannot stat '/path_to_index/qg/ws/hkc_3c': No such file or directory
After the cp error log, cpu and memory both become 0 and index build gets stuck (or failed).
Questions
- Why does the QG index creation (create-qg) generate a qg/prf file with mismatched values?
- Are there any internal defaults that override command-line arguments?
- Is there a way to verify which parameters were actually used?
- Why does the build-qg process fail after cp: cannot stat '/path_to_index/qg/ws/hkc_3c'?
- What is this missing file, and why is it required?
- Is it possible that a previous step failed, leading to missing files?
- Is there any known issue related to QG index builds for very large datasets?
- How can I resolve this issue and successfully build the QG index for 113M embeddings?
- Are there additional configuration settings I should check?
- Are there memory/CPU constraints I should be aware of?
Hello,
Why does the QG index creation (create-qg) generate a qg/prf file with mismatched values?
Unsupported arguments are ignored without any error. It seems that many of the parameters you specified have been ignored.
Why does the build-qg process fail after cp: cannot stat '/path_to_index/qg/ws/hkc_3c'?
I generated 768-dimensional data and tried the same way as you did, but the issue did not reproduce. QG consumes a large amount of memory, so there may be a memory shortage in your case. First, try a smaller dataset to see if the same issue occurs. I also recommend using the latest version.
How can I resolve this issue and successfully build the QG index for 113M embeddings?
Since QG consumes a large amount of memory, I recommend using QG for 113M embeddings. It might also be another good option to use Vald, the distributed index.
Thank you for your reply.
Unsupported arguments are ignored without any error. It seems that many of the parameters you specified have been ignored.
Regarding this, I checked the underlying code (create-qg), which calls the function, getCreationParameters(), where all the parameters are mentioned, that I provided in the command,
qbg create-qg -d 768 -D C -E 10 -S 40 -i t -o f -p 32 -N 384 -c 16 -C sqsu8 -B 2 -b 200 -M l -L s -e 0.1 -v /path_to_index
Still, they are not being updated, unable to understand why they are being ignored.
- The VM machine that I am utilizing for this task,
n2-highmem-128 (128 vCPUs, 864 GB Memory)has enough memory to load the qg index in memory for 113M embeddings. However, duringbuild-qgcommand, the CPU was reaching 100%, when these logs were occurring,
# of processed objects=105000000, time=21.5226 (m), vm size=104.58 G/104.58 G
# of processed objects=106000000, time=21.7245 (m), vm size=104.58 G/104.58 G
# of processed objects=107000000, time=21.9246 (m), vm size=104.58 G/104.58 G
# of processed objects=108000000, time=22.1269 (m), vm size=104.58 G/104.58 G
# of processed objects=109000000, time=22.3308 (m), vm size=104.58 G/104.58 G
# of processed objects=110000000, time=22.5344 (m), vm size=104.58 G/104.58 G
# of processed objects=111000000, time=22.7377 (m), vm size=104.58 G/104.58 G
# of processed objects=112000000, time=22.941 (m), vm size=104.58 G/104.58 G
# of processed objects=113000000, time=23.1424 (m), vm size=104.58 G/104.58 G
and as soon as the whole index was processed, the cp error occurred and CPU came down to 0%.
Memory
Also, the disk writes stopped completely post cpu being 0, but the process and logs were stuck at,
cp: cannot stat '/path_to_index/qg/ws/hkc_3c': No such file or directory
For the smaller indexes, this cp error occurred but only momentarily, and the subsequent processes were completed and index was build, but for 113M the above is happening.
- Regarding this,
I generated 768-dimensional data and tried the same way as you did, but the issue did not reproduce. QG consumes a large amount of memory, so there may be a memory shortage in your case. First, try a smaller dataset to see if the same issue occurs. I also recommend using the latest version.
Could you pls provide the commands that you executed to test for the same, did they match the prf file values for both ngt index and qg index?
- In the meantime, I did try with random embeddings of the shape
(100000, 768), and though, it gave all the above observations along with the error log,cp: cannot stat '/path_to_index/qg/ws/hkc_3c': No such file or directory, it did successfully create the qg index, and thesearch-qgwas then also working.
What could be the possible reasons for the 113M embeddings, which are failing to build qg-index (given that original ngt index works fine while building & searching)?
getCreationParameters() is used in both QG and QBG. Therefore, not everything is reflected for QG. The arguments are mainly reflected in two files: one is [index]/qg/prf, and the other is [index]/qg/global/prf. Some of the variables in the two files inherit the value of [index]/prf as is.
I thought the cp error was not output in my execution, but it was actually output in the same way. After investigating the source code, I found that the file "hkc_3c" in the cp error is not used in QG, so this error can be ignored. Since your memory size is extremely large, if OOM has not occurred, the index might have been successfully created somehow. Could you provide the execution results of the following command?
ls -l [index]/qg
ls -l [index]/qg/global
ls -l [index]/qg/local-0
ngt info [index]/qg/global
ngt info [index]/qg/local-0
The index in memory is saved right after the cp error. I can determine where the process stopped based on the result of the above command.
The executed commands
ngt create -d 768 -D c idx
ngt append idx data.csv
qbg create-qg -d 768 -D C -E 10 -S 40 -i t -o f -p 32 -N 384 -c 16 -C sqsu8 -B 2 -b 200 -M l -L s -e 0.1 -v idx
qbg build-qg -E 128 -v idx
qbg search-qg idx query.csv
> qbg search-qg idx q1.csv
expandedSizeByEpsilon=False
Query No.1
Rank ID Distance
1 1 0
2 32150 0.371839
3 1353 0.388081
4 39141 0.40095
5 41889 0.430618
6 43038 0.441377
7 97010 0.4695
8 48229 0.501483
9 4211 0.520085
10 25896 0.528741
11 92226 0.538304
12 33444 0.542158
13 27482 0.54434
14 81632 0.548103
15 56906 0.548995
16 27702 0.551018
17 46108 0.551958
18 5210 0.553028
19 46142 0.559637
20 29225 0.55997
Query Time= 0.000979127 (sec), 0.979127 (msec)
Average Query Time= 0.000979127 (sec), 0.979127 (msec), (0.000979127/1)
[index]/qg/prf
BatchSize 1000
CentroidCreationMode 1
DataSize 0
DataType 1
Dimension 768
DistanceType 1
GenuineDataType 1
GenuineDimension 768
GlobalCentroidLimit 1
GlobalRange 0
LocalCentroidCreationMode 1
LocalCentroidLimit 16
LocalClusterDataType 2
LocalCodebookState 1
LocalDivisionNo 384
LocalIDByteSize 2
LocalRange 0
LocalSampleCoefficient 100
MaxMagnitude -1
QuantizerType 0
RefinementDataType 99
ScalarQuantizationClippingRate 0.01
ScalarQuantizationNoOfSamples 0
ScalarQuantizationOffset -0.08249611
ScalarQuantizationScale 0.1680998
SingleLocalCodebook 0
ThreadSize 24
[index]/qg/global/prf
AccuracyTable
BatchSizeForCreation 200
BuildTimeLimit 0
DatabaseType Memory
Dimension 768
DistanceType L2
DynamicEdgeSizeBase 30
DynamicEdgeSizeRate 20
EdgeSizeForCreation 10
EdgeSizeForSearch 40
EdgeSizeLimitForCreation 5
EpsilonForCreation 0.1
EpsilonForInsertionOrder 0.1
EpsilonType None
GraphType ONNG
IncomingEdge 80
IncrimentalEdgeSizeLimitForTruncation 0
IndexType GraphAndTree
MaxMagnitude -1
NumberOfNeighborsForInsertionOrder 0
ObjectAlignment False
ObjectType Float-4
OutgoingEdge 10
PathAdjustmentInterval 0
PrefetchOffset 1
PrefetchSize 3072
QuantizationClippingRate 0
QuantizationOffset 0
QuantizationScale 0
RefinementObjectType Float-4
SeedSize 10
SeedType None
ThreadPoolSize 32
TruncationThreadPoolSize 8
The only reason why the NGT graph works but QG does not should be memory shortage, assuming there are no bugs.
I even tried with a bigger machine with more CPU and memory than the previous one, but still got the same results. Here are the required outputs,
ls -l [index]/qg
ls -l [index]/qg
total 428511616
drwxr-xr-x 2 root root 4096 Apr 1 14:58 global
-rw-r--r-- 1 root root 87758377680 Apr 1 15:39 ivt
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-0
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-1
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-10
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-100
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-101
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-102
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-103
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-104
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-105
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-106
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-107
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-108
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-109
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-11
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-110
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-111
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-112
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-113
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-114
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-115
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-116
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-117
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-118
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-119
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-12
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-120
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-121
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-122
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-123
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-124
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-125
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-126
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-127
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-128
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-129
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-13
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-130
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-131
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-132
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-133
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-134
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-135
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-136
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-137
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-138
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-139
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-14
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-140
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-141
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-142
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-143
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-144
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-145
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-146
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-147
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-148
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-149
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-15
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-150
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-151
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-152
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-153
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-154
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-155
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-156
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-157
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-158
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-159
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-16
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-160
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-161
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-162
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-163
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-164
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-165
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-166
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-167
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-168
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-169
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-17
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-170
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-171
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-172
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-173
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-174
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-175
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-176
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-177
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-178
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-179
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-18
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-180
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-181
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-182
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-183
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-184
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-185
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-186
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-187
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-188
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-189
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-19
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-190
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-191
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-192
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-193
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-194
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-195
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-196
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-197
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-198
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-199
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-2
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-20
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-200
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-201
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-202
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-203
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-204
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-205
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-206
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-207
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-208
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-209
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-21
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-210
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-211
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-212
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-213
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-214
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-215
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-216
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-217
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-218
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-219
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-22
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-220
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-221
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-222
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-223
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-224
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-225
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-226
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-227
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-228
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-229
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-23
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-230
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-231
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-232
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-233
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-234
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-235
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-236
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-237
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-238
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-239
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-24
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-240
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-241
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-242
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-243
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-244
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-245
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-246
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-247
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-248
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-249
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-25
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-250
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-251
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-252
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-253
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-254
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-255
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-256
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-257
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-258
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-259
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-26
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-260
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-261
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-262
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-263
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-264
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-265
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-266
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-267
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-268
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-269
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-27
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-270
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-271
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-272
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-273
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-274
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-275
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-276
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-277
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-278
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-279
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-28
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-280
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-281
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-282
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-283
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-284
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-285
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-286
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-287
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-288
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-289
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-29
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-290
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-291
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-292
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-293
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-294
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-295
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-296
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-297
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-298
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-299
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-3
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-30
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-300
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-301
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-302
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-303
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-304
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-305
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-306
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-307
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-308
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-309
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-31
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-310
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-311
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-312
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-313
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-314
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-315
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-316
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-317
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-318
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-319
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-32
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-320
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-321
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-322
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-323
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-324
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-325
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-326
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-327
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-328
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-329
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-33
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-330
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-331
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-332
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-333
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-334
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-335
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-336
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-337
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-338
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-339
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-34
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-340
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-341
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-342
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-343
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-344
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-345
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-346
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-347
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-348
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-349
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-35
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-350
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-351
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-352
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-353
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-354
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-355
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-356
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-357
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-358
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-359
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-36
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-360
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-361
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-362
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-363
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-364
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-365
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-366
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-367
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-368
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-369
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-37
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-370
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-371
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-372
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-373
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-374
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-375
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-376
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-377
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-378
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-379
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-38
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-380
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-381
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-382
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-383
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-39
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-4
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-40
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-41
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-42
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-43
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-44
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-45
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-46
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-47
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-48
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-49
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-5
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-50
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-51
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-52
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-53
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-54
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-55
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-56
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-57
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-58
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-59
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-6
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-60
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-61
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-62
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-63
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-64
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-65
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-66
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-67
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-68
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-69
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-7
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-70
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-71
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-72
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-73
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-74
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-75
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-76
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-77
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-78
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-79
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-8
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-80
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-81
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-82
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-83
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-84
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-85
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-86
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-87
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-88
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-89
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-9
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-90
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-91
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-92
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-93
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-94
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-95
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-96
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-97
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-98
drwxr-xr-x 2 root root 4096 Apr 1 13:16 local-99
-rw-r--r-- 1 root root 351033513744 Apr 1 14:53 obj
-rw-r--r-- 1 root root 547 Apr 1 15:39 prf
-rw-r--r-- 1 root root 2359300 Apr 1 15:39 qr
-rw-r--r-- 1 root root 3084 Apr 1 15:39 rqcb
ls -l [index]/qg/global
total 20
-rw-r--r-- 1 root root 22 Apr 1 15:37 grp
-rw-r--r-- 1 root root 3082 Apr 1 15:37 obj
-rw-r--r-- 1 root root 725 Apr 1 15:37 prf
-rw-r--r-- 1 root root 8 Apr 1 15:37 robj
-rw-r--r-- 1 root root 3108 Apr 1 15:37 tre
du -sh [index]/qg/global
du -sh [index]/qg/global/*
4.0K [index]/qg/global/grp
4.0K [index]/qg/global/obj
4.0K [index]/qg/global/prf
4.0K [index]/qg/global/robj
4.0K [index]/qg/global/tre
ls -l [index]/qg/local-0
total 20
-rw-r--r-- 1 root root 1807 Apr 1 15:37 grp
-rw-r--r-- 1 root root 153 Apr 1 15:37 obj
-rw-r--r-- 1 root root 720 Apr 1 15:37 prf
-rw-r--r-- 1 root root 8 Apr 1 15:37 robj
-rw-r--r-- 1 root root 164 Apr 1 15:37 tre
du -sh [index]/qg/local-0
du -sh [index]/qg/global/*
4.0K [index]/qg/local-0/grp
4.0K [index]/qg/local-0/obj
4.0K [index]/qg/local-0/prf
4.0K [index]/qg/local-0/robj
4.0K [index]/qg/local-0/tre
ngt info [index]/qg/global
NGT version: 2.3.11
CPU SIMD types: avx avx2 avx512f avx512vl avx512bw avx512dq avx512cd avx512er avx512pf avx512vbmi avx512ifma avx5124vnniw avx5124fmaps avx512vpopcntdq avx512vbmi2 avx512vnni
Warning! The node without incoming edges. 1
The number of the objects: 1
The number of the indexed objects: 1
The size of the object repository (not the number of the objects): 1
The size of the refinement object repository (not the number of the objects): 0
The number of the removed objects: 0/1
The number of the nodes: 1
The number of the edges: 0
The mean of the edge lengths: 0
The mean of the number of the edges per node: 0
The number of the nodes without edges: 1
The maximum of the outdegrees: 0
The minimum of the outdegrees: 0
The number of the nodes where indegree is 0: 1
The maximum of the indegrees: 0
The minimum of the indegrees: 0
The mean of the edge lengths for 10 edges: 0/0
#-nodes,#-edges,#-no-indegree,avg-edges,avg-dist,max-out,min-out,v-out,max-in,min-in,v-in,med-out,med-in,mode-out,mode-in,c95,c5,o-distance(10),o-skip,i-distance(10),i-skip:1:0:1:0:-nan:0:0:-nan:0:0:-nan:0:0:0:0:0:0:0:0:0:1:0:1
ngt info [index]/qg/local-0
NGT version: 2.3.11
CPU SIMD types: avx avx2 avx512f avx512vl avx512bw avx512dq avx512cd avx512er avx512pf avx512vbmi avx512ifma avx5124vnniw avx5124fmaps avx512vpopcntdq avx512vbmi2 avx512vnni
checkEdgeLengths: Warning! The indexed edge lengths are different. 44/210.
The number of the objects: 16
The number of the indexed objects: 16
The size of the object repository (not the number of the objects): 16
The size of the refinement object repository (not the number of the objects): 0
The number of the removed objects: 0/16
The number of the nodes: 16
The number of the edges: 210
The mean of the edge lengths: 0.05900290447
The mean of the number of the edges per node: 13.125
The number of the nodes without edges: 0
The maximum of the outdegrees: 15
The minimum of the outdegrees: 10
The number of the nodes where indegree is 0: 0
The maximum of the indegrees: 15
The minimum of the indegrees: 10
The mean of the edge lengths for 10 edges: 0.0501995215/160
#-nodes,#-edges,#-no-indegree,avg-edges,avg-dist,max-out,min-out,v-out,max-in,min-in,v-in,med-out,med-in,mode-out,mode-in,c95,c5,o-distance(10),o-skip,i-distance(10),i-skip:16:210:0:13.125:0.05900290447:15:10:0.2369047619:15:10:0.2369047619:13:13:15:15:15:10:15:10:0.0501995215:0:0.0501995215:0
ngt info [index]
The number of the objects: 113676655
The number of the indexed objects: 113676655
The size of the object repository (not the number of the objects): 113676655
The size of the refinement object repository (not the number of the objects): 0
The number of the removed objects: 0/113676655
The number of the nodes: 113676655
The number of the edges: 2273532990
The mean of the edge lengths: 0.06959541028
The mean of the number of the edges per node: 19.99999903
The number of the nodes without edges: 0
The maximum of the outdegrees: 20390
The minimum of the outdegrees: 10
The number of the nodes where indegree is 0: 0
The maximum of the indegrees: 20390
The minimum of the indegrees: 10
The mean of the edge lengths for 10 edges: 0.05203543551/1136766550
#-nodes,#-edges,#-no-indegree,avg-edges,avg-dist,max-out,min-out,v-out,max-in,min-in,v-in,med-out,med-in,mode-out,mode-in,c95,c5,o-distance(10),o-skip,i-distance(10),i-skip:113676655:2273532990:0:19.99999903:0.06959541028:20390:10:15.1989023:20390:10:15.1989023:15:15:10:10:65.34491862:10:101.3390394:10:0.05203543551:0:0.05203543551:0
qbg search-qg index float_values.tsv
expandedSizeByEpsilon=False
Not found "RefinementObjectType"
Not found "EpsilonType"
Not found "RefinementObjectType"
Not found "EpsilonType"
Warning. Cannot open the refinment objects. /NGT/lib/NGT/ObjectRepository.h:deserialize:58: NGT::ObjectSpace: Cannot open the specified file [index]/robj.
No quantized graph. Construct it temporarily.
inverted index object size=2408074
graph repository size=0
qbg: Error: /NGT/lib/NGT/NGTQ/QbgCli.cpp:searchQG:664: qbg: Error vector::_M_range_check: __n (which is 22456979) >= this->size() (which is 0)
Usage: ngtqg search-qg [-i index-type(g|t|s)] [-n result-size] [-e epsilon] [-E edge-size] [-o output-mode] [-p result-expansion] index(input) query.tsv(input)
Thank you for the infomation.
Please run the following command on the index where the cp error was output.
qbg build-qg -p 3 -E 128 -v [index]
If no error occurs, please try running a search. If this does not work properly, please try running the following commands instead of qbg build-qg -E 128 -v [index] after qbg create-qg ....
qbg build-qg -p 1 -v [index]
qbg build-qg -p 2 -v [index]
qbg build-qg -p 3 -E 128 -v [index]
From your suggestions for alternative commands, all those commands (which specify -p) are giving different errors.
-p 3: gives error for https://github.com/yahoojapan/NGT/blob/50083e6220610e74ad0e90d5f1d861ac63997382/lib/NGT/NGTQ/QuantizedGraph.h#L130- rest all give the error
[index]/qg/global/prfor[index]/qg/ws/hkc_2cnot found. - The original command (
-p 0) gives the same issue as previously mentioned
From ngt info, the maximum number of edges appears to be 20390. Since these too many edges might cause the issue, please run the following commands to prune them.
ngt prune -e 128 [index]
The order of command execution is as follows.
ngt create -d 768 -D c [index]
ngt append [index] data.csv
ngt prune -e 128 [index]
qbg create-qg -d 768 -D C -E 10 -S 40 -i t -o f -p 32 -N 384 -c 16 -C sqsu8 -B 2 -b 200 -M l -L s -e 0.1 -v [index]
qbg build-qg -E 128 -v [index]
qbg search-qg [index] query.csv
Even after pruning, got the same issue with qbg search-qg.
qbg search-qg index float_values.tsv
expandedSizeByEpsilon=False
No quantized graph. Construct it temporarily.
inverted index object size=2408074
graph repository size=0
qbg: Error: /NGT/lib/NGT/NGTQ/QbgCli.cpp:searchQG:663: qbg: Error vector::_M_range_check: __n (which is 22456979) >= this->size() (which is 0)
Usage: ngtqg search-qg [-i index-type(g|t|s)] [-n result-size] [-e epsilon] [-E edge-size] [-o output-mode] [-p result-expansion] index(input) query.tsv(input)
It seems that qbg build-qg didn't generate the quantized graph index. Did qbg build-qg complete without any issues excluding cp error?
So, once qbg build-qg starts,
CPU remains below 100% till these logs complete
append: Data loading time=2.255e-05 (sec) 0.02255 (msec)
# of objects=16
Index creation time=0.00132462 (sec) 1.32462 (msec)
qbg: loading the rotation...
QuantizationCodebook::buildIndex
QuantizationCodebook::buildIndex # of the centroids=1
load() done
codebook index size=1
After that, the below process logs take up 100% CPU,
# of processed objects=105000000, time=21.5226 (m), vm size=104.58 G/104.58 G
# of processed objects=106000000, time=21.7245 (m), vm size=104.58 G/104.58 G
# of processed objects=107000000, time=21.9246 (m), vm size=104.58 G/104.58 G
# of processed objects=108000000, time=22.1269 (m), vm size=104.58 G/104.58 G
# of processed objects=109000000, time=22.3308 (m), vm size=104.58 G/104.58 G
# of processed objects=110000000, time=22.5344 (m), vm size=104.58 G/104.58 G
# of processed objects=111000000, time=22.7377 (m), vm size=104.58 G/104.58 G
# of processed objects=112000000, time=22.941 (m), vm size=104.58 G/104.58 G
# of processed objects=113000000, time=23.1424 (m), vm size=104.58 G/104.58 G
And it drops to 0 on this log
cp: cannot stat '/path_to_index/qg/ws/hkc_3c': No such file or directory
However, this time I let the process run for more than 24hrs, and got this in the logs,
cp: cannot stat '/path_to_index/qg/ws/hkc_3c': No such file or directory
NGTQ index is completed.
time=22.9784 (m)
vmsize=102.91 G
peak vmsize=194.64 G
saving...
NGTQ and NGTQBG indices are completed.
vmsize=18.64 G
peak vmsize=194.64 G
building the quantized graph...
inverted index object size=2408074
graph repository size=113676656
Earlier, I could never reach the completion of the process (if this is supposed to be the end of the process) beyond the cp error log, since after 3-4 hours of that log, I used to end the process.
but still,
qbg search-qg [index] float_values.tsv
expandedSizeByEpsilon=False
No quantized graph. Construct it temporarily.
inverted index object size=2408074
graph repository size=0
qbg: Error: /NGT/lib/NGT/NGTQ/QbgCli.cpp:searchQG:663: qbg: Error vector::_M_range_check: __n (which is 22456979) >= this->size() (which is 0)
Usage: ngtqg search-qg [-i index-type(g|t|s)] [-n result-size] [-e epsilon] [-E edge-size] [-o output-mode] [-p result-expansion] index(input) query.tsv(input)
The build-qg has three phases, and from the messages, it looks like the first two phases had completed and the third phase was still in progress. Since your data is extremely large, I expect the third phase will also take quite a long time. When the third phase is completed, the following message will be displayed:
Quantized graph is completed.
vmsize=XXX G
peak vmsize=XXX G
Also, for memory reduction and faster processing, you should always include ngt prune -e 128 [index]. The value 128 should match the one used in qbg build-qg -E 128 -v [index].
But the build-qg had ended post the below logs, I had kept the command running via nohup, but once these logs were output, the process ended, I cannot see the command running anymore.
building the quantized graph...
inverted index object size=2408074
graph repository size=113676656
The third phase consumes a large amount of memory, which might be causing OOM. Reducing the value 128 in the following commands might help the process complete successfully:
...
ngt prune -e 128 [index]
...
qbg build-qg -E 128 -v [index]
...
At the same time, the memory used is around 100-110GB, until around 04/06 11:20 when I executed qbg search-qg which tried to load the index in memory, thus the memory spike.
But there doesn't seem to be any evidence that phase-3 is running as per the above metrics, and this was done post pruning the ngt index to 128 edges.
What are the next steps that can be taken to either debug this and/or build the QG index successfully?
What I meant in my previous comment was to suggest reducing the number of edges further from 128 — perhaps trying 64, for example.
Just to be safe, trying a smaller value like 16 might be helpful to confirm that it runs correctly.
I tried pruning to 128, 64, and 16 edges. But still cannot get the same to work, phase-3 is not completed, and there are no logs post,
building the quantized graph...
inverted index object size=2408074
graph repository size=113676656
Also, the CPU usage, memory and disk read/write becomes 0 post this, and the process (qbg build-qg) is no more running/executing, while the process qbg search-qg fails as before.
I estimated the memory usage.
16 edges -> 650 G
64 edges -> 2,400 G
128 edges -> 4,800 G
Therefore, it is possible that only the QG index for 16 edges can be built. Since I want to know the memory usage during processing, I added some lines to NGT. Could you use this? In addition, to reduce memory fragmentation, please execute each phase in separate processes as follows.
...
qbg build-qg -p 1 -v [index]
qbg build-qg -p 2 -v [index]
qbg build-qg -p 3 -E 16 -v [index]
qbg build-qg -p 1 -v [index]
optimizing...
optimize: # of objects=1000
optimize: updated # of objects=1000
optimize: # of clusters=16:0
the codebook index file is missing. this index must be QG.
optimize: # of vectors=1000/1000, # of matrices=1
optimize: time=136.683 (ms)
qbg build-qg -p 2 -v [index]
starting few log lines...
building the inverted index...
GraphOptimizer::execute: vm size=6.11 G:10.02 G
delete all of objects
vm size=6.11 G:10.02 G
GraphOptimizer: adjusting outgoing and incoming edges...
Optimizer::execute: Extract the graph data.
GraphReconstructor::reconstructGraph:10:120:9223372036854775807
GraphReconstructor: Warning. The edges are too few. 0:10 for 1
# of the nodes edges of which are in short = 1
Reconstruction time=5.534e-06:1.54e-07:1.9e-07
Optimizer::execute: Graph reconstruction time=3.693e-05 (sec)
GraphOptimizer: redusing shortcut edges...
GraphReconstructor::adjustPaths: graph preparing time=0.00022 (ms)
GraphReconstructor::adjustPaths: extracting removed edge candidates time=4.53551 (ms)
removeCandidateCount=0
Optimizer::execute: Path adjustment time=0.0045632 (sec)
data/index/inhouse_v3_4_beit_2025_03_12/qg/ws/hkc_opt/sv-0->data/index/inhouse_v3_4_beit_2025_03_12/qg/local-0
append: Data loading time=3.4526e-05 (sec) 0.034526 (msec)
# of objects=16
last few log lines...
# of processed objects=113000000, time=26.5813 (m), vm size=96.7 G/96.7 G
cp: cannot stat 'data/index/inhouse_v3_4_beit_2025_03_12/qg/ws/hkc_3c': No such file or directory
NGTQ index is completed.
time=26.8049 (m)
vmsize=99.85 G
peak vmsize=191.58 G
saving...
NGTQ and NGTQBG indices are completed.
vmsize=15.57 G
peak vmsize=191.58 G
qbg build-qg -p 3 -v [index]
building the quantized graph...
construct
vmsize=102.63 G
peak vmsize=102.63 G
inverted index object size=2408074
vmsize=105.65 G
peak vmsize=105.65 G
erased
vmsize=23.92 G
peak vmsize=105.65 G
graph repository size=113676656
vmsize=23.92 G
peak vmsize=105.65 G
resized
vmsize=28.15 G
peak vmsize=105.65 G
# of processed objects=1/113676655(0%)
vmsize=28.15 G
peak vmsize=105.65 G
Segmentation fault (core dumped)
So the phase-3 command failed due to Segmentation fault, not sure why that happened.
I can see a small spike in disk writes, and context switching also, but the memory usage is very low.
What could be the reason for this segmentation fault (core dumped) error?
Thank you for the information. However, the following message seems strange.
inverted index object size=2408074
The expected value is 113676656, which comes from the file ivt. I've updated the source code to check the data. Sorry for the trouble, but could you please run it again?
https://github.com/yahoojapan/NGT/tree/massive_qg
Also, could you send me the messages from the 2nd and 3rd phases that start with DEBUG?
qbg build-qg -p 1 -v [index]
optimizing...
DEBUG: Quantizer::open the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 0
DEBUG: The first cluster size: 0
DEBUG: The inverted index file size: -1 bytes
optimize: # of objects=1000
optimize: updated # of objects=1000
optimize: # of clusters=16:0
the codebook index file is missing. this index must be QG.
optimize: # of vectors=1000/1000, # of matrices=1
optimize: time=144.055 (ms)
qbg build-qg -p 2 -v [index]
building the inverted index...
GraphOptimizer::execute: vm size=6.36 G:10.33 G
delete all of objects
vm size=6.36 G:10.33 G
GraphOptimizer: adjusting outgoing and incoming edges...
Optimizer::execute: Extract the graph data.
GraphReconstructor::reconstructGraph:10:120:9223372036854775807
GraphReconstructor: Warning. The edges are too few. 0:10 for 1
# of the nodes edges of which are in short = 1
Reconstruction time=5.1e-06:2.48e-07:1.63e-07
Optimizer::execute: Graph reconstruction time=2.2173e-05 (sec)
GraphOptimizer: redusing shortcut edges...
GraphReconstructor::adjustPaths: graph preparing time=0.000171 (ms)
GraphReconstructor::adjustPaths: extracting removed edge candidates time=5.42781 (ms)
removeCandidateCount=0
Optimizer::execute: Path adjustment time=0.00545387 (sec)
data/index/inhouse_v3_4_beit_2025_03_12/qg/ws/hkc_opt/sv-0->[index]/qg/local-0
append: Data loading time=3.3083e-05 (sec) 0.033083 (msec)
# of objects=16
Index creation time=0.00546047 (sec) 5.46047 (msec)
data/index/inhouse_v3_4_beit_2025_03_12/qg/ws/hkc_opt/sv-1->[index]/qg/local-1
append: Data loading time=1.807e-05 (sec) 0.01807 (msec)
# of objects=16
Index creation time=0.00449889 (sec) 4.49889 (msec)
data/index/inhouse_v3_4_beit_2025_03_12/qg/ws/hkc_opt/sv-2->[index]/qg/local-2
append: Data loading time=2.1837e-05 (sec) 0.021837 (msec)
# of objects=16
Index creation time=0.00438631 (sec) 4.38631 (msec)
.
.
.
.
.
qbg: loading the rotation...
DEBUG: Quantizer::open the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 0
DEBUG: The first cluster size: 0
DEBUG: The inverted index file size: -1 bytes
QuantizationCodebook::buildIndex
QuantizationCodebook::buildIndex # of the centroids=1
DEBUG: Quantizer::open the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 0
DEBUG: The first cluster size: 0
DEBUG: The inverted index file size: -1 bytes
loading the rotation...
QuantizationCodebook::buildIndex
QuantizationCodebook::buildIndex # of the centroids=1
codebook index size=1
# of processed objects=1000000, time=12.6164 (s), vm size=96.7 G/96.74 G
# of processed objects=2000000, time=25.1842 (s), vm size=96.7 G/96.74 G
# of processed objects=3000000, time=37.7162 (s), vm size=96.7 G/96.74 G
# of processed objects=4000000, time=50.2343 (s), vm size=96.7 G/96.74 G
.
.
.
.
.
.
# of processed objects=113000000, time=23.6545 (m), vm size=96.7 G/96.74 G
cp: cannot stat '[index]/qg/ws/hkc_3c': No such file or directory
DEBUG: save: The Inverted index size=2
DEBUG: save: The entry size=113676655
DEBUG: Quantizer::save the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 2
DEBUG: The first cluster size: 113676655
DEBUG: The inverted index file size: 87758377680 bytes
DEBUG: Quantizer::open the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 2
DEBUG: The first cluster size: 113676655
DEBUG: The inverted index file size: 87758377680 bytes
loading the rotation...
QuantizationCodebook::buildIndex
QuantizationCodebook::buildIndex # of the centroids=1
NGTQ index is completed.
time=23.8782 (m)
vmsize=99.85 G
peak vmsize=191.58 G
saving...
NGTQ and NGTQBG indices are completed.
vmsize=15.57 G
peak vmsize=191.58 G
qbg build-qg -p 3 -v [index]
building the quantized graph...
DEBUG: Quantizer::open the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 2
DEBUG: The first cluster size: 113676655
DEBUG: The inverted index file size: 87758377680 bytes
loading the rotation...
QuantizationCodebook::buildIndex
QuantizationCodebook::buildIndex # of the centroids=1
DEBUG: construct
DEBUG: vmsize=102.63 G
DEBUG: peak vmsize=102.63 G
DEBUG: extractInvertedIndexObject: The Inverted index size=2
DEBUG: extractInvertedIndexObject: The entry size=113676655
DEBUG: extractInvertedIndexObject: The last ID=2408073
DEBUG: inverted index object size=2408074
DEBUG: vmsize=105.65 G
DEBUG: peak vmsize=105.65 G
DEBUG: erased
DEBUG: vmsize=23.92 G
DEBUG: peak vmsize=105.65 G
DEBUG: graph repository size=113676656
DEBUG: vmsize=23.92 G
DEBUG: peak vmsize=105.65 G
DEBUG: resized
DEBUG: vmsize=28.15 G
DEBUG: peak vmsize=105.65 G
# of processed objects=1/113676655(0%)
vmsize=28.15 G
peak vmsize=105.65 G
Phase-3 command failed again due to Segmentation fault (core dumped), not sure why that happened.
Thank you for the information. Since we couldn’t reproduce the issue on our side, it is difficult to fix it directly. However, we found a few things that looked suspicious and have made some changes. Please try running it again.
https://github.com/yahoojapan/NGT/tree/massive_qg
qbg build-qg -p 1 -v [index]
optimizing...
DEBUG: Quantizer::open the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 0
optimize: # of objects=1000
optimize: updated # of objects=1000
optimize: # of clusters=16:0
the codebook index file is missing. this index must be QG.
optimize: # of vectors=1000/1000, # of matrices=1
optimize: time=139.365 (ms)
qbg build-qg -p 2 -v [index]
building the inverted index...
GraphOptimizer::execute: vm size=6.8 G:10.83 G
delete all of objects
vm size=6.8 G:10.83 G
GraphOptimizer: adjusting outgoing and incoming edges...
Optimizer::execute: Extract the graph data.
GraphReconstructor::reconstructGraph:10:120:9223372036854775807
GraphReconstructor: Warning. The edges are too few. 0:10 for 1
# of the nodes edges of which are in short = 1
Reconstruction time=5.287e-06:1.92e-07:1.81e-07
Optimizer::execute: Graph reconstruction time=2.7792e-05 (sec)
GraphOptimizer: redusing shortcut edges...
GraphReconstructor::adjustPaths: graph preparing time=0.000245 (ms)
GraphReconstructor::adjustPaths: extracting removed edge candidates time=4.20532 (ms)
removeCandidateCount=0
Optimizer::execute: Path adjustment time=0.00423766 (sec)
[index]/qg/ws/hkc_opt/sv-0->[index]/qg/local-0
append: Data loading time=3.3967e-05 (sec) 0.033967 (msec)
# of objects=16
Index creation time=0.00522924 (sec) 5.22924 (msec)
[index]/qg/ws/hkc_opt/sv-1->[index]/qg/local-1
append: Data loading time=1.7613e-05 (sec) 0.017613 (msec)
# of objects=16
Index creation time=0.00445954 (sec) 4.45954 (msec)
[index]/qg/ws/hkc_opt/sv-2->[index]/qg/local-2
append: Data loading time=1.5546e-05 (sec) 0.015546 (msec)
# of objects=16
.
.
.
.
.
qbg: loading the rotation...
DEBUG: Quantizer::open the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 0
QuantizationCodebook::buildIndex
QuantizationCodebook::buildIndex # of the centroids=1
DEBUG: Quantizer::open the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 0
loading the rotation...
QuantizationCodebook::buildIndex
QuantizationCodebook::buildIndex # of the centroids=1
codebook index size=1
# of processed objects=1000000, time=12.1449 (s), vm size=96.7 G/96.74 G
# of processed objects=2000000, time=24.2939 (s), vm size=96.7 G/96.74 G
# of processed objects=3000000, time=36.4435 (s), vm size=96.7 G/96.74 G
# of processed objects=4000000, time=48.6213 (s), vm size=96.7 G/96.74 G
.
.
.
.
.
.
# of processed objects=113000000, time=22.898 (m), vm size=96.7 G/96.74 G
DEBUG: save: The Inverted index size=2
DEBUG: save: The entry size=113676655
DEBUG: Quantizer::save the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 2
DEBUG: The first cluster size: 113676655
DEBUG: The inverted index file size: 87758377680 bytes
DEBUG: localDivisionNo: 384
DEBUG: The last ID: 113676655
DEBUG: The entry size: 113676655
DEBUG: The last ID from memory: 113676655
DEBUG: Quantizer::open the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 2
DEBUG: The first cluster size: 113676655
DEBUG: The inverted index file size: 87758377680 bytes
DEBUG: localDivisionNo: 384
DEBUG: The last ID: 113676655
DEBUG: open. The entry size: 113676655
DEBUG: open. The last ID from memory: 0
loading the rotation...
QuantizationCodebook::buildIndex
QuantizationCodebook::buildIndex # of the centroids=1
NGTQ index is completed.
time=23.1253 (m)
vmsize=99.84 G
peak vmsize=191.58 G
saving...
NGTQ and NGTQBG indices are completed.
vmsize=15.57 G
peak vmsize=191.58 G
qbg build-qg -p 3 -v [index]
building the quantized graph...
DEBUG: Quantizer::open the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 2
DEBUG: The first cluster size: 113676655
DEBUG: The inverted index file size: 87758377680 bytes
DEBUG: localDivisionNo: 384
DEBUG: The last ID: 113676655
DEBUG: open. The entry size: 113676655
DEBUG: open. The last ID from memory: 0
loading the rotation...
QuantizationCodebook::buildIndex
QuantizationCodebook::buildIndex # of the centroids=1
DEBUG: construct
DEBUG: vmsize=102.57 G
DEBUG: peak vmsize=102.57 G
DEBUG: extractInvertedIndexObject: The Inverted index size=2
DEBUG: extractInvertedIndexObject: The entry size=113676655
DEBUG: extractInvertedIndexObject: The last ID=2408073
DEBUG: inverted index object size=2408074
DEBUG: vmsize=105.59 G
DEBUG: peak vmsize=105.59 G
DEBUG: erased
DEBUG: vmsize=23.85 G
DEBUG: peak vmsize=105.59 G
DEBUG: graph repository size=113676656
DEBUG: vmsize=23.85 G
DEBUG: peak vmsize=105.59 G
DEBUG: resized
DEBUG: vmsize=28.09 G
DEBUG: peak vmsize=105.59 G
# of processed objects=1/113676655(0%)
vmsize=28.09 G
peak vmsize=105.59 G
Phase-3 command failed again due to Segmentation fault (core dumped), not sure why that happened.
Could you try creating a random numpy vector of shape (100000000, 768) and insert it into the NGT index and build Quantized graph index for the same (following above steps) and check if the same error occurs at your end, since still we are getting same error from my end?
Unfortunately, we can’t prepare a server with such a large memory size at this time. Since I assume this issue is not related to the number of dimensions, I tested it with 100M low-dimensional vectors, but I couldn't reproduce the problem. Anyway, we've narrowed the issue down to the phase 3. I've fixed the part that seemed suspicious. Could you run phase 3 again?
- At long last, the Quantized index was successfully built as phase-3 got completed with the below logs,
building the quantized graph...
DEBUG: Quantizer::open the inverted index [index]/qg/ivt
DEBUG: The inveverted index data size: 2
DEBUG: The first cluster size: 113676655
DEBUG: The inverted index file size: 87758377680 bytes
DEBUG: localDivisionNo: 384
DEBUG: The last ID: 113676655
DEBUG: open. The entry size: 113676655
DEBUG: open. The last ID from memory: 113676655
loading the rotation...
QuantizationCodebook::buildIndex
QuantizationCodebook::buildIndex # of the centroids=1
DEBUG: construct
DEBUG: vmsize=102.63 G
DEBUG: peak vmsize=102.63 G
DEBUG: extractInvertedIndexObject: The Inverted index size=2
DEBUG: extractInvertedIndexObject: The entry size=113676655
DEBUG: extractInvertedIndexObject: The last ID=113676655
DEBUG: inverted index object size=113676656
DEBUG: vmsize=199.13 G
DEBUG: peak vmsize=199.13 G
DEBUG: erased
DEBUG: vmsize=117.4 G
DEBUG: peak vmsize=199.13 G
DEBUG: graph repository size=113676656
DEBUG: vmsize=117.4 G
DEBUG: peak vmsize=199.13 G
DEBUG: resized
DEBUG: vmsize=121.64 G
DEBUG: peak vmsize=199.13 G
# of processed objects=1/113676655(0%)
vmsize=121.64 G
peak vmsize=199.13 G
# of processed objects=2/113676655(0%)
vmsize=121.64 G
peak vmsize=199.13 G
# of processed objects=3/113676655(0%)
vmsize=121.64 G
peak vmsize=199.13 G
# of processed objects=4/113676655(0%)
vmsize=121.64 G
peak vmsize=199.13 G
.
.
.
# of processed objects=113500000/113676655(99%)
vmsize=455.51 G
peak vmsize=455.51 G
# of processed objects=113600000/113676655(99%)
vmsize=455.81 G
peak vmsize=455.81 G
Quantized graph is completed.
vmsize=358.47 G
peak vmsize=456.03 G
- Corresponding
qbg search-qgalso worked fine but a few observations,
The disk-size for the complete index has reached 1.1TB (the ngt index was around 326GB)
13G [index]/grp
326G [index]/obj
4.0K [index]/prf
741G [index]/qg
4.0K [index]/robj
11G [index]/tre
Also, from the above Grafana dashboards, it is evident that it reaches around 94% (~800GB) memory while searching and takes a lot of time for the QG index to load in-memory, even when after one search query is executed, all the used memory goes into a cache, but subsequent searches still try to load cached memory into actual RAM.
Can there be anything done regarding this, or is this supposed to occur?
Could you also, explain what exactly were the reasons for previous observations, and the overall code changes done, and how it affected the large embedding QG index building?
Thank you for the help in figuring out the issue and debugging.
This time, the QG index involves an extremely large dataset, which seems to have caused read/write issues due to the large I/O size to the disk. It appears to depend on the server and file system, as the issue did not occur on our in-house server.
but subsequent searches still try to load cached memory into actual RAM.
How are you running the subsequent searches? Are you specifying multiple queries in a single execution of the ngt command? Also, could you let me know where in your Grafana dashboards the situation you mentioned can be observed?
Here is some additional information.
qbg create-qg -d 768 -D C -E 10 -S 40 -i t -o f -p 32 -N 384 -c 16 -C sqsu8 -B 2 -b 200 -M l -L s -e 0.1 -v [index]
Most of the arguments in the above command are not used, so the following arguments should be sufficient.
qbg create-qg -N 384 [index]
Instead of the above, the following settings can help reduce memory usage, and I don’t think it will affect search accuracy much.
qbg create-qg -N 256 [index]
Also, by adding the parameter as shown below, the graph size can be reduced.
ngt create -d 768 -D c -o h [index]
I tried re-running the search query (qbg search-qg index query.tsv), and did not come across that cached memory issue.
Initially both RAM (in-memory), and cached memory increases, then all that cached memory transfers into in-memory RAM which reaches 94% (approx. 800GB) for a single search.
The memory usage is higher than expected, but QG uses significantly more memory than graph and QBG.
This fix has been released as v2.3.15.