Missing schema patching option for migrating weaviate
How to reproduce this bug?
After migrating from Weaviate 1.19 to Weaviate 1.27 (with the newer client as well), vectors created in Weaviate 1.19 won't continue to work with the newer clients, while vectors created in 1.27 works fine. It appears to use that Weavaite didn't handle the schema migration as expected and we didn't find out an approach that allow users to patch the schema manually.
Here's our to reproduce test case.
- Deploy with weaviate-helm for Weaviate 1.19 (more info in the
Supporting informationsection) - Import vectors (denoted as
Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node - helm upgrade to Weaviate 1.27.27 (You may need to delete the existing weaviate Statefulset as the pod label changes in later versions of weaviate-helm)
- Import vectors using the same data(denoted as
Vector_index_c13319be_a677_4c1e_b5a6_1fc438cc9444_Node - Issue a query for each data source respectively.
Get the following error.
Query call with protocol GRPC search failed with message extract target vectors: class Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node does not have named vector default configured. Available named vectors map[].
We've checked the schema and found out that there's still difference between legacy vectors and the new one via curl "http://localhost:8080/v1/schema/Vector_index_xxxxxxxx_Node" -H "Content-Type: application/json" -H "Authorization: Bearer $WEAVIATE_API_KEY".
For example, schema for legacy vectors (i.e. created in Weaviate 1.19)
{
"class": "Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node",
"invertedIndexConfig": {
"bm25": {
"b": 0.75,
"k1": 1.2
},
"cleanupIntervalSeconds": 60,
"stopwords": {
"additions": null,
"preset": "en",
"removals": null
}
},
"properties": [
{
"dataType": [
"text"
],
"indexFilterable": true,
"indexRangeFilters": false,
"indexSearchable": true,
"name": "text",
"tokenization": "word"
},
{
"dataType": [
"text"
],
"description": "This property was generated by Weaviate's auto-schema feature on Tue Oct 28 06:21:09 2025",
"indexFilterable": true,
"indexRangeFilters": false,
"indexSearchable": true,
"name": "document_id",
"tokenization": "word"
},
{
"dataType": [
"text"
],
"description": "This property was generated by Weaviate's auto-schema feature on Tue Oct 28 06:21:09 2025",
"indexFilterable": true,
"indexRangeFilters": false,
"indexSearchable": true,
"name": "dataset_id",
"tokenization": "word"
},
{
"dataType": [
"text"
],
"description": "This property was generated by Weaviate's auto-schema feature on Tue Oct 28 06:21:09 2025",
"indexFilterable": true,
"indexRangeFilters": false,
"indexSearchable": true,
"name": "doc_id",
"tokenization": "word"
},
{
"dataType": [
"text"
],
"description": "This property was generated by Weaviate's auto-schema feature on Tue Oct 28 06:21:09 2025",
"indexFilterable": true,
"indexRangeFilters": false,
"indexSearchable": true,
"name": "doc_hash",
"tokenization": "word"
}
],
"replicationConfig": {
"asyncEnabled": false,
"factor": 1
},
"shardingConfig": {
"actualCount": 1,
"actualVirtualCount": 128,
"desiredCount": 1,
"desiredVirtualCount": 128,
"function": "murmur3",
"key": "_id",
"strategy": "hash",
"virtualPerPhysical": 128
},
"vectorIndexConfig": {
"bq": {
"enabled": false
},
"cleanupIntervalSeconds": 300,
"distance": "cosine",
"dynamicEfFactor": 8,
"dynamicEfMax": 500,
"dynamicEfMin": 100,
"ef": -1,
"efConstruction": 128,
"filterStrategy": "sweeping",
"flatSearchCutoff": 40000,
"maxConnections": 64,
"pq": {
"bitCompression": false,
"centroids": 256,
"enabled": false,
"encoder": {
"distribution": "log-normal",
"type": "kmeans"
},
"segments": 0,
"trainingLimit": 100000
},
"skip": false,
"sq": {
"enabled": false,
"rescoreLimit": 20,
"trainingLimit": 100000
},
"vectorCacheMaxObjects": 1000000000000
},
"vectorIndexType": "hnsw",
"vectorizer": "none"
}
Schema for vectors created (with the same data) in Weaviate 1.27.27
{
"class": "Vector_index_c13319be_a677_4c1e_b5a6_1fc438cc9444_Node",
"invertedIndexConfig": {
"bm25": {
"b": 0.75,
"k1": 1.2
},
"cleanupIntervalSeconds": 60,
"stopwords": {
"additions": null,
"preset": "en",
"removals": null
}
},
"multiTenancyConfig": {
"autoTenantActivation": false,
"autoTenantCreation": false,
"enabled": false
},
"properties": [
{
"dataType": [
"text"
],
"indexFilterable": true,
"indexRangeFilters": false,
"indexSearchable": true,
"name": "text",
"tokenization": "word"
},
{
"dataType": [
"text"
],
"indexFilterable": true,
"indexRangeFilters": false,
"indexSearchable": true,
"name": "document_id",
"tokenization": "word"
},
{
"dataType": [
"text"
],
"indexFilterable": true,
"indexRangeFilters": false,
"indexSearchable": true,
"name": "doc_id",
"tokenization": "word"
},
{
"dataType": [
"int"
],
"indexFilterable": true,
"indexRangeFilters": false,
"indexSearchable": false,
"name": "chunk_index"
},
{
"dataType": [
"text"
],
"description": "This property was generated by Weaviate's auto-schema feature on Mon Nov 3 06:12:39 2025",
"indexFilterable": true,
"indexRangeFilters": false,
"indexSearchable": true,
"name": "doc_hash",
"tokenization": "word"
},
{
"dataType": [
"uuid"
],
"description": "This property was generated by Weaviate's auto-schema feature on Mon Nov 3 06:12:39 2025",
"indexFilterable": true,
"indexRangeFilters": false,
"indexSearchable": false,
"name": "dataset_id"
}
],
"replicationConfig": {
"asyncEnabled": false,
"deletionStrategy": "NoAutomatedResolution",
"factor": 1
},
"shardingConfig": {
"actualCount": 1,
"actualVirtualCount": 128,
"desiredCount": 1,
"desiredVirtualCount": 128,
"function": "murmur3",
"key": "_id",
"strategy": "hash",
"virtualPerPhysical": 128
},
"vectorConfig": {
"default": {
"vectorIndexConfig": {
"bq": {
"enabled": false
},
"cleanupIntervalSeconds": 300,
"distance": "cosine",
"dynamicEfFactor": 8,
"dynamicEfMax": 500,
"dynamicEfMin": 100,
"ef": -1,
"efConstruction": 128,
"filterStrategy": "sweeping",
"flatSearchCutoff": 40000,
"maxConnections": 32,
"pq": {
"bitCompression": false,
"centroids": 256,
"enabled": false,
"encoder": {
"distribution": "log-normal",
"type": "kmeans"
},
"segments": 0,
"trainingLimit": 100000
},
"skip": false,
"sq": {
"enabled": false,
"rescoreLimit": 20,
"trainingLimit": 100000
},
"vectorCacheMaxObjects": 1000000000000
},
"vectorIndexType": "hnsw",
"vectorizer": {
"none": {}
}
}
}
}
The notable changes are:
.vectorIndexConfig->.vectorConfig.default.vectorIndexConfig.vectorIndexType->.vectorConfig.default.vectorIndexType.vectorizer->.vectorConfig.default.vectorizer. Note that the data type changes as well:"vectorizer": "none"->"vectorizer": { "none": {}}
We attempted to update the schema through the REST API, but it doesn't work.
Unfortunately we haven't find out a way to update the schema. We've attempted modifying existing schema as Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node and update it through
curl "http://localhost:8080/v1/schema/Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node" -H "Content-Type: application/json" -H "Authorization: Bearer $WEAVIATE_API_KEY" -X PUT -T ./Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node.json
{"error":[{"message":"vector config is immutable"}]}
We tried re-importing the data as well, it doesn't work.
Manual migration
curl -X POST "http://localhost:8080/v1/backups/filesystem/pre-migration-backup/restore" -H "Content-Type: application/json" -H "Authorization: Bearer $WEAVIATE_API_KEY" -d '{"id": "pre-migration-backup"}'
{"backend":"filesystem","classes":["Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node","Vector_index_c13319be_a677_4c1e_b5a6_1fc438cc9444_Node"],"id":"pre-migration-backup","path":"/tmp/backups/pre-migration-backup","status":"STARTED"}
curl -X POST "http://localhost:8080/v1/backups/filesystem/pre-migration-backup/restore" -H "Content-Type: application/json" -H "Authorization: Bearer $WEAVIATE_API_KEY" -d '{"id": "pre-migration-backup"}'
{"backend":"filesystem","classes":["Vector_index_c13319be_a677_4c1e_b5a6_1fc438cc9444_Node","Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node"],"id":"pre-migration-backup","path":"/tmp/backups/pre-migration-backup","status":"STARTED"}
Logs from weaviate
kubectl logs -n $NAMESPACE weaviate-0
{"action":"try_restore","backend":"filesystem","backup_id":"pre-migration-backup","build_git_commit":"75351e8","build_go_version":"go1.24.3","build_image_tag":"v1.27.27","build_wv_version":"1.27.27","level":"info","msg":"","time":"2025-11-03T07:18:49Z","took":3970104}
{"action":"restore","backup_id":"pre-migration-backup","build_git_commit":"75351e8","build_go_version":"go1.24.3","build_image_tag":"v1.27.27","build_wv_version":"1.27.27","class":"Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node","level":"info","msg":"successfully restored","time":"2025-11-03T07:18:49Z"}
{"action":"restore","backup_id":"pre-migration-backup","build_git_commit":"75351e8","build_go_version":"go1.24.3","build_image_tag":"v1.27.27","build_wv_version":"1.27.27","class":"Vector_index_c13319be_a677_4c1e_b5a6_1fc438cc9444_Node","level":"info","msg":"successfully restored","time":"2025-11-03T07:18:49Z"}
{"action":"restore","backup_id":"pre-migration-backup","build_git_commit":"75351e8","build_go_version":"go1.24.3","build_image_tag":"v1.27.27","build_wv_version":"1.27.27","level":"info","msg":"backup restored successfully","time":"2025-11-03T07:18:49Z"}
{"build_git_commit":"75351e8","build_go_version":"go1.24.3","build_image_tag":"v1.27.27","build_wv_version":"1.27.27","class":"Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node","level":"info","msg":"class already restored","time":"2025-11-03T07:18:51Z"}
{"build_git_commit":"75351e8","build_go_version":"go1.24.3","build_image_tag":"v1.27.27","build_wv_version":"1.27.27","class":"Vector_index_c13319be_a677_4c1e_b5a6_1fc438cc9444_Node","level":"info","msg":"class already restored","time":"2025-11-03T07:18:51Z"}
{"action":"restore","backup_id":"pre-migration-backup","build_git_commit":"75351e8","build_go_version":"go1.24.3","build_image_tag":"v1.27.27","build_wv_version":"1.27.27","level":"error","msg":"coordinator: could not restore classes: [\"Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node\": class name Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node already exists \"Vector_index_c13319be_a677_4c1e_b5a6_1fc438cc9444_Node\": class name Vector_index_c13319be_a677_4c1e_b5a6_1fc438cc9444_Node already exists]","time":"2025-11-03T07:18:51Z"}
Response from the client
"Query call with protocol GRPC search failed with message extract target vectors: class Vector_index_20fe729e_4b4b_4857_aecf_16fdebd41405_Node does not have named vector default configured. Available named vectors map[].",
What is the expected behavior?
Weaviate automatically migrates legacy schema to new standard.
What is the actual behavior?
Legacy schema wasn't migrated automatically. Manual update doesn't work either. Breaking backward compatibility for other apps.
https://github.com/langgenius/dify/issues/27291
Supporting information
values.yaml for before upgrade (deployed with weaviate-helm 16.1.0)
image:
# registry where weaviate image is stored
registry: docker.io
# Tag of weaviate image to deploy
# Note: We strongly recommend you overwrite this value in your own values.yaml.
# Otherwise a mere upgrade of the chart could lead to an unexpected upgrade
# of weaviate. In accordance with Infra-as-code, you should pin this value
# down and only change it if you explicitly want to upgrade the Weaviate
# version.
tag: 1.19.1
repo: semitechnologies/weaviate
# Image pull policy: https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy
pullPolicy: IfNotPresent
pullSecrets: []
# overwrite command and args if you want to run specific startup scripts, for
# example setting the nofile limit
command: ["/bin/weaviate"]
args:
- "--host"
- "0.0.0.0"
- "--port"
- "8080"
- "--scheme"
- "http"
- "--config-file"
- "/weaviate-config/conf.yaml"
- --read-timeout=60s
- --write-timeout=60s
# below is an example that can be used to set an arbitrary nofile limit at
# startup:
#
# command:
# - "/bin/sh"
# args:
# - "-c"
# - "ulimit -n 65535 && /bin/weaviate --host 0.0.0.0 --port 8080 --scheme http --config-file /weaviate-config/conf.yaml"
# Scale replicas of Weaviate. Note that as of v1.8.0 dynamic scaling is limited
# to cases where no data is imported yet. Scaling down after importing data may
# break usability. Full dynamic scalability will be added in a future release.
replicas: 1
resources:
{}
# requests:
# cpu: '500m'
# memory: '300Mi'
# limits:
# cpu: '1000m'
# memory: '1Gi'
# Add a service account ot the Weaviate pods if you need Weaviate to have permissions to
# access kubernetes resources or cloud provider resources. For example for it to have
# access to a backup up bucket, or if you want to restrict Weaviate pod in any way.
# By default, use the default ServiceAccount
serviceAccountName:
# The Persistent Volume Claim settings for Weaviate. If there's a
# storage.fullnameOverride field set, then the default pvc will not be
# created, instead the one defined in fullnameOverride will be used
storage:
size: 1Gi
storageClassName: ""
# The service controls how weaviate is exposed to the outside world. If you
# don't want a public load balancer, you can also choose 'ClusterIP' to make
# weaviate only accessible within your cluster.
service:
name: weaviate
# type: LoadBalancer
type: ClusterIP
loadBalancerSourceRanges: []
# optionally set cluster IP if you want to set a static IP
clusterIP:
annotations: {}
# Adjust liveness, readiness and startup probes configuration
startupProbe:
# For kubernetes versions prior to 1.18 startupProbe is not supported thus can be disabled.
enabled: false
initialDelaySeconds: 300
periodSeconds: 60
failureThreshold: 50
successThreshold: 1
timeoutSeconds: 3
livenessProbe:
initialDelaySeconds: 900
periodSeconds: 10
failureThreshold: 30
successThreshold: 1
timeoutSeconds: 3
readinessProbe:
initialDelaySeconds: 3
periodSeconds: 10
failureThreshold: 3
successThreshold: 1
timeoutSeconds: 3
terminationGracePeriodSeconds: 600
# Weaviate Config
#
# The following settings allow you to customize Weaviate to your needs, for
# example set authentication and authorization options. See weaviate docs
# (https://www.weaviate.io/developers/weaviate/) for all
# configuration.
authentication:
anonymous_access:
enabled: false
# This configuration allows to add API keys to Weaviate. This configuration allows only
# plain text API Keys, if you want to store the API Keys in a Kubernetes secret you can
# configure the same configuration with ENV Vars. Read the `env` section below on what
# needs to be configured. If using ENV Vars over this make sure to comment out the whole
# `apikey` section (as it is by default). ENV Vars has priority over this config.
apikey:
enabled: true
# Any number of allowed API Keys as plain text
allowed_keys:
- "WVF5YThaHlkYwhGUSmCRgsX3tD5ngdN8pkih"
# You can either set a single user for all the listed Allowed API keys OR
# one user per API Key, i.e. length(apikey.allowed_keys) == length(apikey.users) OR
# length(apikey.users) == 1
# Only the first user-key pair will be used by `dify.api` and `dify-worker`
# NOTE: Make sure the lister Users are added to the Authorization as well.
users:
- [email protected]
oidc:
enabled: false
# issuer: ''
# username_claim: ''
# groups_claim: ''
# client_id: ''
authorization:
admin_list:
enabled: true
users:
# Examples
# - admin_user1
# - admin_user2
# - api-key-user-admin
- [email protected]
read_only_users:
# Examples
# - readonly_user1
# - readonly_user2
# - api-key-user-readOnly
query_defaults:
limit: 100
debug: false
# Insert any custom environment variables or envSecrets by putting the exact name
# and desired value into the settings below. Any env name passed will be automatically
# set for the statefulSet.
env:
CLUSTER_GOSSIP_BIND_PORT: 7000
CLUSTER_DATA_BIND_PORT: 7001
# The aggressiveness of the Go Garbage Collector. 100 is the default value.
GOGC: 100
# Expose metrics on port 2112 for Prometheus to scrape
PROMETHEUS_MONITORING_ENABLED: false
# Set a MEM limit for the Weaviate Pod so it can help you both increase GC-related
# performance as well as avoid GC-related out-of-memory (“OOM”) situations
# GOMEMLIMIT: 6GiB
# Maximum results Weaviate can query with/without pagination
# NOTE: Affects performance, do NOT set to a very high value.
# The default is 100K
QUERY_MAXIMUM_RESULTS: 100000
# whether to enable vector dimensions tracking metric
TRACK_VECTOR_DIMENSIONS: false
# whether to re-index/-compute the vector dimensions metric (needed if upgrading from weaviate < v1.16.0)
REINDEX_VECTOR_DIMENSIONS_AT_STARTUP: false
##########################
# API Keys with ENV Vars #
##########################
# If using ENV Vars to set up API Keys make sure to have `authentication.apikey` block commented out
# to avoid any future changes. ENV Vars has priority over the config above `authentication.apikey`.
# If using `authentication.apikey `the below ENV Vars will be used because they have priority,
# so comment them out to avoid any future changes.
# Enables API key authentication. If it is set to 'false' the AUTHENTICATION_APIKEY_ALLOWED_KEYS
# and AUTHENTICATION_APIKEY_USERS will not have any effect.
# AUTHENTICATION_APIKEY_ENABLED: 'true'
# List one or more keys, separated by commas. Each key corresponds to a specific user identity below.
# If you want to use a kubernetes secret for the API Keys comment out this Variable and use the one in `envSecrets` below
# AUTHENTICATION_APIKEY_ALLOWED_KEYS: 'jane-secret-key,ian-secret-key' (plain text)
# List one or more user identities, separated by commas. You can have only one User for all the keys or one user per key.
# The User/s can be a simple name or an email, no matter if it exists or not.
# NOTE: Make sure to add the users to the authorization above overwise they will not be allowed to interact with Weaviate.
# AUTHENTICATION_APIKEY_USERS: '[email protected],ian-smith'
AUTHENTICATION_APIKEY_ENABLED: "true"
AUTHENTICATION_APIKEY_ALLOWED_KEYS: "WVF5YThaHlkYwhGUSmCRgsX3tD5ngdN8pkih"
AUTHENTICATION_APIKEY_USERS: "[email protected]"
AUTHORIZATION_ADMINLIST_ENABLED: "true"
AUTHORIZATION_ADMINLIST_USERS: "[email protected]"
envSecrets:
# create a Kubernetes secret with AUTHENTICATION_APIKEY_ALLOWED_KEYS key and its respective value
# AUTHENTICATION_APIKEY_ALLOWED_KEYS: name-of-the-k8s-secret-containing-the-comma-separated-api-keys
# Configure backup providers
backups:
# The backup-filesystem module enables creation of the DB backups in
# the local filesystem
filesystem:
enabled: false
envconfig:
# Configure folder where backups should be saved
BACKUP_FILESYSTEM_PATH: /tmp/backups
s3:
enabled: false
# If one is using AWS EKS and has already configured K8s Service Account
# that holds the AWS credentials one can pass a name of that service account
# here using this setting.
# NOTE: the root `serviceAccountName` config has priority over this one, and
# if the root one is set this one will NOT overwrite it. This one is here for
# backwards compatibility.
serviceAccountName:
envconfig:
# Configure bucket where backups should be saved, this setting is mandatory
BACKUP_S3_BUCKET: weaviate-backups
# Optional setting. Defaults to empty string.
# Set this option if you want to save backups to a given location
# inside the bucket
# BACKUP_S3_PATH: path/inside/bucket
# Optional setting. Defaults to AWS S3 (s3.amazonaws.com).
# Set this option if you have a MinIO storage configured in your environment
# and want to use it instead of the AWS S3.
# BACKUP_S3_ENDPOINT: custom.minio.endpoint.address
# Optional setting. Defaults to true.
# Set this option if you don't want to use SSL.
# BACKUP_S3_USE_SSL: true
# You can pass environment AWS settings here:
# Define the region
# AWS_REGION: eu-west-1
# For Weaviate to be able to create bucket objects it needs a user credentials to authenticate to AWS.
# The User must have permissions to read/create/delete bucket objects.
# You can pass the User credentials (access-key id and access-secret-key) in 2 ways:
# 1. by setting the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY plain values in the `secrets` section below
# this chart will create a kubernetes secret for you with these key-values pairs
# 2. create Kubernetes secret/s with AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY keys and their respective values
# Set the Key and the secret where it is set in `envSecrets` section below
secrets: {}
# AWS_ACCESS_KEY_ID: access-key-id (plain text)
# AWS_SECRET_ACCESS_KEY: secret-access-key (plain text)
# If one has already defined secrets with AWS credentials one can pass them using
# this setting:
envSecrets: {}
# AWS_ACCESS_KEY_ID: name-of-the-k8s-secret-containing-the-key-id
# AWS_SECRET_ACCESS_KEY: name-of-the-k8s-secret-containing-the-key
gcs:
enabled: false
envconfig:
# Configure bucket where backups should be saved, this setting is mandatory
BACKUP_GCS_BUCKET: weaviate-backups
# Optional setting. Defaults to empty string.
# Set this option if you want to save backups to a given location
# inside the bucket
# BACKUP_GCS_PATH: path/inside/bucket
# You can pass environment Google settings here:
# Define the project
# GOOGLE_CLOUD_PROJECT: project-id
# For Weaviate to be able to create bucket objects it needs a ServiceAccount credentials to authenticate to GCP.
# The ServiceAccount must have permissions to read/create/delete bucket objects.
# You can pass the ServiceAccount credentials (as JSON) in 2 ways:
# 1. by setting the GOOGLE_APPLICATION_CREDENTIALS json as plain text in the `secrets` section below
# this chart will create a kubernetes secret for you with this key-values pairs
# 2. create a Kubernetes secret with GOOGLE_APPLICATION_CREDENTIALS key and its respective value
# Set the Key and the secret where it is set in `envSecrets` section below
secrets: {}
# GOOGLE_APPLICATION_CREDENTIALS: credentials-json-string (plain text)
# If one has already defined a secret with GOOGLE_APPLICATION_CREDENTIALS one can pass them using
# this setting:
envSecrets: {}
# GOOGLE_APPLICATION_CREDENTIALS: name-of-the-k8s-secret-containing-the-key
azure:
enabled: false
envconfig:
# Configure container where backups should be saved, this setting is mandatory
BACKUP_AZURE_CONTAINER: weaviate-backups
# Optional setting. Defaults to empty string.
# Set this option if you want to save backups to a given location
# inside the container
# BACKUP_AZURE_PATH: path/inside/container
# For Weaviate to be able to create container objects it needs a user credentials to authenticate to Azure Storage.
# The User must have permissions to read/create/delete container objects.
# You can pass the User credentials (account-name id and account-key or connection-string) in 2 ways:
# 1. by setting the AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY
# or AZURE_STORAGE_CONNECTION_STRING plain values in the `secrets` section below
# this chart will create a kubernetes secret for you with these key-values pairs
# 2. create Kubernetes secret/s with AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY
# or AZURE_STORAGE_CONNECTION_STRING and their respective values
# Set the Key and the secret where it is set in `envSecrets` section below
secrets: {}
# AZURE_STORAGE_ACCOUNT: account-name (plain text)
# AZURE_STORAGE_KEY: account-key (plain text)
# AZURE_STORAGE_CONNECTION_STRING: connection-string (plain text)
# If one has already defined secrets with Azure Storage credentials one can pass them using
# this setting:
envSecrets: {}
# AZURE_STORAGE_ACCOUNT: name-of-the-k8s-secret-containing-the-account-name
# AZURE_STORAGE_KEY: name-of-the-k8s-secret-containing-account-key
# AZURE_STORAGE_CONNECTION_STRING: name-of-the-k8s-secret-containing-connection-string
# modules are extensions to Weaviate, they can be used to support various
# ML-models, but also other features unrelated to model inference.
# An inference/vectorizer module is not required, you can also run without any
# modules and import your own vectors.
modules:
# by choosing the default vectorizer module, you can tell Weaviate to always
# use this module as the vectorizer if nothing else is specified. Can be
# overwritten on a per-class basis.
# set to text2vec-transformers if running with transformers instead
default_vectorizer_module: none
# It is also possible to configure authentication and authorization through a
# custom configmap The authorization and authentication values defined in
# values.yaml will be ignored when defining a custom config map.
custom_config_map:
enabled: false
name: "custom-config"
# Pass any annotations to Weaviate pods
annotations:
nodeSelector:
tolerations:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- weaviate
values.yaml after helm upgrade (with weaviate-helm 17.3.3)
image:
# registry where weaviate image is stored
registry: cr.weaviate.io
# Tag of weaviate image to deploy
# Note: We strongly recommend you overwrite this value in your own values.yaml.
# Otherwise a mere upgrade of the chart could lead to an unexpected upgrade
# of weaviate. In accordance with Infra-as-code, you should pin this value
# down and only change it if you explicitly want to upgrade the Weaviate
# version.
tag: 1.27.27
repo: semitechnologies/weaviate
# Image pull policy: https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy
pullPolicy: IfNotPresent
pullSecrets: []
# overwrite command and args if you want to run specific startup scripts, for
# example setting the nofile limit
command: ["/bin/weaviate"]
args:
- '--host'
- '0.0.0.0'
- '--port'
- '8080'
- '--scheme'
- 'http'
- '--config-file'
- '/weaviate-config/conf.yaml'
- --read-timeout=60s
- --write-timeout=60s
# below is an example that can be used to set an arbitrary nofile limit at
# startup:
#
# command:
# - "/bin/sh"
# args:
# - "-c"
# - "ulimit -n 65535 && /bin/weaviate --host 0.0.0.0 --port 8080 --scheme http --config-file /weaviate-config/conf.yaml"
# it is possible to change the sysctl's 'vm.max_map_count' using initContainer for Weaviate,
# the init Container runs before Weaviate Container and sets the value for the WHOLE node
# to the one provided below.
# it is possible to run additional initContainer before Weaviate is up and running. You can specify the
# containers as a list in `extraInitContainers`, exactly how they are defined in a kubernetes manifest:
# https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
initContainers:
sysctlInitContainer:
enabled: false
sysctlVmMaxMapCount: 524288
image:
registry: docker.io
repo: alpine
tag: latest
pullPolicy: IfNotPresent
ensureFileOwnershipContainer:
# This init container sets the file ownerships of /var/lib/weaviate directory to the ones set in
# containerSecurityContext.runAsUser and containerSecurityContext.fsGroup settings to ensure that Weaviate is able
# to start in unprivileged configuration.
# Enable this init container only if Weaviate was configured previously without security context
# and now containerSecurityContext is provided to run Weaviate container with non-root user.
# Please be sure to set at least containerSecurityContext.runAsUser and containerSecurityContext.fsGroup.
enabled: false
extraInitContainers: {}
# - image: some-image
# name: some-name
# Scale replicas of Weaviate. Note that as of v1.8.0 dynamic scaling is limited
# to cases where no data is imported yet. Scaling down after importing data may
# break usability. Full dynamic scalability will be added in a future release.
replicas: 1
# Define how pods will be created. Possible values: OrderedReady | Parallel
# OrderedReady - pods will be created one after another
# Parallel - all pods will be created at once
podManagementPolicy: Parallel
updateStrategy:
type: RollingUpdate
# This setting is only available in K8s v1.24 and higher.
# Setting maxUnavailable to 100% results in removing all of the pods
# and re-creating them in parallel all at once.
# rollingUpdate:
# maxUnavailable: 100%
resources: {}
# requests:
# cpu: '500m'
# memory: '300Mi'
# limits:
# cpu: '1000m'
# memory: '1Gi'
# security Context for the Weaviate Pods. The configurations are the same as setting them
# as described here: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
securityContext: {}
# Security context for the Weaviate container. Override overlapping settings made at the Pod level.
containerSecurityContext: {}
# runAsUser: 1000
# runAsGroup: 1000
# fsGroup: 1000
# fsGroupChangePolicy: "OnRootMismatch"
# runAsNonRoot: true
# allowPrivilegeEscalation: false
# privileged: false
# readOnlyRootFilesystem: true
# Add a service account to the Weaviate pods if you need Weaviate to have permissions to
# access kubernetes resources or cloud provider resources. For example for it to have
# access to a backup up bucket, or if you want to restrict Weaviate pod in any way.
# By default, use the default ServiceAccount
serviceAccountName:
# Kubernetes Cluster domain name, used for resolving intra-cluster requests, i.e
# between instances of weaviate.
# Note: The final '.' on the end of the hostname makes it a FQDN, and is required for
# DNS to resolve in all kubernetes environments.
# See https://github.com/weaviate/weaviate-helm/issues/175 for details.
clusterDomain: cluster.local.
# The Persistent Volume Claim settings for Weaviate. If there's a
# storage.fullnameOverride field set, then the default pvc will not be
# created, instead the one defined in fullnameOverride will be used
storage:
size: 1Gi
storageClassName: ""
# The service controls how weaviate is exposed to the outside world. If you
# don't want a public load balancer, you can also choose 'ClusterIP' to make
# weaviate only accessible within your cluster.
service:
name: weaviate
type: ClusterIP
loadBalancerSourceRanges: []
# optionally set cluster IP if you want to set a static IP
clusterIP:
annotations: {}
# The service controls how weaviate gRPC endpoint is exposed to the outside world.
# If you don't want a public load balancer, you can also choose 'ClusterIP' or `NodePort`
# to make weaviate gRPC port be only accessible within your cluster.
# This service is by default enabled but if you don't want it to be deployed in your
# environment then it can be disabled by setting enabled: false option.
grpcService:
enabled: false
name: weaviate-grpc
ports:
- name: grpc
protocol: TCP
port: 50051
# Target port is going to be the same for every port
type: ClusterIP
loadBalancerSourceRanges: []
# optionally set cluster IP if you want to set a static IP
clusterIP:
annotations: {}
# Adjust liveness, readiness and startup probes configuration
startupProbe:
# For kubernetes versions prior to 1.18 startupProbe is not supported thus can be disabled.
enabled: false
probeType: httpGet
probe:
httpGet:
path: /v1/.well-known/ready
port: 8080
initialDelaySeconds: 300
periodSeconds: 60
failureThreshold: 50
successThreshold: 1
timeoutSeconds: 3
livenessProbe:
livenessProbe:
probeType: httpGet
probe:
httpGet:
path: /v1/.well-known/live
port: 8080
initialDelaySeconds: 900
periodSeconds: 10
failureThreshold: 30
successThreshold: 1
timeoutSeconds: 3
readinessProbe:
probeType: httpGet
probe:
httpGet:
path: /v1/.well-known/ready
port: 8080
initialDelaySeconds: 3
periodSeconds: 10
failureThreshold: 3
successThreshold: 1
timeoutSeconds: 3
terminationGracePeriodSeconds: 600
# Weaviate Config
#
# The following settings allow you to customize Weaviate to your needs, for
# example set authentication and authorization options. See weaviate docs
# (https://www.weaviate.io/developers/weaviate/) for all
# configuration.
authentication:
anonymous_access:
enabled: false
# This configuration allows to add API keys to Weaviate. This configuration allows only
# plain text API Keys, if you want to store the API Keys in a Kubernetes secret you can
# configure the same configuration with ENV Vars. Read the `env` section below on what
# needs to be configured. If using ENV Vars over this make sure to comment out the whole
# `apikey` section (as it is by default). ENV Vars has priority over this config.
apikey:
enabled: true
# Any number of allowed API Keys as plain text
allowed_keys:
- "WVF5YThaHlkYwhGUSmCRgsX3tD5ngdN8pkih"
# You can either set a single user for all the listed Allowed API keys OR
# one user per API Key, i.e. length(apikey.allowed_keys) == length(apikey.users) OR
# length(apikey.users) == 1
# Only the first user-key pair will be used by `dify.api` and `dify-worker`
# NOTE: Make sure the lister Users are added to the Authorization as well.
users:
- [email protected]
oidc:
enabled: false
# issuer: ''
# username_claim: ''
# groups_claim: ''
# client_id: ''
authorization:
admin_list:
enabled: true
users:
# Examples
# - admin_user1
# - admin_user2
# - api-key-user-admin
- [email protected]
read_only_users:
# Examples
# - readonly_user1
# - readonly_user2
# - api-key-user-readOnly
query_defaults:
limit: 100
debug: false
# Insert any custom environment variables or envSecrets by putting the exact name
# and desired value into the settings below. Any env name passed will be automatically
# set for the statefulSet.
env:
CLUSTER_GOSSIP_BIND_PORT: 7000
CLUSTER_DATA_BIND_PORT: 7001
# Set RAFT cluster expected number of voter nodes at bootstrap.
# By default helm automatically sets this value based on the cluster size.
# RAFT_BOOTSTRAP_EXPECT: 1
# Set RAFT cluster bootstrap timeout (in seconds), default is 600 (seconds)
# which should be sufficient for most of the deployments.
RAFT_BOOTSTRAP_TIMEOUT: 600
# Set manually RAFT voter nodes.
# RAFT_JOIN value is automatically generated by "raft_configuration"
# template, but if someone wants to set this value manually then it can be done
# by setting RAFT_JOIN environment variable, example: RAFT_JOIN: "weaviate-0,weaviate-1"
# Please notice that in this case RAFT_BOOTSTRAP_EXPECT setting needs to be also adjusted manually
# to match the number of RAFT voters, so if there are 2 nodes set using RAFT_JOIN variable
# then RAFT_BOOTSTRAP_EXPECT needs to be equal 2 also.
# RAFT_JOIN: "weaviate-0"
# Set to true if voters nodes should handle only schema. With this setting enabled
# voter nodes will not accept any data, one needs to resize the cluster using replicas
# setting so that replicas > voters.
# RAFT_METADATA_ONLY_VOTERS: false
# RAFT_ENABLE_FQDN_RESOLVER setting changes the node name to node ip resolution to use DNS lookups
# instead of memberlist lookup. That means that when weaviate raft component wants to contact `weaviate-0`
# it's going to lookup the dns name `weaviate-0` instead of looking for the node-id in memberlist.
# This is particularly useful if running in an environment where you're using services (for example k8s)
# where the IP of the services is different from the actual node IP, but it proxies the connection to the node.
# RAFT_ENABLE_FQDN_RESOLVER: false
# RAFT_FQDN_RESOLVER_TLD setting acts in combination with RAFT_ENABLE_FQDN_RESOLVER and is appended
# in the format "[node-id].[tld]" when resolving a node-id to an ip.
# RAFT_FQDN_RESOLVER_TLD: "weaviate-0."
# The aggressiveness of the Go Garbage Collector. 100 is the default value.
GOGC: 100
# Expose metrics on port 2112 for Prometheus to scrape
PROMETHEUS_MONITORING_ENABLED: false
# Set a MEM limit for the Weaviate Pod so it can help you both increase GC-related
# performance as well as avoid GC-related out-of-memory (“OOM”) situations
# GOMEMLIMIT: 6GiB
# Maximum results Weaviate can query with/without pagination
# NOTE: Affects performance, do NOT set to a very high value.
# The default is 100K
QUERY_MAXIMUM_RESULTS: 100000
# whether to enable vector dimensions tracking metric
TRACK_VECTOR_DIMENSIONS: false
# whether to re-index/-compute the vector dimensions metric (needed if upgrading from weaviate < v1.16.0)
REINDEX_VECTOR_DIMENSIONS_AT_STARTUP: false
##########################
# API Keys with ENV Vars #
##########################
# If using ENV Vars to set up API Keys make sure to have `authentication.apikey` block commented out
# to avoid any future changes. ENV Vars has priority over the config above `authentication.apikey`.
# If using `authentication.apikey `the below ENV Vars will be used because they have priority,
# so comment them out to avoid any future changes. The same applies for the RBAC configuration
# under the authorization block.
# Enables API key authentication. If it is set to 'false' the AUTHENTICATION_APIKEY_ALLOWED_KEYS
# and AUTHENTICATION_APIKEY_USERS will not have any effect.
# AUTHENTICATION_APIKEY_ENABLED: 'true'
# List one or more keys, separated by commas. Each key corresponds to a specific user identity below.
# If you want to use a kubernetes secret for the API Keys comment out this Variable and use the one in `envSecrets` below
# AUTHENTICATION_APIKEY_ALLOWED_KEYS: 'jane-secret-key,ian-secret-key' (plain text)
# List one or more user identities, separated by commas. You can have only one User for all the keys or one user per key.
# The User/s can be a simple name or an email, no matter if it exists or not.
# NOTE: Make sure to add the users to the authorization above overwise they will not be allowed to interact with Weaviate.
# AUTHENTICATION_APIKEY_USERS: '[email protected],ian-smith'
# Enabling RBAC authorization. It is mutually exclusive with the AUTHORIZATION_ADMIN_LISTS variable. Either RBAC or the
# admin lists mechanism can be used.
# AUTHORIZATION_ENABLE_RBAC: "true"
# Users with admin's RBAC role. List one or more user identities, separated by commas, which will
# have the admin role assigned to. This role provides all permissions to the user, but it's required at least
# in one of the user for managing the cluster.
# AUTHORIZATION_ADMIN_USERS: "admin-user"
# Users with viewer's RBAC role. List one or more user identities, separated by commas, which will
# have the viewer role assigned to. This role allows read permissions in all different areas. Once assigned via
# config, it can't be revoked via API AuthZ calls.
# AUTHORIZATION_VIEWER_USERS: "viewer-user"
AUTHENTICATION_APIKEY_ENABLED: "true"
AUTHENTICATION_APIKEY_ALLOWED_KEYS: "WVF5YThaHlkYwhGUSmCRgsX3tD5ngdN8pkih"
AUTHENTICATION_APIKEY_USERS: "[email protected]"
AUTHORIZATION_ADMINLIST_ENABLED: "true"
AUTHORIZATION_ADMINLIST_USERS: "[email protected]"
envSecrets:
# create a Kubernetes secret with AUTHENTICATION_APIKEY_ALLOWED_KEYS key and its respective value
# AUTHENTICATION_APIKEY_ALLOWED_KEYS: name-of-the-k8s-secret-containing-the-comma-separated-api-keys
# Configure offload providers
offload:
s3:
enabled: false
# If one is using AWS EKS and has already configured K8s Service Account
# that holds the AWS credentials one can pass a name of that service account
# here using this setting.
# NOTE: the root `serviceAccountName` config has priority over this one, and
# if the root one is set this one will NOT overwrite it. This one is here for
# backwards compatibility.
serviceAccountName:
envconfig:
# Configure bucket where data should be saved, this setting is mandatory
OFFLOAD_S3_BUCKET: weaviate-offload
# Optional setting. Defaults to AWS S3 (s3.amazonaws.com).
# Set this option if you have a MinIO storage configured in your environment
# and want to use it instead of the AWS S3.
# OFFLOAD_S3_ENDPOINT: custom.minio.endpoint.address
# Optional setting. Defaults to true.
# Set this option if you don't want to use SSL.
# OFFLOAD_S3_USE_SSL: true
# Optional setting. Defaults to false.
# Set this option if you wan't Weaviate to create
# the bucket used for offloading tenants. Otherwise,
# if set to false Weaviate expects the bucket to be
# already created with the OFFLOAD_S3_BUCKET name
# OFFLOAD_S3_BUCKET_AUTO_CREATE: true
# You can pass environment AWS settings here:
# Define the region
# AWS_REGION: eu-west-1
# For Weaviate to be able to create bucket objects it needs a user credentials to authenticate to AWS.
# The User must have permissions to read/create/delete bucket objects.
# You can pass the User credentials (access-key id and access-secret-key) in 2 ways:
# 1. by setting the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY plain values in the `secrets` section below
# this chart will create a kubernetes secret for you with these key-values pairs
# 2. create Kubernetes secret/s with AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY keys and their respective values
# Set the Key and the secret where it is set in `envSecrets` section below
secrets: {}
# AWS_ACCESS_KEY_ID: access-key-id (plain text)
# AWS_SECRET_ACCESS_KEY: secret-access-key (plain text)
# If one has already defined secrets with AWS credentials one can pass them using
# this setting:
envSecrets: {}
# AWS_ACCESS_KEY_ID: name-of-the-k8s-secret-containing-the-key-id
# AWS_SECRET_ACCESS_KEY: name-of-the-k8s-secret-containing-the-key
# Configure backup providers
backups:
# The backup-filesystem module enables creation of the DB backups in
# the local filesystem
filesystem:
enabled: true
envconfig:
# Configure folder where backups should be saved
BACKUP_FILESYSTEM_PATH: /tmp/backups
s3:
enabled: false
# If one is using AWS EKS and has already configured K8s Service Account
# that holds the AWS credentials one can pass a name of that service account
# here using this setting.
# NOTE: the root `serviceAccountName` config has priority over this one, and
# if the root one is set this one will NOT overwrite it. This one is here for
# backwards compatibility.
serviceAccountName:
envconfig:
# Configure bucket where backups should be saved, this setting is mandatory
BACKUP_S3_BUCKET: weaviate-backups
# Optional setting. Defaults to empty string.
# Set this option if you want to save backups to a given location
# inside the bucket
# BACKUP_S3_PATH: path/inside/bucket
# Optional setting. Defaults to AWS S3 (s3.amazonaws.com).
# Set this option if you have a MinIO storage configured in your environment
# and want to use it instead of the AWS S3.
# BACKUP_S3_ENDPOINT: custom.minio.endpoint.address
# Optional setting. Defaults to true.
# Set this option if you don't want to use SSL.
# BACKUP_S3_USE_SSL: true
# You can pass environment AWS settings here:
# Define the region
# AWS_REGION: eu-west-1
# For Weaviate to be able to create bucket objects it needs a user credentials to authenticate to AWS.
# The User must have permissions to read/create/delete bucket objects.
# You can pass the User credentials (access-key id and access-secret-key) in 2 ways:
# 1. by setting the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY plain values in the `secrets` section below
# this chart will create a kubernetes secret for you with these key-values pairs
# 2. create Kubernetes secret/s with AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY keys and their respective values
# Set the Key and the secret where it is set in `envSecrets` section below
secrets: {}
# AWS_ACCESS_KEY_ID: access-key-id (plain text)
# AWS_SECRET_ACCESS_KEY: secret-access-key (plain text)
# If one has already defined secrets with AWS credentials one can pass them using
# this setting:
envSecrets: {}
# AWS_ACCESS_KEY_ID: name-of-the-k8s-secret-containing-the-key-id
# AWS_SECRET_ACCESS_KEY: name-of-the-k8s-secret-containing-the-key
gcs:
enabled: false
envconfig:
# Configure bucket where backups should be saved, this setting is mandatory
BACKUP_GCS_BUCKET: weaviate-backups
# Optional setting. Defaults to empty string.
# Set this option if you want to save backups to a given location
# inside the bucket
# BACKUP_GCS_PATH: path/inside/bucket
# You can pass environment Google settings here:
# Define the project
# GOOGLE_CLOUD_PROJECT: project-id
# For Weaviate to be able to create bucket objects it needs a ServiceAccount credentials to authenticate to GCP.
# The ServiceAccount must have permissions to read/create/delete bucket objects.
# You can pass the ServiceAccount credentials (as JSON) in 2 ways:
# 1. by setting the GOOGLE_APPLICATION_CREDENTIALS json as plain text in the `secrets` section below
# this chart will create a kubernetes secret for you with this key-values pairs
# 2. create a Kubernetes secret with GOOGLE_APPLICATION_CREDENTIALS key and its respective value
# Set the Key and the secret where it is set in `envSecrets` section below
secrets: {}
# GOOGLE_APPLICATION_CREDENTIALS: credentials-json-string (plain text)
# If one has already defined a secret with GOOGLE_APPLICATION_CREDENTIALS one can pass them using
# this setting:
envSecrets: {}
# GOOGLE_APPLICATION_CREDENTIALS: name-of-the-k8s-secret-containing-the-key
azure:
enabled: false
envconfig:
# Configure container where backups should be saved, this setting is mandatory
BACKUP_AZURE_CONTAINER: weaviate-backups
# Optional setting. Defaults to empty string.
# Set this option if you want to save backups to a given location
# inside the container
# BACKUP_AZURE_PATH: path/inside/container
# For Weaviate to be able to create container objects it needs a user credentials to authenticate to Azure Storage.
# The User must have permissions to read/create/delete container objects.
# You can pass the User credentials (account-name id and account-key or connection-string) in 2 ways:
# 1. by setting the AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY
# or AZURE_STORAGE_CONNECTION_STRING plain values in the `secrets` section below
# this chart will create a kubernetes secret for you with these key-values pairs
# 2. create Kubernetes secret/s with AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY
# or AZURE_STORAGE_CONNECTION_STRING and their respective values
# Set the Key and the secret where it is set in `envSecrets` section below
secrets: {}
# AZURE_STORAGE_ACCOUNT: account-name (plain text)
# AZURE_STORAGE_KEY: account-key (plain text)
# AZURE_STORAGE_CONNECTION_STRING: connection-string (plain text)
# If one has already defined secrets with Azure Storage credentials one can pass them using
# this setting:
envSecrets: {}
# AZURE_STORAGE_ACCOUNT: name-of-the-k8s-secret-containing-the-account-name
# AZURE_STORAGE_KEY: name-of-the-k8s-secret-containing-account-key
# AZURE_STORAGE_CONNECTION_STRING: name-of-the-k8s-secret-containing-connection-string
# modules are extensions to Weaviate, they can be used to support various
# ML-models, but also other features unrelated to model inference.
# An inference/vectorizer module is not required, you can also run without any
# modules and import your own vectors.
modules:
# by choosing the default vectorizer module, you can tell Weaviate to always
# use this module as the vectorizer if nothing else is specified. Can be
# overwritten on a per-class basis.
# set to text2vec-transformers if running with transformers instead
default_vectorizer_module: none
# It is also possible to configure authentication and authorization through a
# custom configmap The authorization and authentication values defined in
# values.yaml will be ignored when defining a custom config map.
custom_config_map:
enabled: false
name: 'custom-config'
# Pass any annotations to Weaviate pods
annotations:
extraVolumeMounts:
extraVolumes:
nodeSelector:
tolerations:
hostAliases:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- weaviate
## Optionally specify priorityClass name for the pod
## ref: https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#pod-priority
##
priorityClassName: ""
globalPriorityClassName: ""
Server Version
1.27.27
Weaviate Setup
Single Node
Nodes count
1
Code of Conduct
- [x] I have read and agree to the Weaviate's Contributor Guide and Code of Conduct
👋 Thanks for opening this issue!
I found a similar issue in our tracker that might address your problem:
This open issue reports the exact error message "class ... does not have named vector default configured" during search after migration, discussing issues with the named vectors in schema after upgrades. It matches the scenario of vectors created under an older version not being recognized with newer version clients.
Please check if this is the same issue. If so, consider adding your input there or closing this one. Thanks for helping us keep the issue tracker organized! 🚀
Powered by Weaviate
👋 Thanks for opening this issue!
I found a similar issue in our tracker that might address your problem:
This open issue reports the exact error message "class ... does not have named vector default configured" during search after migration, discussing issues with the named vectors in schema after upgrades. It matches the scenario of vectors created under an older version not being recognized with newer version clients.
Please check if this is the same issue. If so, consider adding your input there or closing this one. Thanks for helping us keep the issue tracker organized! 🚀
**Powered by Weaviate**
That doesn't look like a match as we focus on lack approach to handle schema migration.
hi @BorisPolonsky !!
Thanks for reporting.
We do not recommend jump versions while upgrading: https://docs.weaviate.io/deploy/migration#upgrades
Can you confirm you see this same issue if upgrading, for example, 1.19.latest -> 1.20.latest -> 1.21.latest and so on?
Also, for such a big jump, it is interesting to consider migrating your data: https://docs.weaviate.io/weaviate/manage-collections/migrate
Thanks!
hi @BorisPolonsky !!
Thanks for reporting.
We do not recommend jump versions while upgrading: https://docs.weaviate.io/deploy/migration#upgrades
Can you confirm you see this same issue if upgrading, for example, 1.19.latest -> 1.20.latest -> 1.21.latest and so on?
Also, for such a big jump, it is interesting to consider migrating your data: https://docs.weaviate.io/weaviate/manage-collections/migrate
Thanks!
We've verified that:
- the schema stay the same at from 1.19 and 1.23, so no need to update by minor version from 1.19 to 1.23 for this issue
- No automatic schema update were observed from Weaviate 1.23 to 1.24
Test cases
Deploy weaviate 1.23.7 and create vectors
image:
# registry where weaviate image is stored
registry: cr.weaviate.io
# Tag of weaviate image to deploy
# Note: We strongly recommend you overwrite this value in your own values.yaml.
# Otherwise a mere upgrade of the chart could lead to an unexpected upgrade
# of weaviate. In accordance with Infra-as-code, you should pin this value
# down and only change it if you explicitly want to upgrade the Weaviate
# version.
tag: 1.23.7
# tag: 1.24.8
repo: semitechnologies/weaviate
# Image pull policy: https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy
pullPolicy: IfNotPresent
pullSecrets: []
# overwrite command and args if you want to run specific startup scripts, for
# example setting the nofile limit
command: ["/bin/weaviate"]
args:
- '--host'
- '0.0.0.0'
- '--port'
- '8080'
- '--scheme'
- 'http'
- '--config-file'
- '/weaviate-config/conf.yaml'
- --read-timeout=60s
- --write-timeout=60s
# below is an example that can be used to set an arbitrary nofile limit at
# startup:
#
# command:
# - "/bin/sh"
# args:
# - "-c"
# - "ulimit -n 65535 && /bin/weaviate --host 0.0.0.0 --port 8080 --scheme http --config-file /weaviate-config/conf.yaml"
# it is possible to change the sysctl's 'vm.max_map_count' using initContainer for Weaviate,
# the init Container runs before Weaviate Container and sets the value for the WHOLE node
# to the one provided below.
# it is possible to run additional initContainer before Weaviate is up and running. You can specify the
# containers as a list in `extraInitContainers`, exactly how they are defined in a kubernetes manifest:
# https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
initContainers:
sysctlInitContainer:
enabled: false
sysctlVmMaxMapCount: 524288
image:
registry: docker.io
repo: alpine
tag: latest
pullPolicy: IfNotPresent
ensureFileOwnershipContainer:
# This init container sets the file ownerships of /var/lib/weaviate directory to the ones set in
# containerSecurityContext.runAsUser and containerSecurityContext.fsGroup settings to ensure that Weaviate is able
# to start in unprivileged configuration.
# Enable this init container only if Weaviate was configured previously without security context
# and now containerSecurityContext is provided to run Weaviate container with non-root user.
# Please be sure to set at least containerSecurityContext.runAsUser and containerSecurityContext.fsGroup.
enabled: false
extraInitContainers: {}
# - image: some-image
# name: some-name
# Scale replicas of Weaviate. Note that as of v1.8.0 dynamic scaling is limited
# to cases where no data is imported yet. Scaling down after importing data may
# break usability. Full dynamic scalability will be added in a future release.
replicas: 1
# Define how pods will be created. Possible values: OrderedReady | Parallel
# OrderedReady - pods will be created one after another
# Parallel - all pods will be created at once
podManagementPolicy: Parallel
updateStrategy:
type: RollingUpdate
# This setting is only available in K8s v1.24 and higher.
# Setting maxUnavailable to 100% results in removing all of the pods
# and re-creating them in parallel all at once.
# rollingUpdate:
# maxUnavailable: 100%
resources: {}
# requests:
# cpu: '500m'
# memory: '300Mi'
# limits:
# cpu: '1000m'
# memory: '1Gi'
# security Context for the Weaviate Pods. The configurations are the same as setting them
# as described here: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
securityContext: {}
# Security context for the Weaviate container. Override overlapping settings made at the Pod level.
containerSecurityContext: {}
# runAsUser: 1000
# runAsGroup: 1000
# fsGroup: 1000
# fsGroupChangePolicy: "OnRootMismatch"
# runAsNonRoot: true
# allowPrivilegeEscalation: false
# privileged: false
# readOnlyRootFilesystem: true
# Add a service account to the Weaviate pods if you need Weaviate to have permissions to
# access kubernetes resources or cloud provider resources. For example for it to have
# access to a backup up bucket, or if you want to restrict Weaviate pod in any way.
# By default, use the default ServiceAccount
serviceAccountName:
# Kubernetes Cluster domain name, used for resolving intra-cluster requests, i.e
# between instances of weaviate.
# Note: The final '.' on the end of the hostname makes it a FQDN, and is required for
# DNS to resolve in all kubernetes environments.
# See https://github.com/weaviate/weaviate-helm/issues/175 for details.
clusterDomain: cluster.local.
# The Persistent Volume Claim settings for Weaviate. If there's a
# storage.fullnameOverride field set, then the default pvc will not be
# created, instead the one defined in fullnameOverride will be used
storage:
size: 1Gi
storageClassName: ""
# The service controls how weaviate is exposed to the outside world. If you
# don't want a public load balancer, you can also choose 'ClusterIP' to make
# weaviate only accessible within your cluster.
service:
name: weaviate
type: ClusterIP
loadBalancerSourceRanges: []
# optionally set cluster IP if you want to set a static IP
clusterIP:
annotations: {}
# The service controls how weaviate gRPC endpoint is exposed to the outside world.
# If you don't want a public load balancer, you can also choose 'ClusterIP' or `NodePort`
# to make weaviate gRPC port be only accessible within your cluster.
# This service is by default enabled but if you don't want it to be deployed in your
# environment then it can be disabled by setting enabled: false option.
grpcService:
enabled: false
name: weaviate-grpc
ports:
- name: grpc
protocol: TCP
port: 50051
# Target port is going to be the same for every port
type: ClusterIP
loadBalancerSourceRanges: []
# optionally set cluster IP if you want to set a static IP
clusterIP:
annotations: {}
# Adjust liveness, readiness and startup probes configuration
startupProbe:
# For kubernetes versions prior to 1.18 startupProbe is not supported thus can be disabled.
enabled: false
probeType: httpGet
probe:
httpGet:
path: /v1/.well-known/ready
port: 8080
initialDelaySeconds: 300
periodSeconds: 60
failureThreshold: 50
successThreshold: 1
timeoutSeconds: 3
livenessProbe:
livenessProbe:
probeType: httpGet
probe:
httpGet:
path: /v1/.well-known/live
port: 8080
initialDelaySeconds: 900
periodSeconds: 10
failureThreshold: 30
successThreshold: 1
timeoutSeconds: 3
readinessProbe:
probeType: httpGet
probe:
httpGet:
path: /v1/.well-known/ready
port: 8080
initialDelaySeconds: 3
periodSeconds: 10
failureThreshold: 3
successThreshold: 1
timeoutSeconds: 3
terminationGracePeriodSeconds: 600
# Weaviate Config
#
# The following settings allow you to customize Weaviate to your needs, for
# example set authentication and authorization options. See weaviate docs
# (https://www.weaviate.io/developers/weaviate/) for all
# configuration.
authentication:
anonymous_access:
enabled: false
# This configuration allows to add API keys to Weaviate. This configuration allows only
# plain text API Keys, if you want to store the API Keys in a Kubernetes secret you can
# configure the same configuration with ENV Vars. Read the `env` section below on what
# needs to be configured. If using ENV Vars over this make sure to comment out the whole
# `apikey` section (as it is by default). ENV Vars has priority over this config.
apikey:
enabled: true
# Any number of allowed API Keys as plain text
allowed_keys:
- "WVF5YThaHlkYwhGUSmCRgsX3tD5ngdN8pkih"
# You can either set a single user for all the listed Allowed API keys OR
# one user per API Key, i.e. length(apikey.allowed_keys) == length(apikey.users) OR
# length(apikey.users) == 1
# Only the first user-key pair will be used by `dify.api` and `dify-worker`
# NOTE: Make sure the lister Users are added to the Authorization as well.
users:
- [email protected]
oidc:
enabled: false
# issuer: ''
# username_claim: ''
# groups_claim: ''
# client_id: ''
authorization:
admin_list:
enabled: true
users:
# Examples
# - admin_user1
# - admin_user2
# - api-key-user-admin
- [email protected]
read_only_users:
# Examples
# - readonly_user1
# - readonly_user2
# - api-key-user-readOnly
query_defaults:
limit: 100
debug: false
# Insert any custom environment variables or envSecrets by putting the exact name
# and desired value into the settings below. Any env name passed will be automatically
# set for the statefulSet.
env:
CLUSTER_GOSSIP_BIND_PORT: 7000
CLUSTER_DATA_BIND_PORT: 7001
# Set RAFT cluster expected number of voter nodes at bootstrap.
# By default helm automatically sets this value based on the cluster size.
# RAFT_BOOTSTRAP_EXPECT: 1
# Set RAFT cluster bootstrap timeout (in seconds), default is 600 (seconds)
# which should be sufficient for most of the deployments.
RAFT_BOOTSTRAP_TIMEOUT: 600
# Set manually RAFT voter nodes.
# RAFT_JOIN value is automatically generated by "raft_configuration"
# template, but if someone wants to set this value manually then it can be done
# by setting RAFT_JOIN environment variable, example: RAFT_JOIN: "weaviate-0,weaviate-1"
# Please notice that in this case RAFT_BOOTSTRAP_EXPECT setting needs to be also adjusted manually
# to match the number of RAFT voters, so if there are 2 nodes set using RAFT_JOIN variable
# then RAFT_BOOTSTRAP_EXPECT needs to be equal 2 also.
# RAFT_JOIN: "weaviate-0"
# Set to true if voters nodes should handle only schema. With this setting enabled
# voter nodes will not accept any data, one needs to resize the cluster using replicas
# setting so that replicas > voters.
# RAFT_METADATA_ONLY_VOTERS: false
# RAFT_ENABLE_FQDN_RESOLVER setting changes the node name to node ip resolution to use DNS lookups
# instead of memberlist lookup. That means that when weaviate raft component wants to contact `weaviate-0`
# it's going to lookup the dns name `weaviate-0` instead of looking for the node-id in memberlist.
# This is particularly useful if running in an environment where you're using services (for example k8s)
# where the IP of the services is different from the actual node IP, but it proxies the connection to the node.
# RAFT_ENABLE_FQDN_RESOLVER: false
# RAFT_FQDN_RESOLVER_TLD setting acts in combination with RAFT_ENABLE_FQDN_RESOLVER and is appended
# in the format "[node-id].[tld]" when resolving a node-id to an ip.
# RAFT_FQDN_RESOLVER_TLD: "weaviate-0."
# The aggressiveness of the Go Garbage Collector. 100 is the default value.
GOGC: 100
# Expose metrics on port 2112 for Prometheus to scrape
PROMETHEUS_MONITORING_ENABLED: false
# Set a MEM limit for the Weaviate Pod so it can help you both increase GC-related
# performance as well as avoid GC-related out-of-memory (“OOM”) situations
# GOMEMLIMIT: 6GiB
# Maximum results Weaviate can query with/without pagination
# NOTE: Affects performance, do NOT set to a very high value.
# The default is 100K
QUERY_MAXIMUM_RESULTS: 100000
# whether to enable vector dimensions tracking metric
TRACK_VECTOR_DIMENSIONS: false
# whether to re-index/-compute the vector dimensions metric (needed if upgrading from weaviate < v1.16.0)
REINDEX_VECTOR_DIMENSIONS_AT_STARTUP: false
##########################
# API Keys with ENV Vars #
##########################
# If using ENV Vars to set up API Keys make sure to have `authentication.apikey` block commented out
# to avoid any future changes. ENV Vars has priority over the config above `authentication.apikey`.
# If using `authentication.apikey `the below ENV Vars will be used because they have priority,
# so comment them out to avoid any future changes. The same applies for the RBAC configuration
# under the authorization block.
# Enables API key authentication. If it is set to 'false' the AUTHENTICATION_APIKEY_ALLOWED_KEYS
# and AUTHENTICATION_APIKEY_USERS will not have any effect.
# AUTHENTICATION_APIKEY_ENABLED: 'true'
# List one or more keys, separated by commas. Each key corresponds to a specific user identity below.
# If you want to use a kubernetes secret for the API Keys comment out this Variable and use the one in `envSecrets` below
# AUTHENTICATION_APIKEY_ALLOWED_KEYS: 'jane-secret-key,ian-secret-key' (plain text)
# List one or more user identities, separated by commas. You can have only one User for all the keys or one user per key.
# The User/s can be a simple name or an email, no matter if it exists or not.
# NOTE: Make sure to add the users to the authorization above overwise they will not be allowed to interact with Weaviate.
# AUTHENTICATION_APIKEY_USERS: '[email protected],ian-smith'
# Enabling RBAC authorization. It is mutually exclusive with the AUTHORIZATION_ADMIN_LISTS variable. Either RBAC or the
# admin lists mechanism can be used.
# AUTHORIZATION_ENABLE_RBAC: "true"
# Users with admin's RBAC role. List one or more user identities, separated by commas, which will
# have the admin role assigned to. This role provides all permissions to the user, but it's required at least
# in one of the user for managing the cluster.
# AUTHORIZATION_ADMIN_USERS: "admin-user"
# Users with viewer's RBAC role. List one or more user identities, separated by commas, which will
# have the viewer role assigned to. This role allows read permissions in all different areas. Once assigned via
# config, it can't be revoked via API AuthZ calls.
# AUTHORIZATION_VIEWER_USERS: "viewer-user"
AUTHENTICATION_APIKEY_ENABLED: "true"
AUTHENTICATION_APIKEY_ALLOWED_KEYS: "WVF5YThaHlkYwhGUSmCRgsX3tD5ngdN8pkih"
AUTHENTICATION_APIKEY_USERS: "[email protected]"
AUTHORIZATION_ADMINLIST_ENABLED: "true"
AUTHORIZATION_ADMINLIST_USERS: "[email protected]"
envSecrets:
# create a Kubernetes secret with AUTHENTICATION_APIKEY_ALLOWED_KEYS key and its respective value
# AUTHENTICATION_APIKEY_ALLOWED_KEYS: name-of-the-k8s-secret-containing-the-comma-separated-api-keys
# Configure offload providers
offload:
s3:
enabled: false
# If one is using AWS EKS and has already configured K8s Service Account
# that holds the AWS credentials one can pass a name of that service account
# here using this setting.
# NOTE: the root `serviceAccountName` config has priority over this one, and
# if the root one is set this one will NOT overwrite it. This one is here for
# backwards compatibility.
serviceAccountName:
envconfig:
# Configure bucket where data should be saved, this setting is mandatory
OFFLOAD_S3_BUCKET: weaviate-offload
# Optional setting. Defaults to AWS S3 (s3.amazonaws.com).
# Set this option if you have a MinIO storage configured in your environment
# and want to use it instead of the AWS S3.
# OFFLOAD_S3_ENDPOINT: custom.minio.endpoint.address
# Optional setting. Defaults to true.
# Set this option if you don't want to use SSL.
# OFFLOAD_S3_USE_SSL: true
# Optional setting. Defaults to false.
# Set this option if you wan't Weaviate to create
# the bucket used for offloading tenants. Otherwise,
# if set to false Weaviate expects the bucket to be
# already created with the OFFLOAD_S3_BUCKET name
# OFFLOAD_S3_BUCKET_AUTO_CREATE: true
# You can pass environment AWS settings here:
# Define the region
# AWS_REGION: eu-west-1
# For Weaviate to be able to create bucket objects it needs a user credentials to authenticate to AWS.
# The User must have permissions to read/create/delete bucket objects.
# You can pass the User credentials (access-key id and access-secret-key) in 2 ways:
# 1. by setting the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY plain values in the `secrets` section below
# this chart will create a kubernetes secret for you with these key-values pairs
# 2. create Kubernetes secret/s with AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY keys and their respective values
# Set the Key and the secret where it is set in `envSecrets` section below
secrets: {}
# AWS_ACCESS_KEY_ID: access-key-id (plain text)
# AWS_SECRET_ACCESS_KEY: secret-access-key (plain text)
# If one has already defined secrets with AWS credentials one can pass them using
# this setting:
envSecrets: {}
# AWS_ACCESS_KEY_ID: name-of-the-k8s-secret-containing-the-key-id
# AWS_SECRET_ACCESS_KEY: name-of-the-k8s-secret-containing-the-key
# Configure backup providers
backups:
# The backup-filesystem module enables creation of the DB backups in
# the local filesystem
filesystem:
enabled: true
envconfig:
# Configure folder where backups should be saved
BACKUP_FILESYSTEM_PATH: /tmp/backups
s3:
enabled: false
# If one is using AWS EKS and has already configured K8s Service Account
# that holds the AWS credentials one can pass a name of that service account
# here using this setting.
# NOTE: the root `serviceAccountName` config has priority over this one, and
# if the root one is set this one will NOT overwrite it. This one is here for
# backwards compatibility.
serviceAccountName:
envconfig:
# Configure bucket where backups should be saved, this setting is mandatory
BACKUP_S3_BUCKET: weaviate-backups
# Optional setting. Defaults to empty string.
# Set this option if you want to save backups to a given location
# inside the bucket
# BACKUP_S3_PATH: path/inside/bucket
# Optional setting. Defaults to AWS S3 (s3.amazonaws.com).
# Set this option if you have a MinIO storage configured in your environment
# and want to use it instead of the AWS S3.
# BACKUP_S3_ENDPOINT: custom.minio.endpoint.address
# Optional setting. Defaults to true.
# Set this option if you don't want to use SSL.
# BACKUP_S3_USE_SSL: true
# You can pass environment AWS settings here:
# Define the region
# AWS_REGION: eu-west-1
# For Weaviate to be able to create bucket objects it needs a user credentials to authenticate to AWS.
# The User must have permissions to read/create/delete bucket objects.
# You can pass the User credentials (access-key id and access-secret-key) in 2 ways:
# 1. by setting the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY plain values in the `secrets` section below
# this chart will create a kubernetes secret for you with these key-values pairs
# 2. create Kubernetes secret/s with AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY keys and their respective values
# Set the Key and the secret where it is set in `envSecrets` section below
secrets: {}
# AWS_ACCESS_KEY_ID: access-key-id (plain text)
# AWS_SECRET_ACCESS_KEY: secret-access-key (plain text)
# If one has already defined secrets with AWS credentials one can pass them using
# this setting:
envSecrets: {}
# AWS_ACCESS_KEY_ID: name-of-the-k8s-secret-containing-the-key-id
# AWS_SECRET_ACCESS_KEY: name-of-the-k8s-secret-containing-the-key
gcs:
enabled: false
envconfig:
# Configure bucket where backups should be saved, this setting is mandatory
BACKUP_GCS_BUCKET: weaviate-backups
# Optional setting. Defaults to empty string.
# Set this option if you want to save backups to a given location
# inside the bucket
# BACKUP_GCS_PATH: path/inside/bucket
# You can pass environment Google settings here:
# Define the project
# GOOGLE_CLOUD_PROJECT: project-id
# For Weaviate to be able to create bucket objects it needs a ServiceAccount credentials to authenticate to GCP.
# The ServiceAccount must have permissions to read/create/delete bucket objects.
# You can pass the ServiceAccount credentials (as JSON) in 2 ways:
# 1. by setting the GOOGLE_APPLICATION_CREDENTIALS json as plain text in the `secrets` section below
# this chart will create a kubernetes secret for you with this key-values pairs
# 2. create a Kubernetes secret with GOOGLE_APPLICATION_CREDENTIALS key and its respective value
# Set the Key and the secret where it is set in `envSecrets` section below
secrets: {}
# GOOGLE_APPLICATION_CREDENTIALS: credentials-json-string (plain text)
# If one has already defined a secret with GOOGLE_APPLICATION_CREDENTIALS one can pass them using
# this setting:
envSecrets: {}
# GOOGLE_APPLICATION_CREDENTIALS: name-of-the-k8s-secret-containing-the-key
azure:
enabled: false
envconfig:
# Configure container where backups should be saved, this setting is mandatory
BACKUP_AZURE_CONTAINER: weaviate-backups
# Optional setting. Defaults to empty string.
# Set this option if you want to save backups to a given location
# inside the container
# BACKUP_AZURE_PATH: path/inside/container
# For Weaviate to be able to create container objects it needs a user credentials to authenticate to Azure Storage.
# The User must have permissions to read/create/delete container objects.
# You can pass the User credentials (account-name id and account-key or connection-string) in 2 ways:
# 1. by setting the AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY
# or AZURE_STORAGE_CONNECTION_STRING plain values in the `secrets` section below
# this chart will create a kubernetes secret for you with these key-values pairs
# 2. create Kubernetes secret/s with AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY
# or AZURE_STORAGE_CONNECTION_STRING and their respective values
# Set the Key and the secret where it is set in `envSecrets` section below
secrets: {}
# AZURE_STORAGE_ACCOUNT: account-name (plain text)
# AZURE_STORAGE_KEY: account-key (plain text)
# AZURE_STORAGE_CONNECTION_STRING: connection-string (plain text)
# If one has already defined secrets with Azure Storage credentials one can pass them using
# this setting:
envSecrets: {}
# AZURE_STORAGE_ACCOUNT: name-of-the-k8s-secret-containing-the-account-name
# AZURE_STORAGE_KEY: name-of-the-k8s-secret-containing-account-key
# AZURE_STORAGE_CONNECTION_STRING: name-of-the-k8s-secret-containing-connection-string
# modules are extensions to Weaviate, they can be used to support various
# ML-models, but also other features unrelated to model inference.
# An inference/vectorizer module is not required, you can also run without any
# modules and import your own vectors.
modules:
# by choosing the default vectorizer module, you can tell Weaviate to always
# use this module as the vectorizer if nothing else is specified. Can be
# overwritten on a per-class basis.
# set to text2vec-transformers if running with transformers instead
default_vectorizer_module: none
# It is also possible to configure authentication and authorization through a
# custom configmap The authorization and authentication values defined in
# values.yaml will be ignored when defining a custom config map.
custom_config_map:
enabled: false
name: 'custom-config'
# Pass any annotations to Weaviate pods
annotations:
extraVolumeMounts:
extraVolumes:
nodeSelector:
tolerations:
hostAliases:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- weaviate
## Optionally specify priorityClass name for the pod
## ref: https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#pod-priority
##
priorityClassName: ""
globalPriorityClassName: ""
Verify that schema are set in the same way as Weaviate 1.19
curl "http://localhost:8080/v1/schema/Vector_index_77154e28_c1df_4c39_882d_1ea047c22f84_Node" -H "Content-Type: application/json" -H "Authorization: Bearer $WEAVIATE_API_KEY"
{
"class": "Vector_index_77154e28_c1df_4c39_882d_1ea047c22f84_Node",
"invertedIndexConfig": {
"bm25": {
"b": 0.75,
"k1": 1.2
},
"cleanupIntervalSeconds": 60,
"stopwords": {
"additions": null,
"preset": "en",
"removals": null
}
},
"multiTenancyConfig": {
"enabled": false
},
"properties": [
{
"dataType": [
"text"
],
"indexFilterable": true,
"indexSearchable": true,
"name": "text",
"tokenization": "word"
},
{
"dataType": [
"uuid"
],
"description": "This property was generated by Weaviate's auto-schema feature on Wed Nov 12 01:36:27 2025",
"indexFilterable": true,
"indexSearchable": false,
"name": "document_id"
},
{
"dataType": [
"uuid"
],
"description": "This property was generated by Weaviate's auto-schema feature on Wed Nov 12 01:36:27 2025",
"indexFilterable": true,
"indexSearchable": false,
"name": "dataset_id"
},
{
"dataType": [
"uuid"
],
"description": "This property was generated by Weaviate's auto-schema feature on Wed Nov 12 01:36:27 2025",
"indexFilterable": true,
"indexSearchable": false,
"name": "doc_id"
},
{
"dataType": [
"text"
],
"description": "This property was generated by Weaviate's auto-schema feature on Wed Nov 12 01:36:27 2025",
"indexFilterable": true,
"indexSearchable": true,
"name": "doc_hash",
"tokenization": "word"
}
],
"replicationConfig": {
"factor": 1
},
"shardingConfig": {
"virtualPerPhysical": 128,
"desiredCount": 1,
"actualCount": 1,
"desiredVirtualCount": 128,
"actualVirtualCount": 128,
"key": "_id",
"strategy": "hash",
"function": "murmur3"
},
"vectorIndexConfig": {
"skip": false,
"cleanupIntervalSeconds": 300,
"maxConnections": 64,
"efConstruction": 128,
"ef": -1,
"dynamicEfMin": 100,
"dynamicEfMax": 500,
"dynamicEfFactor": 8,
"vectorCacheMaxObjects": 1000000000000,
"flatSearchCutoff": 40000,
"distance": "cosine",
"pq": {
"enabled": false,
"bitCompression": false,
"segments": 0,
"centroids": 256,
"trainingLimit": 100000,
"encoder": {
"type": "kmeans",
"distribution": "log-normal"
}
}
},
"vectorIndexType": "hnsw",
"vectorizer": "none"
}
Verfify that schema were patched after upgrading to 1.24.8, yet still doesn't comply with the form observed in 1.27
curl "http://localhost:8080/v1/schema/Vector_index_77154e28_c1df_4c39_882d_1ea047c22f84_Node" -H "Content-Type: application/json" -H "Authorization: Bearer $WEAVIATE_API_KEY"
{
"class": "Vector_index_77154e28_c1df_4c39_882d_1ea047c22f84_Node",
"invertedIndexConfig": {
"bm25": {
"b": 0.75,
"k1": 1.2
},
"cleanupIntervalSeconds": 60,
"stopwords": {
"additions": null,
"preset": "en",
"removals": null
}
},
"multiTenancyConfig": {
"enabled": false
},
"properties": [
{
"dataType": [
"text"
],
"indexFilterable": true,
"indexSearchable": true,
"name": "text",
"tokenization": "word"
},
{
"dataType": [
"uuid"
],
"description": "This property was generated by Weaviate's auto-schema feature on Wed Nov 12 01:36:27 2025",
"indexFilterable": true,
"indexSearchable": false,
"name": "document_id"
},
{
"dataType": [
"uuid"
],
"description": "This property was generated by Weaviate's auto-schema feature on Wed Nov 12 01:36:27 2025",
"indexFilterable": true,
"indexSearchable": false,
"name": "dataset_id"
},
{
"dataType": [
"uuid"
],
"description": "This property was generated by Weaviate's auto-schema feature on Wed Nov 12 01:36:27 2025",
"indexFilterable": true,
"indexSearchable": false,
"name": "doc_id"
},
{
"dataType": [
"text"
],
"description": "This property was generated by Weaviate's auto-schema feature on Wed Nov 12 01:36:27 2025",
"indexFilterable": true,
"indexSearchable": true,
"name": "doc_hash",
"tokenization": "word"
}
],
"replicationConfig": {
"factor": 1
},
"shardingConfig": {
"virtualPerPhysical": 128,
"desiredCount": 1,
"actualCount": 1,
"desiredVirtualCount": 128,
"actualVirtualCount": 128,
"key": "_id",
"strategy": "hash",
"function": "murmur3"
},
"vectorIndexConfig": {
"skip": false,
"cleanupIntervalSeconds": 300,
"maxConnections": 64,
"efConstruction": 128,
"ef": -1,
"dynamicEfMin": 100,
"dynamicEfMax": 500,
"dynamicEfFactor": 8,
"vectorCacheMaxObjects": 1000000000000,
"flatSearchCutoff": 40000,
"distance": "cosine",
"pq": {
"enabled": false,
"bitCompression": false,
"segments": 0,
"centroids": 256,
"trainingLimit": 100000,
"encoder": {
"type": "kmeans",
"distribution": "log-normal"
}
},
"bq": {
"enabled": false
}
},
"vectorIndexType": "hnsw",
"vectorizer": "none"
}
The notabale changes are
},
"bq": {
"enabled": false
}
},
As is already known that 1.27 doesn't take care of the schema migration either, that narrows responsibility of schema migration down to weaviate 1.24, 1.25 and 1.26, which we didn't bother testing as incompatibility issues go beyond this thread (e.g. un-handled object data migration in accord to the schema migration, which is off topic) and have already costed us too much time struggling with this vdb with frustration.
To conclude, we don't think Weaviate takes data migrations properly among versions (even if upgraded by minor versions), and would like to arouse the awareness from Weaviate to take compatibility issues into account with priority.