m3db-operator
m3db-operator copied to clipboard
"unaggregated namespace is not yet initialized" error
I am following the instructions here and after deploying the cluster and attempting to write some data (manually, explore in Grafana, anything) I get the following error:
{
"status": "error",
"error": "unaggregated namespace is not yet initialized"
}
Here is my etcd cluster config:
apiVersion: v1
kind: Service
metadata:
name: etcd
labels:
app: etcd
spec:
ports:
- port: 2379
name: client
- port: 2380
name: peer
clusterIP: None
selector:
app: etcd
---
apiVersion: v1
kind: Service
metadata:
name: etcd-cluster
labels:
app: etcd
spec:
selector:
app: etcd
ports:
- port: 2379
protocol: TCP
type: ClusterIP
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: etcd
labels:
app: etcd
spec:
serviceName: "etcd"
replicas: 3
selector:
matchLabels:
app: etcd
template:
metadata:
labels:
app: etcd
spec:
containers:
- name: etcd
image: quay.io/coreos/etcd:v3.5.0
imagePullPolicy: IfNotPresent
command:
- "etcd"
- "--name"
- "$(MY_POD_NAME)"
- "--listen-peer-urls"
- "http://$(MY_IP):2380"
- "--listen-client-urls"
- "http://$(MY_IP):2379,http://127.0.0.1:2379"
- "--advertise-client-urls"
- "http://$(MY_POD_NAME).etcd:2379"
- "--initial-cluster-token"
- "etcd-cluster-1"
- "--initial-advertise-peer-urls"
- "http://$(MY_POD_NAME).etcd:2380"
- "--initial-cluster"
- "etcd-0=http://etcd-0.etcd:2380,etcd-1=http://etcd-1.etcd:2380,etcd-2=http://etcd-2.etcd:2380"
- "--initial-cluster-state"
- "new"
- "--data-dir"
- "/var/lib/etcd"
ports:
- containerPort: 2379
name: client
- containerPort: 2380
name: peer
volumeMounts:
- name: etcd-data
mountPath: /var/lib/etcd
env:
- name: MY_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: ETCDCTL_API
value: "3"
volumeClaimTemplates:
- metadata:
name: etcd-data
spec:
storageClassName: encrypted-gp2
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
limits:
storage: 30Gi
Here is my M3DBCluster (operator) config:
apiVersion: operator.m3db.io/v1alpha1
kind: M3DBCluster
metadata:
name: m3db-cluster
spec:
image: quay.io/m3db/m3dbnode:v1.2.0
imagePullPolicy: IfNotPresent
replicationFactor: 3
numberOfShards: 256
etcdEndpoints:
- http://etcd-0.etcd:2379
- http://etcd-1.etcd:2379
- http://etcd-2.etcd:2379
isolationGroups:
- name: group1
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- us-west-2a
- name: group2
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- us-west-2b
- name: group3
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- us-west-2c
podIdentityConfig:
sources: []
namespaces:
- name: metrics-10s:2d
preset: 10s:2d
dataDirVolumeClaimTemplate:
metadata:
name: m3db-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: encrypted-gp2
resources:
requests:
storage: 350Gi
limits:
storage: 350Gi
And then the operations I apply to deploy my cluster:
helm install -n monitoring m3db-operator m3db/m3db-operator
kubectl -n monitoring apply -f ~/Dropbox/projects/m3db/conf/m3db-cluster.yaml
My resultant auto-generated m3coordinator config from this is the following:
kind: ConfigMap
apiVersion: v1
metadata:
name: m3db-config-map-m3db-cluster
namespace: monitoring
uid: 8beea769-2b68-488a-a3c3-606622013bcd
resourceVersion: '163096500'
creationTimestamp: '2021-09-22T23:45:20Z'
ownerReferences:
- apiVersion: operator.m3db.io/v1alpha1
kind: m3dbcluster
name: m3db-cluster
uid: aa05a31c-4767-4674-8343-129e609a793d
controller: true
blockOwnerDeletion: true
managedFields:
- manager: m3db-operator
operation: Update
apiVersion: v1
time: '2021-09-22T23:45:20Z'
fieldsType: FieldsV1
fieldsV1:
'f:data':
.: {}
'f:m3.yml': {}
'f:metadata':
'f:ownerReferences':
.: {}
'k:{"uid":"aa05a31c-4767-4674-8343-129e609a793d"}':
.: {}
'f:apiVersion': {}
'f:blockOwnerDeletion': {}
'f:controller': {}
'f:kind': {}
'f:name': {}
'f:uid': {}
data:
m3.yml: |
coordinator: {}
db:
hostID:
resolver: file
file:
path: /etc/m3db/pod-identity/identity
timeout: 5m
client:
writeConsistencyLevel: majority
readConsistencyLevel: unstrict_majority
discovery:
config:
service:
env: "monitoring/m3db-cluster"
zone: embedded
service: m3db
cacheDir: /var/lib/m3kv
etcdClusters:
- zone: embedded
endpoints:
- "http://etcd-0.etcd:2379"
- "http://etcd-1.etcd:2379"
- "http://etcd-2.etcd:2379"
My resultant namespace config from this is the following:
http://localhost:7201/api/v1/services/m3db/namespace
{
"registry": {
"namespaces": {
"metrics-10s:2d": {
"bootstrapEnabled": true,
"flushEnabled": true,
"writesToCommitLog": true,
"cleanupEnabled": true,
"repairEnabled": false,
"retentionOptions": {
"retentionPeriodNanos": "172800000000000",
"blockSizeNanos": "7200000000000",
"bufferFutureNanos": "600000000000",
"bufferPastNanos": "600000000000",
"blockDataExpiry": true,
"blockDataExpiryAfterNotAccessPeriodNanos": "300000000000",
"futureRetentionPeriodNanos": "0"
},
"snapshotEnabled": true,
"indexOptions": {
"enabled": true,
"blockSizeNanos": "7200000000000"
},
"schemaOptions": null,
"coldWritesEnabled": false,
"runtimeOptions": null,
"cacheBlocksOnRetrieve": false,
"aggregationOptions": {
"aggregations": [
{
"aggregated": true,
"attributes": {
"resolutionNanos": "10000000000",
"downsampleOptions": {
"all": true
}
}
}
]
},
"stagingState": {
"status": "READY"
},
"extendedOptions": null
}
}
}
}
(Note how the stagingState
status is READY
, but this namespace is being reported as not yet initialized.)
I believe that this issue relates to this change.
To fix the issue:
- I provide my own m3coordinator conf by adding this line to my M3DBCluster (operator) config:
configMapName: m3db-cluster-config-map
- Then I provide my own m3coordinator config which is a duplicate the auto-generated config, but instead has config to define the unaggregated namespace:
kind: ConfigMap
apiVersion: v1
metadata:
name: m3db-cluster-config-map
data:
m3.yml: |
coordinator:
local:
namespaces:
- namespace: metrics-10s:2d
type: unaggregated
retention: 48h
db:
hostID:
resolver: file
file:
path: /etc/m3db/pod-identity/identity
timeout: 5m
client:
writeConsistencyLevel: majority
readConsistencyLevel: unstrict_majority
discovery:
config:
service:
env: "monitoring/m3db-cluster"
zone: embedded
service: m3db
cacheDir: /var/lib/m3kv
etcdClusters:
- zone: embedded
endpoints:
- "http://etcd-0.etcd:2379"
- "http://etcd-1.etcd:2379"
- "http://etcd-2.etcd:2379"
- Then I apply/deploy these manifests by performing the following:
helm install -n monitoring m3db-operator m3db/m3db-operator
kubectl -n monitoring apply -f ~/Dropbox/projects/m3db/conf/m3db-cluster-config-map.yaml
kubectl -n monitoring apply -f ~/Dropbox/projects/m3db/conf/m3db-cluster.yaml
- The resultant namespace config from this is the following:
{
"registry": {
"namespaces": {
"metrics-10s:2d": {
"bootstrapEnabled": true,
"flushEnabled": true,
"writesToCommitLog": true,
"cleanupEnabled": true,
"repairEnabled": false,
"retentionOptions": {
"retentionPeriodNanos": "172800000000000",
"blockSizeNanos": "7200000000000",
"bufferFutureNanos": "600000000000",
"bufferPastNanos": "600000000000",
"blockDataExpiry": true,
"blockDataExpiryAfterNotAccessPeriodNanos": "300000000000",
"futureRetentionPeriodNanos": "0"
},
"snapshotEnabled": true,
"indexOptions": {
"enabled": true,
"blockSizeNanos": "7200000000000"
},
"schemaOptions": null,
"coldWritesEnabled": false,
"runtimeOptions": null,
"cacheBlocksOnRetrieve": false,
"aggregationOptions": {
"aggregations": [
{
"aggregated": true,
"attributes": {
"resolutionNanos": "10000000000",
"downsampleOptions": {
"all": true
}
}
}
]
},
"stagingState": {
"status": "UNKNOWN"
},
"extendedOptions": null
}
}
}
}
(Note how the stagingState
status is UNKNOWN
. This is the same issue as m3 issue #3649
- Then I need to force-ready this namespace, as per m3 issue #3649
http://localhost:7201/api/v1/services/m3db/namespace/ready
{
"name": "metrics-10s:2d",
"force": true
}
And then I restart/delete the m3db-operator pod and namespace becomes stagingState
status is READY
and I can write/query for data.
@skupjoe where did you get the base coordinator config? We can look into updating that. Also pull requests are welcome if you'd like to contribute a change!