trident
trident copied to clipboard
Allow TridentBackendConfig to be incoperated into helm chart as CRD
Describe the solution you'd like Helm chart should have the ability to incorporate the CRD of TridentBackendConfig this definition seems to be missing from the helm chart making it difficult to incorporate into CI/CD systems
on top of this, it looks like there is means to install the existing CRDs from the helm chart
Describe alternatives you've considered so essentially there should be a field called
extraDeploy which is an Array of extra objects to deploy with the release
an example of which can be taken from https://github.com/bitnami/charts/tree/master/bitnami/grafana/#installing-the-chart
which would allow the backend to be rendered alternatively it can be included in the helm chart as a template to be populated via the values.yaml file https://github.com/bitnami/charts/tree/master/bitnami/cert-manager where the key value pair of installCRDs: true
is used in this instance
in terms of the CRDs for installation this should be set in the values.yaml an example of this can be found in
Additional context in addition to the above, I feel that the trident installer needs to be bundled with the helm chart in the form of kubernetes manifest in this case, there does not seem to be much in terms of documentation around getting the TridentBackendConfig CRD functioning as it seems to be committed from the helm chart making documentation like https://netapp-trident.readthedocs.io/en/latest/kubernetes/operations/tasks/managing-backends/tbc.html#step-2-create-the-tridentbackendconfig-cr effectively useless unless tridentctl is used as per https://docs.netapp.com/us-en/trident/pdfs/sidebar/Configure_backends.pdf
this and to help with CI/CD system a helm repo is also beneficial which can be seen up via https://helm.sh/docs/topics/chart_repository/
and avoid errors such as
kubectl : error: unable to recognize "testbackend.yaml": no matches for kind "TridentBackendConfig" in version "trident.netapp.io/v1"
so instead of having to extract the helm charts from the release in github the changes are passed via CI/CD into the chart repo and downloaded in the form of a URL such as
https://charts.bitnami.com/bitnami
also a tutorial on how to deploy TridentBackendConfig in yaml would be useful in the form of a youtube video most tutorials seem to focus on tridentctl
@dc232 thank you for posting this. Today, backends can be created by 2 means: tridentctl
, kubectl
[Using tridentBackendConfigs
. I understand that your ask is to create backends as part of a Helm install. Is that right? The best way to do it right now would be to a.Install Trident with Helm b. Create backends with kubectl
or tridentctl
post-install.
- Have you taken a look at the instructions for creating backends using
TridentBackendConfigs
? I noticed you referenced our docs from its previous home. Did you see it linked somewhere? - Trident's Helm chart is hosted on a repository. Instead of downloading the installer from our GitHub repo, you can pull the Helm chart and install
Hey @balaramesh so i'm trying to create the backend through azure pipelines and a terraform template
example code
data "template_file" "netapp_backend_config"{
depends_on = [kubernetes_secret_v1.netapp]
template = file("${path.module}/BackendCRD/backend-anf.yaml")
vars = {
# "clientID" = data.azurerm_client_config.current.client_id
# "clientSecret" = var.netappclientsecret
"metadataname" = "nett-app-backend"
"namespace" = kubernetes_namespace.netapp.metadata[0].name
"storagedrivername" = "azure-netapp-files"
"subscriptionid" = data.azurerm_client_config.current.subscription_id
"tenantid" = data.azurerm_client_config.current.tenant_id
"location" = var.location
"servicelevel" = var.servicelevel
"secretcredentails" = kubernetes_secret_v1.netapp.metadata[0].name
"subnet" = var.subnet
"virtualNetwork" = var.virtualNetwork
"nfsMountOptions" = "nfsvers=4"
"backendName" = var.backendName
}
}
resource "kubectl_manifest" "netapp_production_backend_config" {
yaml_body = data.template_file.netapp_backend_config.rendered
depends_on =[kubernetes_secret_v1.netapp]
}
apiVersion: trident.netapp.io/v1
kind: TridentBackendConfig
metadata:
name: ${metadataname}
namespace: ${namespace}
spec:
version: 1
storageDriverName: ${storagedrivername}
subscriptionID: ${subscriptionid}
tenantID: ${tenantid}
location: ${location}
serviceLevel: ${servicelevel}
virtualNetwork: ${virtualNetwork}
subnet: ${subnet}
backendName: ${backendName}
nfsMountOptions: ${nfsMountOptions}
credentials:
name: ${secretcredentails}
however this approach results in errors such as
kubectl : error: unable to recognize "testbackend.yaml": no matches for kind "TridentBackendConfig" in version "trident.netapp.io/v1"
when the pipeline is run For some reason if the templated yaml is applied outside of the pipeline the backend then get applied once the helm chart is applied as expected, this is slightly perplexing as I have templated cert-manager CRD in the same manner and have it apply in the cluster without error
The documents that I have found most useful in terms of the backend configuration have been https://docs.microsoft.com/en-us/azure/aks/azure-netapp-files https://netapp-trident.readthedocs.io/en/latest/kubernetes/operations/tasks/backends/anf.html
Thank you for the instructions link on the backend I will be sure to take a look
in terms of the
helm repo add netapp-trident https://netapp.github.io/trident-helm-chart
I have tried this with the terraform
resource "helm_release" "netapphelm" {
depends_on = [
kubernetes_namespace.netapp,
kubectl_manifest.deploy_trident_orchestrator_crd
]
name = "netapp-trident-operator"
namespace = kubernetes_namespace.netapp.metadata[0].name
repository = "https://netapp.github.io/trident-helm-chart"
chart = "trident-operator"
}
however this didn't seem to work, I will double check this tomorrow however can you confirm if the params ae correct for the pulling of the chart?
also when speficing a backend i noticed i got the erorr
Warning Failed 64s trident-crd-controller Failed to create backend: problem initializing storage driver 'azure-netapp-files': error initializing azure-netapp-files SDK client. azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/providers/Microsoft.ResourceGraph/res
ources?api-version=2021-03-01: StatusCode=401 -- Original Error: adal: Refresh request failed. Status Code = '401'. Response body: {"error":"invalid_client","error_description":"AADSTS7000215: Invalid client secret provided. Ensure the secret being sent in the request is the client secret value, not the client secret ID, for a secret added to ap
p '{ommited}'.\r\nTrace ID: {ommited}\r\nCorrelation ID: {ommited}\r\nTimestamp: 2022-02-14 09:42:35Z","error_codes":[7000215],"timestamp":"2022-02-14 09:42:35Z","trace_id":"{ommited}","correlation_id":"{ommited}
","error_uri":"https://login.microsoftonline.com/error?code=7000215"} Endpoint https://login.microsoftonline.com/{ommited}/oauth2/token?api-version=1.0; Error(s) after resource discovery: no capacity pools found for storage pool netappbackendcsi_pool
Does the storage pool get created via the CSI or does it need to exist prior? Would this be the name of the capacity pool in azure net app files?
The error that you don't have the TridentBackendConfig spec is mostly caused by the fact that the custom resource is defined by the operator. This happens after a successful deploy using the Helm Chart. Maybe you need to delay the creation of a backend by a few seconds/minutes?
The parameters of the chart look ok to me. Your second error's probably caused by interchanging the client ID and secret from Azure, as it states.
@balaramesh yeah that's makes sense I did try this technique via
resource "time_sleep" "wait_30_seconds" {
depends_on = [helm_release.netapphelm]
create_duration = "240s"
}
unfortunately I recived the same error :(
I think you should give a larger wait-time a try, or try to create a backend post-deploy (you could do this by running a command to get the version of Trident)
@balaramesh i can try 10 minutes maybe, tried 6 minutes but got the same error? is the command for the version of trident
kubectl describe torc trident
Yes. You should look at the status of the torc
object. When it reports itself as "Installed", you are ready to create a backend.
@balaramesh tried a 10 minute timmer this time sadly i got the same result :(
╷
│ Error: netapp/nett-app-backend failed to create kubernetes rest client for update of resource: resource [trident.netapp.io/v1/TridentBackendConfig] isn't valid for cluster, check the APIVersion and Kind fields are valid
│
│ with module.Netapp.kubectl_manifest.netapp_production_backend_config,
│ on Netapp_module\helm.tf line 92, in resource "kubectl_manifest" "netapp_production_backend_config":
│ 92: resource "kubectl_manifest" "netapp_production_backend_config" {
│
╵
The helm chart link seems to work fine thank you for that
@dc232 were you ever able to find a solution for this?
Hi @ryangrush I did in the end it was a bit of hack as i needed to create a service account user, i also found that because of a bug in Linux nfs4 doesn't mount natively so i managed to get it working with nfs3 i'll post the terraform module for this on my gitub for people to use in thier CI/CD pipelines
@ryangrush
have a look at
https://github.com/dc232/Terraform-Trident-Config
i haven't run
terraform validate
against it yet as its a little tricky but should get the job done with a few tweaks
@dc232 thank you for uploading the repo! There's a scarce few resources online for this.
I was able to manually install Trident last week, but I'm still having problems getting it to work with Terraform. This part doesn't seem to pass validation (at least not in TF v1.1.6), do you remember if that was needed?
Hi @ryangrush so the idea in that line is to render out the YAML in memory so i was trying to get the raw output from the get request as that was the file that was needed, when you perform the validate what's the error that you get, I have updated the repo slightly to remove the empty depends on
Hi @ryangrush added a fix to it as it was using the wrong resource type, try now
@dc232 I was able to use data.http.crds.response_body
now, thanks. I also noticed a small syntax error here btw.
I saw you had run into something similar back in February, but I kept running into this error -
Error: tridentorchestrators.trident.netapp.io failed to create kubernetes rest client for update of resource: Get "http://localhost/api?timeout=32s": dial tcp [::1]:80: connect: connection refused
│
│ with module.netapp.kubectl_manifest.deploy_trident_orchestrator_crd,
│ on modules/netapp/main.tf line 72, in resource "kubectl_manifest" "deploy_trident_orchestrator_crd":
│ 72: resource "kubectl_manifest" "deploy_trident_orchestrator_crd" {
I did manage to get it working yesterday, but while in the process of adding the changes to a PR and applying it to our "sandbox" AKS cluster it gets stuck on that error again. The key to getting it to work yesterday was adding a kubernetes_persistent_volume_claim
resource to my proof-of-concept.
This is the TF code currently, with some of the other resources removed until I get past this problem. The Azure App for var.azure_client_id
has the API permissions listed in the screenshot.
Do you remember how you resolved that error message? The only thing I can think of is a permissions issue but I'm also quite mentally fatigued at this point too lol.
Hey @ryangrush thanks for pointing out the error I have updated it to make it compatible with the new trident version via https://github.com/NetApp/trident/issues/716
i don't remember exactly how i got working in the past it however i suspect its via the kubectl provider via gavin bunny it has a bit of funny config so when i created this project i did it this way
provider "kubectl" {
host = coalesce(module.MianModule.kown_kube_host, module.MianModule.kube_host)
client_certificate = base64decode(coalesce(module.MianModule.client_certificate_file, module.MianModule.kube_client_certificate))
client_key = base64decode(coalesce(module.MianModule.client_key_file, module.MianModule.kube_client_key))
cluster_ca_certificate = base64decode(coalesce(module.MianModule.cluster_ca_certificate_file, module.MianModule.kube_cluster_ca_cert))
load_config_file = false #this is true by deafult see #https://registry.terraform.io/providers/gavinbunney/kubectl/latest/docs
}
the docs for the provider also suggests you can load the kubeconfig directly so something like
provider "kubectl" {
load_config_file = true
config_path = "~/.kube/config"
config_context = "my-context"
}
i think how i got round the error in the end was by setting
load_config_file = false
in my case
these docs should be able to help with your use case https://registry.terraform.io/providers/gavinbunney/kubectl/latest/docs
The kubernetes_persistent_volume_claim
shouldn't be required if your using the CSI, all that is required is to set the name of the storage class it should create the persistent volume and the claim automatically for you when using using something like helm charts
its only when you creating a single stand alone deployment that a kubernetes_persistent_volume_claim would be required
Hope this helps, I feel your pain if you want an more assistance let me know
@dc232 ok thanks, I'll look into the kubectl provider angle.
One of the few differences between the PoC I stood up yesterday and integrating it into our main Terraform repo is that some providers are defined elsewhere, so that could be inline with the kubectl theory. Also its using a dedicated TF Azure App for it's identity, I was using my personal Azure user as identity for the PoC.
Thanks for the code snippets and link.
@dc232 it looks like load_config_file = false
and defining the provider "kubectl"
block was key in getting it to play nice with our other Terraform code. I think I've managed to finally get it working, thanks again for everything!
It does look like its still dependent on the kubernetes_persistent_volume_claim
resource being defined however. I tried removing that resource and just referencing the storage class but it says 0/1 nodes are available: 1 persistentvolumeclaim "storage-class-netapp-pr" not found.
Here is my TF code for what its worth -
volume {
name = "netapp-pr"
persistent_volume_claim {
claim_name = "storage-class-netapp-pr"
}
}
volume_mount {
name = "netapp-pr"
read_only = false
mount_path = "/data-netapp-pr/"
}