lxd icon indicating copy to clipboard operation
lxd copied to clipboard

lxd init produces config which fails validation

Open mmokrejs opened this issue 1 year ago • 2 comments

Required information

  • Distribution: Gentoo
  • Distribution version: current
  • The output of "snap list --all lxd core20 core22 core24 snapd": NA
  • The output of "lxc info" or if that fails:
# lxc info
config: {}
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- backup_compression
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
- storage_volumes_created_at
- cpu_hotplug
- projects_networks_zones
- network_txqueuelen
- cluster_member_state
- instances_placement_scriptlet
- storage_pool_source_wipe
- zfs_block_mode
- instance_generation_id
- disk_io_cache
- amd_sev
- storage_pool_loop_resize
- migration_vm_live
- ovn_nic_nesting
- oidc
- network_ovn_l3only
- ovn_nic_acceleration_vdpa
- cluster_healing
- instances_state_total
- auth_user
- security_csm
- instances_rebuild
- numa_cpu_placement
- custom_volume_iso
- network_allocations
- storage_api_remote_volume_snapshot_copy
- zfs_delegate
- operations_get_query_all_projects
- metadata_configuration
- syslog_socket
- event_lifecycle_name_and_project
- instances_nic_limits_priority
- disk_initial_volume_configuration
- operation_wait
- cluster_internal_custom_volume_copy
- disk_io_bus
- storage_cephfs_create_missing
- instance_move_config
- ovn_ssl_config
- init_preseed_storage_volumes
- metrics_instances_count
- server_instance_type_info
- resources_disk_mounted
- server_version_lts
- oidc_groups_claim
- loki_config_instance
- storage_volatile_uuid
- import_instance_devices
- instances_uefi_vars
- instances_migration_stateful
- container_syscall_filtering_allow_deny_syntax
- access_management
- vm_disk_io_limits
- storage_volumes_all
- instances_files_modify_permissions
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
auth_user_name: root
auth_user_method: unix
environment:
  addresses: []
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    MIICDzCCAZWgAwIBAgIQEXV0oidWpjtZIBbRNDFAYjAKBggqhkjOPQQDAzA4MRww
    GgYDVQQKExNsaW51eGNvbnRhaW5lcnMub3JnMRgwFgYDVQQDDA9yb290QGRlbGwt
    ZTU1ODAwHhcNMjMwOTEyMDkyNjQ2WhcNMzMwOTA5MDkyNjQ2WjA4MRwwGgYDVQQK
    ExNsaW51eGNvbnRhaW5lcnMub3JnMRgwFgYDVQQDDA9yb290QGRlbGwtZTU1ODAw
    djAQBgcqhkjOPQIBBgUrgQQAIgNiAAR+0So/ESK5qNqE0Pnf+6esB3a+sSB/k6gg
    zWPw3u5ibMsI6SzOnHk791PBxFj7XOczJKJiXkBOsy/yszYWgK9vL184mWAzCMZu
    BBAl5fPotnDKqodIA/Ekqa/gtXVkW1ijZDBiMA4GA1UdDwEB/wQEAwIFoDATBgNV
    HSUEDDAKBggrBgEFBQcDATAMBgNVHRMBAf8EAjAAMC0GA1UdEQQmMCSCCmRlbGwt
    ZTU1ODCHBH8AAAGHEAAAAAAAAAAAAAAAAAAAAAEwCgYIKoZIzj0EAwMDaAAwZQIx
    AOtnEW/8f+MwmRs6mzVJWuh5fhf20TCcVMUB61JLu/EGCzKfB36EACVeKwqmnD6y
    ZwIwYBEu7Nzyb8nWL9Q3jcsa/lf9eeJjGkiUW67gs0n6qq6C1Biy6BAN7BZVo+me
    ywbj
    -----END CERTIFICATE-----
  certificate_fingerprint: 7fd61a6e356536f2b16e529a0edd944d4298e653c77e06dfca1a308a4c343ce8
  driver: lxc
  driver_version: 6.0.0
  instance_types:
  - container
  firewall: xtables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    idmapped_mounts: "true"
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    uevent_injection: "true"
    unpriv_fscaps: "true"
  kernel_version: 6.7.10-gentoo-dist
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Gentoo
  os_version: "2.15"
  project: default
  server: lxd
  server_clustered: false
  server_event_mode: full-mesh
  server_name: vss2
  server_pid: 3682640
  server_version: 5.21.1
  server_lts: true
  storage: btrfs
  storage_version: "6.8"
  storage_supported_drivers:
  - name: dir
    version: "1"
    remote: false
  - name: btrfs
    version: "6.8"
    remote: false
#
  • Kernel version: 6.8.8
  • LXC version: app-containers/lxc-6.0.0-r1:0/1.8::gentoo USE="caps pam seccomp ssl systemd tools -apparmor -examples -io-uring -lto -man (-selinux) -test -verify-sig"
  • LXD version: app-containers/lxd-5.21.1:0/stable::gentoo USE="nls -apparmor -verify-sig"
  • Storage backend in use: ext4

Issue description

# /usr/sbin/lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: 
Do you want to configure a new storage pool? (yes/no) [default=yes]: no
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to create a new local network bridge? (yes/no) [default=yes]: no
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: yes
Name of the existing bridge or host interface: lxdbr0
Would you like the LXD server to be available over the network? (yes/no) [default=no]: yes
Address to bind LXD to (not including port) [default=all]: 
Port to bind LXD to [default=8443]: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes]: 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: yes
config:
  core.https_address: '[::]:8443'
networks: []
storage_pools: []
storage_volumes: []
profiles:
- config: {}
  description: ""
  devices:
    eth0:
      name: eth0
      nictype: bridged
      parent: lxdbr0
      type: nic
  name: default
projects: []
cluster: null

Error: Failed to update profile "default": Device validation failed for "eth0": Cannot use "nictype" property in conjunction with "network" property
#
  1. The message is confusing.
  2. I assumed the configfile will include my existing storage pools and volumes.
# ifconfig  |  grep UP,BROADCAST,RUNNING,MULTICAST
enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
lxdbr0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
veth99be41cf: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
vethbe338fac: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
vethc000f9ef: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
vethce33f53e: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
#

I think iproute or systemd or network-manager renamed eth0 to enp4s0. Is that the source of the problem?

https://discuss.linuxcontainers.org/t/replacing-network-bridge/16344

mmokrejs avatar May 10 '24 14:05 mmokrejs

To make sure we're on the same page, are you configuring a freshly installed lxd or reconfiguring an existing one? If you already have an existing config, could you provide the config of the existing default profile (lxc profile show default)? Also, are you able to provide the network config as shown by ip addr?

Thanks!

MggMuggins avatar May 15 '24 22:05 MggMuggins

Yeah this is a bug as lxd init shouldnt be using nictype to reference a managed network and should instead by using network.

tomponline avatar Jun 17 '24 09:06 tomponline

Could it be that the current configuration in the profile already defines NIC with name eth0 that uses network property?

I assume profile has the following NIC defined:

...
- devices:
    eth0:
      type: nic
      network: lxdbr0

When lxd init is run, it tries to set nictype and parent properties on that NIC, which are also valid within LXD. However, the issue arises because those fields are merged, which fails the profile update as properties nictype and network are mutually exclusive.

We can fix that by using network in lxd init (and in other places), but the issue would remain, just in reverse scenario.

Reproducer:

# Create managed network.
lxc network create lxdbr0

# Configure NIC in default profile.
lxc profile edit default << EOF
name: default
description: Default LXD profile
config: {}
devices:
  eth0:
    type: nic
    network: lxdbr0
EOF

# Try to use "lxd init" to update NIC in default profile.
lxd init --preseed << EOF
profiles:
- name: default
  devices:
    eth0:
      type: nic
      nictype: bridged
      parent: lxdbr0
EOF            
Error: Failed to update profile "default": Device validation failed for "eth0": Cannot use "nictype" property in conjunction with "network" property

MusicDin avatar Sep 16 '24 15:09 MusicDin

Could it be that the current configuration in the profile already defines NIC with name eth0 that uses network property?

What do you get on a fresh system with the OP's reproducer steps?

tomponline avatar Sep 16 '24 16:09 tomponline

No error on the fresh LXD install.

Tested as follows on latest/edge and 5.21/edge:

lxc network create lxdbr0
lxd init # Followed the interactive mode as described by OP

MusicDin avatar Sep 16 '24 16:09 MusicDin

The error provided by OP states that network and nictype options are conflicting, but we always set only nictype and parent (without network property).

MusicDin avatar Sep 16 '24 16:09 MusicDin

Ah yeah I'm happy to see that lxd init is using network option and not parent:

Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: yes
config: {}
networks:
- config:
    ipv4.address: auto
    ipv6.address: auto
  description: ""
  name: lxdbr0
  type: ""
  project: default
storage_pools:
- config:
    size: 5GiB
  description: ""
  name: default
  driver: zfs
storage_volumes: []
profiles:
- config: {}
  description: ""
  devices:
    eth0:
      name: eth0
      network: lxdbr0
      type: nic
    root:
      path: /
      pool: default
      type: disk
  name: default
projects: []
cluster: null

tomponline avatar Sep 17 '24 07:09 tomponline

@mmokrejs when you imported the config from lxd init, was it on a fresh system or did it have existing configuration?

tomponline avatar Sep 17 '24 07:09 tomponline

@tomponline Thank you for your efforts. I don't remember, sorry.

mmokrejs avatar Sep 17 '24 10:09 mmokrejs

OK, lets close this for now, but if you encounter this again let us know.

As @MusicDin mentioned, it sounds like there was already conflicting historical config on your LXD install.

tomponline avatar Sep 17 '24 10:09 tomponline

Yeah but please, ad some more Info/Debug messages along the path. And I bet you can reproduce it after some attempts mimicking what a newbie would do.

mmokrejs avatar Sep 17 '24 11:09 mmokrejs

Ah yeah I'm happy to see that lxd init is using network option and not parent:

True, but only if creating a new bridged network: https://github.com/canonical/lxd/blob/72bd63a2e22e52f3c28873ce300d1a1ebd4a1c4b/lxd/main_init_interactive.go#L455-L460

Otherwise, if an existing network is being used, the following is added to the profile: https://github.com/canonical/lxd/blob/72bd63a2e22e52f3c28873ce300d1a1ebd4a1c4b/lxd/main_init_interactive.go#L363-L373

MusicDin avatar Sep 17 '24 12:09 MusicDin

Ah yeah I'm happy to see that lxd init is using network option and not parent:

True, but only if creating a new bridged network:

https://github.com/canonical/lxd/blob/72bd63a2e22e52f3c28873ce300d1a1ebd4a1c4b/lxd/main_init_interactive.go#L455-L460

Otherwise, if an existing network is being used, the following is added to the profile:

https://github.com/canonical/lxd/blob/72bd63a2e22e52f3c28873ce300d1a1ebd4a1c4b/lxd/main_init_interactive.go#L363-L373

ah that is correct if its an existing bridge that isnt a managed network.

Does it differentiate between an existing managed network and existing unmanaged network?

tomponline avatar Sep 17 '24 13:09 tomponline

No, it defaults to macvlan and checks for presence of /sys/class/net/%s/bridge for bridged networks. Correct me if I'm wrong, but we detect managed networks by looking into database for its presence?

MusicDin avatar Sep 17 '24 13:09 MusicDin

Correct me if I'm wrong, but we detect managed networks by looking into database for its presence?

Yeah, feels like we should use "network" for an existing managed bridge network in the networks list.

tomponline avatar Sep 17 '24 13:09 tomponline