lxd icon indicating copy to clipboard operation
lxd copied to clipboard

Inconsistency in unix-hotplug when multiple devices share product/vendor IDs

Open jonathan-conder opened this issue 1 year ago • 14 comments

Required information

  • Distribution: Ubuntu
  • Distribution version: 24.04.1
  • The output of "snap list --all lxd core20 core22 core24 snapd":
Name    Version         Rev    Tracking       Publisher   Notes
core20  20240705        2379   latest/stable  canonical✓  base
core22  20240904        1621   latest/stable  canonical✓  base
lxd     5.21.2-2f4ba6b  30131  5.21/stable    canonical✓  -
  • The output of "lxc info" or if that fails:
config: {}
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- backup_compression
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
- storage_volumes_created_at
- cpu_hotplug
- projects_networks_zones
- network_txqueuelen
- cluster_member_state
- instances_placement_scriptlet
- storage_pool_source_wipe
- zfs_block_mode
- instance_generation_id
- disk_io_cache
- amd_sev
- storage_pool_loop_resize
- migration_vm_live
- ovn_nic_nesting
- oidc
- network_ovn_l3only
- ovn_nic_acceleration_vdpa
- cluster_healing
- instances_state_total
- auth_user
- security_csm
- instances_rebuild
- numa_cpu_placement
- custom_volume_iso
- network_allocations
- storage_api_remote_volume_snapshot_copy
- zfs_delegate
- operations_get_query_all_projects
- metadata_configuration
- syslog_socket
- event_lifecycle_name_and_project
- instances_nic_limits_priority
- disk_initial_volume_configuration
- operation_wait
- cluster_internal_custom_volume_copy
- disk_io_bus
- storage_cephfs_create_missing
- instance_move_config
- ovn_ssl_config
- init_preseed_storage_volumes
- metrics_instances_count
- server_instance_type_info
- resources_disk_mounted
- server_version_lts
- oidc_groups_claim
- loki_config_instance
- storage_volatile_uuid
- import_instance_devices
- instances_uefi_vars
- instances_migration_stateful
- container_syscall_filtering_allow_deny_syntax
- access_management
- vm_disk_io_limits
- storage_volumes_all
- instances_files_modify_permissions
- image_restriction_nesting
- container_syscall_intercept_finit_module
- device_usb_serial
- network_allocate_external_ips
- explicit_trust_token
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
auth_user_name: ****
auth_user_method: unix
environment:
  addresses: []
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    ****
    -----END CERTIFICATE-----
  certificate_fingerprint: ****
  driver: lxc | qemu
  driver_version: 6.0.0 | 8.2.1
  instance_types:
  - container
  - virtual-machine
  firewall: nftables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    idmapped_mounts: "true"
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    uevent_injection: "true"
    unpriv_fscaps: "true"
  kernel_version: 6.8.0-45-generic
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Ubuntu
  os_version: "24.04"
  project: default
  server: lxd
  server_clustered: false
  server_event_mode: full-mesh
  server_name: ****
  server_pid: 3073
  server_version: 5.21.2
  server_lts: true
  storage: dir
  storage_version: "1"
  storage_supported_drivers:
  - name: dir
    version: "1"
    remote: false
  - name: lvm
    version: 2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.48.0
    remote: false
  - name: powerflex
    version: 1.16 (nvme-cli)
    remote: true
  - name: zfs
    version: 2.2.2-0ubuntu9
    remote: false
  - name: btrfs
    version: 5.16.2
    remote: false
  - name: ceph
    version: 17.2.7
    remote: true
  - name: cephfs
    version: 17.2.7
    remote: true
  - name: cephobject
    version: 17.2.7
    remote: true

Issue description

I have a couple of USB cameras, which I tried adding to a container using unix-hotplug devices. On my physical machine, these create devices /dev/video[0-3], but only /dev/video[1-2] are mounted in the container. If I unplug the external camera and plug it back in, /dev/video3 also appears (as do /dev/bus/usb/003/014 and /dev/snd/controlC3). So it seems like hotplugging adds every device which matches the product/vendor ID, but starting the container only adds one.

Steps to reproduce

  1. Plug in external camera
  2. Create a container and add the cameras:
$ ls -l /dev/video*
crw-rw----+ 1 root video 81, 0 Oct 19 05:29 /dev/video0
crw-rw----+ 1 root video 81, 1 Oct 19 05:29 /dev/video1
crw-rw----+ 1 root video 81, 2 Oct 19 05:34 /dev/video2
crw-rw----+ 1 root video 81, 3 Oct 19 05:34 /dev/video3
$ lxc launch ubuntu:24.04 u1
$ lxc config device add u1 internal-camera unix-hotplug productid=**** vendorid=**** required=false
$ lxc config device add u1 external-camera unix-hotplug productid=**** vendorid=**** required=false
$ lxc exec u1 -- sh -c 'ls -l /dev/video*'
crw-rw---- 1 root root 81, 1 Oct 18 16:39 /dev/video1
crw-rw---- 1 root root 81, 2 Oct 18 16:39 /dev/video2
  1. Unplug external camera
  2. Check that the devices have gone:
$ ls -l /dev/video*
crw-rw----+ 1 root video 81, 0 Oct 19 05:29 /dev/video0
crw-rw----+ 1 root video 81, 1 Oct 19 05:29 /dev/video1
$ lxc exec u1 -- sh -c 'ls -l /dev/video*'
crw-rw---- 1 root root 81, 1 Oct 18 16:39 /dev/video1
  1. Plug in external camera
$ ls -l /dev/video*
crw-rw----+ 1 root video 81, 0 Oct 19 05:29 /dev/video0
crw-rw----+ 1 root video 81, 1 Oct 19 05:29 /dev/video1
crw-rw----+ 1 root video 81, 2 Oct 19 05:41 /dev/video2
crw-rw----+ 1 root video 81, 3 Oct 19 05:41 /dev/video3
$ lxc exec u1 -- sh -c 'ls -l /dev/video*'
crw-rw---- 1 root root  81, 1 Oct 18 16:39 /dev/video1
crw-rw---- 1 root video 81, 2 Oct 18 16:41 /dev/video2
crw-rw---- 1 root video 81, 3 Oct 18 16:41 /dev/video3

The devices which initially show up seem to be /dev/video1 and /dev/video2 consistently. My preferred fix would be to add all matching devices right away, rather than restricting hotplug events to one device per product/vendor ID.

jonathan-conder avatar Oct 13 '24 21:10 jonathan-conder

Another (separate but related) issue: already plugged in devices have root:root ownership, whereas hotplugged devices have root:video ownership. Here is what I see after replugging my external camera:

$ lxc shell u1
root@u1:~# ls -l /dev/video*
crw-rw---- 1 root root  81, 1 Oct 14 01:46 /dev/video1
crw-rw---- 1 root video 81, 2 Oct 14 01:47 /dev/video2
crw-rw---- 1 root video 81, 3 Oct 14 01:47 /dev/video3
root@u1:~# udevadm info --name /dev/video1
P: /devices/pci0000:00/0000:00:08.1/0000:36:00.3/usb1/1-4/1-4:1.0/video4linux/video1
M: video1
R: 1
U: video4linux
D: c 81:1
N: video1
L: 0
E: DEVPATH=/devices/pci0000:00/0000:00:08.1/0000:36:00.3/usb1/1-4/1-4:1.0/video4linux/video1
E: DEVNAME=/dev/video1
E: MAJOR=81
E: MINOR=1
E: SUBSYSTEM=video4linux

root@u1:~# udevadm info --name /dev/video2
P: /devices/pci0000:00/0000:00:08.1/0000:36:00.4/usb3/3-1/3-1.4/3-1.4:1.0/video4linux/video2
M: video2
R: 2
U: video4linux
D: c 81:2
N: video2
L: 0
S: v4l/by-path/pci-0000:36:00.4-usbv2-0:1.4:1.0-video-index0
S: v4l/by-id/usb-****_****_Webcam_****-video-index0
S: v4l/by-path/pci-0000:36:00.4-usb-0:1.4:1.0-video-index0
E: DEVPATH=/devices/pci0000:00/0000:00:08.1/0000:36:00.4/usb3/3-1/3-1.4/3-1.4:1.0/video4linux/video2
E: DEVNAME=/dev/video2
E: MAJOR=81
E: MINOR=2
E: SUBSYSTEM=video4linux
E: USEC_INITIALIZED=20796201304
E: ID_V4L_VERSION=2
E: ID_V4L_PRODUCT=****Webcam****
E: ID_V4L_CAPABILITIES=:capture:
E: ID_USB_MODEL=****_Webcam_****
E: ID_USB_MODEL_ENC=****\x20Webcam\x20****
E: ID_USB_MODEL_ID=****
E: ID_USB_SERIAL=****_****_Webcam_****
E: ID_USB_SERIAL_SHORT=****
E: ID_USB_VENDOR=****
E: ID_USB_VENDOR_ENC=****
E: ID_USB_VENDOR_ID=****
E: ID_USB_REVISION=0019
E: ID_USB_TYPE=video
E: ID_USB_INTERFACES=:0e0100:0e0200:010100:010200:
E: ID_USB_INTERFACE_NUM=00
E: ID_USB_DRIVER=uvcvideo
E: ID_PATH_WITH_USB_REVISION=pci-0000:36:00.4-usbv2-0:1.4:1.0
E: ID_PATH=pci-0000:36:00.4-usb-0:1.4:1.0
E: ID_PATH_TAG=pci-0000_36_00_4-usb-0_1_4_1_0
E: DEVLINKS=/dev/v4l/by-path/pci-0000:36:00.4-usbv2-0:1.4:1.0-video-index0 /dev/v4l/by-id/usb-****_****_Webcam_****-video-index0>
E: TAGS=:uaccess:seat:
E: CURRENT_TAGS=:uaccess:seat:

jonathan-conder avatar Oct 14 '24 01:10 jonathan-conder

Hi @jonathan-conder,

Thank you for reporting these issues.

With regards to the first issue of hot plugging external USB video devices, can you please try lxc config device add <instance_name> <device_name> unix-char source=<path_on_host> path=<path_on_instance> required=false? Specifying a source path and setting required to false enables hot plugging for Unix character devices. The unix-hotplug device is a bit of a hybrid between unix-block and unix-char, and only captures the first device with a matching product and vendor id.

kadinsayani avatar Oct 14 '24 20:10 kadinsayani

I just read through https://github.com/canonical/workshop/pull/193. Based on my understanding, the reason for using unix-hotplug is to avoid the limit of 10 devices and manually setting the user and group. unix-hotplug and unix-char use the same underlying functions to setup a device with slightly different logic for default file modes (unix-hotplug attempts to retrieve host device mode and if there is no type field, the ownership defaults to root). In either case, it is probably better to manually specify a uid, gid and mode when hot plugging devices. As far as the limit, I am not aware of any hardcoded logic limiting the number of hot pluggable devices but I'll look into this further.

kadinsayani avatar Oct 14 '24 21:10 kadinsayani

Based on my understanding, the reason for using unix-hotplug is to avoid the limit of 10 devices and manually setting the user and group.

Yes, we can query the devices using the LXD API (e.g lxc query /1.0/resources) which provides the productid and vendorid for unix-hotplug but not the device node needed for unix-char. Would prefer to use this API rather than relying on something hacky like ls /dev/video*. Actually I was hoping to configure unix-hotplug devices via subsystem rather than productid/vendorid, but that's a feature request so I'll raise it through another channel.

Manually setting the user and group is OK, but personally I'd prefer to have that handled automatically. Ideally I'd like it to match how things work on the host, i.e. my user isn't even in the video group but still has access to /dev/video0 via ACLs (for unix-hotplug devices this is possible by artificially creating a logind session on seat0, e.g. systemd-run -p PAMName=login -p User=ubuntu --setenv=XDG_SEAT=seat0 sleep infinity).

As far as the limit, I am not aware of any hardcoded logic limiting the number of hot pluggable devices but I'll look into this further.

Not sure what you mean, but I picked the limit of 10 devices for simplicity, not because LXD can't handle more than that. If you're talking about my cameras not showing up, there does seems to be a limit of 1 existing device per unix-hotplug item, see: https://github.com/canonical/lxd/blob/79e117f76e113f2611e874430f8945fafe612ede/lxd/device/unix_hotplug.go#L152

jonathan-conder avatar Oct 14 '24 22:10 jonathan-conder

Yes, we can query the devices using the LXD API (e.g lxc query /1.0/resources) which provides the productid and vendorid for unix-hotplug but not the device node needed for unix-char. Would prefer to use this API rather than relying on something hacky like ls /dev/video*. Actually I was hoping to configure unix-hotplug devices via subsystem rather than productid/vendorid, but that's a feature request so I'll raise it through another channel.

Understood, thanks. Based on the current implementation of unix-hotplug, the resolution for the issue described is also a feature request/improvement. We can continue tracking it here.

Manually setting the user and group is OK, but personally I'd prefer to have that handled automatically. Ideally I'd like it to match how things work on the host, i.e. my user isn't even in the video group but still has access to /dev/video0 via ACLs (for unix-hotplug devices this is possible by artificially creating a logind session on seat0, e.g. systemd-run -p PAMName=login -p User=ubuntu --setenv=XDG_SEAT=seat0 sleep infinity).

It is handled automatically, the logic is just different for unix-hotplug and unix-char.

Not sure what you mean, but I picked the limit of 10 devices for simplicity, not because LXD can't handle more than that.

Your PR mentions the following:

For now we just expose /dev/video[0-9] as unix-char devices. This also supports hotplugging but is limited to 10 devices and requires manually setting the device user and group to 1000.

I now understand that the limit is not on our side, sorry for the confusion.

kadinsayani avatar Oct 14 '24 22:10 kadinsayani

It is handled automatically, the logic is just different for unix-hotplug and unix-char.

Sorry, I should rephrase - my preference is that the devices should be owned by root:video. Looks like unix-char is always root:root but for unix-hotplug it depends on when you insert the device.

I now understand that the limit is not on our side, sorry for the confusion.

Oh I see, yeah I could have worded that more clearly.

jonathan-conder avatar Oct 14 '24 22:10 jonathan-conder

Sorry, I should rephrase - my preference is that the devices should be owned by root:video. Looks like unix-char is always root:root but for unix-hotplug it depends on when you insert the device.

Thanks for clarifying!

kadinsayani avatar Oct 14 '24 22:10 kadinsayani

but for unix-hotplug it depends on when you insert the device.

@kadinsayani why is this?

tomponline avatar Oct 18 '24 14:10 tomponline

@jonathan-conder with both cameras plugged in please provide output of lxc query /1.0/resources.

tomponline avatar Oct 18 '24 14:10 tomponline

@jonathan-conder can we get reproducer steps for the statement :

but for unix-hotplug it depends on when you insert the device

tomponline avatar Oct 18 '24 14:10 tomponline

with both cameras plugged in please provide output of lxc query /1.0/resources.

Here is the usb section, let me know if you need more than that:

{
	"usb": {
		"devices": [
			{
				"bus_address": 1,
				"device_address": 2,
				"interfaces": [
					{
						"class": "Video",
						"class_id": 14,
						"driver": "uvcvideo",
						"driver_version": "1.1.1",
						"number": 0,
						"subclass": "Video Control",
						"subclass_id": 1
					},
					{
						"class": "Video",
						"class_id": 14,
						"driver": "uvcvideo",
						"driver_version": "1.1.1",
						"number": 1,
						"subclass": "Video Streaming",
						"subclass_id": 2
					}
				],
				"product": "USB2.0 HD UVC WebCam",
				"product_id": "b685",
				"serial": "",
				"speed": 480,
				"vendor": "",
				"vendor_id": "2b7e"
			},
			{
				"bus_address": 3,
				"device_address": 5,
				"interfaces": [
					{
						"class": "Video",
						"class_id": 14,
						"driver": "uvcvideo",
						"driver_version": "1.1.1",
						"number": 0,
						"subclass": "Video Control",
						"subclass_id": 1
					},
					{
						"class": "Video",
						"class_id": 14,
						"driver": "uvcvideo",
						"driver_version": "1.1.1",
						"number": 1,
						"subclass": "Video Streaming",
						"subclass_id": 2
					},
					{
						"class": "Audio",
						"class_id": 1,
						"driver": "snd-usb-audio",
						"driver_version": "6.8.0-47-generic",
						"number": 2,
						"subclass": "Control Device",
						"subclass_id": 1
					},
					{
						"class": "Audio",
						"class_id": 1,
						"driver": "snd-usb-audio",
						"driver_version": "6.8.0-47-generic",
						"number": 3,
						"subclass": "Streaming",
						"subclass_id": 2
					}
				],
				"product": "HD Pro Webcam C920",
				"product_id": "0892",
				"serial": "",
				"speed": 480,
				"vendor": "Logitech, Inc.",
				"vendor_id": "046d"
			},
			{
				"bus_address": 3,
				"device_address": 6,
				"interfaces": [
					{
						"class": "Human Interface Device",
						"class_id": 3,
						"driver": "usbhid",
						"driver_version": "6.8.0-47-generic",
						"number": 0,
						"subclass": "",
						"subclass_id": 0
					}
				],
				"product": "",
				"product_id": "82ff",
				"serial": "",
				"speed": 480,
				"vendor": "Texas Instruments, Inc.",
				"vendor_id": "0451"
			},
			{
				"bus_address": 3,
				"device_address": 3,
				"interfaces": [
					{
						"class": "Wireless",
						"class_id": 224,
						"driver": "btusb",
						"driver_version": "0.8",
						"number": 0,
						"subclass": "Radio Frequency",
						"subclass_id": 1
					},
					{
						"class": "Wireless",
						"class_id": 224,
						"driver": "btusb",
						"driver_version": "0.8",
						"number": 1,
						"subclass": "Radio Frequency",
						"subclass_id": 1
					}
				],
				"product": "Bluetooth Radio",
				"product_id": "3571",
				"serial": "",
				"speed": 12,
				"vendor": "IMC Networks",
				"vendor_id": "13d3"
			}
		],
		"total": 4
	}
}

can we get reproducer steps

The steps are the same as for the original issue, but with ls -l instead of ls, to show permissions. I've edited the above steps to reflect that.

jonathan-conder avatar Oct 18 '24 16:10 jonathan-conder

but for unix-hotplug it depends on when you insert the device.

@kadinsayani why is this?

In the Start() function in unix_hotplug.go, we have:

	if device.Subsystem() == "block" {
		err = unixDeviceSetupBlockNum(d.state, d.inst.DevicesPath(), "unix", d.name, d.config, major, minor, device.Devnode(), false, &runConf)
	} else {
		err = unixDeviceSetupCharNum(d.state, d.inst.DevicesPath(), "unix", d.name, d.config, major, minor, device.Devnode(), false, &runConf)
	}

The false argument is a parameter named defaultMode. From the doc comment: "If defaultMode is true or mode is supplied in the device .config then the origin device does not need to be accessed for its file mode."

defaultMode is passed down to a chain of callee functions: unixDeviceSetup() -> UnixDeviceCreate(), where the device mode is sourced. If a mode exists in the device config, it is used; if it's not specified and the defaultMode is false, the source device's mode is used; and finally, it defaults to unixDefaultMode which is 0660.

defaultMode is set to false when the caller function is in unix_hotplug.go and true when the caller function is in unix_common.go which handles unix-char and unix-block devices.

Not sure why we're observing different behaviour based on when the device is plugged in - I will continue investigating.

kadinsayani avatar Oct 18 '24 21:10 kadinsayani

My apologies, I went in the wrong direction with my previous comment. I don't think the file mode setup has anything to do with the GID bug.

Here is my current hypothesis:

The ownership defaults to root (0) when the device config does not contain a UID or GID. When a device is hot plugged (started) or the LXD instance is started, the config's ownership fields are filled. When a device is registered (added), the config must not be filled correctly so ownership is defaulting to root.

I'm searching for more semantics in the code to confirm my hypothesis and then we can get a fix for the ownership issue.

kadinsayani avatar Oct 19 '24 20:10 kadinsayani

As for the first issue regarding hot plugging devices that share product/vendor ID's, I see that loadUnixDevice (used for registering hotplug devices) only returns the first matching device with the subsystem type char or block. Whenever a unix hotplug device event occurs, the handler function is executed for each device, hence why all devices show up after replugging the camera.

We can modify the behaviour of registering/adding unix hotplug devices to add all devices with a matching product/vendor ID, rather than just the first matching device.

kadinsayani avatar Oct 19 '24 20:10 kadinsayani

When a device is hot plugged (started) or the LXD instance is started, the config's ownership fields are filled.

where is this happening?

tomponline avatar Nov 18 '24 12:11 tomponline

When a device is hot plugged (started) or the LXD instance is started, the config's ownership fields are filled.

where is this happening?

This was happening due to inconsistent matching for Register and Start events. https://github.com/canonical/lxd/pull/14375/commits/ae8a3e5600d9f962c09719f7f84e0e8e877710a3 fixes this issue.

kadinsayani avatar Nov 18 '24 16:11 kadinsayani

When a device is hot plugged (started) or the LXD instance is started, the config's ownership fields are filled.

where is this happening?

This was happening due to inconsistent matching for Register and Start events. ae8a3e5 fixes this issue.

Why was this causing different uid/gids?

tomponline avatar Nov 18 '24 16:11 tomponline

When a device is hot plugged (started) or the LXD instance is started, the config's ownership fields are filled.

where is this happening?

This was happening due to inconsistent matching for Register and Start events. ae8a3e5 fixes this issue.

Why was this causing different uid/gids?

This wasn't actually causing the different uid/gids - this was causing inconsistent device matches.

kadinsayani avatar Nov 18 '24 16:11 kadinsayani

When a device is hot plugged (started) or the LXD instance is started, the config's ownership fields are filled.

where is this happening?

This was happening due to inconsistent matching for Register and Start events. ae8a3e5 fixes this issue.

Why was this causing different uid/gids?

This wasn't actually causing the different uid/gids - this was causing inconsistent device matches.

So what was the issue that caused the different uid/gids when hotplugged vs start time?

tomponline avatar Nov 18 '24 16:11 tomponline

When a device is hot plugged (started) or the LXD instance is started, the config's ownership fields are filled.

where is this happening?

This was happening due to inconsistent matching for Register and Start events. ae8a3e5 fixes this issue.

Why was this causing different uid/gids?

This wasn't actually causing the different uid/gids - this was causing inconsistent device matches.

So what was the issue that caused the different uid/gids when hotplugged vs start time?

I thought I had a fix in https://github.com/canonical/lxd/pull/14417 but I just tested and I'm still seeing the error after a hotplug event when uid and gid aren't supplied. I'll have to take a closer look and get back to you. Sorry I missed this, it's a bit tricky to test all cases manually.

kadinsayani avatar Nov 18 '24 19:11 kadinsayani

When a device is hot plugged (started) or the LXD instance is started, the config's ownership fields are filled.

where is this happening?

This was happening due to inconsistent matching for Register and Start events. ae8a3e5 fixes this issue.

Why was this causing different uid/gids?

This wasn't actually causing the different uid/gids - this was causing inconsistent device matches.

So what was the issue that caused the different uid/gids when hotplugged vs start time?

I thought I had a fix in #14417 but I just tested and I'm still seeing the error after a hotplug event when uid and gid aren't supplied. I'll have to take a closer look and get back to you. Sorry I missed this, it's a bit tricky to test all cases manually.

https://github.com/canonical/lxd/issues/14426.

kadinsayani avatar Nov 22 '24 23:11 kadinsayani