Azure_Kinect_ROS_Driver [Feature] Support the Microphone Array

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

Mar 13 '20 10:03 bryantaoli

The focus of the ROS node was navigation and manipulation.

If you are interested in Microphone projections, let's have a conversation.

Mar 13 '20 17:03 ooeygui

Yes, I am very interested in Microphone projections. This is because the microphone array can detect the object making the sound while navigating. If we calibrate the position of the sound object, we can use the microphone array signal to locate and then navigate. This is the reason why I bought k4a, but after installing the ros driver, I found that the ros driver does not support microphone array, but I found the SDK support of k4a, so this is just the work that needs to be done on the ros driver software, and I hope Microsoft can update it. Thanks for replying.

Mar 13 '20 22:03 bryantaoli

Can you share if this is this for a personal project, research project or commercial deployment?

Mar 13 '20 23:03 ooeygui

Yes I can. It is a research project. We want to use the k4a to locate and navigate.

Mar 13 '20 23:03 bryantaoli

Windows or Linux?

Mar 13 '20 23:03 ooeygui

Linux, more precisely ROS Melodic

Mar 13 '20 23:03 bryantaoli

@ooeygui, Thanks for taking up this conversation. I would like to strongly second this feature request!

If the Azure Kinect is to be used as a fairly complete 'robot head', access to the microphone array would be valuable (Ubuntu 18.04, ROS Melodic, telerobotics/telepresence research project).

@bryantaoli: I would be very interested if you want to continue the conversation on how to best capture audio.

I have confirmed these so far:

The SDK viewer can read out video and audio at the same time
Audacity can record all 7 channels using pulseaudio, the microphone array is registered on a system level
Audacity can record all 7 channels while the Azure Kinect is connected via ROS
Correct (human) spatialization using the recorded 7 channels is possible

If not officially supported, an audio capture driver (or second ROS node) via pulseaudio might be feasible, and it can probably run next to the Azure Kinect ROS node.

Mar 22 '20 10:03 roelofvandijk

@ooeygui, Thanks for taking up this conversation. I would like to strongly second this feature request!

If the Azure Kinect is to be used as a fairly complete 'robot head', access to the microphone array would be valuable (Ubuntu 18.04, ROS Melodic, telerobotics/telepresence research project).

@bryantaoli: I would be very interested if you want to continue the conversation on how to best capture audio.

I have confirmed these so far:

The SDK viewer can read out video and audio at the same time

Audacity can record all 7 channels using pulseaudio, the microphone array is registered on a system level

Audacity can record all 7 channels while the Azure Kinect is connected via ROS

Correct (human) spatialization using the recorded 7 channels is possible

If not officially supported, an audio capture driver (or second ROS node) via pulseaudio might be feasible, and it can probably run next to the Azure Kinect ROS node.

Yes, I also think it is not difficult to get the audio in Azure Kinect ROS node, but I would like to get the officially support.

Mar 22 '20 10:03 bryantaoli

Just to confirm the ask: You'd like the Azure Kinect to output Audio samples like the audio_common pacage? (https://github.com/ros-drivers/audio_common/blob/master/audio_capture/src/audio_capture.cpp)

Mar 24 '20 17:03 ooeygui

Just to confirm the ask: You'd like the Azure Kinect to output Audio samples like the audio_common pacage? (https://github.com/ros-drivers/audio_common/blob/master/audio_capture/src/audio_capture.cpp)

Yes, I'd like the Azure Kinect to output Audio samples like the audio_common pacage which is a third-party audio development kit that implements audio drivers and related ROS message mechanisms.

Mar 25 '20 00:03 bryantaoli

Just to confirm the ask: You'd like the Azure Kinect to output Audio samples like the audio_common pacage? (https://github.com/ros-drivers/audio_common/blob/master/audio_capture/src/audio_capture.cpp)

I occurred the same problem since the kinect ros driver cannot provide access to Microphone array,which is necessary for my research in linux

Mar 25 '20 00:03 star0w

@ooeygui, Thanks for taking up this conversation. I would like to strongly second this feature request! If the Azure Kinect is to be used as a fairly complete 'robot head', access to the microphone array would be valuable (Ubuntu 18.04, ROS Melodic, telerobotics/telepresence research project). @bryantaoli: I would be very interested if you want to continue the conversation on how to best capture audio. I have confirmed these so far:

The SDK viewer can read out video and audio at the same time Audacity can record all 7 channels using pulseaudio, the microphone array is registered on a system level Audacity can record all 7 channels while the Azure Kinect is connected via ROS Correct (human) spatialization using the recorded 7 channels is possible

If not officially supported, an audio capture driver (or second ROS node) via pulseaudio might be feasible, and it can probably run next to the Azure Kinect ROS node.

hi，I‘m interested in how to use the audio capture driver (or second ROS node) via pulseaudio since the kinect don't have ros node for mic. can u share me with some kind of demo if possible?

Apr 09 '20 02:04 star0w

Hello @star0w, here is a brief recording example: try the python library sounddevice, use sounddevice.query_devices() to get the kinect index, sounddevice.query_devices(kinect_index) for information about the kinect (e.g. sample rate). and then use sounddevice.rec(seconds=5, channels=7, device=kinect_index, blocking=True, sample_rate=48000) to get a recording as numpy array.

If you want to stream the audio, have a look at the audio_common package.

Apr 09 '20 13:04 roelofvandijk

Thank you all for the input on this. As our team owns ROS on Windows and many other ROS solutions, we will take this feedback and fold it into our workstream and prioritize it appropriately. Based on that backlog, the earliest we will be able to start working on it it is May 2020.

Apr 09 '20 17:04 ooeygui

Thank you all for the input on this. As our team owns ROS on Windows and many other ROS solutions, we will take this feedback and fold it into our workstream and prioritize it appropriately. Based on that backlog, the earliest we will be able to start working on it it is May 2020.

Hello there, any progress? :)

Sep 21 '20 08:09 linhan94

Hi @linhan94, No progress has been made on exposing the microphone directly from this ROS node. I have no ETA to share, but would happily accept a PR.

I'm told by the Azure Kinect audio team that the microphone array is accessible as an multichannel audio device using ALSA directly. In this case, it should be accessible using the audio_common package as @roelofvandijk mentioned, but have not had an opportunity to verify or to document.

Sep 21 '20 16:09 ooeygui

Hello @star0w, here is a brief recording example: try the python library sounddevice, use sounddevice.query_devices() to get the kinect index, sounddevice.query_devices(kinect_index) for information about the kinect (e.g. sample rate). and then use sounddevice.rec(seconds=5, channels=7, device=kinect_index, blocking=True, sample_rate=48000) to get a recording as numpy array.

If you want to stream the audio, have a look at the audio_common package.

Is it possible to stream each channel separately for spatialization using the recorded 7 channels. Is there another way for spatialization?

thank you.

Mar 18 '21 22:03 ymohamed08

Hello @youssef266, as far as I remember, sounddevice yields a numpy array containing the 7 separate channels, which you could feed into a spatialization algorithm. For live spatialization, you would have to access the audio using a system-level audio API. You could try using ODAS with the Azure Kinect as documented here or try NAudio.

Also see this issue: https://github.com/microsoft/Azure-Kinect-Sensor-SDK/issues/536

Mar 19 '21 12:03 roelofvandijk

Have a look to: https://github.com/busybeaver42/kv3 It comes with an example for ODAS and kinect azure. It contains the right odas cfg file for the kinect azure mirophone array and the example show how you can use it in parallel together with kinect azure frame stream and opencv rendering.

Jan 22 '24 06:01 busybeaver42

Azure_Kinect_ROS_Driver Azure_Kinect_ROS_Driver copied to clipboard

[Feature] Support the Microphone Array

Azure_Kinect_ROS_Driver
Azure_Kinect_ROS_Driver copied to clipboard