Performance bottleneck in audio publishing with LiveKit Python SDK at scale
Issue Description Problem Description I've discovered a significant performance bottleneck when publishing audio files to LiveKit rooms using the Python SDK. While the SDK works well with a small number of users (5), performance degrades dramatically when scaling to just 25 concurrent users, making it impractical for our target load of 1000 concurrent users.
Expected Behavior Publishing time should scale linearly or at least reasonably with the number of concurrent users Should be able to handle hundreds of concurrent audio publishes with appropriate resource management
Code Example
`async def publish_audio_file(self, file_path): async with self._audio_publish_semaphore: logger.info(f"Acquired audio publishing semaphore for {self.username}") try: start_time = time.time()
# Get audio properties
with wave.open(file_path, 'rb') as wav_reader:
channels = wav_reader.getnchannels()
sample_rate = wav_reader.getframerate()
# Create audio source and track
source = rtc.AudioSource(sample_rate, channels)
track = rtc.LocalAudioTrack.create_audio_track("audio", source)
options = rtc.TrackPublishOptions()
options.source = rtc.TrackSource.SOURCE_MICROPHONE
# Publish the track - THIS IS WHERE THE DELAY OCCURS
publication = await self.room.local_participant.publish_track(track, options)
logger.info(f"Published track {publication.sid}")
publish_time = time.time()
logger.info(f"Published track in {publish_time - start_time} seconds.")
# Read and send audio data
with wave.open(file_path, 'rb') as wav_file:
num_frames = wav_file.getnframes()
all_frames = wav_file.readframes(num_frames)
frame = rtc.AudioFrame.create(
sample_rate=sample_rate,
num_channels=channels,
samples_per_channel=num_frames // channels
)
audio_data = np.frombuffer(frame.data, dtype=np.int16)
frame_data = np.frombuffer(all_frames, dtype=np.int16)
copy_length = min(len(audio_data), len(frame_data))
np.copyto(audio_data[:copy_length], frame_data[:copy_length])
await source.capture_frame(frame)
finally:
logger.info(f"Released audio publishing semaphore")`
Please suggest any changes/improvements I can do here.
Python has GIL and you'd need to use multiprocessing to take advantage of multiple cores.