WhisperLive
WhisperLive copied to clipboard
String/Bytes Type Mismatch in WebSocket Communication
There is a type mismatch between the server's expected data type and the client's sent data type when handling the end-of-audio signal.
Issue
The server's get_audio_from_websocket() method expects binary data and checks for b"END_OF_AUDIO":
def get_audio_from_websocket(self, websocket):
frame_data = websocket.recv()
if frame_data == b"END_OF_AUDIO": # Expects bytes
return False
return np.frombuffer(frame_data, dtype=np.float32)
However, the ServeClientBase.disconnect() method sends a string message:
def disconnect(self):
self.websocket.send(json.dumps({ # Sends string
"uid": self.client_uid,
"message": self.DISCONNECT
}))
This causes a "bytes-like object is required, not 'str'" error when the server tries to process the disconnect message.
Steps to Reproduce
- Start WhisperLive server
- Connect client
- Send audio data
- Try to disconnect client
- Observe error in server logs: "Unexpected error: a bytes-like object is required, not 'str'"
Expected Behavior
The server should either:
- Accept string messages for control signals like disconnect
- Or document that all WebSocket communication must be binary data
Suggested Fix
Either:
Modify disconnect() to send binary data:
def disconnect(self):
self.websocket.send(b"END_OF_AUDIO")
Or modify get_audio_from_websocket() to handle both binary and string data:
def get_audio_from_websocket(self, websocket):
frame_data = websocket.recv()
if isinstance(frame_data, str):
try:
control_msg = json.loads(frame_data)
if control_msg.get("message") == self.DISCONNECT:
return False
except json.JSONDecodeError:
pass
elif frame_data == b"END_OF_AUDIO":
return False
return np.frombuffer(frame_data, dtype=np.float32)