-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Updated Speechmatics STT integration #4359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
longcw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! could you help to resolve the conflicts with the main branch. and something nit:
| prefer_current_speaker=prefer_current_speaker, | ||
| focus_speakers=focus_speakers if is_given(focus_speakers) else [], | ||
| ignore_speakers=ignore_speakers if is_given(ignore_speakers) else [], | ||
| language=_set(language), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| language=_set(language), | |
| language=language, |
| transcription_config.language = language | ||
|
|
||
| # Prepare the config | ||
| self._config = self._prepare_config(language) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stt.stream() will overwrite the self._config, what if there are multiple streams created for a single stt instance?
| group: List of SpeechFragment objects. | ||
| logger.debug(f"{event} -> {message}") | ||
|
|
||
| async def _handle_partial_segment(self, message: dict[str, Any]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems all these handles including the _send_frames are sync methods, there is no need to add async for them.
| """Endpoint and turn detection handling mode. | ||
| How the STT engine handles the endpointing of speech. If using Pipecat's built-in endpointing, | ||
| then use `TurnDetectionMode.EXTERNAL`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Pipecat -> LiveKit
| language=_set(language), | ||
| output_locale=_set(output_locale), | ||
| domain=_set(domain), | ||
| turn_detection_mode=_set(turn_detection_mode), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This and focus_mode do not allow None values.
| @property | ||
| def model(self) -> str: | ||
| return "unknown" | ||
| return str(self._stt_options.turn_detection_mode) if self._stt_options else "UNKNOWN" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is usually reserved for STT model names like Whisper, so we really don't have to change it here.
What's Changed?
Deprecation Warning
end_of_utterance_modehas been changed toturn_detection_modeto reduce ambiguityTurnDetectionMode.EXTERNAL(default) for any VAD within LiveKitTurnDetectionMode.ADAPTIVEorTurnDetectionMode.SMART_TURNto use the plugin VAD / turn detection