feat: text-to-speech for AI responses #1359

princeaden1 · 2025-09-23T20:14:49Z

feat: text-to-speech for AI responses

Add text-to-speech functionality to AI chat responses with e2e testing.

Changes

TTS Button: Play/stop button for AI messages with state management
Voice Support: Uses browser's Web Speech API with voice selection
UI States: Visual feedback with play/stop icons and tooltips
E2E Tests: Complete test coverage following existing patterns

audio.mov

Summary by cubic

Adds text-to-speech to assistant messages so users can listen to chat replies. Includes a play/stop control and end-to-end tests.

New Features
- Play/stop button on assistant messages with icons and tooltips
- Web Speech API integration with voice selection and support check
- Text cleaning strips markdown, HTML, and Dyad tags before reading
- E2E test mocks speechSynthesis and verifies state changes

graphite-app · 2025-09-23T20:16:50Z

src/hooks/useTextToSpeech.ts

+  const toggle = (
+    text?: string,
+    options?: { rate?: number; pitch?: number; volume?: number },
+  ) => {
+    speechSynthesis.cancel();
+    if (isPlaying) {
+      if (isPaused) {
+        resume();
+      } else {
+        pause();
+      }
+    } else {
+      if (text) {
+        speak(text, options);
+      }
+    }
+  };


There's a logical issue in the toggle function where speechSynthesis.cancel() is called unconditionally at the start, which stops any ongoing speech. This prevents the pause/resume functionality from working correctly, as there's no speech left to pause or resume after cancellation.

Consider restructuring the function to only cancel speech when necessary:

const toggle = ( text?: string, options?: { rate?: number; pitch?: number; volume?: number }, ) => { if (isPlaying) { if (isPaused) { resume(); } else { pause(); } } else { // Only cancel and start new speech when not currently playing speechSynthesis.cancel(); if (text) { speak(text, options); } } };

This way, pause/resume will work as expected while still ensuring clean state when starting new speech.

Suggested change

const toggle = (

text?: string,

options?: { rate?: number; pitch?: number; volume?: number },

) => {

speechSynthesis.cancel();

if (isPlaying) {

if (isPaused) {

resume();

} else {

pause();

}

} else {

if (text) {

speak(text, options);

}

}

};

const toggle = (

text?: string,

options?: { rate?: number; pitch?: number; volume?: number },

) => {

if (isPlaying) {

if (isPaused) {

resume();

} else {

pause();

}

} else {

speechSynthesis.cancel();

if (text) {

speak(text, options);

}

}

};

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

graphite-app · 2025-09-23T20:16:52Z

e2e-tests/text_to_speech.spec.ts

+      speak: (utterance: MockSpeechSynthesisUtterance) => {
+        isPlaying = true;
+        setTimeout(() => utterance.onstart?.(), 10);
+      },


The mock implementation of speechSynthesis.speak() currently sets up the onstart callback but doesn't trigger the onend callback. This creates an incomplete simulation of the TTS lifecycle, as the test can't verify that the functionality properly completes and resets its state.

Consider adding another setTimeout to trigger utterance.onend() after a reasonable delay:

speak: (utterance: MockSpeechSynthesisUtterance) => { isPlaying = true; setTimeout(() => utterance.onstart?.(), 10); setTimeout(() => utterance.onend?.(), 100); // Add this to complete the lifecycle },

This would allow the test to verify both the start and completion states of the TTS functionality.

Suggested change

speak: (utterance: MockSpeechSynthesisUtterance) => {

isPlaying = true;

setTimeout(() => utterance.onstart?.(), 10);

},

speak: (utterance: MockSpeechSynthesisUtterance) => {

isPlaying = true;

setTimeout(() => utterance.onstart?.(), 10);

setTimeout(() => utterance.onend?.(), 100);

},

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

cubic-dev-ai

6 issues found across 3 files

Prompt for AI agents (all 6 issues)


Understand the root cause of the following 6 issues and fix them.


<file name="e2e-tests/text_to_speech.spec.ts">

<violation number="1" location="e2e-tests/text_to_speech.spec.ts:9">
Mock injection added after page load: addInitScript runs only on new document contexts, so the current page won&#39;t be mocked. Move addInitScript before navigation or reload after adding.</violation>
</file>

<file name="src/hooks/useTextToSpeech.ts">

<violation number="1" location="src/hooks/useTextToSpeech.ts:14">
Guard getVoices() to avoid ReferenceError when Web Speech API is unavailable.</violation>

<violation number="2" location="src/hooks/useTextToSpeech.ts:42">
Unconditional cancel in toggle() prevents proper pause/resume; remove the cancel here (speak() already cancels when starting).</violation>

<violation number="3" location="src/hooks/useTextToSpeech.ts:136">
Use ?? so a provided 0 pitch is not overridden.</violation>

<violation number="4" location="src/hooks/useTextToSpeech.ts:137">
Use ?? so volume 0 (mute) is honored.</violation>

<violation number="5" location="src/hooks/useTextToSpeech.ts:200">
Referencing window during render will crash in SSR. Guard with typeof window !== &quot;undefined&quot;.</violation>
</file>

_{React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.}

cubic-dev-ai · 2025-09-23T20:31:14Z

e2e-tests/text_to_speech.spec.ts

+  await po.importApp("minimal");
+
+  // Mock speechSynthesis API
+  await po.page.addInitScript(() => {


Mock injection added after page load: addInitScript runs only on new document contexts, so the current page won't be mocked. Move addInitScript before navigation or reload after adding.

Prompt for AI agents

Address the following comment on e2e-tests/text_to_speech.spec.ts at line 9: <comment>Mock injection added after page load: addInitScript runs only on new document contexts, so the current page won't be mocked. Move addInitScript before navigation or reload after adding.</comment> <file context> @@ -0,0 +1,55 @@ + await po.importApp("minimal"); + + // Mock speechSynthesis API + await po.page.addInitScript(() => { + let isPlaying = false; + </file context>

✅ Addressed in 298da44

cubic-dev-ai · 2025-09-23T20:31:14Z

src/hooks/useTextToSpeech.ts

+  useEffect(() => {
+    return () => {
+      if (utteranceRef.current) {
+        speechSynthesis.cancel();


Unconditional cancel in toggle() prevents proper pause/resume; remove the cancel here (speak() already cancels when starting).

Prompt for AI agents

Address the following comment on src/hooks/useTextToSpeech.ts at line 42: <comment>Unconditional cancel in toggle() prevents proper pause/resume; remove the cancel here (speak() already cancels when starting).</comment> <file context> @@ -0,0 +1,214 @@ + useEffect(() => { + return () => { + if (utteranceRef.current) { + speechSynthesis.cancel(); + } + }; </file context>

✅ Addressed in 298da44

cubic-dev-ai · 2025-09-23T20:31:14Z

src/hooks/useTextToSpeech.ts

+  };
+
+  // Check if TTS is supported
+  const isSupported = "speechSynthesis" in window;


Referencing window during render will crash in SSR. Guard with typeof window !== "undefined".

Prompt for AI agents

Address the following comment on src/hooks/useTextToSpeech.ts at line 200: <comment>Referencing window during render will crash in SSR. Guard with typeof window !== "undefined".</comment> <file context> @@ -0,0 +1,214 @@ + }; + + // Check if TTS is supported + const isSupported = "speechSynthesis" in window; + return { + speak, </file context>

Suggested change

const isSupported = "speechSynthesis" in window;

const isSupported = typeof window !== "undefined" && "speechSynthesis" in window;

✅ Addressed in 298da44

cubic-dev-ai · 2025-09-23T20:31:15Z

src/hooks/useTextToSpeech.ts

+    }
+
+    utterance.rate = options?.rate || 1;
+    utterance.pitch = options?.pitch || 1;


Use ?? so a provided 0 pitch is not overridden.

Prompt for AI agents

Address the following comment on src/hooks/useTextToSpeech.ts at line 136: <comment>Use ?? so a provided 0 pitch is not overridden.</comment> <file context> @@ -0,0 +1,214 @@ + } + + utterance.rate = options?.rate || 1; + utterance.pitch = options?.pitch || 1; + utterance.volume = options?.volume || 1; + </file context>

Suggested change

utterance.pitch = options?.pitch || 1;

utterance.pitch = options?.pitch ?? 1;

cubic-dev-ai · 2025-09-23T20:31:15Z

src/hooks/useTextToSpeech.ts

+  // Load available voices
+  useEffect(() => {
+    const loadVoices = () => {
+      const availableVoices = speechSynthesis.getVoices();


Guard getVoices() to avoid ReferenceError when Web Speech API is unavailable.

Prompt for AI agents

Address the following comment on src/hooks/useTextToSpeech.ts at line 14: <comment>Guard getVoices() to avoid ReferenceError when Web Speech API is unavailable.</comment> <file context> @@ -0,0 +1,214 @@ + // Load available voices + useEffect(() => { + const loadVoices = () => { + const availableVoices = speechSynthesis.getVoices(); + setVoices(availableVoices); + </file context>

Suggested change

const availableVoices = speechSynthesis.getVoices();

const availableVoices = typeof window !== "undefined" && "speechSynthesis" in window ? window.speechSynthesis.getVoices() : [];

✅ Addressed in 298da44

cubic-dev-ai · 2025-09-23T20:31:15Z

src/hooks/useTextToSpeech.ts

+
+    utterance.rate = options?.rate || 1;
+    utterance.pitch = options?.pitch || 1;
+    utterance.volume = options?.volume || 1;


Use ?? so volume 0 (mute) is honored.

Prompt for AI agents

Address the following comment on src/hooks/useTextToSpeech.ts at line 137: <comment>Use ?? so volume 0 (mute) is honored.</comment> <file context> @@ -0,0 +1,214 @@ + + utterance.rate = options?.rate || 1; + utterance.pitch = options?.pitch || 1; + utterance.volume = options?.volume || 1; + + // Event handlers </file context>

Suggested change

utterance.volume = options?.volume || 1;

utterance.volume = options?.volume ?? 1;

✅ Addressed in 298da44

graphite-app · 2025-09-23T22:31:01Z

src/components/chat/ChatMessage.tsx

+                                  <CircleStop className="h-4 w-4 text-green-500" />
+                                  <span className="sm:inline">Stop</span>


The "Stop" text appears to be inconsistently styled compared to the play state. When playing, the text shows Stop, but this will only display on small screens and up. For consistency and better UX, consider making the "Stop" text always visible when the button is in the stop state, since it provides important context to users. Either remove the sm:inline class or ensure the parent span doesn't have a hidden class that might be affecting visibility on mobile devices.

Suggested change

<CircleStop className="h-4 w-4 text-green-500" />

Stop

<CircleStop className="h-4 w-4 text-green-500" />

Stop

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

graphite-app · 2025-09-23T23:01:01Z

src/hooks/useTextToSpeech.ts

+    utterance.onstart = () => {
+      setIsPlaying(true);
+      setIsPaused(false);
+    };


Race condition bug: The isPlaying state is set asynchronously in the onstart callback, but the ChatMessage component checks this state synchronously when deciding whether to start or stop playback. If a user clicks the TTS button rapidly before the onstart callback fires, the component will think speech isn't playing and call toggle() again instead of stop(), causing speech to restart rather than stop. This creates confusing UX where the button appears to be malfunctioning. Fix by either: 1) Setting isPlaying = true immediately in the speak() function before calling speechSynthesis.speak(), or 2) Using speechSynthesis.speaking property to check current speech state more reliably, or 3) Adding a 'starting' state to prevent double-clicks during the async startup period.

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

wwwillchen · 2025-09-26T17:05:13Z

thanks @princeaden1. i pulled it down and it does work, but I'm not super sure about the use case - is it for accessibility or some other reason? in general - i'd recommend opening an issue for new feature requests to discuss the use case and see whether it's aligned with the projects direction. for example, right now it's reading the content in the think tags which can be quite verbose and somewhat confusing for users IMO.

princeaden1 · 2025-09-29T09:57:23Z

@wwwillchen Thanks for taking time to review the PR🙏. This PR is for both accessibility and to let devs listen to responses if they don’t want to read through chats (similar to ChatGPT’s feature). The think tags can be also removed from the spoken output. I'll also make sure to open an issue first for future feature requests to be first discussed.

feat: text-to-speech for AI responses

fe0a21d

graphite-app bot reviewed Sep 23, 2025

View reviewed changes

cubic-dev-ai bot reviewed Sep 23, 2025

View reviewed changes

fix

298da44

graphite-app bot reviewed Sep 23, 2025

View reviewed changes

refactor: stop button

4105cb1

graphite-app bot reviewed Sep 23, 2025

View reviewed changes

fix warning

36f00d2

	const isSupported = "speechSynthesis" in window;
	const isSupported = typeof window !== "undefined" && "speechSynthesis" in window;

	utterance.pitch = options?.pitch \|\| 1;
	utterance.pitch = options?.pitch ?? 1;

	const availableVoices = speechSynthesis.getVoices();
	const availableVoices = typeof window !== "undefined" && "speechSynthesis" in window ? window.speechSynthesis.getVoices() : [];

	utterance.volume = options?.volume \|\| 1;
	utterance.volume = options?.volume ?? 1;

		<CircleStop className="h-4 w-4 text-green-500" />
		<span className="sm:inline">Stop</span>

feat: text-to-speech for AI responses #1359

Are you sure you want to change the base?

feat: text-to-speech for AI responses #1359

Conversation

princeaden1 commented Sep 23, 2025 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat: text-to-speech for AI responses

Changes

Summary by cubic

Uh oh!

graphite-app bot Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

graphite-app bot Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

graphite-app bot Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

graphite-app bot Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

wwwillchen commented Sep 26, 2025

Uh oh!

princeaden1 commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

princeaden1 commented Sep 23, 2025 •

edited by cubic-dev-ai bot

Loading

cubic-dev-ai bot Sep 23, 2025 •

edited

Loading

cubic-dev-ai bot Sep 23, 2025 •

edited

Loading

cubic-dev-ai bot Sep 23, 2025 •

edited

Loading

cubic-dev-ai bot Sep 23, 2025 •

edited

Loading

cubic-dev-ai bot Sep 23, 2025 •

edited

Loading

cubic-dev-ai bot Sep 23, 2025 •

edited

Loading