Speech Input

A compact speech-to-text input component with a provider-agnostic transcription adapter. Ships with a Web Speech API adapter for demos; users supply their own for ElevenLabs Scribe, Deepgram, or other providers.

SpeechInput is provider-agnostic — pass a TranscriptionAdapter wired to ElevenLabs, Deepgram, OpenAI, or any other backend. This demo has no STT hooked up.

Installation

npx shadcn-svelte@latest add https://sv11.ui.twango.dev/r/speech-input.json

Usage

<script lang="ts">
	import { SpeechInput } from "$lib/registry/ui/speech-input";
</script>
 
<SpeechInput />

<SpeechInput> requires an adapter prop — see Providers for the interface and Adapters for provider recipes.

Examples

Basic Usage

Compose SpeechInput with the record button, preview, and cancel button. Pass any object that matches TranscriptionAdapter.

<script lang="ts">
	import * as SpeechInput from "$lib/registry/ui/speech-input";
	import type { TranscriptionAdapter } from "$lib/registry/ui/speech-input";
 
	const adapter: TranscriptionAdapter = createMyAdapter(/* ... */);
</script>
 
<SpeechInput.Root
	{adapter}
	onChange={(data) => console.log(data.transcript)}
	onStop={(data) => console.log("Final:", data.transcript)}
>
	<SpeechInput.RecordButton />
	<SpeechInput.Preview placeholder="Start speaking..." />
	<SpeechInput.CancelButton />
</SpeechInput.Root>

With Form Input

Use onStop to append the committed transcript onto an external text field.

<script lang="ts">
	import * as SpeechInput from "$lib/registry/ui/speech-input";
	import type { TranscriptionAdapter } from "$lib/registry/ui/speech-input";
 
	const adapter: TranscriptionAdapter = createMyAdapter(/* ... */);
	let value = $state("");
</script>
 
<div class="flex items-center gap-2">
	<input bind:value class="flex-1 rounded border px-3 py-2" />
	<SpeechInput.Root {adapter} onStop={(data) => (value = `${value} ${data.transcript}`.trim())}>
		<SpeechInput.RecordButton />
		<SpeechInput.Preview />
		<SpeechInput.CancelButton />
	</SpeechInput.Root>
</div>

Reversed Layout

Child order is the layout order — put the cancel button first if you want it to lead.

<SpeechInput.Root {adapter}>
	<SpeechInput.CancelButton />
	<SpeechInput.Preview />
	<SpeechInput.RecordButton />
</SpeechInput.Root>

Minimal (Record Button Only)

Drop the preview and cancel slots for an icon-only recorder; the transcript is still delivered via onStop.

<SpeechInput.Root {adapter} onStop={(data) => console.log(data.transcript)}>
	<SpeechInput.RecordButton />
</SpeechInput.Root>

Custom Placeholder

SpeechInputPreview shows its placeholder text until the first partial transcript arrives.

<SpeechInput.Root {adapter}>
	<SpeechInput.RecordButton />
	<SpeechInput.Preview placeholder="Say something..." />
	<SpeechInput.CancelButton />
</SpeechInput.Root>

Using the Hook

useSpeechInput() reads the context set up by SpeechInput.Root, so child components can render their own UI against the shared state.

<script lang="ts">
	import { useSpeechInput } from "$lib/registry/ui/speech-input";
 
	const state = useSpeechInput();
</script>
 
<p>
	Status: {state.error
		? `Error: ${state.error}`
		: state.isConnecting
			? "Connecting"
			: state.isConnected
				? "Recording"
				: "Idle"}
</p>
<p>Transcript: {state.transcript}</p>

API Reference

Prop	Type	Default	Description
`adapter`	`TranscriptionAdapter`	—	STT backend bridge that owns the transcription session. Conforms to `TranscriptionAdapter`.
`size?`	`ButtonSize`	`"default"`	Shared size applied to `SpeechInputRecordButton` and `SpeechInputCancelButton` via context.
`onStart?`	`(data: SpeechInputData) => void`	—	Fired once the adapter reports the connection is ready for audio.
`onStop?`	`(data: SpeechInputData) => void`	—	Fired when the user stops recording. Receives a snapshot of the transcript — any in-flight partial is preserved.
`onCancel?`	`(data: SpeechInputData) => void`	—	Fired when the user cancels recording. Receives the snapshot taken before partial + committed state is cleared.
`onChange?`	`(data: SpeechInputData) => void`	—	Fired on every partial or committed transcript update.
`onError?`	`(error: Error) => void`	—	Fired when the adapter surfaces an error or `start()` rejects.
`children?`	`Snippet`	—	Compound children — typically `SpeechInputRecordButton`, `SpeechInputPreview`, and `SpeechInputCancelButton` in any order.
`ref?`	`HTMLDivElement \| null`	`$bindable(null)`	Bindable ref to the root `<div>` element.

Notes

The component is a compound primitive — SpeechInput.Root wires context and the adapter; SpeechInputRecordButton, SpeechInputPreview, and SpeechInputCancelButton read that context. Sub-components must be rendered inside a root or their useSpeechInput() call will throw.
The adapter reference is stored as a plain private field (not reactive) — swapping adapters mid-recording is unsupported and takes effect on the next start().
An internal request-id invalidates late callbacks, so a stop() or cancel() during connection reliably wins even if the adapter's onConnect fires after.
stop() preserves the in-flight partial transcript and forwards it to onStop; cancel() clears partial + committed state before firing onCancel. Choose based on whether the user is committing or discarding the utterance.
Teardown uses onDestroy rather than $effect cleanup so dev-mode HMR and parent re-renders do not cancel an active recording.
The record button is disabled while state.isConnecting; the cancel button is inert unless state.isConnected.

Shimmering Text Transcript Viewer