Synchronized transcript with audio playback. Highlights each word as it's spoken, supports seeking via scrub bar, and exposes a compound-component API with provider-agnostic character-alignment data.
Installation
npx shadcn-svelte@latest add https://sv11.ui.twango.dev/r/transcript-viewer.json Usage
<script lang="ts">
import { TranscriptViewer } from "$lib/registry/ui/transcript-viewer";
</script>
<TranscriptViewer /> <script lang="ts">
import * as TranscriptViewer from "$lib/registry/ui/transcript-viewer";
import type { CharacterAlignment } from "$lib/registry/ui/transcript-viewer";
let {
audioSrc,
alignment,
}: {
audioSrc: string;
alignment: CharacterAlignment;
} = $props();
</script>
<TranscriptViewer.Root {audioSrc} {alignment}>
<TranscriptViewer.Audio />
<TranscriptViewer.Words />
<div class="flex items-center gap-3">
<TranscriptViewer.PlayPauseButton />
<TranscriptViewer.ScrubBar />
</div>
</TranscriptViewer.Root> Examples
Custom Audio Type
Pass audioType when the source is not MP3 so the browser picks the right decoder.
<TranscriptViewer.Root {audioSrc} {alignment} audioType="audio/wav">
<TranscriptViewer.Audio />
<TranscriptViewer.Words />
<TranscriptViewer.ScrubBar />
</TranscriptViewer.Root> Custom Word and Gap Rendering
TranscriptViewerWords accepts renderWord and renderGap snippets for per-segment overrides. Each receives the segment and its status — "spoken", "current", or "unspoken".
<script lang="ts">
import * as TranscriptViewer from "$lib/registry/ui/transcript-viewer";
import type { CharacterAlignment } from "$lib/registry/ui/transcript-viewer";
let {
audioSrc,
alignment,
}: {
audioSrc: string;
alignment: CharacterAlignment;
} = $props();
</script>
<TranscriptViewer.Root {audioSrc} {alignment}>
<TranscriptViewer.Audio />
<TranscriptViewer.Words>
{#snippet renderWord({ word, status })}
<span
class:font-semibold={status === "current"}
class:text-primary={status === "spoken"}
class:text-muted-foreground={status === "unspoken"}
>
{word.text}
</span>
{/snippet}
</TranscriptViewer.Words>
<TranscriptViewer.ScrubBar />
</TranscriptViewer.Root> Playback Callbacks
The root forwards the underlying <audio> lifecycle via onPlay, onPause, onTimeUpdate, onEnded, and onDurationChange — useful for analytics or syncing external state.
<script lang="ts">
import * as TranscriptViewer from "$lib/registry/ui/transcript-viewer";
import type { CharacterAlignment } from "$lib/registry/ui/transcript-viewer";
let {
audioSrc,
alignment,
}: {
audioSrc: string;
alignment: CharacterAlignment;
} = $props();
let currentTime = $state(0);
</script>
<TranscriptViewer.Root
{audioSrc}
{alignment}
onPlay={() => console.log("Playing")}
onPause={() => console.log("Paused")}
onTimeUpdate={(t) => (currentTime = t)}
onEnded={() => console.log("Ended")}
>
<TranscriptViewer.Audio />
<TranscriptViewer.Words />
<TranscriptViewer.ScrubBar />
</TranscriptViewer.Root> Custom Play/Pause Button
TranscriptViewerPlayPauseButton accepts a children snippet that receives { isPlaying }, so you can render your own label and icons while keeping the shared click behavior.
<script lang="ts">
import PauseIcon from "@lucide/svelte/icons/pause";
import PlayIcon from "@lucide/svelte/icons/play";
import * as TranscriptViewer from "$lib/registry/ui/transcript-viewer";
import type { CharacterAlignment } from "$lib/registry/ui/transcript-viewer";
let {
audioSrc,
alignment,
}: {
audioSrc: string;
alignment: CharacterAlignment;
} = $props();
</script>
<TranscriptViewer.Root {audioSrc} {alignment}>
<TranscriptViewer.Audio />
<TranscriptViewer.Words />
<TranscriptViewer.PlayPauseButton>
{#snippet children({ isPlaying })}
{#if isPlaying}
<PauseIcon class="size-4" /> Pause
{:else}
<PlayIcon class="size-4" /> Play
{/if}
{/snippet}
</TranscriptViewer.PlayPauseButton>
</TranscriptViewer.Root> Accessing Viewer State
useTranscriptViewer() returns the shared reactive state inside any descendant — useful for custom transport UI that needs to observe currentWord, currentTime, isPlaying, or jump to a specific word via seekToWord.
<script lang="ts">
import { useTranscriptViewer } from "$lib/registry/ui/transcript-viewer";
const state = useTranscriptViewer();
</script>
<div>
Current word: {state.currentWord?.text ?? "—"}
<button onclick={() => state.seekToWord(0)}>Restart transcript</button>
</div> API Reference
TranscriptViewer (root)
Each row below also accepts the standard HTMLAttributes for its host element (e.g. class, style, data-*, event handlers) unless otherwise noted.
TranscriptViewerAudio
Renders the underlying <audio> element driven by the root's state. Extends Omit<HTMLAudioAttributes, "src" | "children"> — src comes from the root's audioSrc.
TranscriptViewerWords
Renders the word/gap segments. Extends HTMLAttributes<HTMLDivElement>.
TranscriptViewerWord
The span rendered per word when no custom renderWord is supplied. Extends Omit<HTMLAttributes<HTMLSpanElement>, "children">.
TranscriptViewerPlayPauseButton
Play/pause toggle wired to the root's isPlaying state. Extends Omit<ButtonProps, "children">, so it inherits variant, size, etc. from the button primitive.
TranscriptViewerScrubBar
Time-aware scrub bar composed over the standalone ScrubBar primitive. Extends Omit<HTMLAttributes<HTMLDivElement>, "children">.
Notes
alignmentshape mirrors ElevenLabs'CharacterAlignmentResponseModel— three parallel arrays (characters,characterStartTimesSeconds,characterEndTimesSeconds) indexed per character. Reshape data from other providers (OpenAI, Deepgram, custom) to the same structure.- The root composes character alignment into word and gap segments internally via
composeSegments, or a customsegmentComposerwhen provided. Recomposition runs wheneveralignmentchanges. - When
hideAudioTagsistrue(default), anything inside[...]brackets — e.g. ElevenLabs'[excited]style tags — is stripped from the rendered transcript. - The audio element is owned by
TranscriptViewerAudio. The root wires upplay,pause,timeupdate,seeked,durationchange, andloadedmetadatalisteners plus arequestAnimationFrameloop to drivecurrentTimeand the active word index. - The active-word walk is incremental: normal playback advances via a forward scan, and seeks fall back to a binary search over the word list.
TranscriptViewerScrubBarcomposes the standaloneScrubBarprimitive with time labels and suspendstimeupdate-driven UI updates while the user is scrubbing so the thumb doesn't fight the pointer.- Words take one of three statuses —
"spoken","current","unspoken"— which you can target via the built-in Tailwind classes onTranscriptViewerWordor render yourself throughrenderWord/renderGapsnippets.