stt

stt ¶

STT evaluation client — httpx + httpx-ws + httpx-sse under the hood.

AudioEncoding ¶

Bases: StrEnum

Wire encoding for WebSocket audio frames.

STTResult `dataclass` ¶

STTResult(
    hypothesis_text: str = "",
    text_metrics: TextMetrics | None = None,
    latency_ms: float = 0.0,
    chunks_received: int = 0,
    fragments: list[str] = list(),
)

STT evaluation result with optional metrics.

assert_quality ¶

assert_quality(
    *, max_wer: float = 0.2, max_cer: float = 0.15
) -> Self

Assert STT quality. Chainable.

compute_metrics ¶

compute_metrics(reference: str) -> Self

Compute WER/CER against reference. Chainable.

STTSession ¶

STTSession(
    *,
    session: AsyncWebSocketSession,
    sample: AudioSample | None,
)

Active WebSocket session for STT evaluation.

send_bytes `async` ¶

send_bytes(data: bytes) -> None

Send binary audio data.

send_text `async` ¶

send_text(data: str) -> None

Send text (JSON config, END_OF_AUDIO, etc.).

send_sample `async` ¶

send_sample(
    sample: AudioSample,
    *,
    chunk_ms: int = 200,
    encoding: AudioEncoding = AudioEncoding.FLOAT32,
) -> None

Stream sample in chunks with realistic pacing.

encoding controls wire format

FLOAT32 → binary frame, raw float32 (default) PCM16 → binary frame, raw int16 PCM16_BASE64 → text frame, base64-encoded int16

receive_text `async` ¶

receive_text(*, timeout: float | None = None) -> str

Receive text frame and accumulate as fragment.

receive_bytes `async` ¶

receive_bytes(*, timeout: float | None = None) -> bytes

Receive binary frame.

result ¶

result() -> STTResult

Build STTResult from accumulated fragments.

STTClient ¶

STTClient(*, url: str, timeout: float = 30.0)

STT evaluation client — HTTP batch + WebSocket streaming.

post `async` ¶

post(
    *, data: bytes | None = None, **kwargs: Any
) -> httpx.Response

Batch POST audio to STT endpoint (e.g. OpenAI Whisper API). Returns raw httpx.Response.

stream `async` ¶

stream(
    *, data: bytes | None = None, **kwargs: Any
) -> AsyncIterator[httpx.Response]

Chunked streaming POST. Yields httpx.Response for aiter_bytes/aiter_lines.

sse `async` ¶

sse(
    *, data: bytes | None = None, **kwargs: Any
) -> AsyncIterator[EventSource]

SSE streaming POST. Yields EventSource for aiter_sse().

ws `async` ¶

ws(
    *, sample: AudioSample | None = None, **kwargs: Any
) -> AsyncIterator[STTSession]

Open WebSocket session for STT streaming (e.g. WhisperLive).

aclose `async` ¶

aclose() -> None

No-op — clients are created per-call.

stt

stt ¶

AudioEncoding ¶

STTResult dataclass ¶

assert_quality ¶

compute_metrics ¶

STTSession ¶

send_bytes async ¶

send_text async ¶

send_sample async ¶

receive_text async ¶

receive_bytes async ¶