Sonar Search — Audio Search API for AI Agents

Capabilities

Everything an agent needs to reason over audio.

Four core capabilities, one API surface. No stitching together pipelines.

Semantic Audio Search

Query by meaning, not keywords. Sonar finds the clips where your concept is discussed, even when the exact words aren't used.

audio.search() · REST + SDK

Speaker Attribution

Results include identified speakers with confidence scores. Know who said it, not just what was said.

audio.speakers() · diarization

Timestamped Retrieval

Every result links directly to the exact moment in the source audio. Agents can cite the precise clip, not just the episode.

audio.clip() · direct playback

Live Index Streaming

Breaking broadcasts, live hearings, and real-time podcasts are indexed within minutes of airing. Agents stay current.

audio.stream() · sub-5min latency

Endpoints

What you can build.

Real-world use cases, each powered by a handful of API calls. If your agent needs to hear the world, Sonar is the layer underneath.

01 — audio.search()

Semantic Audio Search

Query across millions of indexed recordings in natural language. Sonar returns the most relevant clips with full transcripts, source metadata, and relevance scores — no keyword gymnastics required.

✓ Production-ready

Sonar Search SEARCHING

Find what Powell said about monetary policy and rate cuts

Searching audio index... 1,314 files found

Federal Reserve Press Conference

federalreserve.gov · 58 min · 2026-03-20

Jerome Powell · 14:22

"We are holding the policy rate steady while the committee evaluates whether monetary policy is restrictive enough to bring inflation down sustainably."

.97

Brookings Monetary Policy Forum

brookings.edu · 74 min · panel audio

Lael Brainard · 31:08

"The chair was clear that rate cuts depend on confidence in the inflation path, not just one softer month of economic data."

.94

Bloomberg Surveillance · Morning Edition

bloomberg.com · 42 min · markets podcast

Tom Keene · 52:41

"Powell said the Fed can move carefully, but he pushed back on the idea that easier monetary policy is already guaranteed this year."

.91

aud_2a71d0Morning markets rates briefing.89

aud_61f3a8Central bank policy panel.88

aud_80c42eInflation roundtable excerpt.86

aud_f93b10Macro outlook pressure.84

aud_348de2Fiscal year forecast clip.82

aud_b2c94aRate cut discussion.81

aud_4e51bdMarket tightening call.80

aud_d73aa1Reserve board minutes.79

aud_980ff4Yield curve analysis.78

aud_24bd90Quarterly outlook segment.77

aud_6c12eaBanking hearing testimony.76

aud_e8a0c5Monetary recap clip.75

+2.4Madditional files scanneddone

Sonar Search API indexed

audio_fileaud_9f42c1 · Federal Reserve Press Conference

sourcefederalreserve.gov · 58 min · 2026-03-20

speakerJerome Powell · chair · confidence 0.97

audio_fileaud_61f3a8 · Brookings Monetary Policy Forum

speakerLael Brainard · panelist · confidence 0.94

audio_fileaud_80c42e · Bloomberg Surveillance

speakerTom Keene · host · confidence 0.91

utterancemonetary policy · rate cuts · timestamped clips

{
  "query": "Powell monetary policy and rate cuts",
  "utterance_text": "We are holding the policy rate steady while the committee evaluates whether monetary policy is restrictive enough.",
  "audio_files": [
    { "id": "aud_9f42c1", "source": "federalreserve.gov", "duration": "58m" },
    { "id": "aud_61f3a8", "source": "brookings.edu", "duration": "74m" }
  ],
  "speakers": ["Jerome Powell", "Lael Brainard", "Tom Keene"],
  "clips": [
    { "file_id": "aud_9f42c1", "speaker": "Jerome Powell", "t": "14:22", "score": 0.97 }
  ]
}

02 — audio.transcribe()

On-demand Transcription

Submit any public audio URL and get back a fully speaker-diarised, timestamped transcript in seconds. Ideal for agents that need to reason over specific recordings not yet in the Sonar index.

◎ In beta

Sonar Transcribe PROCESSING

https://podcasts.apple.com/us/podcast/ai-a16z/id1740178076

AI + a16z · weekly podcast · word-level timestamps

Sonar transcribe API transcript streaming

{  "type": "transcript.segment.created",  "audio_url": "podcasts.apple.com/.../ai-a16z",  "entry": {    "speaker_name": "Martin",    "timestamp": "00:04.120",    "confidence": 0.97,    "verbatim_utterance": "AI apps are becoming systems, not just prompts."  },  "format": "json",  "word_timestamps": true}

03 — audio.stream()

Real-time Broadcast Monitoring

Subscribe to live audio streams and receive transcript chunks as they're spoken. Agents can react to breaking news, live hearings, or earnings calls the moment words are said — not hours later.

⟡ Early access

Sonar Stream LIVE

congress.gov/live/senate

Senate floor audio · chunks every ~3s

Latency 0.8s

Sonar Stream API streaming

{
  "event": "senate hearing",
  "time": "12:04:08.240",
  "text": "Mr. President, I rise today because our nation stands at a critical crossroads. The choices we make in this chamber will echo for generations. Every day, working families are asking whether we can still solve problems together. We came here to deliver real, tangible progress."
}

Sonar Database Indexed

12:04 Sen. Wicker "The committee will now consider the spectrum bill and related amendments."

12:07 Clerk "Amendment 214 is reported with language covering emergency alerts and carrier obligations."

12:10 Sen. Cantwell "We need a clear timeline before this moves to a floor vote later today."

Sonar Stream API triggered

{
  "event": "policy_watch.triggered",
  "matched_segment": "floor vote later today",
  "speaker": "Sen. Cantwell",
  "timestamp": "12:10",
  "latency_ms": 840
}

How it works

From raw audio to agent-ready signal.

Sonar runs a continuous pipeline that crawls, processes, and indexes public audio across the web — so by the time your agent makes a query, the work is already done.

Crawl & Ingest

Sonar continuously discovers public audio across podcast feeds, broadcast archives, government streams, and radio APIs.

Transcribe & Diarise

Every recording is transcribed with word-level timestamps and speaker separation using a proprietary multi-model stack.

Embed & Index

Transcript chunks are embedded into Sonar's audio-native semantic index, optimised for spoken-word retrieval patterns.

Serve & Rank

Queries return ranked clips with metadata in under 200ms. Your agent gets structured JSON — not raw audio to figure out itself.

Cite & Attribute

Every result includes source URL, timestamp, speaker, and a clip playback link — so your agent's outputs are fully verifiable.

AIs are the web's new ears.

The public web has a voice. Now your agent can hear it.

State of the art on every audio retrieval benchmark.

Everything an agent needs to reason over audio.

Semantic Audio Search

Speaker Attribution

Timestamped Retrieval

Live Index Streaming

What you can build.

Semantic Audio Search

On-demand Transcription

Real-time Broadcast Monitoring

From raw audio to agent-ready signal.

Crawl & Ingest

Transcribe & Diarise

Embed & Index

Serve & Rank

Cite & Attribute

From the team

Why keyword search fails for spoken language — and what we do instead

Sonar v0.4: live streaming, speaker ID, and the audio.stream() endpoint

Introducing AudioBench-2025: the first open retrieval benchmark for public audio

Towards a programmatic web for AIs