transcript
transcript · Not mentioned by Google
Appears in
What is it?
transcript holds the full text transcription of an audio or video recording. It gives search engines access to spoken content that would otherwise be locked inside a media file. This property applies to both AudioObject and VideoObject.
Why this matters for AEO
AI answer engines cannot listen to audio. When a user asks "what did [guest] say about [topic] on [podcast]?", the AI needs text to work with. transcript provides that text directly in structured data, making spoken content searchable and quotable. Podcast episodes, webinars, and interviews with transcripts are far more likely to surface in AI-generated answers than those without.
What the specs say
Schema.org: Text. Applies to AudioObject and VideoObject. "If this MediaObject is an AudioObject or VideoObject, the transcript of that object." schema.org/transcript
Google: Not mentioned. No dedicated Google structured data documentation exists for AudioObject, and the transcript property is not referenced in any Google rich result type.
How to find your value
- Transcription services — Otter.ai, Rev, Descript, or Whisper output
- Podcast hosts — Buzzsprout, Transistor, and Captivate offer built-in transcription
- YouTube — Auto-generated or manually uploaded captions can be exported as text
- Manual transcription — Copy the spoken content and format as plain text
- AI tools — Feed the audio file to a speech-to-text model
Format and code
transcript accepts a plain text string containing the full transcription.
{
"@context": "https://schema.org",
"@type": "AudioObject",
"name": "Episode 41: Webpack Inside and Out",
"contentUrl": "https://example.com/episodes/ep41.mp3",
"transcript": "Welcome to the show. Today we are talking about module bundling with webpack. Our guest is Sean Larkin, one of the core maintainers..."
}
For long transcripts, the full text goes into the transcript value. There is no character limit in the schema.org spec, but keep in mind that very large JSON-LD blocks increase page weight.
Valid values:
"Welcome to episode one. Today we discuss..."(full transcript text)- A complete multi-paragraph transcription as a single string
Invalid values:
"https://example.com/transcript.txt"(URL; use a Text value, not a link)"[transcript available on request]"(placeholder, not actual content)
Webflow implementation
Static pages
Add the JSON-LD in Page Settings > Custom Code > Head Code:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "AudioObject",
"name": "Episode Title",
"contentUrl": "https://yourdomain.com/audio/episode.mp3",
"transcript": "Full transcript text goes here..."
}
</script>
CMS template pages
If your Webflow CMS has a rich text or plain text field for transcripts:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "AudioObject",
"name": "{{wf {"path":"name","type":"PlainText"} }}",
"contentUrl": "{{wf {"path":"audio-file-url","type":"PlainText"} }}",
"transcript": "{{wf {"path":"transcript-text","type":"PlainText"} }}"
}
</script>
Note: Webflow rich text fields may include HTML tags. Use a plain text field for transcript to avoid injecting HTML into JSON-LD.
In Schema HQ
Optional mapping is available: a CMS plain text field to transcript, handling JSON escaping for special characters, line breaks, and quotation marks in the spoken content.
Real examples
No major brands were found using transcript in live JSON-LD during research. Adoption of this property in production markup is very low, despite its value for accessibility and AI discoverability.
{
"@context": "https://schema.org",
"@type": "PodcastEpisode",
"name": "The Future of Search with AI",
"url": "https://example.com/podcast/future-of-search",
"associatedMedia": {
"@type": "AudioObject",
"contentUrl": "https://example.com/podcast/future-of-search.mp3",
"duration": "PT45M",
"transcript": "Host: Welcome to the show. Today we are joined by Dr. Sarah Chen to discuss how AI is changing search. Sarah, what has shifted in the past year? Guest: The biggest change is that search engines now generate answers instead of just listing links. This means structured data is more important than ever because AI models need clean, labeled information to produce accurate responses..."
}
}
Related fields
FAQ
Should the transcript go in JSON-LD or on the page as visible text?
Both, ideally. Visible text on the page helps traditional SEO and accessibility. Putting it in transcript within JSON-LD makes it machine-readable for AI engines. If you must choose one, visible page text has broader impact.
Is there a character limit for the transcript field?
Schema.org does not define a character limit. However, embedding a full hour-long transcript in JSON-LD can add significant page weight. For very long recordings, consider placing the full transcript as visible page text and using transcript for a summary or key excerpt.
Does transcript replace captions or subtitles?
No. transcript is a full-text rendering of all spoken content. Captions (caption) are time-coded text segments synchronized with playback. They serve different purposes: transcripts are for reading and search, captions are for real-time viewing.