We wrote 110,000 words of schema documentation in one session. Then we threw all of it away.
Not because it was bad writing. The structure was solid, the templates were consistent, the JSON-LD examples looked right. We threw it away because 56% of the Google requirement claims were unverifiable. The AEO sections were generic filler. And the QA process we built to catch errors was itself making errors, because it never fetched a single source to verify its flags.
This is the story of how we built a 223-article schema markup glossary for Schema HQ, the production pipeline we designed after the first approach failed, and what we learned about coordinating AI agents for content at scale.
The final output: 58 schema type overviews, 223 field reference articles, every fact verified against schema.org and Google's documentation, with Webflow implementation guides and AEO context that doesn't exist anywhere else on the web.
Why This Content Needed to Exist
Go to schema.org/offers. You get a one-line definition ("An offer to provide this item"), a list of expected types, and a table of properties. No examples. No common mistakes. No explanation of why price must be a string in Google's implementation but schema.org accepts a Number. No mention of how AI answer engines use this field to answer "how much does this cost?" queries.
Go to Google's Product structured data docs. You get a list of required and recommended fields with brief descriptions. No Webflow implementation guide. No before/after code showing what breaks when you forget @type: "Offer" inside your offers object.
Go to any SEO blog post about Product schema. You get a 500-word overview that tells you schema markup is important and links to Google's docs.
Nobody has built the reference that a developer or marketer actually needs: what does this field do, what does Google require (verified, with the exact quote), what does AI care about, how do you implement it in Webflow specifically, and what are the real mistakes people make.
We gathered pieces of information from schema.org specs, Google documentation, live brand implementations, and industry guides. Verified each piece against the actual source. Linked every claim to its origin. And assembled them into articles that answer every implementation question for each field.
That was the goal. Getting there took three attempts.
Attempt 1: Brute Force (110,000 Words, One Session)
The setup: Claude Code running Opus 4.6 with 1M token context window on maximum effort. One agent session. A 12-section article template. A priority list of 57 schema types.
We started at Priority 1 (Product, LocalBusiness, Article, Event, JobPosting, Recipe, VideoObject, Review, FAQPage, BreadcrumbList) and wrote everything.
Product (10 fields): 11 files, 11,041 words. About 20 minutes.
LocalBusiness (11 fields): 11 files, 9,153 words.
On through all 11 Priority 1 types, then 10 Priority 2 types, then 35 Priority 3 and 4 types.
By the end: 204 files. 110,903 words. All 58 schema types covered. We pushed articles to Webflow CMS via the Webflow MCP, set up the CMS collections, and published 85 Organization articles live.
It felt like a win. Then we ran QA.
Attempt 2: QA That Was Wrong About Being Right
We wrote three QA agent prompts. One checked factual accuracy. One checked content quality and voice. One checked links.
The accuracy agent came back with 44 critical errors. Fields labeled Required that should be Recommended. Fields labeled Recommended that should be Optional. Invalid schema.org values. Missing fields in type overviews.
We started fixing. A Python script applied badge corrections across 152 files, updated tier classifications, removed invalid values like https://schema.org/SoldOut (not a real value) and @type: "State" (not a schema.org type).
Then we stopped. Because the QA agent had a fundamental problem.
It never fetched a single source page. It read our articles, compared claims against its training data, and flagged discrepancies. Some flags were obviously correct (a field labeled both "Recommended" and "Optional" in the same article). But the Google requirement tier claims? The agent was asserting that Google "requires" certain fields based on what it remembered from training, not what Google's documentation actually says today.
We were fixing errors using another layer of unverified claims. Every "correction" was potentially introducing a new error.
This was the turning point.
Opus 4.6 1M as Project Coordinator, Opus and Sonnet as Workers
Before explaining the pipeline, a note on how the AI coordination worked, because this produced one of our most interesting findings.
The coordination model:
Opus 4.6 with 1M context window on maximum effort ran as the project coordinator. It held the full picture: the database schema, all collection IDs, the pipeline status, the Webflow CMS structure, and the conversation history. It designed the pipeline, wrote the agent prompts, generated the SQL schemas, and tracked progress across all phases.
The actual work (research, writing, validation) was done by separate agent sessions. We tested three configurations:
- Opus 4.6 1M, max effort: Research agents, some writer agents
- Opus 4.6 (200K standard), max effort: Product writer agent
- Sonnet 4.6, high effort: Code validation, link audits, mechanical fixes, some writer agents
Then we compared the writing output with hard metrics.
Opus 4.6 standard (200K) vs Opus 4.6 1M for article writing:
| Metric | Opus 4.6 (200K) | Opus 4.6 1M |
|---|---|---|
| AEO section words (avg) | 62 | 57 |
| AEO user scenarios per article | 1.3 | 1.0 |
| Internal links per article | 6.5 | 5.7 |
| Schema.org references per article | 9.1 | 4.1 |
| Google doc references per article | 2.7 | 1.7 |
| Code blocks per article | 5.2 | 6.5 |
| Webflow section words (avg) | 110 | 122 |
Standard Opus outperformed the 1M version on every quality metric except code blocks and Webflow section length. Twice as many schema.org citations. 60% more Google references. Richer AEO sections with more specific user scenarios.
The likely explanation: each field article is a small, self-contained task. The 1M context window is optimized for holding massive context simultaneously. When the task only needs the research record for one field and the article template, the 200K model focuses its full reasoning budget on output quality instead of context management.
Sonnet 4.6 vs Opus 4.6:
Sonnet produced comparable structure but weaker AEO sections (51 words avg vs Opus's 62), fewer internal links (7 vs 6.5), and less source citation. For reference documentation following a fixed template with pre-researched facts, the gap between Sonnet and Opus was smaller than expected. The template and verified research data did the heavy lifting, not the model's reasoning depth.
The actual prose difference looks like this. Both models wrote AEO sections for equivalent fields. Here is Opus 4.6 writing the brand field's AEO section:
When a shopper asks an AI assistant "What brand makes the best trail running shoes under $150?", the engine uses
branddata from Product markup to group products by manufacturer and compare across brands. Without structured brand data, the AI cannot reliably attribute products to specific brands and may omit your products from brand-based comparisons.Brand data also helps AI engines disambiguate products. When a user asks about "Air Max 90," the
brandfield confirms this is a Nike product, not a generic item with a similar name.
Two specific user scenarios. Two concrete examples (trail running shoes, Air Max 90). 88 words.
Here is Sonnet 4.6 writing an equivalent field (hiringOrganization) at the same complexity level:
When a candidate asks "what jobs are available at Stripe," the AI engine matches against
hiringOrganization.nameandhiringOrganization.sameAsto identify all Stripe job listings. A well-formedhiringOrganizationwith asameAsURL pointing to the employer's official website enables accurate entity disambiguation. Without it, the AI cannot reliably connect the posting to the correct company.
One scenario. Technically accurate but single-angle. 54 words.
Both are correct. Both follow the template. Opus gives the reader two ways to understand why the field matters. Sonnet gives one. Across 223 articles, that compounds.
What we settled on: Opus 4.6 (standard or 1M) for all writing. Sonnet for mechanical tasks (code validation, link scanning, find-and-replace operations). The cost difference between Opus and Sonnet mattered less than the AEO depth and internal linking, because those are the signals that compound across 223 articles for SEO and AI citation.
The entire project ran on a Claude Max 20x subscription. No API billing. During the brute force writing session, we burned through the daily token allocation and had to wait 2 hours for it to reset before continuing. That pause turned out to be useful: it forced us to step back and question the approach instead of pushing more volume.
From 301 Markdown Files to One Database
The 110,000 words lived as 301 markdown files across 58 directories. One directory per schema type. One .md file per field. An _index.md overview in each directory.
schema-glossary/
├── product/
│ ├── _index.md
│ ├── name.md
│ ├── offers.md
│ ├── brand.md
│ └── ...
├── organization/
│ ├── _index.md
│ ├── 01-essential/
│ │ ├── name.md
│ │ ├── url.md
│ │ └── ...
│ ├── 02-recommended/
│ └── ...
└── ... (56 more directories)
This structure made writing easy: create a file, write the article, save. It made everything else hard.
"How many articles have an AEO section?" Grep across 301 files, count matches, hope the section heading is consistent. "Which fields belong to the Product type?" List the files in the product directory. But name also belongs to Organization, LocalBusiness, Event, and 40 other types. The directory structure couldn't represent that. "What's the average word count across all Required fields?" Not possible without a script that reads every file, parses the tier from the header, counts words, and aggregates.
We had built a content library that couldn't answer basic questions about itself.
So we migrated everything to SQLite. One database file, five tables:
schema_type (id, name, slug, body_md, body_html, meta_title, meta_description, webflow_id)
schema_field (id, name, slug, body_md, body_html, tier, required, meta_title, meta_description,
word_count, has_aeo_section, has_webflow_section, has_faq_section,
has_common_mistakes, has_json_examples, has_related_fields, webflow_id)
schema_type_field (type_id, field_id) -- many-to-many join
research (field_id, schema_org_url, schema_org_type, google_status, google_source_quote,
live_example_1_brand, live_example_1_json, ...)
validation (field_id, check_type, status, details)
The import script parsed every markdown file: extracted the tier from the header badge, checked for section headings (AEO, FAQ, Webflow, etc.), counted words, and stored everything as structured data. The type-field relationships came from which directory the file lived in, with shared fields (like name appearing in 45+ types) getting multiple entries in the join table.
We deleted all 301 files. The database was the single source of truth from that point forward.
What the database revealed immediately:
SELECT COUNT(*) FROM schema_field WHERE has_aeo_section = 0;
-- 132 missing AEO sections (59%)
SELECT COUNT(*) FROM schema_field WHERE has_webflow_section = 0;
-- 186 missing Webflow implementation (83%)
SELECT COUNT(*) FROM schema_field WHERE word_count > 1500;
-- 0 articles over 1,500 words
SELECT tier, COUNT(*) FROM schema_field GROUP BY tier;
-- Required: 20, Recommended: 66, Optional: 125, Advanced: 12
83% of articles lacked Webflow implementation sections. 59% had no AEO section. Zero articles exceeded 1,500 words. The brute force approach had produced volume. Not quality.
The database also enabled everything that came next. The research table gave each field a verified fact record separate from its article. The validation table collected QA results across multiple agent runs. The join table let us query "which types does author belong to" (Article, BlogPosting, NewsArticle, Recipe, Podcast, and 14 others) instead of checking 19 directories for a file called author.md.
When it came time to push content to Webflow, the database was the export source. SELECT slug, body_html, meta_title, meta_description FROM schema_field gave us everything for a CSV export or API push. When Webflow's slug issue forced us to change URL patterns, UPDATE schema_field SET body_md = REPLACE(body_md, '/schema-type/', '/schema-glossary-types/') migrated every article in one query instead of find-and-replace across 301 files.
The migration took about an hour. It saved days of downstream work.
Attempt 3: The Pipeline (Research First, Write Second)
With the database in place, we designed a five-phase production pipeline.
Phase 1: Research (17 Agents, 223 Fields)
Before rewriting anything, we verified every fact.
17 research agents, grouped by schema type. Organization (84 fields) got its own agent. Smaller types were bundled. Each agent had a specific job:
- Fetch
https://schema.org/[propertyName]for every field. Record the actual property type, which types it belongs to, and schema.org's description. - Fetch the Google structured data docs page for the type. Search for the field name. Record the exact requirement status (Required, Recommended, or Not mentioned) and the exact quote from Google's page.
- Find 2 live brand implementations using WebSearch. Record the brand name, URL, and actual JSON-LD.
- Note additional sources, deprecation warnings, and conflicts between schema.org and Google.
Each agent had duplicate protection (skip fields already researched), a pre-flight DB check, and a self-reporting final step showing progress.
We ran 5 agents at a time. All on Opus 4.6 1M, max effort (research needs web fetching which benefits from the larger context window for processing fetched pages).
Result after 4 batches: 223/223 fields researched. 2 fetch failures. Zero empty data fields.
What Research Actually Looked Like
Each agent's report showed how far off the original articles were. Here are specific reversals that would have gone live as misinformation:
LocalBusiness:
- image: our article said "Google requires it." Research found Google doesn't mention image for LocalBusiness at all. Not Required, not Recommended. Not mentioned.
- openingHours: we said "Recommended." Google's docs don't reference the simple text openingHours property. They use openingHoursSpecification (the structured object) instead. Different property entirely.
- description, sameAs: both listed as "Recommended." Both are "Not mentioned" by Google for LocalBusiness.
Event:
- eventAttendanceMode: article said "Recommended." Google doesn't mention it. The field exists in schema.org but Google has never documented a requirement level for it.
Person:
- All 6 person-specific fields (givenName, familyName, jobTitle, worksFor, alumniOf, birthDate): we had assigned Google requirement levels to every one. Google's ProfilePage documentation mentions none of them. Every tier assignment was fabricated from training knowledge.
Recipe:
- prepTime, cookTime, totalTime: all listed as "Recommended." Google does not formally recommend any of these individually. They appear in Google's example code but are not in the recommended properties list.
JobPosting was the most interesting case. The original QA bot had flagged validThrough as "should be Required." The research found Google actually lists it as "Recommended." The QA bot's correction would have been wrong in the opposite direction.
Google status distribution: - Required by Google: 17 fields (8%) - Recommended by Google: 67 fields (30%) - Not mentioned by Google: 124 fields (56%) - No Google docs page for this type: 15 fields (7%)
56% of fields are not mentioned anywhere in Google's structured data documentation. Our original articles had been assigning Google requirement levels to fields Google has never acknowledged. Those articles would have gone live as authoritative reference content with fabricated Google endorsements.
Three deprecated features caught: - HowTo rich results (deprecated September 2023). Our articles recommended implementing them. - Sitelinks Search Box (deprecated November 2024). Our WebSite article referenced it as active. - Google Podcasts (shut down 2024). Our Podcast articles referenced Google Podcasts indexing.
Without the research phase, we would have published a glossary recommending three features that no longer exist.
Phase 2: Writing From Verified Research (17 Agents)
Each writer agent read the research table and rewrote from scratch. The template:
- Appears in (which schema types, pulled from database joins)
- What it does (plain English for marketers AND developers)
- How AI engines use it (specific scenario: "When a user asks X, the AI pulls this field to do Y")
- What the specs say (schema.org definition with link; Google status with exact quote from docs)
- How to find your value (practical: "check your product packaging," "use your CMS date field")
- Format and code (verified JSON-LD, valid/invalid examples)
- Webflow implementation (static page embed, CMS template with field references, Schema HQ automation)
- Real examples (from research: actual brand JSON-LD, not "Example Company")
- Related fields (3-5 internal links)
- FAQ (2-3 questions developers and marketers actually search for)
The critical rule: no claim without a source from the research table. If the research didn't cover something, the writer flagged it instead of making it up.
17 writer agents in batches. Result: 223/223 articles rewritten. All quality flags at 100%.
Phases 3-5: Validation
Phase 3 (Code Validation): Parsed 935 JSON-LD blocks. Found 8 instances of @type: "State" (not a schema.org type, should be AdministrativeArea). Fixed.
Phase 4 (Fact Check): A separate Opus agent (not the writer) cross-referenced every article claim against the research table. Found 2 issues across 223 articles. Both fixed.
2 fact-check errors across 223 articles. Compare that to the 44 "critical errors" flagged by the first QA attempt that never checked a source.
Phase 5 (Link Audit): Found 26 broken internal links and 21 orphan articles. Fixed all 47. Final state: 1,228 internal links, zero broken, zero orphans.
Webflow CMS: Where Everything Went Sideways
Content done. Fact-checked. Interlinked. Pushing to Webflow CMS was supposed to take an hour.
It took multiple sessions across two days.
Background Agents Can't Get Permissions
We launched 4 background agents to create CMS items in parallel via the Webflow MCP. All 4 failed. Background agents in Claude Code cannot receive interactive permission prompts. The Webflow tool requires user approval, the approval dialog can't reach a background process, and the agent stops.
Fix: Everything through the main thread or CSV imports.
Webflow Soft-Deletes Slugs Permanently
We imported 23 schema types via CSV. Needed to reimport after updating content. Deleted the items in the Designer. Reimported.
Webflow created 23 NEW items with suffixed slugs: article-ae28c, product-62276, event-3de8d. The original slugs were permanently reserved by the "deleted" items. 100 items in a collection that should have 58.
The Webflow API also rejects creating or updating items with reserved slugs. No workaround via API or CSV.
Fix: Created entirely new CMS collections. Changed all URL patterns from /schema-type/ to /schema-glossary-types/ and /schema-term/ to /schema-glossary-terms/. Updated every internal link in all 281 articles. Twice (once in markdown, once in HTML).
MultiReference Fields Point to Collections, Not Items
We created a MultiReference field linking Schema Terms to Schema Types. Then we needed new Schema Types collection (because of the slug problem). The MultiReference still pointed to the old collection. The Webflow API cannot change a reference field's target collection.
Fix: Delete the field, create a new one. Relink everything.
Agents Are for Reasoning, Scripts Are for Execution
After fixing the collection structure, we needed to push updated content to all 58 type items and 223 term items in Webflow. We wrote an agent prompt: "Update each item one at a time via the Webflow MCP. No subagents. No optimization. Grind through all 58."
The agent pushed 1 item (about-page). Then it paused. Instead of processing item 2, it started deliberating. "Doing 57 more individual tool calls with full HTML bodies is impractical. Let me check if I can use the Webflow API directly via HTTP for better efficiency."
It spent 15 minutes trying to find the Webflow API token, exploring alternative approaches, and asking itself if there was a better way. It never pushed item 2.
We killed the agent and wrote a 50-line Python script instead. The script read from the database, looped through all 58 items, made HTTP PATCH requests to the Webflow API with a 0.5-second delay between calls, tracked progress to a file for crash recovery, and did a spot check every 10 items.
58 types pushed in 30 seconds. 223 terms pushed in 2 minutes. Zero failures. Every spot check passed.
The lesson: AI agents are good at reasoning (designing the pipeline, writing prompts, analyzing research data, making editorial decisions). They are bad at repetitive execution with large payloads. When the task is "do the same thing 223 times with different data," write a script.
QA False Positives: The Wrong Field Name
After pushing all content, we ran a verification agent (QA1) to check that Webflow had correct data. It reported 100 terms with empty body content. The entire h-through-s alphabetical range appeared to have no articles.
We wrote a fix agent. The fix agent listed all 223 Webflow items and checked their body content. Every single one had content. Zero empty.
The QA1 agent had checked a field called body-2. The Terms collection uses a field called body. The field body-2 doesn't exist on that collection. The QA agent reported 100 items as empty because it was reading from a nonexistent field.
We deployed a fix agent for a problem that didn't exist, because the diagnostic agent checked the wrong field name. The fix agent discovered the actual state and corrected course. But we burned a full agent session on a false alarm.
The meta-lesson: when an AI agent reports a problem, verify the problem exists before deploying another AI agent to fix it. A one-line SQL query (SELECT COUNT(*) FROM schema_field WHERE body IS NOT NULL) would have caught this in 2 seconds.
If We Did This Again
We would create the CMS collections once, with final names and slugs, before writing a single article. The collection structure should be locked before content production starts. We changed it four times. Every change cascaded through every article.
Phase 6: Six Editorial Fix Passes After Publish
Content was live. The fact checks had passed. The links resolved. Then we started actually reading the articles.
Six categories of issues had slipped through Phases 1-5. None were fact errors. The claims were all verified. These were editorial and rendering bugs visible to anyone who opened a page.
1. Markdown tables stripped by Webflow RichText (218 articles). Webflow's RichText field silently drops <table> tags. Every | Field | Description | table in our type overviews and term "How to find your value" sections rendered as whitespace on the live page. Fix: convert every markdown table to a bullet list with - [**field**](/url) — description format before generating HTML. 45 types and 177 terms touched.
2. Meta-content leaks (76 articles). Sentences like "No live examples were found in the research for this field. Here is an illustrative example based on common implementations." These leaked the production process to readers. Fix: a regex pass covering nine disclaimer variants plus inline (illustrative) parentheticals. Remove the disclaimer, keep the example beneath it.
3. Em dashes in prose (18 articles). The voice rules forbid em dashes in prose. They had crept in from writer agents. Bullet-list separators (- **label** — description) are fine because they're structural, not prose. The regex replaced \w — \w patterns with , and \w — [A-Z] with ., with bullet-line and code-block protection. Prose em dashes: 0. Bullet-list separators preserved: 1,604.
4. Heading-word doubling (223 articles). Every term article had a ### With Schema HQ subheading followed by a paragraph starting "Schema HQ does X...". Rendered, this read as "HQ / Schema HQ" back-to-back. Fix: heading swap to ### In Schema HQ across all 223 terms. Deterministic, one SQL UPDATE.
5. Formulaic paragraph openers (222 articles). After the heading swap, 222 paragraphs still opened with "Schema HQ ...". That's the batch-writing fingerprint. A template rewriter with 13 verb patterns (includes, maps, does not, supports, reads, sets, generates, outputs, lets you, builds, publishes/emits, requires, can map) plus hash-based variant rotation distributed three different phrasings across articles. After four corrective passes for edge cases (possessive Schema HQ's, broken-grammar infinitives, redundant field-name repetition), the result: 0 openers starting with "Schema HQ", 161 unique first-two-word patterns across 222 articles, max repetition 20×.
6. Placeholder brand names in examples (15 articles). Hypothetical "Acme Corp" references in prose and JSON-LD format examples. Fix: replace with real attributed brands verified via WebFetch — Rotten Tomatoes movie pages for actor/director/contentRating/dateCreated, Google Search Central for review-snippet schema, schema.org canonical examples for Course/Rating/ScholarlyArticle, MIT OCW for course codes. For unverifiable sections, drop them entirely (Path B) rather than ship fabricated attributions.
Each pass followed the same pattern: SQL query to identify affected articles, targeted regex or AI rewrite, regenerate HTML from updated markdown, push to Webflow via the Python HTTP script, track touched slugs in /tmp/*_progress.txt for resume support. 0.5s delay between API calls.
Cumulative Phase 6 result: 796 Webflow writes across six passes. Zero failures.
And then we found the real problem.
The Whitespace Bug
After all six fixes pushed, a spot-check on the Webflow CMS editor for the website type showed blank sections under the Required, Recommended, and Optional headings. The fixes we had just made had not rendered.
First diagnosis: "This must be the publish gap. Webflow Data API updates drafts; the site needs to be re-published for changes to go live."
That was wrong.
The user pushed back: "The screenshot is from the CMS editor, not the live site. Don't fuck with me." Pavel was right. The CMS editor shows staged content immediately. If the editor was showing empty, something else was happening.
We ran a canary. The user manually typed a single bullet item into the website record's body field in the Webflow editor, formatting the word as both bold and a link. Saved. We queried the Webflow API and inspected the stored HTML.
The difference was in one character.
Our API-generated HTML:
<ul>
<li><a href="/schema-glossary-terms/name"><strong>name</strong></a> — The site name...</li>
</ul>
Webflow's stored HTML from the user's manual edit:
<ul><li><a href="/schema-glossary-terms/name"><strong>name</strong></a> — The site name...</li></ul>
No newlines between tags. Webflow's RichText parser silently drops list children when whitespace separates <ul>, <li>, and </ul>. The Data API GET echoes back the stored HTML (misleading you into thinking the content is there), but the editor's internal block representation has no bullet nodes. So the editor shows empty sections, and so does the rendered live page.
The Python markdown library outputs <ul>\n<li>...</li>\n</ul> by default. Every article we had pushed since the initial upload had this structure. Webflow accepted the payload, returned success, and the Data API GET showed the HTML with bullets present. Everything looked right through the API. The editor and the rendered page both showed empty sections.
The fix: a compact() helper that strips whitespace between tags (re.sub(r'>\s+<', '><', ...)) while preserving content inside <pre> code blocks so JSON examples keep their formatting.
def compact(html):
out, i = [], 0
while i < len(html):
pre = re.search(r'<pre[^>]*>[\s\S]*?</pre>', html[i:])
if pre:
out.append(re.sub(r'>\s+<', '><', html[i:i + pre.start()]))
out.append(pre.group(0))
i += pre.end()
else:
out.append(re.sub(r'>\s+<', '><', html[i:]))
break
return ''.join(out)
Applied to all 281 articles, regenerated HTML in the database, re-pushed to Webflow. After the compact push, a second live-page check confirmed bullets rendering correctly across the corpus.
This is the kind of bug that can't be caught by unit tests or API response validation. The API reports success. The stored data looks correct. The symptom only appears in the visual rendering layer. The root cause is whitespace handling inside a parser we don't control.
We verified by fetching 10 sample live pages from karpi-2-0.webflow.io and inspecting rendered HTML directly: bullet counts, heading counts, internal link counts, JSON-LD presence. All 181 internal links extracted from those pages resolved to 200. Only then was it safe to publish to production.
What Makes This Content Different
Schema.org gives you a property definition. Google gives you a requirements checklist. Blog posts give you a high-level overview. None of them tell you the full picture.
Take offers in Product schema. Schema.org says it accepts an Offer or Demand object. Google says it's Required for Merchant Listings and must include price and priceCurrency. Neither tells you:
- That
priceshould be a string in Google's implementation despite schema.org accepting Number - That
availabilityvalues must use fullhttps://schema.org/InStockURLs, not shortInStockstrings - That
priceValidUntilin the past silently disables your rich snippet - How to implement conditional availability in Webflow CMS (which doesn't support conditional JSON-LD natively)
- That AI answer engines pull price data from
offersto answer "how much does X cost" queries, and without structured offers data, your product is invisible to transactional AI responses
Our offers article covers all of this. 1,141 words. Schema.org definition with link. Google requirement with exact quote. Webflow implementation for static pages, CMS templates, and Schema HQ. Real brand JSON-LD examples. Valid/invalid code pairs for the 6 most common mistakes. FAQ covering Offer vs AggregateOffer, sale prices, and free products.
We built this for the developer implementing Product schema on a Webflow site at 11pm who needs to know exactly what format offers requires, and for the marketer who needs to understand why this field matters for Google Shopping visibility and AI search citations.
That person has nowhere to go today. Schema.org is too dry. Google's docs are too sparse. Blog posts are too shallow. This glossary fills the gap by gathering verified information from every source, linking it, and presenting it in one place.
Production Numbers
| Metric | Count |
|---|---|
| Schema types covered | 58 |
| Field articles | 223 |
| Research records (verified against sources) | 223 |
| Internal links | 1,228 |
| JSON-LD code blocks | 935 |
| Google statuses verified with exact quotes | 223 |
| Live brand examples collected | 156 |
| Bullet-list separators preserved through fix passes | 1,604 |
| Webflow writes across Phase 6 fix passes | 796 |
| Push failures across all phases | 0 |
| Live-page internal links verified 200 | 181/181 |
| Broken links (final) | 0 |
| Orphan articles (final) | 0 |
| Fact check errors (final) | 0 |
Models and configuration: - Opus 4.6 1M, max effort: Project coordination, research agents, writer agents, fact check - Opus 4.6 (200K), max effort: Product writer (produced highest source citation density) - Sonnet 4.6, high effort: Code validation, link audits, URL pattern migrations, mechanical fixes
Time: ~3 days of intermittent work. The token-limited pause after the brute force session was the most productive non-working time of the project, because it forced the pivot from "write more" to "verify first."
Platform: Claude Max 20x subscription (no API billing). Hit the token ceiling during the initial writing session and waited 2 hours for reset.
What We Would Do Differently
Research before writing, always. 110,000 words felt productive. But 56% of the Google requirement claims were wrong or unverifiable. Writing first and fact-checking later meant we wrote everything twice. Researching first and writing once would have cut the project time in half.
Lock the CMS structure before producing content. Every collection rename, slug change, and URL pattern update cascaded through all 223 articles. Four structural changes meant four rounds of find-and-replace across the entire database.
Use a database from day one. "How many articles are missing FAQ sections?" required grep across 301 files. In SQLite, it's SELECT COUNT(*) FROM schema_field WHERE has_faq_section = 0. The database enabled the pipeline, the quality metrics, and the export workflow.
Standard Opus for writing, 1M for coordination. The 1M context window shines when holding the full project state (collection IDs, pipeline status, conversation history). For writing individual articles, standard Opus produces better output because it focuses reasoning on the task instead of context management.
Never fight Webflow's CMS from the API. Webflow CMS is built for humans in the Designer. Slug reservations, soft-deletes, and immutable references are all designed around manual editing. Build the structure in the Designer first, verify it works, then use the API only for content updates.
Use scripts for bulk execution, agents for reasoning. An agent that pushes 223 items to an API is a script with extra overhead. Write the Python script. Use agents for the parts that require judgment: designing the pipeline, analyzing research data, making editorial decisions about content structure. The moment the task becomes "repeat this 223 times," stop using an agent.
Verify the diagnostic before deploying the fix. When a QA agent flags 100 items as broken, run one SQL query to confirm before launching a fix agent. We burned a full session on a false positive because the QA agent checked the wrong field name. A 2-second database query would have caught it.
Verify visually, not via API GET. An API GET of a Webflow RichText field returns the HTML as stored, not as the editor renders it. When Webflow's parser silently drops content it can't translate to its internal block representation, the GET will still echo your original payload back. We spent an iteration chasing the wrong root cause because the API response looked correct. Visual verification in the editor, or a live-page fetch followed by grep for the expected element, is the only way to confirm content actually rendered.
Sync the full corpus, not just fix-touched items. We ran six targeted fix passes, each pushing only the articles it modified. Articles that didn't need any Phase 6 fix but had stale Webflow state from an earlier upload silently drifted out of sync. After the sixth pass, a full 281-item re-sync was required to catch every stale item. Next time: sync everything after every pass, even when the cost is extra API calls.
Compact HTML before pushing to Webflow RichText. The Python markdown library outputs whitespace between block tags by default. Webflow's RichText parser treats whitespace between <ul>, <li>, and </ul> as content boundaries and drops the list children. Every push must go through a compact helper that strips whitespace between tags while preserving <pre> blocks. This is not documented anywhere in Webflow's API docs. We found it by manually editing an item in the Webflow editor and comparing the stored HTML to what we had been pushing.
FAQ
Can this approach work for other content types?
The research-first pipeline works for any domain where claims can be verified against authoritative sources. Schema markup was ideal because schema.org and Google's docs are the definitive references. For opinion content or thought leadership, Phase 1 would be competitive analysis instead of spec verification. The principle holds: verify before you write.
Why not use a single agent for everything?
Context limits and specialization. Organization has 84 fields. Researching all 84 (fetching 84+ web pages, processing content, writing records) exceeds what one session handles reliably. Splitting into parallel agents with duplicate protection let each agent focus on a manageable scope. The coordinator (Opus 1M) held the big picture while workers executed specific tasks.
How do you prevent AI hallucination across 223 articles?
Five layers. Phase 1 creates verified facts from fetched sources. Phase 2 writers can only use research table data. Phase 3 validates all code. Phase 4 cross-references every claim against research. Phase 5 checks structural integrity. The hallucination surface is minimized because writers have pre-verified facts instead of generating facts from training knowledge. Result: 2 errors across 223 fact-checked articles.
What about the 56% of fields "Not mentioned" by Google?
We document them honestly. "Not mentioned by Google" with the schema.org definition and practical implementation guidance. No fabricated Google endorsement. This accuracy is an advantage: when AI engines evaluate our glossary as a source, the honest attribution of what Google does and doesn't say builds credibility over sites that inflate every field's importance.
How do you catch bugs that only show up in the rendered output?
You can't, from an AI agent's perspective alone. The Webflow whitespace bug was invisible through every channel available to the coordinator: the markdown in the DB was correct, the generated HTML was correct, the API GET echoed back correct HTML, every automated check reported success. The bug only surfaced when a human looked at the CMS editor and said "the sections are empty." Visual verification has to be part of the pipeline. When the platform has a rendering or parsing layer you don't control, the only way to catch divergence between what you send and what renders is to fetch the rendered output and verify directly.
If your Webflow site needs schema markup that works for both Google and AI engines, Schema HQ handles it automatically. Or if you want to build it yourself, the glossary covers every field you'll need.