Why HTML-First Indexing
Because the plugin runs on rendered HTML, it works with any content architecture without special configuration.
Structured Content Sites
For sites using structured frontmatter with component sections, each page's components render into the final HTML — and that HTML is what gets indexed:
yaml---
title: "My Page"
sections:
- sectionType: hero
text:
title: "Welcome"
leadIn: "Get started"
prose: "This content is automatically indexed"
- sectionType: rich-text
text:
prose: "More searchable content here"
---
There's no need to declare component types, field names, or section schemas. Anything that ends up as visible text on the rendered page (after theexcludeSelectorsare removed) is part of the index.
Traditional Markdown Sites
For traditional long-form content, the Markdown is rendered to HTML by@metalsmith/layoutsbefore the plugin sees it:
yaml---
title: "My Article"
---
# Article Title
Long-form content that becomes searchable once rendered.
All body text is captured in thecontentfield. Headings (h1–h6) are collected separately into theheadingsarray so the client can jump to the matching section.
Excluding Site Chrome
UseexcludeSelectorsto keep navigation, repeated promo banners, or related-post widgets out of the index:
javascript.use(search({
excludeSelectors: ['nav', 'header', 'footer', '.site-promo', '.related-posts']
}))
<script>and<style>elements are always stripped automatically.