← Back to Blog

Astro Content Collections Explained

Deep dive into Astro Content Collections — a type-safe way to manage your Markdown and MDX content with frontmatter validation and schema support.

Managing content in a static site can quickly become unwieldy as your project grows. Markdown files accumulate, frontmatter schemas drift out of sync, and referencing one piece of content from another becomes error-prone. Astro Content Collections solve all of these problems by providing a structured, type-safe system for organizing and validating your content. This guide takes a deep dive into how they work and how to get the most out of them in your Astro 6 projects.

What Are Content Collections?

Content Collections are Astro's built-in system for organizing, validating, and querying your Markdown, MDX, and other content files. Instead of loosely placing content files in directories and manually importing them, Content Collections give you a formal schema, type checking, and a powerful query API. Think of them as a lightweight database for your content, where every entry is validated against a schema you define, and TypeScript ensures you never access a property that does not exist.

Setting Up Collections

Content Collections are defined in the src/content/ directory. Each collection gets its own subdirectory, and a config.ts file defines the schemas for all collections. Astro uses Zod for schema validation, which means you get runtime validation and TypeScript type inference from a single source of truth. When you build your site, Astro validates every content file against its schema and reports clear errors if any frontmatter is missing or incorrectly typed.

Defining a Schema

Using Zod, you define exactly what fields each content entry must have, their types, and any default values. For example, a blog post collection might require a title (string), a date (date), and an optional description (string). You can also define enum fields for categories, arrays for tags, and even nested objects for complex metadata. The schema is enforced at build time, so invalid content never makes it to production.

Querying Collections

Once your content is organized into collections, you can query it using the getCollection() function. This function returns all entries in a collection, each with validated frontmatter and a rendered HTML body. You can filter entries by frontmatter values, sort them by date or any other field, and render them in your page templates. Because the results are fully typed, your IDE provides autocompletion for every frontmatter field.

Filtering and Sorting

The getCollection() function accepts an optional filter function that lets you select entries based on any criteria. For example, you can filter blog posts by tag, select only published entries, or find content within a date range. Combined with JavaScript array methods like sort() and slice(), you can create complex content queries that would be difficult to achieve with a simple file-based approach.

Live Collections in Astro 6

Astro 6 introduces Live Collections, which extend the existing Content Collections API with real-time editing capabilities. When you modify a Markdown file in a live collection, the changes appear in the browser instantly without a full page reload. This is a game-changer for content authors who want to see their edits reflected immediately as they write. Live Collections work with the same schema validation and type safety as regular collections, so you get the best of both worlds: instant feedback and strict content integrity.

Relationships Between Collections

Content Collections also support relationships between entries. You can reference one collection from another using slug references in your frontmatter. For example, an author collection might contain author profiles, and blog posts can reference an author by their slug. Astro resolves these references and provides the full author object when you query a blog post, eliminating the need for manual lookups or fragile string matching.

Tip: Use the reference() function in your schema to create typed references between collections. This ensures that every reference points to a valid entry and provides autocompletion for available slugs.

Content Collections and Search

Content Collections pair beautifully with Pagefind for search. Because all your content lives in a structured, validated format, you can ensure that every piece of content has consistent metadata like titles, dates, and descriptions. When Pagefind indexes your built pages, it uses this metadata to display rich, informative search results. The combination of Content Collections for authoring and Pagefind for discovery creates a powerful content management experience without any server-side infrastructure.

Best Practices

When working with Content Collections, keep a few best practices in mind. Always define a comprehensive schema — it is your safety net against content drift. Use default values for optional fields to keep your frontmatter concise. Organize related content into separate collections rather than cramming everything into one. And take advantage of the TypeScript integration by using the generated types throughout your project for maximum type safety.