logo
BETA

SEO & Discoverability

Built-in tools for search engine optimization

Core Framework Feature

SEO support is built into Harpy.js core. Every new project automatically includes robots.txt and sitemap.xml generation out of the box, making your application discoverable by search engines from day one.

Harpy.js provides a powerful SEO module that automatically generates robots.txt and sitemap.xml files for your application. Built on top of NestJS's dependency injection system, the SEO module offers both default implementations and extensible services for advanced use cases.

Quick Start

For new projects created with the Harpy.js CLI, SEO is automatically configured. For existing projects, add the SeoModule to your app module:

TypeScript
1import { Module } from '@nestjs/common';2import { SeoModule } from '@harpy-js/core';34@Module({5  imports: [6    SeoModule.forRoot({7      baseUrl: process.env.BASE_URL || 'http://localhost:3000',8    }),9  ],10})11export class AppModule {}

⚑ Instant Results: Once imported, your application automatically serves /robots.txt and /sitemap.xml endpoints with sensible defaults.

What Gets Generated

robots.txt

The robots.txt file tells search engine crawlers which parts of your site they can access. Here's what the default configuration generates:

TypeScript
1User-agent: *2Allow: /3Disallow: /api/4Disallow: /private/56Sitemap: https://example.com/sitemap.xml78Host: https://example.com

sitemap.xml

The sitemap provides search engines with a structured list of your pages, including metadata about update frequency and priority:

TypeScript
1<?xml version="1.0" encoding="UTF-8"?>2<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">3  <url>4    <loc>https://example.com</loc>5    <lastmod>2025-12-13T12:00:00.000Z</lastmod>6    <changefreq>daily</changefreq>7    <priority>1</priority>8  </url>9  <url>10    <loc>https://example.com/about</loc>11    <lastmod>2025-12-13T12:00:00.000Z</lastmod>12    <changefreq>monthly</changefreq>13    <priority>0.8</priority>14  </url>15</urlset>

Key Benefits:

  • Faster indexing of new and updated pages
  • Better crawl budget allocation
  • Improved discovery of deep or dynamic pages
  • Priority hints for important content

Custom SEO Service

For most applications, you'll want to customize which URLs appear in your sitemap. Create a custom service by extending BaseSeoService:

Step 1: Create Your SEO Service

TypeScript
1import { Injectable } from '@nestjs/common';2import { BaseSeoService, SitemapUrl, RobotsConfig } from '@harpy-js/core';34@Injectable()5export class SeoService extends BaseSeoService {6  getSitemapUrls(): Promise<SitemapUrl[]> {7    const now = new Date();89    return Promise.resolve([10      {11        url: this.baseUrl,12        lastModified: now,13        changeFrequency: 'daily',14        priority: 1.0,15      },16      {17        url: `${this.baseUrl}/about`,18        lastModified: now,19        changeFrequency: 'monthly',20        priority: 0.8,21      },22    ]);23  }2425  getRobotsConfig(): RobotsConfig {26    return {27      rules: {28        userAgent: '*',29        allow: '/',30        disallow: ['/api/', '/admin/'],31      },32      sitemap: `${this.baseUrl}/sitemap.xml`,33      host: this.baseUrl,34    };35  }36}

Step 2: Register Your Custom Service

TypeScript
1import { Module } from '@nestjs/common';2import { SeoModule } from '@harpy-js/core';3import { SeoService } from './seo.service';45@Module({6  imports: [7    SeoModule.forRootWithService(SeoService, {8      baseUrl: process.env.BASE_URL || 'http://localhost:3000',9    }),10  ],11})12export class AppModule {}

πŸ’‘ Protected Access: The baseUrl property is protected in BaseSeoService, giving your custom service access to the configured base URL without manually passing it around.

Dynamic Sitemaps from Database

Real-world applications often need to generate sitemaps from database content like blog posts, products, or user profiles. Here's how to integrate with TypeORM:

TypeScript
1import { Injectable } from '@nestjs/common';2import { BaseSeoService, SitemapUrl } from '@harpy-js/core';3import { InjectRepository } from '@nestjs/typeorm';4import { Repository } from 'typeorm';5import { Post } from './entities/post.entity';67@Injectable()8export class SeoService extends BaseSeoService {9  constructor(10    @InjectRepository(Post)11    private postsRepository: Repository<Post>,12  ) {13    super();14  }1516  async getSitemapUrls(): Promise<SitemapUrl[]> {17    const staticPages: SitemapUrl[] = [18      {19        url: this.baseUrl,20        lastModified: new Date(),21        changeFrequency: 'daily',22        priority: 1.0,23      },24    ];2526    // Fetch blog posts from database27    const posts = await this.postsRepository.find({28      select: ['slug', 'updatedAt'],29      where: { published: true },30    });3132    const dynamicPages = posts.map((post) => ({33      url: `${this.baseUrl}/blog/${post.slug}`,34      lastModified: post.updatedAt,35      changeFrequency: 'weekly' as const,36      priority: 0.7,37    }));3839    return [...staticPages, ...dynamicPages];40  }4142  getRobotsConfig() {43    return {44      rules: { userAgent: '*', allow: '/' },45      sitemap: `${this.baseUrl}/sitemap.xml`,46    };47  }48}

⚠️ Performance Tip: For large databases, consider caching sitemap results or implementing pagination with sitemap index files to avoid overwhelming your database with every crawler request.

Multi-language Sitemaps

For internationalized applications, you can use the alternates property to specify alternate language versions of each page:

TypeScript
1getSitemapUrls(): Promise<SitemapUrl[]> {2  const locales = ['en', 'fr', 'es'];3  const pages = ['/', '/about', '/contact'];4  5  const urls: SitemapUrl[] = [];6  7  for (const locale of locales) {8    for (const page of pages) {9      urls.push({10        url: `${this.baseUrl}/${locale}${page}`,11        lastModified: new Date(),12        changeFrequency: 'weekly',13        priority: page === '/' ? 1.0 : 0.8,14        alternates: {15          languages: locales.reduce((acc, lang) => {16            acc[lang] = `${this.baseUrl}/${lang}${page}`;17            return acc;18          }, {} as Record<string, string>),19        },20      });21    }22  }23  24  return Promise.resolve(urls);25}

This generates proper <xhtml:link> tags in your sitemap, helping search engines understand the relationship between translated versions of your content.

Advanced robots.txt Configuration

You can define different rules for different crawlers and specify multiple sitemaps:

TypeScript
1getRobotsConfig(): RobotsConfig {2  return {3    rules: [4      {5        userAgent: 'Googlebot',6        allow: '/',7        crawlDelay: 0,8      },9      {10        userAgent: 'Bingbot',11        allow: '/',12        crawlDelay: 1,13      },14      {15        userAgent: '*',16        disallow: ['/api/', '/admin/', '/private/'],17      },18    ],19    sitemap: [20      `${this.baseUrl}/sitemap.xml`,21      `${this.baseUrl}/sitemap-images.xml`,22    ],23    host: this.baseUrl,24  };25}

Use Cases:

  • Crawl delays: Prevent aggressive crawlers from overwhelming your server
  • Multiple sitemaps: Separate sitemaps for different content types (pages, images, videos)
  • Bot-specific rules: Customize behavior for Google, Bing, or other specific crawlers

TypeScript Types

The SEO module is fully typed for excellent IDE support:

TypeScript
1interface SitemapUrl {2  url: string;3  lastModified?: Date | string;4  changeFrequency?: 5    | 'always' 6    | 'hourly' 7    | 'daily' 8    | 'weekly' 9    | 'monthly' 10    | 'yearly' 11    | 'never';12  priority?: number; // 0.0 to 1.013  alternates?: {14    languages?: Record<string, string>;15  };16}1718interface RobotsConfig {19  rules: {20    userAgent: string | string[];21    allow?: string | string[];22    disallow?: string | string[];23    crawlDelay?: number;24  } | Array<{...}>;25  sitemap?: string | string[];26  host?: string;27}

Testing Your Configuration

Once your application is running, test your SEO endpoints:

TypeScript
1# Test robots.txt2curl http://localhost:3000/robots.txt34# Test sitemap.xml5curl http://localhost:3000/sitemap.xml

πŸ” Validation Tools:

Best Practices

βœ… Update Frequencies

Set realistic changeFrequency values. Use 'daily' for homepage, 'weekly' for blog posts, and 'monthly' for static pages.

βœ… Priority Values

Reserve priority: 1.0 for your most important pages. Use 0.8-0.9 for major sections and 0.5-0.7 for standard content.

βœ… Caching Strategy

The SEO controllers include cache headers by default (24h for robots.txt, 1h for sitemap.xml). Adjust these based on how frequently your content changes.

βœ… Environment Variables

Always use process.env.BASE_URL for your base URL configuration. This ensures correct URLs across development, staging, and production environments.

Why SEO Matters in Modern Web Development

Search engine optimization isn't just about rankingsβ€”it's about making your application discoverable, accessible, and successful:

Organic Traffic

Properly configured sitemaps help search engines discover and index your content faster, leading to increased organic traffic.

Lower Acquisition Costs

Good SEO reduces reliance on paid advertising by improving organic visibility and reducing customer acquisition costs.

Credibility & Trust

Users trust search results. Higher rankings signal authority and build credibility with your audience.

Global Reach

Multi-language sitemap support helps international audiences find your content in their preferred language.

Built for Production from Day One

By including SEO as a core framework feature, Harpy.js ensures that developers don't have to remember to add these critical features later. Your application is search-engine ready from the moment you start development, following industry best practices automatically.

Next Steps