SEO & Discoverability
Built-in tools for search engine optimization
Core Framework Feature
SEO support is built into Harpy.js core. Every new project automatically includes robots.txt and sitemap.xml generation out of the box, making your application discoverable by search engines from day one.
Harpy.js provides a powerful SEO module that automatically generates robots.txt and sitemap.xml files for your application. Built on top of NestJS's dependency injection system, the SEO module offers both default implementations and extensible services for advanced use cases.
Quick Start
For new projects created with the Harpy.js CLI, SEO is automatically configured. For existing projects, add the SeoModule to your app module:
1import { Module } from '@nestjs/common';2import { SeoModule } from '@harpy-js/core';34@Module({5 imports: [6 SeoModule.forRoot({7 baseUrl: process.env.BASE_URL || 'http://localhost:3000',8 }),9 ],10})11export class AppModule {}⚡ Instant Results: Once imported, your application automatically serves /robots.txt and /sitemap.xml endpoints with sensible defaults.
What Gets Generated
robots.txt
The robots.txt file tells search engine crawlers which parts of your site they can access. Here's what the default configuration generates:
1User-agent: *2Allow: /3Disallow: /api/4Disallow: /private/56Sitemap: https://example.com/sitemap.xml78Host: https://example.comsitemap.xml
The sitemap provides search engines with a structured list of your pages, including metadata about update frequency and priority:
1<?xml version="1.0" encoding="UTF-8"?>2<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">3 <url>4 <loc>https://example.com</loc>5 <lastmod>2025-12-13T12:00:00.000Z</lastmod>6 <changefreq>daily</changefreq>7 <priority>1</priority>8 </url>9 <url>10 <loc>https://example.com/about</loc>11 <lastmod>2025-12-13T12:00:00.000Z</lastmod>12 <changefreq>monthly</changefreq>13 <priority>0.8</priority>14 </url>15</urlset>Key Benefits:
- Faster indexing of new and updated pages
- Better crawl budget allocation
- Improved discovery of deep or dynamic pages
- Priority hints for important content
Custom SEO Service
For most applications, you'll want to customize which URLs appear in your sitemap. Create a custom service by extending BaseSeoService:
Step 1: Create Your SEO Service
1import { Injectable } from '@nestjs/common';2import { BaseSeoService, SitemapUrl, RobotsConfig } from '@harpy-js/core';34@Injectable()5export class SeoService extends BaseSeoService {6 getSitemapUrls(): Promise<SitemapUrl[]> {7 const now = new Date();89 return Promise.resolve([10 {11 url: this.baseUrl,12 lastModified: now,13 changeFrequency: 'daily',14 priority: 1.0,15 },16 {17 url: `${this.baseUrl}/about`,18 lastModified: now,19 changeFrequency: 'monthly',20 priority: 0.8,21 },22 ]);23 }2425 getRobotsConfig(): RobotsConfig {26 return {27 rules: {28 userAgent: '*',29 allow: '/',30 disallow: ['/api/', '/admin/'],31 },32 sitemap: `${this.baseUrl}/sitemap.xml`,33 host: this.baseUrl,34 };35 }36}Step 2: Register Your Custom Service
1import { Module } from '@nestjs/common';2import { SeoModule } from '@harpy-js/core';3import { SeoService } from './seo.service';45@Module({6 imports: [7 SeoModule.forRootWithService(SeoService, {8 baseUrl: process.env.BASE_URL || 'http://localhost:3000',9 }),10 ],11})12export class AppModule {}💡 Protected Access: The baseUrl property is protected in BaseSeoService, giving your custom service access to the configured base URL without manually passing it around.
Dynamic Sitemaps from Database
Real-world applications often need to generate sitemaps from database content like blog posts, products, or user profiles. Here's how to integrate with TypeORM:
1import { Injectable } from '@nestjs/common';2import { BaseSeoService, SitemapUrl } from '@harpy-js/core';3import { InjectRepository } from '@nestjs/typeorm';4import { Repository } from 'typeorm';5import { Post } from './entities/post.entity';67@Injectable()8export class SeoService extends BaseSeoService {9 constructor(10 @InjectRepository(Post)11 private postsRepository: Repository<Post>,12 ) {13 super();14 }1516 async getSitemapUrls(): Promise<SitemapUrl[]> {17 const staticPages: SitemapUrl[] = [18 {19 url: this.baseUrl,20 lastModified: new Date(),21 changeFrequency: 'daily',22 priority: 1.0,23 },24 ];2526 // Fetch blog posts from database27 const posts = await this.postsRepository.find({28 select: ['slug', 'updatedAt'],29 where: { published: true },30 });3132 const dynamicPages = posts.map((post) => ({33 url: `${this.baseUrl}/blog/${post.slug}`,34 lastModified: post.updatedAt,35 changeFrequency: 'weekly' as const,36 priority: 0.7,37 }));3839 return [...staticPages, ...dynamicPages];40 }4142 getRobotsConfig() {43 return {44 rules: { userAgent: '*', allow: '/' },45 sitemap: `${this.baseUrl}/sitemap.xml`,46 };47 }48}⚠️ Performance Tip: For large databases, consider caching sitemap results or implementing pagination with sitemap index files to avoid overwhelming your database with every crawler request.
Multi-language Sitemaps
For internationalized applications, you can use the alternates property to specify alternate language versions of each page:
1getSitemapUrls(): Promise<SitemapUrl[]> {2 const locales = ['en', 'fr', 'es'];3 const pages = ['/', '/about', '/contact'];4 5 const urls: SitemapUrl[] = [];6 7 for (const locale of locales) {8 for (const page of pages) {9 urls.push({10 url: `${this.baseUrl}/${locale}${page}`,11 lastModified: new Date(),12 changeFrequency: 'weekly',13 priority: page === '/' ? 1.0 : 0.8,14 alternates: {15 languages: locales.reduce((acc, lang) => {16 acc[lang] = `${this.baseUrl}/${lang}${page}`;17 return acc;18 }, {} as Record<string, string>),19 },20 });21 }22 }23 24 return Promise.resolve(urls);25}This generates proper <xhtml:link> tags in your sitemap, helping search engines understand the relationship between translated versions of your content.
Advanced robots.txt Configuration
You can define different rules for different crawlers and specify multiple sitemaps:
1getRobotsConfig(): RobotsConfig {2 return {3 rules: [4 {5 userAgent: 'Googlebot',6 allow: '/',7 crawlDelay: 0,8 },9 {10 userAgent: 'Bingbot',11 allow: '/',12 crawlDelay: 1,13 },14 {15 userAgent: '*',16 disallow: ['/api/', '/admin/', '/private/'],17 },18 ],19 sitemap: [20 `${this.baseUrl}/sitemap.xml`,21 `${this.baseUrl}/sitemap-images.xml`,22 ],23 host: this.baseUrl,24 };25}Use Cases:
- Crawl delays: Prevent aggressive crawlers from overwhelming your server
- Multiple sitemaps: Separate sitemaps for different content types (pages, images, videos)
- Bot-specific rules: Customize behavior for Google, Bing, or other specific crawlers
TypeScript Types
The SEO module is fully typed for excellent IDE support:
1interface SitemapUrl {2 url: string;3 lastModified?: Date | string;4 changeFrequency?: 5 | 'always' 6 | 'hourly' 7 | 'daily' 8 | 'weekly' 9 | 'monthly' 10 | 'yearly' 11 | 'never';12 priority?: number; // 0.0 to 1.013 alternates?: {14 languages?: Record<string, string>;15 };16}1718interface RobotsConfig {19 rules: {20 userAgent: string | string[];21 allow?: string | string[];22 disallow?: string | string[];23 crawlDelay?: number;24 } | Array<{...}>;25 sitemap?: string | string[];26 host?: string;27}Testing Your Configuration
Once your application is running, test your SEO endpoints:
1# Test robots.txt2curl http://localhost:3000/robots.txt34# Test sitemap.xml5curl http://localhost:3000/sitemap.xml🔍 Validation Tools:
- Google's robots.txt Tester
- Google Search Console - Submit and validate sitemaps
- XML Sitemap Validator
Best Practices
✅ Update Frequencies
Set realistic changeFrequency values. Use 'daily' for homepage, 'weekly' for blog posts, and 'monthly' for static pages.
✅ Priority Values
Reserve priority: 1.0 for your most important pages. Use 0.8-0.9 for major sections and 0.5-0.7 for standard content.
✅ Caching Strategy
The SEO controllers include cache headers by default (24h for robots.txt, 1h for sitemap.xml). Adjust these based on how frequently your content changes.
✅ Environment Variables
Always use process.env.BASE_URL for your base URL configuration. This ensures correct URLs across development, staging, and production environments.
Canonical Domain Redirects
Fix SEO Issues with Just 4 Lines of Code
One of the most common SEO problems is having multiple URLs pointing to the same content—like http://, https://, and www variants. This confuses search engines and dilutes your rankings. Harpy.js solves this automatically with permanent redirects.
The Problem: Duplicate Content Penalties
Search engines like Google penalize websites that serve the same content from multiple URLs. Without proper redirects, your site could be accessible via:
- ❌
http://example.com - ❌
https://example.com - ❌
http://www.example.com - ❌
https://www.example.com
Result: Google Search Console will flag these as "Page with redirect" errors, your link equity gets split across multiple URLs, and your rankings suffer.
The Solution: One Canonical URL
Harpy.js automatically issues 301 Permanent Redirects to consolidate all traffic to your preferred canonical domain:
https://example.com(Your canonical URL)All other variants automatically redirect here with proper caching headers.
Implementation: Simple & Powerful
Add these four options to your setupHarpyApp() call in your main.ts file:
1import { NestFactory } from '@nestjs/core';2import { FastifyAdapter } from '@nestjs/platform-fastify';3import { setupHarpyApp } from '@harpy-js/core';45async function bootstrap() {6 const adapter = new FastifyAdapter();7 const app = await NestFactory.create(AppModule, adapter);89 await setupHarpyApp(app, {10 enforceRedirects: true, // Enable canonical redirects11 mainDomain: 'example.com', // Your canonical domain12 enforceHttps: true, // Redirect HTTP → HTTPS13 redirectWww: true, // Redirect www → non-www14 });1516 await app.listen(process.env.PORT || 3000, '0.0.0.0');17}18bootstrap();Configuration Options:
interface HarpyAppOptions { enforceRedirects?: boolean; // Enable/disable all redirects mainDomain?: string; // Your canonical domain (e.g., 'example.com') enforceHttps?: boolean; // Force HTTPS protocol redirectWww?: boolean; // Redirect www to non-www (or vice versa) // ... other options}- enforceRedirects: Master switch to enable/disable all redirect logic
- mainDomain: Your canonical domain without protocol or www (e.g., 'example.com', 'harpyjs.org')
- enforceHttps: Redirects all HTTP requests to HTTPS
- redirectWww: Redirects www to non-www (or set to false for opposite)
How It Works Behind the Scenes
Fastify Hook Integration
Uses a Fastify onRequest hook that runs early in the request lifecycle—before any route handlers.
Proxy-Aware
Automatically detects x-forwarded-proto headers from proxies like Vercel, Cloudflare, and Nginx.
301 Permanent Redirects
Uses 301 status codes (not 302) with 1-year cache headers, telling search engines this redirect is permanent.
Localhost Exemption
Automatically skips redirects for localhost and 127.0.0.1 so local development works without hassle.
Real-World Impact
SEO Benefits:
- ✓Consolidated Link Equity: All backlinks point to one URL, maximizing ranking power
- ✓No Duplicate Content: Search engines index only your canonical URL
- ✓Clean Search Console: Eliminates "Page with redirect" warnings
- ✓Better Crawl Efficiency: Search bots don't waste time on redirects
- ✓HTTPS Security: Forces secure connections, boosting trust signals
Why This Matters:
Most frameworks require complex server configuration, middleware chains, or third-party packages to handle canonical redirects. Harpy.js makes it framework-native—just four configuration options and you're done. This is exactly the kind of developer experience that prevents costly SEO mistakes before they happen.
Why SEO Matters in Modern Web Development
Search engine optimization isn't just about rankings—it's about making your application discoverable, accessible, and successful:
Organic Traffic
Properly configured sitemaps help search engines discover and index your content faster, leading to increased organic traffic.
Lower Acquisition Costs
Good SEO reduces reliance on paid advertising by improving organic visibility and reducing customer acquisition costs.
Credibility & Trust
Users trust search results. Higher rankings signal authority and build credibility with your audience.
Global Reach
Multi-language sitemap support helps international audiences find your content in their preferred language.
Built for Production from Day One
By including SEO as a core framework feature, Harpy.js ensures that developers don't have to remember to add these critical features later. Your application is search-engine ready from the moment you start development, following industry best practices automatically.