refly icon indicating copy to clipboard operation
refly copied to clipboard

feat(seo): add robots.txt and dynamic sitemaps; dynamic canonical/hre…

Open anthhub opened this issue 3 weeks ago • 1 comments

…flang; include marketplace sitemap in index

Summary

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

[!Tip] Close issue syntax: Fixes #<issue number> or Resolves #<issue number>, see documentation for more details.

Impact Areas

Please check the areas this PR affects:

  • [ ] Multi-threaded Dialogues
  • [ ] AI-Powered Capabilities (Web Search, Knowledge Base Search, Question Recommendations)
  • [ ] Context Memory & References
  • [ ] Knowledge Base Integration & RAG
  • [ ] Quotes & Citations
  • [ ] AI Document Editing & WYSIWYG
  • [ ] Free-form Canvas Interface
  • [ ] Other

Screenshots/Videos

Before After
... ...

Checklist

[!IMPORTANT]
Please review the checklist below before submitting your pull request.

  • [ ] This change requires a documentation update, included: Refly Documentation
  • [x] I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • [x] I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • [x] I've updated the documentation accordingly.
  • [x] I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

Summary by CodeRabbit

  • New Features

    • Added sitemaps, a sitemap index, and robots.txt to improve crawling and indexing.
    • Built automated sitemap generation as part of the build process.
    • Improved SEO with canonical links and Open Graph URL metadata.
  • Changes

    • Updated routing: new workflow-template share route and a redirect that preserves query parameters for share links.

✏️ Tip: You can customize this high-level summary in your review settings.

anthhub avatar Dec 01 '25 13:12 anthhub

Walkthrough

Adds SEO infrastructure: sitemap generation scripts (JS/TS), static sitemap assets and robots.txt, build script changes to run sitemap generation, GlobalSEO updates for canonical and hreflang links, and routing changes introducing WorkflowTemplateRedirect and canonical /workflow-template/:shareId route. (50 words)

Changes

Cohort / File(s) Summary
Build Configuration
apps/web/package.json
Added generate:sitemap npm script and updated build to run sitemap generation after rsbuild build.
Static SEO Assets
apps/web/public/robots.txt, apps/web/public/sitemap.xml, apps/web/public/sitemap-templates.xml, apps/web/public/sitemap_index.xml
Added robots.txt, sitemap.xml, sitemap-templates.xml, and a sitemap_index.xml referencing sitemaps and sitemap-templates.
Sitemap Generation Scripts
apps/web/scripts/generate-sitemap.js, apps/web/scripts/generate-sitemap.ts
Added JS and TS scripts that fetch public workflow templates from the API, build sitemap XML (main, templates, index), write to dist/ and mirror to public/, and handle fetch errors/fallbacks. TS file exports three generator functions.
SEO Component
apps/web/src/components/GlobalSEO.tsx
Added useMemo hooks to compute canonical/og URLs from window.location, render <link rel="canonical">, <meta property="og:url">, and conditional hreflang alternates derived from window.ENV.HREFLANGS.
Routing & Redirects
apps/web/src/routes/index.tsx, apps/web/src/routes/redirects.tsx
Added WorkflowTemplateRedirect component; changed /app/:shareId to use that redirect; introduced new route /workflow-template/:shareIdWorkflowAppPage; removed old /workflow-template route.

Sequence Diagram(s)

sequenceDiagram
    participant Build as Build Process
    participant Script as generate-sitemap (JS/TS)
    participant API as Public API
    participant FS as File System
    participant Web as Web Server

    Build->>Script: npm run generate:sitemap
    Script->>API: GET /v1/template/list?scope=public&page=1&pageSize=1000
    alt API returns templates
        API-->>Script: templates list
        Script->>Script: map templates -> sitemap entries
    else API error or empty
        API-->>Script: error / empty
        Script->>Script: use fallback template entry
    end
    Script->>FS: write sitemap.xml, sitemap-templates.xml, sitemap_index.xml to dist/
    Script->>FS: mirror files to public/ (dev)
    FS-->>Script: files written
    Script-->>Build: generation complete
    Web->>Web: serve sitemaps & robots.txt to crawlers

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Pay attention to: API fetch error handling and timeouts in generate-sitemap scripts
  • Verify XML generation conforms to sitemap schema and date formatting
  • Confirm build script ordering ensures sitemaps are generated after assets are available
  • Check WorkflowTemplateRedirect preserves search params and edge-case fallback to /workspace
  • Validate GlobalSEO use of window and ENV.HREFLANGS is server-safe (or guarded)

Possibly related PRs

  • refly-ai/refly#1615 — Touches apps/web/src/routes/index.tsx and .../redirects.tsx, related routing/redirect changes.
  • refly-ai/refly#1665 — Modifies apps/web/src/components/GlobalSEO.tsx, related to SEO canonical/hreflang updates.
  • refly-ai/refly#1405 — Changes navigation around shareId-based routes; related to introducing /workflow-template/:shareId and redirects.

Suggested reviewers

  • lefarcen
  • mrcfps

Poem

🐰 I hopped through code and left a trail,

sitemaps sprout where crawlers sail.
Robots read my tiny rhyme,
Canonical hops land just in time.
Hreflangs sing across the vale.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description uses the repository template but lacks substantive content; it contains only template scaffolding with uncompleted sections and no actual summary of changes. Add a detailed summary explaining the SEO changes, list any issue numbers being fixed, describe motivation/context, and specify which impact areas are affected.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main SEO-related changes: robots.txt, dynamic sitemaps, canonical/hreflang tags, and marketplace sitemap inclusion.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment
  • [ ] Commit unit tests in branch feat/seo-files

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 63334fe488bc867f15cf9b0a8f31559f4aace00b and c9733f0a5ee130a054dbd71bb46822eb778fad45.

📒 Files selected for processing (1)
  • apps/web/public/sitemap_index.xml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/web/public/sitemap_index.xml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build / Build

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Dec 01 '25 13:12 coderabbitai[bot]