DocHub
Express application structure, middleware stack, three-tier model, and service layer

Architecture Overview

DocHub is an Express.js application that serves documentation in three tiers: visual HTML overviews, rendered markdown pages, and JSON API manifests. It runs on port 3002 alongside the CMS (ports 3000/3001) on the same DigitalOcean droplet.

Three-Tier Model

Tier Route Handler Content Source
Tier 1 — Overviews /overview/* routes/overview.ts Self-contained HTML files in content/_overviews/
Tier 2 — Docs /docs/* routes/docs.ts Markdown files in content/{project}/{subproject}/
Tier 3 — API /api/* routes/api.ts JSON manifests generated from metadata + raw markdown

Tier 1 pages are visual summaries with flow diagrams and architecture maps. They link down to Tier 2 for developer detail. Tier 3 serves the same content as machine-readable JSON for Claude CLI.

Application Entry Point

The server is defined in src/index.ts. It configures middleware in this order:

  1. Trust proxyapp.set('trust proxy', 1) so Express trusts nginx’s X-Forwarded-Proto header for secure cookie setting
  2. Helmet — CSP headers allowing 'unsafe-inline' for styles (needed for inline CSS in templates) and Google Fonts
  3. Morgan — HTTP request logging in combined format
  4. Body parsers — JSON and URL-encoded with default limits
  5. Session — PostgreSQL-backed via connect-pg-simple, shared with CMS through .ipnoelp.com cookie domain
  6. Passport — Google OAuth initialization

Route Registration

Routes are mounted in this order:

Mount Point Auth Handler Purpose
/auth Public routes/auth.ts Google OAuth login/logout/callback
/api/health Public routes/health.ts Health check with DB status
/overview/* Protected routes/overview.ts Visual HTML overview pages (Tier 1)
/api/* Protected routes/api.ts Manifest, raw, search, report endpoints (Tier 3)
/docs/* Protected routes/docs.ts Rendered markdown documentation pages (Tier 2)
/ Protected inline Redirects to /overview/ (hub landing page)

Auth is conditional: if GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET are not set, all routes bypass authentication (dev mode).

Service Layer

Three services do the real work:

Service File Singleton Purpose
MarkdownService services/MarkdownService.ts markdownService Parses frontmatter, renders markdown to HTML, builds nav trees, full-text search
ManifestService services/ManifestService.ts manifestService Generates JSON manifests from _project.yaml and _subproject.yaml metadata
TemplateService services/TemplateService.ts templateService Wraps rendered content in HTML layout with header, sidebar, breadcrumbs, and TOC

All three are instantiated as module-level singletons. MarkdownService and ManifestService accept an optional contentDir constructor argument (defaults to CONTENT_DIR env var or ./content). TemplateService reads styles.css from src/templates/ at construction time.

The overview route (routes/overview.ts) does NOT use these services — it serves self-contained HTML files directly from content/_overviews/.

Directory Structure

DocHub/
  src/
    index.ts                    # Express app, middleware, route mounting
    config/database.ts          # PostgreSQL pool (pg)
    middleware/auth.ts           # isAuthenticated, isAllowedDomain
    routes/
      auth.ts                   # Google OAuth + login page HTML
      health.ts                 # GET /api/health
      overview.ts               # Visual HTML overview pages (Tier 1)
      api.ts                    # Manifests, raw markdown, search, reports (Tier 3)
      docs.ts                   # Rendered HTML pages (Tier 2)
    services/
      MarkdownService.ts        # Markdown parsing and rendering
      ManifestService.ts        # JSON manifest generation
      TemplateService.ts        # HTML template rendering
    templates/
      styles.css                # All CSS for rendered Tier 2 pages
    scripts/
      build-site.ts             # Pre-render manifests + raw files to build/
      daily-agent.ts            # Claude API validation agent
  content/
    _master/index.md            # Hub landing page for /docs/
    _overviews/                 # Tier 1 self-contained HTML files
      architecture.html
      content-pipeline.html
      auth-deployment.html
    {project}/                  # Tier 2 markdown documentation
  build/                        # Pre-rendered output (gitignored)
  reports/                      # Daily agent JSON reports (gitignored)
  deploy/                       # nginx config, droplet setup script
  scripts/
    daily-run.sh                # Cron script: pull, build, agent, restart

API Endpoint Summary

DocHub exposes these endpoint groups (see API Endpoints docs for full request/response details):

Group Routes Auth Handler
Health GET /api/health Public routes/health.ts
Auth GET /auth/login, /auth/google, /auth/google/callback, /auth/logout, /auth/me Public routes/auth.ts
Overviews GET /overview/, GET /overview/:id Protected routes/overview.ts
Manifests GET /api/manifest, /api/manifest/:project, /api/manifest/:project/:subproject Protected routes/api.ts
Raw Content GET /api/raw/:project/:subproject/:page Protected routes/api.ts
Search GET /api/search?q=term Protected routes/api.ts
Reports GET /api/report/latest, GET /report Protected routes/api.ts / inline
Docs GET /docs/, /docs/:project/, /docs/:project/:subproject/, /docs/:project/:subproject/:page Protected routes/docs.ts

All protected routes return 401 (API) or redirect to /auth/login (pages) when unauthenticated. In dev mode (no OAuth credentials), all routes are public.

Data Structures

Session Table

The session table is the only database table DocHub uses. Created automatically by connect-pg-simple:

Column Type Purpose
sid varchar (PK) Session ID
sess json Serialized session data (Passport user object)
expire timestamp Session expiration time

Internal Interfaces

Interface Service Description
DocPage MarkdownService Parsed page: { title, summary, content (HTML), toc, references }
NavItem MarkdownService Navigation tree node: { label, path, children[], order }
ProjectMeta ManifestService Project: { id, name, description }
SubprojectMeta ManifestService Subproject: { id, name, description, doc_count }
AgentReport daily-agent Report: { generated_at, summary, projects[], issues[], recommendations[] }

There is no ORM — all database access is a single SELECT 1 in the health check endpoint. Content is entirely filesystem-based.

Process Management

DocHub uses PM2 (installed globally on the droplet) for process management, independent from the CMS start/stop scripts. The CMS scripts explicitly avoid killing port 3002 so both systems coexist.

Aspect Detail
Process manager PM2 (global install)
Start pm2 start ecosystem.config.js
Restart pm2 restart dochub
Persistence pm2 startup && pm2 save — survives reboots
Memory limit 256MB (auto-restart on exceeded)
Logs /var/log/dochub/out.log, /var/log/dochub/error.log