Skip to content

Architecture

System architecture of rimae/scan: component breakdown, data flow, and directory structure.


High-Level Architecture

                  +-------------------+
                  |     Browser       |
                  |  (React + Vite)   |
                  +--------+----------+
                           |
                    HTTPS / WSS
                           |
                  +--------v----------+
                  |      Caddy        |
                  |  (Reverse Proxy)  |
                  |  Automatic HTTPS  |
                  +--------+----------+
                           |
              +------------+------------+
              |                         |
     /api/*, /ws/*           All other paths
              |              (frontend embedded in binary)
              |
     +--------v-----------------------------------------+
     |             rimae-scan (single Go binary)         |
     |                                                   |
     |  +-------------------+  +----------------------+  |
     |  |    Echo v4 API    |  |  robfig/cron v3      |  |
     |  |    HTTP handlers  |  |  In-process scheduler |  |
     |  |    Port 8000      |  |  18 scheduled jobs    |  |
     |  +--------+----------+  +----------+-----------+  |
     |           |                        |              |
     +-----------+------------------------+--------------+
                 |                        |
        +--------+--------+     (scheduled tasks run
        |                 |      as goroutines against
   +----v-----+    +------v---+  the same DB pool)
   | PostgreSQL|    |  Redis   |
   |    16     |    |    7     |
   | Database  |    | Rate     |
   |           |    | Limiting |
   |           |    | + Cache  |
   +-----------+    +----------+

Component Breakdown

Caddy (Reverse Proxy)

Caddy sits in front of the Go binary. It terminates TLS via Let's Encrypt and forwards all traffic to port 8000. The frontend is embedded in the Go binary -- Caddy does not serve static files.

rimae-scan (Single Go Binary)

The entire backend is a single Go binary compiled from cmd/server/main.go. It embeds the frontend static files, database migrations, and all application logic. There are no separate worker or beat processes.

Entry point: cmd/server/main.go -- parses flags (--port, --config, --migrate, --seed, --version), connects to PostgreSQL and Redis, wires the Echo v4 router, starts the in-process cron scheduler, and listens on the configured port (default 8000).

Graceful shutdown: The binary traps SIGINT/SIGTERM, drains HTTP connections (15-second timeout), stops the cron scheduler, and closes the database pool.

Echo v4 API Server

The HTTP layer uses Echo v4, configured in internal/api/router.go.

Middleware stack (applied in order):

  1. Recover -- Panic recovery (Echo built-in)
  2. SecurityHeaders -- X-Content-Type-Options, X-Frame-Options, Content-Security-Policy, Referrer-Policy, Permissions-Policy, HSTS
  3. RateLimit -- Token-bucket rate limiting backed by Redis. Per-IP for auth endpoints, per-user for writes and exports
  4. CSRF -- Cross-site request forgery protection
  5. Gzip -- Compresses responses larger than 1 KB (Echo built-in)
  6. CORS -- Allows requests from the configured domain and localhost:3000 (development)

API route groups:

Package Prefix Description
handlers.Health /api/health Health check -- status, version, component connectivity
handlers.Auth /api/auth Login, setup wizard, logout, token refresh, API key management
handlers.Users /api/users User CRUD and role assignment
handlers.Assets /api/assets Asset inventory (hosts, OS packages, IPs, criticality)
handlers.Cves /api/cves CVE explorer with filtering, sorting, detail views
handlers.Advisories /api/advisories OS-specific security advisories (USN, ALSA, DSA, VMSA)
handlers.Apps /api/config/app-configs Application instance configuration and probing
handlers.VulnMatches /api/vuln-matches Vulnerability-to-asset correlation results
handlers.Summary /api/summary Dashboard statistics, trends, top CVEs
handlers.Github /api/github GitHub repositories, dependency manifests, scan results
handlers.Export /api/export Report generation (CSV, JSON, HTML, PDF, XLSX)
handlers.SBOM /api/export/sbom CycloneDX and SPDX SBOM export
handlers.ScanRuns /api/scan-runs Scan execution history and manual trigger
handlers.Alerts /api/alerts Alert management
handlers.SiemAlerts /api/siem-alerts SIEM alert correlation and status management
handlers.Audit /api/audit-log Audit log (admin only)
handlers.Branding /api/branding White-label branding configuration
handlers.ConfigOther /api/config/os-versions OS version configuration and onboarding agent
handlers.VulnSources /api/config/vuln-sources Vulnerability source configuration and crawl triggers
handlers.Platforms /api/config/platforms Integration platform configuration (SIEM, scanner, ticketing, etc.)
handlers.SLA /api/config/sla-policies SLA policy configuration and overdue tracking
handlers.DockerImages /api/docker-images Container image inventory
handlers.System /api/system System administration (status, force-correlate, maintenance mode)

robfig/cron v3 Scheduler

The scheduler runs in-process within the Go binary, initialized in internal/scheduler/scheduler.go. It manages 18 periodic jobs as goroutines against the shared database connection pool. No Redis is required for task scheduling.

Scheduled jobs:

Job Schedule Description
integration-sync Every 5 minutes Sync integration platform configurations
inventory-sync Every 15 minutes Collect asset data from Wazuh, HTTP probes, Docker, Ceph
crawl-all Hourly Fetch data from 70+ vulnerability sources
full-correlation Daily (06:00 UTC) Run the 7-step correlation engine
retention-cleanup Daily (04:00 UTC) Data retention cleanup and maintenance
github-org-scan Every 12 hours Scan GitHub organization repositories and manifests
agent-version-discovery Every 30 minutes LLM agent version detection for new OS releases
sla-breach-check Daily (midnight UTC) Check for SLA policy breaches
ticket-create Hourly (at :05) Create tickets in configured ticketing platforms
ticket-sync Every 30 minutes Sync ticket status from external platforms
shared-asset-sync Every 15 minutes Sync asset data to shared schema
shared-vuln-summary Every 6 hours Update shared vulnerability summary
threat-intel-sync Every 4 hours Pull indicators from threat intelligence platforms
threat-intel-enrich Hourly (at :15) Enrich CVEs with threat intelligence data
threat-intel-push Every 6 hours (at :45) Push indicators to threat intel platforms
threat-intel-reenrich Every 12 hours (at :30) Re-enrich stale threat intel data
notification-dispatch Every 2 minutes Dispatch pending notifications (Slack, webhook)
scanner-sync Every 6 hours Sync findings from external scanners

Each job runs in its own goroutine with panic recovery. Job status (last run, next run, error state) is tracked in memory and exposed via /api/system/status.

PostgreSQL 16

Primary data store. Key tables:

Table Description
assets Host inventory (hostname, OS, IP addresses, Wazuh agent ID, criticality)
os_packages Installed OS packages per asset
app_instances Detected application instances per asset
app_configs Application definitions (probe methods, CPE strings, advisory sources)
infra_components Infrastructure components (Ceph daemons, Docker engines)
docker_images Container images found on assets
cves CVE records with CVSS, EPSS, KEV, and exploit metadata
advisories OS-specific security advisories
advisory_packages Package-level fix information within advisories
ecosystem_advisories Language ecosystem advisories (npm, PyPI, Go, Rust, etc.)
vuln_matches Correlation results linking assets to vulnerabilities
vuln_source_configs Configuration for each vulnerability data source
os_version_configs Tracked OS distributions and versions
scan_runs History of crawl, correlation, and probe runs
users User accounts with roles (admin, analyst, read_only)
secrets Encrypted storage for integration API keys (AES-256-GCM)
sla_policies SLA breach thresholds by severity
integration_configs External platform configurations (SIEM, scanner, ticketing, etc.)

Redis 7

Two purposes:

  1. Rate limiting -- Fixed-window and token-bucket counters for API rate limiting
  2. Caching -- Transient cache for frequently accessed data

Redis is NOT used as a message broker or task queue. All task scheduling is handled by the in-process robfig/cron scheduler.


Data Flow

1. Vulnerability Ingestion

External Sources (NVD, MITRE, GHSA, vendor feeds, ...)
        |
        v
  [robfig/cron: crawl-all job]
        |
        v
  Crawler packages (internal/crawler/*)
        |
        v
  PostgreSQL (cves, advisories, ecosystem_advisories)

Crawlers are organized by category under internal/crawler/:

Category Examples
cve/ NVD, MITRE, CIRCL, CISA KEV, EPSS, ExploitDB, Metasploit, Nuclei, OSV, VulnCheck
csaf/ CISA, Red Hat, Microsoft, Cisco, Siemens, CERT-Bund, NCSC-NL
ecosystem/ GHSA, PyPA, RustSec, npm, Go, RubyGems, Composer, Sonatype, OSSF Malicious Packages
vendor/ Red Hat, Debian DSA, Broadcom/VMware, HashiCorp, Grafana, Jenkins, Kubernetes, Ceph, ISC
os/ Ubuntu USN, AlmaLinux errata, Proxmox, ESXi VMSA
oval/ OVAL vulnerability definitions
national_cert/ US-CERT, BSI, ANSSI, NCSC-UK, JPCERT, JVN, ENISA, NCSC-NL, ACSC, CCN-CERT
threat_intel/ GreyNoise, OTX, ThreatFox, SANS ISC
supply_chain/ deps.dev, OpenSSF Scorecard
container/ Grype DB, Trivy DB
weakness/ CWE, CAPEC, MITRE ATT&CK

Each crawler implements the interface defined in internal/crawler/base.go and is registered via internal/crawlereg/.

2. Asset Inventory Collection

Infrastructure (Wazuh, HTTP endpoints, Ceph, Docker)
        |
        v
  [robfig/cron: inventory-sync job]
        |
        v
  Inventory probes (internal/inventory/*)
        |
        v
  PostgreSQL (assets, os_packages, app_instances, infra_components, docker_images)

3. GitHub Scanning

GitHub API (organization repos)
        |
        v
  [robfig/cron: github-org-scan job]
        |
        v
  Org scanner + manifest parsers (internal/github/*)
        |
        v
  PostgreSQL (github repos, dependencies)

Supported manifest formats: package-lock.json, yarn.lock, pnpm-lock.yaml, Pipfile.lock, poetry.lock, go.mod, go.sum, requirements.txt, Cargo.toml, Cargo.lock, Gemfile.lock, composer.lock, pom.xml, build.gradle.

4. Correlation

  [robfig/cron: full-correlation job]
        |
        v
  Correlation Engine (internal/correlation/)
  7-step pipeline:
    1. Load asset inventory
    2. Load vulnerability data
    3. Match packages against advisories
    4. Match apps against CVEs/CPEs
    5. Score matches (composite: 7 weighted signals)
    6. Deduplicate results
    7. Write/update vuln_match records
        |
        v
  PostgreSQL (vuln_matches)

The scoring engine uses pluggable signal modules, each implementing signals.Signal defined in internal/correlation/signals/signal.go:

Signal File Description
CVSS Base Score cvss.go CVSS v3.1 and v4 base scores
EPSS Probability epss.go EPSS probability score
KEV Boost kev.go CISA Known Exploited Vulnerabilities catalog (with ransomware boost)
Exploit Availability exploit.go ExploitDB, Metasploit, Nuclei, XDB availability
SIEM Correlation siem.go Active exploitation signals from SIEM alerts
Asset Criticality asset_criticality.go Business-criticality weighting of the affected asset
ThreatIntel Corroboration threatintel.go Corroboration from threat intelligence platform indicators

Each signal returns a value in the 0-10 range. The CompositeScorer in internal/correlation/scoring.go computes a weighted average across all active signals. Signal weights can be overridden per vulnerability source via the database.

5. Integration Sync

External Platforms (Jira, Slack, MISP, Qualys, Tenable, ...)
        |
        v
  [robfig/cron: integration-sync, ticket-*, threat-intel-*, scanner-sync, notification-dispatch]
        |
        v
  Orchestrators (internal/integrations/)
        |
        v
  Target adapters (internal/integrations/targets/)
        |
        v
  PostgreSQL (integration_configs, siem alerts, tickets)

Integration target adapters live in internal/integrations/targets/:

  • Ticketing: Jira, Zendesk, Frappe Helpdesk
  • Threat Intelligence: MISP, OpenCTI, VirusTotal, Vulners, Yeti
  • Scanners: Qualys, Tenable, Tenable.sc, Snyk, Shodan
  • Notifications: Slack, Webhook

Each adapter implements the interface in internal/integrations/targets/base.go and is registered via the factory in internal/integrations/targets/factory.go.

6. User Interaction

  Browser (React SPA)
        |
        v
  Caddy -> Echo v4 API (single Go binary)
        |
        v
  PostgreSQL (reads) / Scheduler (triggers scans on demand)

Directory Structure

Source Layout

cmd/
  server/
    main.go              # Application entry point
    embed.go             # Embedded frontend + migrations via go:embed
    frontend/            # Built React static files
    migrations/          # SQL migration files (golang-migrate)

internal/
  agents/                # LLM-powered agents (source discovery, version onboarding)
  api/                   # Echo v4 application setup
    router.go            # Route registration and middleware wiring
    handlers/            # HTTP handler structs per resource
    middleware/           # Auth, CSRF, rate limiting, security headers
  auth/                  # Password hashing (argon2id), JWT tokens, revocation
  config/                # Application configuration loading and seed data
  correlation/           # 7-step correlation engine
    scoring.go           # Composite risk scorer
    signals/             # 7 pluggable scoring signals
  crawler/               # 70+ vulnerability source crawlers
    base.go              # Base crawler interface
    cve/                 # CVE databases (NVD, MITRE, CIRCL, etc.)
    csaf/                # CSAF advisory feeds
    ecosystem/           # Language ecosystem advisories
    vendor/              # Vendor-specific advisories
    os/                  # OS advisory feeds
    oval/                # OVAL vulnerability definitions
    national_cert/       # National CERT advisories
    threat_intel/        # Threat intelligence feeds
    supply_chain/        # Supply chain data sources
    container/           # Container vulnerability databases
    weakness/            # CWE, CAPEC, ATT&CK
  crawlereg/             # Crawler registry
  crypto/                # AES-256-GCM encryption with Argon2id KDF
  db/                    # sqlc-generated queries + models, connection pool, migrations
  export/                # Report generation
    csv.go               # CSV export
    json.go              # JSON export
    html.go              # HTML export (html/template)
    pdf.go               # PDF export (maroto v2)
    xlsx.go              # XLSX spreadsheet export
    cyclonedx.go         # CycloneDX SBOM export
    spdx.go              # SPDX SBOM export
  github/                # GitHub org scanning, manifest parsing, upstream resolution
  integrations/          # Integration orchestrators
    targets/             # Target adapters (SIEM, scanner, threat_intel, ticketing, notification)
  inventory/             # Asset inventory (Wazuh, HTTP probes, Docker, Ceph)
  redis/                 # Redis client for rate limiting
  scheduler/             # robfig/cron job definitions
  shared/                # Shared utilities (asset sync, vuln summary)

frontend/                # React + Vite + Tailwind CSS frontend
  src/
    views/               # 25 page components
    components/          # Shared UI components
    ui/                  # Primitive UI components (Badge, Skeleton, etc.)
    auth/                # AuthProvider, ProtectedRoute, PublicRoute
    hooks/               # useApi, usePaginatedQuery, useApiMutation
    branding/            # White-label branding system
  dist/                  # Built static files (served by Caddy)

scripts/                 # Operational scripts
  install.sh             # Full interactive installer
  upgrade.sh             # Upgrade script
  uninstall.sh           # Uninstaller with confirmation prompts
  backup.sh              # Backup (database, config, certificates)
  health-check.sh        # Comprehensive health checker
  seed-defaults.sh       # Default data seeder

systemd/                 # Systemd unit files
packaging/               # Package build files
  debian/                # deb package control files
  rpm/                   # RPM spec file

Installation Directories

/usr/bin/rimae-scan-server          # API server binary (frontend embedded)
/usr/bin/rimae-scan-scheduler       # Scheduler binary
/usr/lib/rimae-scan/workers/        # Worker binaries (crawlers, sync, reports)

/etc/rimae-scan/                    # Configuration
  rimae-scan.conf                   # Main config file (env vars)
  compliance/factory/               # Signed compliance catalogs

/var/lib/rimae-scan/                # Persistent data
  git_cache/                        # Cloned GitHub repositories
  grype_db/                         # Grype vulnerability database cache
  trivy_db/                         # Trivy vulnerability database cache
  export_tmp/                       # Temporary export file staging

/var/log/rimae-scan/                # Log files (scripts output)
/var/backups/rimae-scan/            # Backup archives

Technology Stack

Layer Technology
Language Go 1.26
Web framework Echo v4
Database driver pgx/v5 (pgxpool)
Query generation sqlc
Database PostgreSQL 16
Migrations golang-migrate (embedded via go:embed)
Scheduler robfig/cron v3 (in-process)
Auth (JWT) golang-jwt/jwt/v5
Auth (passwords) argon2id
Auth (LDAP) go-ldap/ldap/v3
Encryption AES-256-GCM + Argon2id KDF
HTTP client net/http (stdlib)
Serialization encoding/json (stdlib)
Templating html/template (stdlib)
XML parsing encoding/xml (stdlib)
PDF generation maroto v2
SBOM formats CycloneDX, SPDX
Cache / Rate limiting Redis 7
Frontend React, Vite, Tailwind CSS
Reverse proxy Caddy 2
Logging log/slog (stdlib, JSON output)

Extensibility

rimae/scan uses Go interfaces for extensibility. Custom crawlers and exporters implement the corresponding interface and register with the application.

Custom Crawlers

Crawlers implement the interface in internal/crawler/base.go and register in internal/crawlereg/. The interface defines methods for fetching, parsing, and persisting vulnerability data from a source.

Custom Exporters

Export formats are individual Go files in internal/export/. Each exporter writes a specific format (CSV, JSON, HTML, PDF, XLSX, CycloneDX, SPDX) given a set of vulnerability match data.

Integration Adapters

Integration targets implement the adapter interface in internal/integrations/targets/base.go. New adapters for SIEM, scanner, threat intelligence, ticketing, or notification platforms register via the factory in internal/integrations/targets/factory.go.