Skip to content

Troubleshooting

Common issues, diagnostics, and fixes for rimae/scan deployments.

Service Won't Start

Check the Configuration File

Verify that /etc/rimae-scan/rimae-scan.conf exists and contains valid values:

sudo cat /etc/rimae-scan/rimae-scan.conf

Required fields: - DATABASE_URL -- Must be a valid PostgreSQL pgx/v5 connection string - REDIS_URL -- Must point to a reachable Redis instance - SECRET_KEY -- Must be set (used for JWT signing) - DOMAIN -- Must be set (used for OIDC redirect URLs and TLS)

Check the Database

Verify PostgreSQL is running and the database exists:

sudo systemctl status postgresql
psql -U rimae-scan -d rimae-scan -c "SELECT 1;"

If the database does not exist:

sudo -u postgres createuser rimae-scan
sudo -u postgres createdb -O rimae-scan rimae-scan

Check Redis

Verify Redis is running and reachable:

sudo systemctl status redis
redis-cli ping

Expected response: PONG

Check Service Logs

# rimae-scan service (API + scheduler)
sudo journalctl -u rimae-scan -n 100 --no-pager

# Caddy reverse proxy
sudo journalctl -u caddy -n 100 --no-pager

Common Startup Errors

Error Cause Solution
connection refused on port 5432 PostgreSQL not running sudo systemctl start postgresql
connection refused on port 6379 Redis not running sudo systemctl start redis
FATAL: password authentication failed Wrong database credentials Check DATABASE_URL in rimae-scan.conf
no such table / relation does not exist Migrations not applied Run rimae-scan migrate up
SECRET_KEY not set Missing configuration Set a random 64-character string in rimae-scan.conf

Migration Failures

Database Migration Errors

# Apply pending migrations
rimae-scan migrate up

# Check current migration version
rimae-scan migrate version

If a migration fails partway through:

  1. Check the error message for the specific SQL that failed
  2. Verify the database schema matches the expected state:
    psql -U rimae-scan -d rimae-scan -c "\dt"
    
  3. If the database is in an inconsistent state, you may need to manually fix the schema and stamp the migration:
    rimae-scan migrate force <version>
    

Warning: Never manually edit the schema_migrations table unless you understand the migration dependency chain. Incorrect versions can cause future migrations to skip critical changes.


Crawl Failures

Source Not Crawling

  1. Verify the source is enabled in the Vulnerability Sources settings
  2. Check the last_crawl_status field -- if it shows error, check the worker logs
  3. Use the Test Connection button to verify the source URL is reachable
  4. For authenticated sources, verify the API key reference is valid

Trigger a Manual Crawl

# Via API
curl -X POST -H "Authorization: Bearer $TOKEN" \
  https://rimae-scan.example.com/api/scan-runs/trigger/<source-slug>

# Check scan run history
curl -H "Authorization: Bearer $TOKEN" \
  "https://rimae-scan.example.com/api/scan-runs/?source_slug=<source-slug>&page_size=5"

Common Crawl Errors

Symptom Cause Solution
last_crawl_status: error Source URL unreachable or returning errors Test the URL manually, check DNS and firewall rules
Crawl completes but record_count is 0 Parser mismatch or feed format changed Check the parser_type matches the actual feed format
Rate limiting (HTTP 429) Crawling too frequently Increase crawl_interval_minutes for the source
Authentication errors Expired or invalid API key Update the API key in source configuration
Scheduled task not running Service not running Check sudo systemctl status rimae-scan

Performance Tuning

Database Performance

  • Slow queries: Check PostgreSQL slow query log. rimae/scan's paginated list endpoints use COUNT(*) subqueries which can be slow on very large tables.
  • Connection pooling: Ensure DATABASE_URL uses pgx/v5 and connection pooling is configured appropriately for your workload.
  • Vacuuming: PostgreSQL autovacuum should be enabled. For large tables (vuln_matches, cves), manual VACUUM ANALYZE may help after bulk data loads.

Redis Performance

  • Memory: Monitor Redis memory usage. Rate limiting data has TTLs and is self-cleaning.
  • Connection timeouts: The rate limit middleware uses a 1-second socket timeout. If Redis is overloaded, rate limiting falls back to an in-process counter.

Scheduler Performance

  • Cron jobs: The embedded robfig/cron scheduler runs all jobs as goroutines within the main process.
  • Concurrency: Each crawl task runs independently as a goroutine. Monitor via the system status endpoint.
  • Memory: Memory usage grows with large crawl results. If needed, restart the rimae-scan service to reclaim memory.

API Performance

  • Pagination: Always use pagination parameters (page, page_size) on list endpoints. The maximum page_size is 500.
  • Export endpoints: PDF exports can be resource-intensive for large data sets. Use filters to limit the data scope.
  • Rate limiting: Export endpoints have a lower per-user rate limit than standard endpoints to prevent resource exhaustion.

Log Locations and What to Look For

Changing Log Level

The log level can be changed at runtime without restarting the service:

# Via API (admin only)
curl -X PUT -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"log_level": "debug"}' \
  https://rimae-scan.example.com/api/system/log-level

# Or set in /etc/rimae-scan/rimae-scan.conf (requires restart)
LOG_LEVEL=debug

Valid levels: debug, info (default), warn, error.

Systemd Journal (Native Install)

# All rimae/scan log output
sudo journalctl -u rimae-scan --since "1 hour ago"

# Follow live logs
sudo journalctl -u rimae-scan -f

Log File (Native Install)

# Log file (rotated weekly or at 500MB)
tail -f /var/log/rimae-scan/rimae-scan.log

Docker (Docker Compose Install)

docker compose logs api --tail=100

Key Log Patterns

Pattern Meaning
HTTP 5xx Server error -- check full traceback in logs
LDAP connection failed LDAP server unreachable -- check network and TLS settings
OIDC discovery fetch failed OIDC provider unreachable -- check OIDC_ISSUER_URL
endoflife.date lookup failed External API timeout -- transient, will retry
NVD CPE lookup failed NVD API timeout or rate limit -- transient
Anthropic API returned 4xx Invalid API key or request -- check ANTHROPIC_API_KEY
Redis unavailable for rate limiting Redis down -- rate limiting falls back to in-process counter
Database health check failed PostgreSQL unreachable -- check connection string and service status
Onboarding failed for ... Agent pipeline error -- check the full traceback

Health Check Interpretation

The /health endpoint provides a detailed status summary. It does not require authentication.

curl https://rimae-scan.example.com/health

Response:

{
  "status": "ok",
  "version": "0.1.0",
  "database": "connected",
  "redis": "connected",
  "scheduler_jobs": 8,
  "uptime_seconds": 86432.15
}

Status Values

Status Meaning Action
ok All components healthy No action needed
degraded Redis is disconnected but database is up Rate limiting falls back to in-process mode. Check Redis service.
unhealthy Database is disconnected Immediate action required. Check PostgreSQL service and connection string.

System Status (Authenticated)

The GET /api/system/status endpoint provides additional detail including scheduler count and active task count. This requires authentication and reports:

Field Description
version rimae/scan application version
db_connected PostgreSQL reachability
redis_connected Redis reachability (PING test)
scheduler_jobs Number of registered cron jobs
active_tasks Number of currently executing tasks across all workers

Common Issues Quick Reference

Login Fails with Valid Credentials

  1. Check AUTH_MODE in rimae-scan.conf -- is it set to internal when the user was created via LDAP?
  2. For LDAP users: verify LDAP server reachability and user filter
  3. For OIDC users: verify the callback URL matches https://<DOMAIN>/login
  4. Check the rate limit -- after 10 failed login attempts per minute from the same IP, further attempts are blocked

Exports Produce Empty Files

  1. Verify that vulnerability match data exists for the applied filters
  2. Try the export without filters to confirm the exporter works
  3. For PDF exports, maroto v2 is compiled into the Go binary -- no additional dependencies required

Agent Produces Low-Confidence Results

  1. Verify ANTHROPIC_API_KEY is set if LLM fallback is desired
  2. Check the resolution report for the specific stage that returned not_found
  3. For new or unusual distributions, you may need to manually configure the advisory feed URL
  4. Re-run the agent after correcting any configuration issues

Setup Wizard Returns 409 Conflict

The setup endpoint is only available when zero users exist in the database. If you need to reset:

psql -U rimae-scan -d rimae-scan -c "SELECT count(*) FROM users;"

If users exist but you have lost access, you will need to reset the admin password directly in the database or clear the users table.

Wazuh Connection Fails with Hostname Error

If rimae/scan reports a hostname mismatch or connection error when contacting Wazuh:

  1. Verify WAZUH_API_URL in /etc/rimae-scan/rimae-scan.conf contains the correct URL (including port, typically 55000)
  2. Ensure the URL matches the hostname or IP that the Wazuh manager's TLS certificate was issued for
  3. If Wazuh uses a self-signed certificate, set WAZUH_TLS_VERIFY=false in rimae-scan.conf or in Settings > Integrations
  4. If you have configured Wazuh settings via the UI, note that UI overrides take priority over rimae-scan.conf -- check the Integrations page for the saved values

Crawlers Show 0 Records

If a crawl completes successfully but reports record_count: 0:

  1. Check that the persistence layer is working by verifying records directly:
    psql -U rimae-scan -d rimae-scan -c "SELECT slug, last_crawl_record_count FROM vuln_sources WHERE slug = '<source-slug>';"
    
  2. Verify the source's fetch_url is reachable and returning data (use the Test Connection button in Settings)
  3. Check if the parser type matches the actual response format -- feed format changes can cause zero-record parses
  4. Review worker logs for parsing warnings: sudo journalctl -u rimae-scan --since "1 hour ago" | grep -i parse

Theme Toggle Does Not Work

The dark/light theme toggle requires v1.0.18 or later, which introduced the CSS variable system. On older versions:

  • The toggle button may appear but not change the UI appearance
  • Upgrade to v1.0.18+ to get the CSS variable-based theming
  • After upgrading, clear your browser cache to pick up the new stylesheets

API Returns 500 on Datetime Fields

If API responses return HTTP 500 errors with traceback mentioning datetime or naive/aware:

  • This is caused by a mismatch between naive (no timezone) and timezone-aware datetime objects
  • Ensure your PostgreSQL database uses timezone-aware timestamps (TIMESTAMP WITH TIME ZONE)
  • If you migrated from an older version, check for any naive datetime values in the database:
    psql -U rimae-scan -d rimae-scan -c "SHOW timezone;"
    
    The timezone should be set to UTC
  • Restart the API service after any database timezone changes: sudo systemctl restart rimae-scan

Scheduled Tasks Not Running

  1. Check the service: sudo systemctl status rimae-scan
  2. Verify Redis is reachable (used for rate limiting)
  3. Check for errors in the logs: sudo journalctl -u rimae-scan -n 50
  4. Verify cron jobs are registered: check the /health endpoint for scheduler_jobs count

Getting Help

If the troubleshooting steps above do not resolve your issue:

  1. Check the GitHub Issues for known problems
  2. Collect diagnostic information:
  3. rimae/scan version (/health endpoint)
  4. Relevant service logs (last 100 lines)
  5. Configuration (redact secrets)
  6. Steps to reproduce
  7. Open a new issue with the collected information