Skip to main content

Data Stores

RedactedWorld uses four data storage technologies, each chosen for a specific workload pattern.

PostgreSQL

PostgreSQL is the primary relational database. Each service owns one or more schemas and manages its own migrations via TypeORM. Cross-service data access happens exclusively through gRPC calls -- services never query another service's schema directly.

Schema Overview

SchemaOwning ServiceKey TablesDescription
authauth-servicesessions, refresh_tokens, spicedb_sync_logSession tracking and Keycloak sync metadata
usersuser-serviceusers, user_profiles, user_preferencesUser accounts, profile data, and per-user settings (theme, notifications)
orgsorg-serviceorganizations, memberships, invitations, teamsOrganization hierarchy, team membership, and invitation workflows
chatchat-servicechannels, messages, message_readsChat channels (direct and group) with message history and read receipts
notificationsnotification-servicenotifications, notification_preferences, delivery_logIn-app and email notifications with delivery tracking
forumsforum-serviceboards, threads, posts, moderation_actionsDiscussion forums with threaded replies and moderation
filesfile-servicefile_metadata, upload_sessionsFile metadata and multipart upload state (binary objects live in MinIO)
domainsdomain-servicedomains, verification_records, subdomains, verification_scheduleRegistered domains, DNS TXT verification status, and re-verification cron state
scansscan-servicescan_jobs, scan_configs, scan_schedules, tool_registryScan job definitions, scheduling rules, and tool configuration

SpiceDB

SpiceDB provides fine-grained, relationship-based authorization inspired by Google Zanzibar. Rather than checking role strings, the platform checks whether a relationship exists in a graph (e.g., "Is user X a member of org Y that owns domain Z?").

Schema Definition

definition user {}

definition organization {
relation owner: user
relation admin: user
relation member: user

permission manage = owner + admin
permission view = owner + admin + member
permission delete = owner
}

definition domain {
relation organization: organization
relation verified_by: user

permission scan = organization->manage + organization->member
permission view = organization->view
permission delete = organization->manage
permission verify = organization->manage
}

definition scan_job {
relation domain: domain
relation initiated_by: user

permission view = domain->view + initiated_by
permission cancel = domain->manage + initiated_by
permission delete = domain->manage
permission download_report = domain->view
}

How It Works

  1. When a user creates an organization, the auth-service writes organization:org-123#owner@user:user-456 to SpiceDB.
  2. When a domain is added to an organization, domain:example.com#organization@organization:org-123 is written.
  3. On every API request, the API Gateway calls SpiceDB.CheckPermission to verify the caller has the required permission on the target resource.

ClickHouse

ClickHouse stores scan telemetry and results for analytical queries. It is optimized for fast aggregation over large volumes of time-series scan data.

Use cases:

  • Historical scan result storage (findings, severity counts, timestamps)
  • Trend analysis dashboards (vulnerability counts over time per domain)
  • SLA reporting (scan duration percentiles, failure rates)

Data flows into ClickHouse from the scan-service after each job completes. The report-service reads from ClickHouse when generating aggregate reports.

Elasticsearch

Elasticsearch provides full-text search over scan findings and reports.

Use cases:

  • Searching vulnerability descriptions across all scans for a domain
  • Filtering findings by CVE ID, severity, or affected port
  • Powering the global search bar in the Angular frontend

The report-service indexes structured findings into Elasticsearch after processing raw scan output.