Deployment: CredForge Databases¶

This document outlines the deployment steps for the required databases: MongoDB 7.0 and PostgreSQL 16.

Database Requirements¶

MongoDB 7.0
PostgreSQL 16

Deployment Methods¶

1. Docker Compose (Recommended for Dev Testing Environments)¶

MongoDB 7.0¶

version: '3.8'
services:
  mongodb:
    image: mongo:7.0
    container_name: mongodb
    ports:
      - "27017:27017"
    environment:
      - MONGO_INITDB_ROOT_USERNAME=admin
      - MONGO_INITDB_ROOT_PASSWORD=your_secure_password
    volumes:
      - mongodb_data:/data/db

volumes:
  mongodb_data:

PostgreSQL 16¶

version: '3.8'
services:
  postgres:
    image: postgres:16
    container_name: postgres
    ports:
      - "5432:5432"
    environment:
      - POSTGRES_USER=admin
      - POSTGRES_PASSWORD=your_secure_password
      - POSTGRES_DB=postgres
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

To deploy using Docker Compose:

Create a docker-compose.yml file with both services
Run docker-compose up -d
Verify services are running with docker-compose ps

2. Custom Installation on Ubuntu 22.04 (Recommended for UAT Environments)¶

MongoDB 7.0 Installation¶

Import MongoDB public GPG key:

curl -fsSL https://pgp.mongodb.com/server-7.0.asc | \
   sudo gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg \
   --dearmor

Create list file for MongoDB:

echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list

Update package database and install MongoDB:

sudo apt-get update
sudo apt-get install -y mongodb-org

Start and enable MongoDB:

sudo systemctl start mongod
sudo systemctl enable mongod

PostgreSQL 16 Installation¶

Add PostgreSQL repository:

sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

Install PostgreSQL:

sudo apt-get update
sudo apt-get install -y postgresql-16

Start and enable PostgreSQL:

sudo systemctl start postgresql
sudo systemctl enable postgresql

3. AWS Managed Databases (Recommended for Production Environments)¶

Amazon DocumentDB (MongoDB Compatible)¶

Create a DocumentDB cluster:
Go to AWS Console → DocumentDB
Click "Create cluster"
Select engine version compatible with MongoDB 7.0
Configure instance class (recommended: db.r6g.large or higher)
Set up VPC and security groups
Configure backup retention period
Set up encryption at rest
Configure security:
Configure security groups to allow access from application servers

Amazon RDS for PostgreSQL 16¶

Create an RDS instance:
Go to AWS Console → RDS
Click "Create database"
Choose PostgreSQL 16
Select instance class (recommended: db.r6g.large or higher)
Configure storage (recommended: 100GB+ with autoscaling)
Set up VPC and security groups
Enable Multi-AZ deployment for high availability
Configure backup retention period
Enable encryption at rest
Configure security:
Configure security groups to allow access from application servers

Database Schema Setup¶

PostgreSQL Database Setup¶

Required PostgreSQL Databases¶

cred_forge_db

Create database and user:¶

-- Create databases
CREATE DATABASE cred_forge_db;

-- Create user with password
CREATE USER credforge_user WITH PASSWORD 'your_secure_password';

-- Grant privileges for api_insights_db
GRANT CONNECT ON DATABASE cred_forge_db TO credforge_user;
GRANT USAGE ON SCHEMA public TO credforge_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO credforge_user;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO credforge_user;

-- Set default privileges for future tables in both databases
ALTER DEFAULT PRIVILEGES IN SCHEMA public 
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO credforge_user;

ALTER DEFAULT PRIVILEGES IN SCHEMA public 
GRANT USAGE, SELECT ON SEQUENCES TO credforge_user;

Create Tables¶

Database to use - cred_forge_db

CREATE TABLE forge_request_workflow_state (
    request_id VARCHAR(255),
    user_id VARCHAR(255),
    reference_id VARCHAR(255),
    client_id VARCHAR(255),
    workflow_endpoint VARCHAR(255),
    workflow_strategy VARCHAR(255),
    workflow_config JSONB,
    input_json JSONB DEFAULT '{}'::jsonb,
    ecm JSONB,
    features JSONB,
    rules_output JSONB,
    engine_history JSONB,
    created_at TIMESTAMP NOT NULL,
    created_date TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL
) PARTITION BY RANGE (created_date);

-- Optional indices for better query performance
CREATE INDEX idx_request_workflow_state_reference_id ON forge_request_workflow_state(reference_id);
CREATE INDEX idx_request_workflow_state_user_id ON forge_request_workflow_state(user_id);
CREATE INDEX idx_request_workflow_state_request_id ON forge_request_workflow_state(request_id);
CREATE INDEX idx_request_workflow_state_client_id ON forge_request_workflow_state(client_id);
CREATE INDEX idx_request_workflow_state_workflow_endpoint ON forge_request_workflow_state(workflow_endpoint);
CREATE INDEX idx_request_workflow_state_workflow_strategy ON forge_request_workflow_state(workflow_strategy);
CREATE INDEX idx_request_workflow_state_created_date ON forge_request_workflow_state(created_date);

CREATE TABLE forge_ecm_response_log (
    id SERIAL,
    request_id VARCHAR(255) NOT NULL,
    user_id VARCHAR(255),
    reference_id VARCHAR(255),
    engine VARCHAR(255),
    client_id VARCHAR(255),
    external_status VARCHAR(255),
    external_response_code INTEGER,
    external_retries INTEGER,
    external_latency FLOAT,
    ecm_data_source VARCHAR(255),
    ecm_message VARCHAR(255),
    metadata_json JSONB,
    ecm_status_code INTEGER,
    created_at TIMESTAMP NOT NULL,
    created_date TIMESTAMP NOT NULL
) PARTITION BY RANGE (created_date);

-- Indices for better query performance
CREATE INDEX idx_ecm_response_log_request_id ON forge_ecm_response_log(request_id);
CREATE INDEX idx_ecm_response_log_reference_id ON forge_ecm_response_log(reference_id);
CREATE INDEX idx_ecm_response_log_user_id ON forge_ecm_response_log(user_id);
CREATE INDEX idx_ecm_response_log_engine ON forge_ecm_response_log(engine);
CREATE INDEX idx_ecm_response_log_created_date ON forge_ecm_response_log(created_date);

CREATE TABLE forge_application_logs (
    id CHAR(36),
    request_id VARCHAR(255),
    client_id VARCHAR(255),
    reference_id VARCHAR(255),
    user_id VARCHAR(255),
    forge_component VARCHAR(255) NOT NULL,
    api_endpoint VARCHAR(255),
    engine_name VARCHAR(255),
    workflow_name VARCHAR(255),
    level VARCHAR(50) NOT NULL,
    timestamp TIMESTAMP NOT NULL,
    message TEXT NOT NULL,
    extra_data JSON,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    created_date date GENERATED ALWAYS AS ("timestamp"::date) STORED
) PARTITION BY RANGE (created_at);

-- Create indexes
CREATE INDEX idx_application_logs_request_id ON forge_application_logs(request_id);
CREATE INDEX idx_application_logs_client_id ON forge_application_logs(client_id);
CREATE INDEX idx_application_logs_reference_id ON forge_application_logs(reference_id);
CREATE INDEX idx_application_logs_user_id ON forge_application_logs(user_id);
CREATE INDEX idx_application_logs_forge_component ON forge_application_logs(forge_component);
CREATE INDEX idx_application_logs_api_endpoint ON forge_application_logs(api_endpoint);
CREATE INDEX idx_application_logs_engine_name ON forge_application_logs(engine_name);
CREATE INDEX idx_application_logs_workflow_name ON forge_application_logs(workflow_name);
CREATE INDEX idx_application_logs_level ON forge_application_logs(level);
CREATE INDEX idx_application_logs_created_date ON forge_application_logs(created_date);

Create Table Partitions¶

CREATE OR REPLACE FUNCTION public.create_daily_partition(parent_table text, partition_date date, days_ahead integer DEFAULT 0)
 RETURNS void
 LANGUAGE plpgsql
AS $function$
DECLARE
    partition_name text;
    partition_start timestamp;
    partition_end timestamp;
    partition_col_name text;
    d date := partition_date;
    end_date date := partition_date + days_ahead;
BEGIN
    -- Detect the partition key column
    SELECT a.attname INTO partition_col_name
    FROM pg_partitioned_table pt
    JOIN pg_class c ON pt.partrelid = c.oid
    JOIN pg_attribute a ON a.attrelid = pt.partrelid AND a.attnum = ANY(pt.partattrs)
    WHERE c.relname = parent_table
    LIMIT 1;

    IF partition_col_name IS NULL THEN
        RAISE EXCEPTION 'Partition column not found for table %', parent_table;
    END IF;

    -- Iterate using WHILE loop (safe for DATE)
    WHILE d <= end_date LOOP
        partition_start := d::timestamp;
        partition_end := (d + 1)::timestamp;
        partition_name := parent_table || '_' || to_char(d, 'YYYY_MM_DD');

        BEGIN
            IF NOT EXISTS (
                SELECT 1
                FROM pg_class c
                JOIN pg_namespace n ON n.oid = c.relnamespace
                WHERE c.relname = partition_name
                AND n.nspname = 'public'
            ) THEN
                EXECUTE format(
                    'CREATE TABLE IF NOT EXISTS %I PARTITION OF %I FOR VALUES FROM (%L) TO (%L)',
                    partition_name,
                    parent_table,
                    partition_start,
                    partition_end
                );

                RAISE NOTICE 'Created partition % for table % on %', partition_name, parent_table, partition_col_name;
            END IF;

        EXCEPTION WHEN OTHERS THEN
            RAISE NOTICE 'Failed to create partition %: %', partition_name, SQLERRM;
        END;

        d := d + INTERVAL '1 day';
    END LOOP;

    RETURN;
END;
$function$
;

Execute, or set crons for -

SELECT create_daily_partition('forge_request_workflow_state', CURRENT_DATE, 10);
SELECT create_daily_partition('forge_ecm_response_log', CURRENT_DATE, 10);
SELECT create_daily_partition('forge_application_logs', CURRENT_DATE, 10);

To setup crons using pg_cron -

Enable pg_cron -

CREATE EXTENSION IF NOT EXISTS pg_cron;

SELECT * FROM pg_extension WHERE extname = 'pg_cron';

GRANT USAGE ON SCHEMA cron TO credforge_user;

GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA cron TO credforge_user;

ALTER DEFAULT PRIVILEGES IN SCHEMA cron GRANT EXECUTE ON FUNCTIONS TO credforge_user;

Set crons -

SELECT cron.schedule(
  'create_daily_partitions_for_forge_request_workflow_state',
  '0 0 * * *',
  $$SELECT create_daily_partition('forge_request_workflow_state', CURRENT_DATE, 10);$$
);

SELECT cron.schedule(
  'create_daily_partitions_for_forge_ecm_response_log',
  '0 0 * * *',
  $$SELECT create_daily_partition('forge_ecm_response_log', CURRENT_DATE, 10);$$
);

SELECT cron.schedule(
  'create_daily_partitions_for_forge_application_logs',
  '0 0 * * *',
  $$SELECT create_daily_partition('forge_application_logs', CURRENT_DATE, 10);$$
);

Purge Table Partitions¶

CREATE OR REPLACE FUNCTION public.cleanup_old_partitions_for_table(parent_table text, retention_days integer DEFAULT 90)
 RETURNS void
 LANGUAGE plpgsql
AS $function$
DECLARE
    partition_record record;
    partition_date date;
BEGIN
    FOR partition_record IN 
        SELECT tablename, schemaname
        FROM pg_tables 
        WHERE schemaname = 'public' 
        AND tablename LIKE parent_table || '_%_%_%'  
    LOOP
        BEGIN
            -- Extract the last three parts of the table name which contain the date
            partition_date := to_date(
                split_part(partition_record.tablename, '_', -3) || '_' ||
                split_part(partition_record.tablename, '_', -2) || '_' ||
                split_part(partition_record.tablename, '_', -1),
                'YYYY_MM_DD'
            );

            IF partition_date < current_date - retention_days THEN
                EXECUTE format('DROP TABLE IF EXISTS %I.%I', 
                               partition_record.schemaname, 
                               partition_record.tablename);
                RAISE NOTICE 'Deleted partition: %', partition_record.tablename;
            END IF;
        EXCEPTION WHEN OTHERS THEN
            RAISE NOTICE 'Error processing partition %: %', partition_record.tablename, SQLERRM;
            CONTINUE;
        END;
    END LOOP;
END;
$function$
;

Execute, or set crons for -

SELECT cleanup_old_partitions_for_table('forge_request_workflow_state', 7);
SELECT cleanup_old_partitions_for_table('forge_ecm_response_log', 7);
SELECT cleanup_old_partitions_for_table('forge_application_logs', 7);

To setup crons using pg_cron -

Enable pg_cron -

Follow cron setup steps above.

Set crons -

SELECT cron.schedule(
  'create_daily_partitions_for_forge_request_workflow_state',
  '0 0 * * *',
  $$SELECT create_daily_partition('forge_request_workflow_state', CURRENT_DATE, 10);$$
);

SELECT cron.schedule(
  'create_daily_partitions_for_forge_ecm_response_log',
  '0 0 * * *',
  $$SELECT create_daily_partition('forge_ecm_response_log', CURRENT_DATE, 10);$$
);

SELECT cron.schedule(
  'create_daily_partitions_for_forge_application_logs',
  '0 0 * * *',
  $$SELECT create_daily_partition('forge_application_logs', CURRENT_DATE, 10);$$
);

MongoDB Database Setup¶

Create databases and user¶

// Connect to MongoDB
use admin

// Create application user
db.createUser({
  user: "credforge_user",
  pwd: "your_secure_password",
  roles: [
    { role: "readWrite", db: "cred_forge_db" }
  ]
})

Create collections in MongoDB¶

// 1. Clients
db.createCollection("clients");
db.clients.createIndex({ "client_id": 1 });
db.clients.createIndex({ "auth_token": 1 });

// 2. Ecm Client Metadata
db.createCollection("ecm_client_metadata");
db.ecm_client_metadata.createIndex({ "client_id": 1 });
db.ecm_client_metadata.createIndex({ "engine": 1 });

// 3. Ecm Output Log
db.createCollection("ecm_output_log");
db.ecm_output_log.createIndex({ "reference_id": 1 });
db.ecm_output_log.createIndex({ "user_id": 1 });
db.ecm_output_log.createIndex({ "engine": 1 });
db.ecm_output_log.createIndex({ "request_id": 1 });
db.ecm_output_log.createIndex({ "client_id": 1 });

// 4. Ecm Request Log
db.createCollection("ecm_request_log");
db.ecm_request_log.createIndex({ "reference_id": 1 });
db.ecm_request_log.createIndex({ "user_id": 1 });
db.ecm_request_log.createIndex({ "request_id": 1 });
db.ecm_request_log.createIndex({ "engine": 1 });
db.ecm_request_log.createIndex({ "client_id": 1 });

// 5. Ecm Response Log
db.createCollection("ecm_response_log");
db.ecm_response_log.createIndex({ "reference_id": 1 });
db.ecm_response_log.createIndex({ "request_id": 1 });
db.ecm_response_log.createIndex({ "user_id": 1 });
db.ecm_response_log.createIndex({ "engine": 1 });
db.ecm_response_log.createIndex({ "client_id": 1 });

// 6. Request Workflow State
db.createCollection("request_workflow_state");
db.request_workflow_state.createIndex({ "client_id": 1 });
db.request_workflow_state.createIndex({ "created_date": -1 });
db.request_workflow_state.createIndex({ "created_at": -1 });
db.request_workflow_state.createIndex({ "user_id": 1 });
db.request_workflow_state.createIndex({ "request_id": 1 });
db.request_workflow_state.createIndex({ "reference_id": 1 });
db.request_workflow_state.createIndex({ "workflow_strategy": 1 });
db.request_workflow_state.createIndex({ "workflow_endpoint": 1 });

// 7. Unauthorized Requests
db.createCollection("unauthorized_requests");
db.unauthorized_requests.createIndex({ "client_id": 1 });
db.unauthorized_requests.createIndex({ "request_id": 1 });

// 8. Usage
db.createCollection("usage");
db.usage.createIndex({ "client_id": 1 });
db.usage.createIndex({ "request_id": 1 });
db.usage.createIndex({ "api_name": 1 });

// 9. Workflow Configs
db.createCollection("workflow_configs");
db.workflow_configs.createIndex({ "workflow_name": 1 });
db.workflow_configs.createIndex({ "client_id": 1 });

// 10. Workflow Endpoint Config
db.createCollection("workflow_endpoint_config");
db.workflow_endpoint_config.createIndex({ "workflow_endpoint": 1 });
db.workflow_endpoint_config.createIndex({ "client_id": 1 });

// 11. Workflow Event Log
db.createCollection("workflow_event_log");
db.workflow_event_log.createIndex({ "client_id": 1 });
db.workflow_event_log.createIndex({ "reference_id": 1 });
db.workflow_event_log.createIndex({ "request_id": 1 });
db.workflow_event_log.createIndex({ "user_id": 1 });
db.workflow_event_log.createIndex({ "event_type": 1 });

Important Notes¶

Replace your_secure_password with a strong, secure password in all scripts
Ensure proper network security rules are in place before running these scripts
For production environments, use AWS Secrets Manager or similar services to manage credentials
Regularly rotate database credentials
Monitor database access logs for any unauthorized access attempts

Security Considerations¶

Always use strong passwords
Enable encryption at rest
Configure network security (firewalls, security groups)
Regular backups
Monitor database performance and logs
Keep databases updated with security patches

Backup and Recovery¶

MongoDB¶

Regular automated backups
Point-in-time recovery (for DocumentDB)
Backup verification
Recovery testing

PostgreSQL¶

Automated snapshots
Point-in-time recovery
WAL archiving
Backup verification
Recovery testing

Monitoring¶

Set up CloudWatch metrics (for AWS deployments)
Configure database-specific monitoring tools
Set up alerts for:
High CPU usage
Low disk space
Connection count
Error rates
Replication lag

Maintenance¶

Regular security updates
Performance optimization
Index maintenance
Vacuum operations (PostgreSQL)
Database statistics updates

Scaling Ladder¶

The following table provides recommended node specifications and counts for PostgreSQL and MongoDB deployments at various Daily Active User (DAU) scales.

PostgreSQL: 1GB RAM, 2 vCPU, 25GB disk (3000 IOPS, 125 MBPS)
MongoDB: 4GB RAM, 2 vCPU, 100GB disk (3000 IOPS, 125 MBPS)

Requests / Day	PostgreSQL Nodes	MongoDB Nodes
10K	1	1
25K	2	2
50K	3	3
75K	4	4
100K	5	5

Notes:

For high availability and failover, consider running at least 2 nodes (primary + standby/replica) at all times, even at lower DAU.
Scale up node count and/or instance size if you observe CPU, RAM, or disk IOPS/throughput consistently above 50% utilization.
Adjust disk size and IOPS as data volume grows.
Monitor database metrics and tune accordingly.