IDX Market Data Platform

Real-time Indonesian stock market data pipeline — from exchange feed to client dashboard

Parse Rate
3.1M/s
messages per second
Tickers
900+
IDX listed stocks
Latency
<50ms
feed-to-DB p50
Components
12
Docker services
Uptime
99.9%
during market hours

What This Platform Does

Receives real-time ITCH binary feed from the Indonesia Stock Exchange (IDX), parses it at wire speed, stores ticks and aggregated bars in hot storage (QuestDB), drains historical data to cold storage (ClickHouse), computes proprietary metrics (HD, RSI), and serves everything via a Go/Fiber API to web clients and an AmiBroker plugin.

Tech Stack

Rs
Rust Parser
Zero-alloc ITCH decoder. 3.1M msg/s, <0.3us per message.
Rp
Redpanda
Kafka-compatible message bus. Single binary, 6 topics.
Qd
QuestDB
Hot time-series store. ILP ingestion, SAMPLE BY aggregation.
CH
ClickHouse
Cold analytical store. 40:1 compression, unlimited history.
Rd
Redis
Sub-ms cache. Sessions, rate limits, WS fan-out.
PG
PostgreSQL
Auth DB. Users, keys, audit log, partitioned tables.
Go
Go + Fiber
Single binary API. REST + WS + SSE + Admin + Portal.
TV
TradingView
Professional charting via UDF datafeed protocol.

Architecture

Single binary Go API serving three domains — Admin, Portal, Data API

System Data Flow

Exchange
ITCH Feed
Parser
Rust
Bus
Redpanda
Hot Store
QuestDB
API
Go/Fiber
Clients
TradingView

API Server — Three Domains

Admin Panel
/admin/*
Session Auth + CSRF
Manage users, keys, plugins
HD config, tier config, audit
User Portal
/portal/*
Session Auth + CSRF
Own keys, machines, usage
Subscription management
Data API
/v1/*
JWT Auth (RS256)
Rate limited, tier enforced
REST + WS + SSE
All three domains run in a single Go binary on port :18090. Shared connection pools for PostgreSQL, Redis, QuestDB, and ClickHouse.

Data Pipeline

Zero-loss path from exchange feed to client-facing API

1
Exchange Feed
RabbitMQ AMQP from IDX feed provider. Prefetch=500, <1ms local network latency.
2
Rust Parser (zero-alloc)
Binary ITCH decode at 3.1M msg/s. Handles Trade, Orderbook, Snapshot, Index messages.
QuestDB (ILP TCP) Redpanda (async)
3
OHLCV Aggregator (Rust)
Consumes idx.ticks, builds 1-minute OHLCV bars. Filters board=RG only.
QuestDB idx_ohlcv Redpanda idx.ohlcv Redis PUBLISH
4
Metric Worker (Rust)
Consumes idx.ohlcv, computes HD (7-step pipeline) + RSI-14 per 1m bar.
QuestDB metrics_hd Redis cache (sub-ms) ClickHouse archive
5
Go API Server (:18090)
QuestDB for hot data (14 days), ClickHouse for historical, Redis for real-time cache + WS pub/sub.
6
Clients
TradingView Chart (UDF), AmiBroker Plugin (REST + WS), Admin Panel (HTMX), User Portal (HTMX).

Latency Budget

Hopp50p99Notes
RabbitMQ → Parser<1ms<2msLocal network, prefetch 500
Parser → QuestDB~50ms~150msILP batch 500 msgs / 100ms
Parser → Redpanda<5ms<10msAsync produce
Redpanda → Metric Worker<5ms<20msConsumer lag
Go API query1-5ms10-50msQuestDB HTTP SQL
End-to-end<100ms<300ms

Services & Ports

All services run as Docker containers with idxmdp- prefix

ServiceContainerPort(s)Purpose
Go APIgo-api:18090REST + WS + SSE + Admin + Portal + Chart + Docs
Prometheus(same binary):2112/metrics scrape endpoint
Rust Parseridxmdp-parser-ITCH feed → QuestDB + Redpanda
Metric Workeridxmdp-metric-worker-HD + RSI computation
QuestDBidxmdp-questdb:19000 (HTTP) :19009 (PG)Hot storage (7 tables)
ClickHouseidxmdp-clickhouse:28123 (HTTP) :29001 (native)Cold storage (historical)
Redpandaidxmdp-redpanda:29092 :28082 :29644Message bus (6 topics)
Redisidxmdp-redis:26379Cache, sessions, pub/sub
PostgreSQLidxmdp-postgres:25432Auth DB (16 tables)
CH Drainidxmdp-ch-drain-QuestDB → ClickHouse drain
Dashboardidxmdp-dashboard:18080Legacy Rust monitoring
Grafanaidxmdp-grafana:13000Monitoring dashboards (2 pre-provisioned)
Prometheusidxmdp-prometheus:19090Metrics collection + alert rules
Alertmanageridxmdp-alertmanager:19093Alert routing → Telegram
QuestDB Tables (7)
TableDescriptionKey Columns
idx_ticksRaw trade ticksticker, board, price, volume, ts
idx_ohlcv1-minute OHLCV barsticker, open, high, low, close, volume, ts
idx_snapshotBest bid/ask snapshotsticker, bid, ask, bid_vol, ask_vol, ts
idx_orderbookOrder book updatesticker, side, price, volume, ts
idx_indexIndex values (IHSG etc)index_code, value, ts
idx_contractsContract metadataticker, name, board, ts
metrics_hdHD + RSI metric valuesticker, freq, hd, rsi, ts
Redpanda Topics (6)
TopicProducerConsumerPartitions
idx.ticksParserAggregator, CH Drain10
idx.ohlcvAggregatorMetric Worker, WS10
idx.orderbookParser(deferred to M4)10
idx.snapshotParserSnapshot cache10
idx.indexParserIndex display10
metrics.hdMetric WorkerReal-time HD10

Credentials (Dev Environment)

All credentials below are for local development only

These credentials are for local development. Never use them in production.
ServiceHost:PortUsernamePassword
PostgreSQLlocalhost:25432adminsecret
Redislocalhost:26379-idxmdp_redis_dev_2026
ClickHouselocalhost:28123default(none)
QuestDBlocalhost:19000-(none)
Redpandalocalhost:28082-(none)
Grafanalocalhost:13000adminadmin (change on first login)
Prometheuslocalhost:19090-(none)
Alertmanagerlocalhost:19093-(none)

Admin Panel Access

URLUsernamePasswordRole
/auth/loginsuperadminAdmin123456superadmin
Environment Variables
PORT=18090
DATABASE_URL=postgres://admin:secret@localhost:25432/idx_admin?sslmode=disable
REDIS_ADDR=localhost:26379
REDIS_PASSWORD=idxmdp_redis_dev_2026
QUESTDB_HOST=localhost
QUESTDB_HTTP_PORT=19000
CLICKHOUSE_DSN=http://localhost:28123
JWT_PRIVATE_KEY_PATH=secrets/jwt_private.pem
JWT_PUBLIC_KEY_PATH=secrets/jwt_public.pem
JWT_ISSUER=idx-market-data
CHART_LIB_PATH=../charting_library-master
SUPERADMIN_PASSWORD=Admin123456

Docker Compose

All containers use idxmdp- prefix on idxmdp_net network

# Start all services
docker compose up -d

# View parser logs
docker compose logs -f idxmdp-parser

# Check container status
docker ps --filter "name=idxmdp"

# Restart parser
docker compose restart idxmdp-parser

Volume Mounts

ServiceVolumePurpose
QuestDB./data/questdbTime-series data
ClickHouse./data/clickhouseHistorical data
PostgreSQL./data/postgresAuth database
Redis./data/redisAOF persistence
Redpanda./data/redpandaKafka log segments

Storage & Retention

Hot/cold storage strategy with automatic data lifecycle

StoreDataRetentionStrategy
QuestDBTicks, Snapshots3 daysPartition TTL cron (DROP PARTITION)
QuestDBOHLCV, metrics_hd14 daysPartition TTL cron
ClickHouseOHLCV + TicksUnlimitedColumnar compression (~40:1)
RedisMetric cache24h TTLKey expiry
PostgreSQLAudit log6 monthsMonthly partitions, auto-detach
The idxmdp-ch-drain container periodically reads completed QuestDB partitions and inserts them into ClickHouse, typically running at 16:30 WIB after market close.

Public Endpoints

No authentication required

MethodPathDescription
GET/healthBasic liveness → {"ok":true}
GET/readyDeep check: Redis, PG, QuestDB, ClickHouse
POST/v1/auth/tokenAPI key → JWT (15 min TTL)
POST/v1/auth/refreshRefresh expiring JWT
Token Exchange Example
// Request
POST /v1/auth/token
Content-Type: application/json

{ "api_key": "idx_live_a1b2c3d4..." }

// Response
{
  "ok": true,
  "data": {
    "token": "eyJhbGciOiJSUzI1NiIs...",
    "expires_in": 900,
    "tier": "pro"
  }
}

Response Envelope

All API responses use a standard envelope:

Success
{ "ok": true,
  "data": {...},
  "meta": { "latency_us": 142 } }
Error
{ "ok": false,
  "error": "description",
  "meta": { ... } }

Data API

JWT required — Authorization: Bearer {token}

MethodPathDescription
GET/v1/snapshot/:tickerLatest tick for a symbol
GET/v1/ohlcv/:tickerOHLCV bars (QuestDB + ClickHouse hybrid)
GET/v1/symbolsAll available tickers
GET/v1/metrics/latestLatest metric from Redis (sub-ms)
GET/v1/metrics/historyMetric history (QuestDB/ClickHouse)
GET/v1/hd/chart/:tickerHD dashboard (enterprise only)
OHLCV Query Example
GET /v1/ohlcv/BBCA?tf=1m&from=2026-03-25&to=2026-03-26
Authorization: Bearer eyJ...

{
  "ok": true,
  "data": [{
    "ticker": "BBCA",
    "date": "2026-03-25T02:00:00Z",
    "bar_time": "09:00",
    "freq": "1m",
    "open": 8900, "high": 8925,
    "low": 8875, "close": 8925,
    "volume": 123456,
    "freq_cnt": 42,
    "hd": 0
  }]
}

Plugin Endpoints

AmiBroker DLL wire protocol — API key in body, not JWT

MethodPathDescription
POST/v1/plugin/activateMachine licensing (key + machine ID)
POST/v1/plugin/deactivateRelease machine slot
POST/v1/plugin/heartbeatKeepalive + token refresh (every 10 min)
POST/v1/plugin/reportDLL error/event telemetry
Plugin Lifecycle Flow
1
Activate
Send API key + machine_id + machine_name + version. Returns JWT + tier capabilities.
2
Heartbeat (every 10 min)
Send current token. Returns fresh token with new expiry.
3
Fetch Data
Use JWT from activate/heartbeat on GET /v1/ohlcv, /v1/symbols, /v1/snapshot.
4
Stream (optional)
WS /v1/stream with JWT. Subscribe to tickers for real-time 1m bars.
5
Deactivate
On DB unload, release machine slot. Send api_key + machine_id.

Streaming (WS/SSE)

Real-time OHLCV bars via WebSocket or Server-Sent Events

WebSocket
WS /v1/stream
Bidirectional. Subscribe/unsubscribe per ticker. JWT in header. Best for plugins.
Server-Sent Events
SSE /v1/live
Server-push only. Query params for symbols. Auto-reconnect. Best for web clients.
WebSocket Protocol
// Subscribe
{"action":"subscribe", "tickers":["BBCA","BBRI"], "freq":"1m"}

// Unsubscribe
{"action":"unsubscribe", "tickers":["BBRI"]}

// Incoming bar
{"ticker":"BBCA", "bar_time":"09:15", "open":8900,
 "high":8925, "low":8875, "close":8925,
 "volume":45000, "hd":0}

Tier Limits

TierWebSocketMax Symbols
FreeBlocked-
ProAllowed50
EnterpriseAllowedUnlimited

TradingView UDF Datafeed

UDF-compatible endpoints for the TradingView charting library

MethodPathDescription
GET/udf/configDatafeed configuration
GET/udf/symbols?symbol=BBCASymbol resolution
GET/udf/search?query=BB&limit=30Symbol search
GET/udf/history?symbol=X&resolution=R&from=T&to=TOHLCV bars (columnar)
GET/udf/timeServer Unix timestamp
Supported resolutions: 1 (1m), 5, 15, 30, 60 (1h), D (daily). Access chart at /chart?symbol=BBCA&interval=D&theme=dark
History Response Format
// Success (columnar format)
{ "s": "ok",
  "t": [1711324800, 1711411200],
  "o": [8900, 8925],
  "h": [8950, 8975],
  "l": [8875, 8900],
  "c": [8925, 8950],
  "v": [123456, 98765] }

// No data (with hint)
{ "s": "no_data", "nextTime": 1711238400 }

Authentication

Dual auth model — JWT for API, sessions for Admin/Portal

API Key Lifecycle

1
Create Key
Generate 32 random bytes → format as idx_live_{64 hex} → SHA-256 hash stored in DB. Full key shown once.
2
Exchange for JWT
Web: POST /v1/auth/token. Plugin: POST /v1/plugin/activate. Returns RS256 JWT valid for 15 minutes.
3
Revoke
Revoked keys fail on next token exchange. Active JWTs expire naturally (max 15 min).
JWT Claims Structure
{
  "sub": "user-uuid",
  "iss": "idx-market-data",
  "tier": "pro",
  "key_id": "key-uuid",
  "machine_id": "sha256(...)",  // plugin only
  "exp": 1711929600,
  "iat": 1711928700
}

Tier System

Single source of truth in internal/config/tier.go

FeatureFreeProEnterprise
Rate Limit60 req/min600 req/minUnlimited
TimeframesDaily onlyAll (1m-1d)All (1m-1d)
History Depth90 days730 daysUnlimited
HD AccessStrippedObfuscated (FNV)Raw values
WebSocketBlocked50 symbolsUnlimited
Max Machines125
StreamingNoYesYes

Tier Override Priority

Highest
API Key Override
>
Medium
DB Tier Config
>
Fallback
Hardcoded tier.go

RBAC & HD Access

Role-based access control + tier-based data filtering

Roles

RoleScopeAccess
superadminSystemEverything — user management, HD config, tier config, imports
adminSystemView all users/keys, manage plugins, view audit
userSelfOwn keys, own machines, request upgrades, view own usage

HD (Hidden Delta) Access by Tier

TierHD BehaviorImplementation
FreeHD = 0 (stripped)Middleware zeroes hd field on all bars
ProHD obfuscated (relative)FNV hash seeded by client_id — relative values preserved
EnterpriseRaw HD (full access)No modification, raw computed values

Monitoring

Real-time observability across the full platform stack

Monitoring Stack

1
Go API (:2112/metrics)
Exposes Prometheus-format metrics: request rates, latency histograms, error counters, rate limiter stats, audit health, plugin gauges.
2
Prometheus (:19090)
Scrapes Go API metrics every 15s. Evaluates alert rules (feed staleness, error rates, pool exhaustion). Stores 15-day time-series history.
Grafana datasource Alert rules
3
Grafana (:13000)
Visualization layer. 2 pre-provisioned dashboards with auto-configured Prometheus datasource. Login: admin / admin.
4
Alertmanager (:19093)
Routes alerts from Prometheus to Telegram bot. Requires TELEGRAM_BOT_TOKEN + TELEGRAM_CHAT_ID env vars.

Dashboards

Built-in
/ops/latency
QuestDB ages, Redis/PG/CH health, Redpanda consumer lag, parser channel buffers, pipeline performance
Built-in
/ops/ranking
Top 20 stocks by volume, value, trade frequency (auto-refresh 10s)
Grafana Dashboard
IDX API Dashboard
10 panels: request rate, latency p50/p95/p99, error rate (5xx), active WS connections, active SSE connections, rate limiter fail-open counter, audit fallback writes, stale plugin activations, request rate by status code, plugin activations.
Grafana Dashboard
IDX Platform Overview
8 panels: service health, QuestDB write rate (ILP rows/sec), ClickHouse insert rate, Redpanda consumer lag, Redis memory usage, Redis connected clients, disk usage, CPU usage.
Grafana dashboards are auto-provisioned from monitoring/grafana/dashboards/. Datasource auto-configured to Prometheus at :19090. No manual setup needed — just docker compose up -d grafana.

Access Points

ToolURLAuthPurpose
Grafanahttp://localhost:13000admin / adminTime-series dashboards (API + Platform)
Prometheushttp://localhost:19090nonePromQL queries, alert rule status
Alertmanagerhttp://localhost:19093noneActive alerts, silences, routing
Pipeline Monitor/ops/latencynoneLive pipeline health (built into Go API)
Market Ranking/ops/rankingnoneTop 20 volume/value/frequency
Prometheus Metrics Exposed
# Request metrics
http_requests_total{method, path, status}
http_request_duration_seconds{method, path}

# Rate limiting
rate_limit_hits_total
rate_limit_fail_open_total

# Audit
audit_writes_total
audit_fallback_writes_total

# Plugin
plugin_activations_active
plugin_heartbeat_failures_total

# WebSocket / SSE
ws_connections_active
sse_connections_active
Docker Compose Commands
# Start monitoring stack
docker compose -f docker-compose.dev.yml up -d grafana prometheus alertmanager

# Check status
docker ps --filter "name=idxmdp-grafana"
docker ps --filter "name=idxmdp-prometheus"

# View Grafana logs
docker logs idxmdp-grafana --tail 20

# Restart monitoring
docker compose -f docker-compose.dev.yml restart grafana prometheus

Alert Thresholds

MetricWarningCriticalAction
Feed → DB latency>5s>60sCheck parser logs, QuestDB health
Request p99>50ms>200msCheck DB pool exhaustion, slow queries
rate_limit_fail_open>0 for 5m>0 for 15mRedis connectivity issue
audit_fallback_writes>0>0 for 10mPostgreSQL connection issue
Consumer lag (Redpanda)>1000>10000Metric worker or CH drain backpressure
Alertmanager requires TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID env vars. Without them, Alertmanager will restart-loop. Set via .env or skip Alertmanager if Telegram bot isn't configured yet.

HD & RSI Metrics

Proprietary market indicators computed in real-time by the Rust metric worker

HD (Hidden Delta) — 7-Step Pipeline

1
Price Delta
delta = close - prev_close
2
Signed Volume
signed_vol = (delta >= 0) ? +volume : -volume — only during gate hours (09:05 – 14:50 WIB)
3
Cumulative Signed Volume
cum_signed = running_sum(signed_vol) — resets daily
4
Net Signed Volume
net_signed = cum_positive + cum_negative
5
MAPO
mapo = ema(net_signed, period=20) — Moving Average Price x Outstanding
6
DMAPO
dmapo = mapo - prev_day_mapo — frozen at 15:00 WIB
7
HD Value
hd = net_signed - dmapo — the final Hidden Delta indicator

Gate Times (Configurable via Admin)

GateDefaultDescription
Gate Open09:05:00 WIBStart accumulating signed volume
Gate Close14:50:00 WIBStop accumulating signed volume
DMAPO Freeze15:00:00 WIBFreeze daily MAPO snapshot
Gate times are configurable via /admin/hd-config (superadmin only) with hot-reload via Redis pub/sub to the Rust worker. No restart needed.

RSI (Relative Strength Index)

// Standard RSI-14 on 1-minute close prices
gain = avg(positive_deltas, 14)
loss = avg(negative_deltas, 14)
rs   = gain / loss
rsi  = 100 - (100 / (1 + rs))

Key Decisions

Architecture and design choices with rationale

Why One Binary (not 8)?

The reference idx-data-api used 8 separate binaries. We consolidated because:

  • Solo operator = one process to monitor, one set of logs
  • Fiber handles REST + WebSocket on one port
  • Shared connection pools (no duplicated PG/Redis connections)
  • Fewer containers = less Docker complexity
Why QuestDB + ClickHouse (dual storage)?
  • QuestDB: Optimized for ingestion (ILP, 10M+ rows/s). SAMPLE BY for OHLCV aggregation.
  • ClickHouse: Optimized for analytical queries. 40:1 compression.
  • Drain: After 3 days, QuestDB partitions drain to ClickHouse and get dropped.
Why Redpanda (not Kafka)?
  • Single binary (no ZooKeeper/KRaft)
  • Kafka-compatible API (same client libraries)
  • Lower resource overhead for single-node
  • Built-in admin API at :28082
Why Rust Parser (not Go)?
  • 3.1M msg/s vs ~500K in Go (6x faster)
  • Zero-alloc parsing (<0.3us per message)
  • No GC pauses during critical ingestion path
  • Binary ITCH protocol needs tight memory control
Why Board='RG' Filter?

IDX has boards: RG (Regular), NG (Negotiated), TN (Tunai/Cash). Only RG represents true market price discovery. NG/TN are negotiated off-market and would distort OHLCV.

Why HTMX for Admin (not React)?
  • Admin panel = 12 CRUD pages, not a charting app
  • HTMX via embed.FS = zero npm, zero build step, zero CORS
  • Tailwind + DaisyUI = modern UI with minimal effort
  • Alpine.js for lightweight interactivity

Project Stages

Scope-driven development stages with phases inside each — not time-boxed sprints

Each Stage delivers one major capability. Stages contain Phases — ordered implementation steps. The tree below shows what was built, which services were introduced, and current status.
S1
Parser Backbone
Live ITCH feed → parsed data into hot storage
DONE
  • P1 — Rust ITCH Parser
    Zero-alloc binary decoder for 7 message types: Trade, Snapshot, Orderbook (bid/ask), Contract, Index. Benchmarked at 3.1M msg/s with <0.3µs per message. Fat LTO + native CPU (AVX2/SSE4.2).
    Rust binary
  • P2 — Dual RabbitMQ Consumer
    Two AMQP exchanges (itchdata + idxdata) on one shared channel. Prefetch=500, zero lock contention. Auto-reconnect on connection loss.
    RabbitMQ (remote)
  • P3 — QuestDB Ingestion
    6 parallel ILP writers over TCP with TCP_NODELAY. Batch 500 msgs or 100ms flush. Auto-reconnect. Writes to 6 tables.
    QuestDB
  • P4 — OHLCV Aggregation
    1-minute bar state machine using FxHashMap per symbol. Filters board=RG only (excludes NG/TN negotiated trades).
  • P5 — Pipeline Channels
    9 crossbeam bounded channels for backpressure: consumer→parser (65K), parser→aggregator (65K), parser→writers (256–8K).
  • P6 — Dashboard + LogView
    Built-in Rust HTTP dashboards. OHLCV charts, ranking, latency page (:18080). Live message stream, per-table stats (:18081).
What was achieved
  • Live IDX feed ingestion at 12K msg/s with 250x headroom
  • 6 QuestDB tables populated: idx_ticks, idx_ohlcv, idx_snapshot, idx_orderbook, idx_index, idx_contracts
  • Criterion benchmarks with TSV history tracking (13 benchmark points)
  • Operations guide + decision log documented

Services Introduced

Rs
Rust Parser
Data Ingestion Engine
Receives binary ITCH feed from IDX exchange via RabbitMQ AMQP. Decodes 7 message types at wire speed. Produces parsed records to QuestDB (ILP TCP) and Redpanda (async Kafka). The core backbone — everything downstream depends on this.
3.1M/s
Parse Rate
9
Threads
<0.3µs
Per Message
Qd
QuestDB
Hot Time-Series Store
Ingestion-optimized columnar database. Receives data via ILP (Influx Line Protocol) over TCP at port 19009. Purpose-built for time-series with native SAMPLE BY for OHLCV aggregation. Stores the last 3–14 days of live data depending on table.
7
Tables
3-14d
Retention
:19000
HTTP Port
S2
Metric Pipeline
Proprietary indicators (HD + RSI) computed from live OHLCV bars
DONE
  • P1 — Redpanda Message Bus
    Kafka-compatible single-binary message broker. 6 topics (idx.ticks, idx.ohlcv, idx.orderbook, idx.snapshot, idx.index, metrics.hd), 10 partitions each. Replaces need for full Kafka cluster.
    Redpanda
  • P2 — ClickHouse Cold Storage
    Columnar analytical database for unlimited history. 40:1 compression ratio. Receives drained data from QuestDB via ch-drain worker. ReplacingMergeTree for deduplication.
    ClickHouseCH Drain
  • P3 — Redis Cache Layer
    In-memory cache for sub-millisecond metric lookups. Stores latest HD/RSI per ticker with 24h TTL. Also used for WS pub/sub fan-out, sessions, and rate limiting.
    Redis
  • P4 — HD Engine (7-Step Pipeline)
    Rust MetricEngine trait. Computes Hidden Delta: price delta → signed volume → cumulative → net signed → MAPO (EMA-20) → DMAPO (daily diff) → HD value. Gate times: 09:05–14:50 WIB.
  • P5 — HD Accuracy Verification
    100% match vs Go reference implementation across 606 tickers. Accuracy fixture with CSV comparison gate in CI.
  • P6 — RSI-14 Engine
    Standard Wilder's RSI on 1-minute close prices. 14-period lookback. 4 unit tests. Runs alongside HD in the same metric worker.
  • P7 — Metric Worker Consumer
    Kafka consumer group metric-worker-hd consuming idx.ohlcv topic. Computes HD+RSI per bar, writes to QuestDB metrics_hd + Redis cache + ClickHouse archive.
  • P8 — QuestDB Partition TTL
    Cron job drops old partitions: 3 days for ticks/snapshots/orderbook, 14 days for OHLCV/metrics. Prevents storage exhaustion.
  • P9 — Historical Backfill
    Replay tool ingests historical fixture files (10.3M messages) to populate ClickHouse with pre-launch data.
  • P10 — ClickHouse Drain
    Consumer group ch-drain reads from multiple Redpanda topics and inserts into ClickHouse. Typically runs after market close (16:30 WIB).
  • P11 — Orderbook Kafka Deferral
    Orderbook publishing to Kafka disabled (82% of message volume). QuestDB still receives orderbook data. Re-enable for M4 order flow metric.
What was achieved
  • HD metric: 100% accuracy match vs Go reference across all 606 tickers
  • RSI-14 Wilder's with 4 passing tests
  • Full hot/cold storage pipeline: QuestDB (hot, 3-14d) → ClickHouse (cold, unlimited)
  • Redpanda message bus with 6 topics, 2 consumer groups operational
  • Redis cache serving sub-ms metric lookups

Services Introduced

Rp
Redpanda
Kafka-Compatible Message Bus
Single-binary Kafka replacement. No ZooKeeper/KRaft needed. Serves as the event backbone: parser publishes parsed data, consumers (metric worker, CH drain, WS broadcaster) subscribe to topics. Built-in admin API + Prometheus metrics at :29644.
6
Topics
10
Partitions
:29092
Kafka Port
CH
ClickHouse
Cold Analytical Store
Columnar OLAP database for unlimited historical data. Achieves 40:1 compression ratio. Uses ReplacingMergeTree for deduplication. Receives data from CH Drain worker after QuestDB partitions age out. Powers historical OHLCV queries and analytics.
40:1
Compression
Retention
:28123
HTTP Port
Rd
Redis
In-Memory Cache + Pub/Sub
Sub-millisecond key-value store. Caches latest HD/RSI metric values per ticker (24h TTL). Powers WebSocket fan-out via PUBLISH/SUBSCRIBE. Also handles session storage, rate limit counters (Lua sliding window), and API response caching.
<1ms
Latency
24h
Cache TTL
:26379
Port
MW
Metric Worker
Real-Time Indicator Engine
Rust consumer that subscribes to idx.ohlcv topic. Computes HD (7-step pipeline) and RSI-14 for every 1-minute bar. Outputs to QuestDB (metrics_hd table), Redis (sub-ms cache), and ClickHouse (historical archive). Hot-reloadable gate times via Redis pub/sub.
606
Tickers
2
Metrics
1m
Bar Interval
S3
Reconstructed API Layer
Single Go binary serving REST + WS + SSE + Admin + Portal
DONE
  • P1 — Foundation + Auth
    Go/Fiber project structure with internal/ package layout. PostgreSQL migrations (16 tables). JWT RS256 issuer. API key hashing (SHA-256). Rate limiter (Redis Lua sliding window). Health/ready endpoints.
    PostgreSQLGo/Fiber
  • P2 — Data Endpoints
    GET /v1/snapshot/:ticker (QuestDB), GET /v1/ohlcv/:ticker (hybrid QuestDB+ClickHouse), GET /v1/symbols, GET /v1/metrics/latest (Redis sub-ms), GET /v1/hd/chart/:ticker. Tier enforcement in RBAC middleware.
  • P3 — Plugin Endpoints
    AmiBroker DLL wire protocol: POST /v1/plugin/activate (machine licensing), /deactivate, /heartbeat (10-min keepalive), /report (telemetry). Stale plugin cleanup job (5-min). API key auto-expire (hourly).
  • P4 — WebSocket + SSE Streaming
    WS /v1/stream (Redis PubSub → WS broadcast). SSE /v1/live (alternative for web). Subscribe/unsubscribe JSON protocol matching plugin expectations. Tier gating (Free=blocked, Pro=50 sym, Enterprise=unlimited).
  • P5 — Admin Panel
    HTMX + Tailwind + Alpine.js + DaisyUI. Session auth + CSRF. Dashboard, users, API keys, plugins, audit, CSV import. HD config page (gate times + hot-reload). Tier config page (per-tier variables + per-key overrides). Subscription approval queue.
  • P6 — User Portal
    Self-service: own API keys, own machines, usage stats, setup guide. Subscription page (tier comparison, upgrade requests). Email verification + password reset flows.
  • P7 — Zero-Silent-Error + Polish
    Audit write-ahead buffer with fallback file. Prometheus counters/gauges. Graceful shutdown (drain + close pools). TradingView UDF datafeed integration. Response envelope matching plugin expectations.
What was achieved
  • Single Go binary replaces 8 separate services from reference project
  • Three-domain architecture: Admin (/admin/*), Portal (/portal/*), Data API (/v1/*)
  • PostgreSQL with 16 tables: users, API keys, plugins, audit, settings, subscriptions, payments
  • Full AmiBroker plugin compatibility (exact endpoint paths + response format)
  • TradingView charting via UDF datafeed at /chart
  • Rate limiting, RBAC, HD obfuscation, tier enforcement from single TierMatrix

Services Introduced

Go
Go API Server
Unified Application Server
Single Fiber binary serving everything: REST API with JWT auth, WebSocket streaming, SSE, Admin panel (HTMX), User portal, TradingView UDF, Prometheus metrics. Shared connection pools for all databases. Replaces 8 separate binaries from reference architecture.
:18090
Port
3
Domains
50+
Endpoints
PG
PostgreSQL
Authentication & State Database
Relational database for all application state: user accounts, API keys (SHA-256 hashed), plugin activations, audit log (monthly partitions), system settings, subscription requests, payment records. CITEXT for case-insensitive email. Supports hot-reload of tier + HD configs.
16
Tables
RS256
JWT Signing
:25432
Port
S4
Monitoring & Alerting
Prometheus + Grafana + Alertmanager + Telegram notifications
DONE
  • P1 — Prometheus Metrics
    Go API exposes /metrics at :2112. Request rates, latency histograms, error counters, rate limit hits, audit fallback writes, plugin activation gauges.
    Prometheus
  • P2 — Grafana Dashboards
    Pre-configured dashboards for pipeline health, API performance, and database status. Auto-provisioned data sources.
    Grafana
  • P3 — Alertmanager Rules
    Alert rules for feed staleness (>60s), request p99 (>200ms), rate limit fail-open, audit fallback writes, DB connection pool exhaustion.
    Alertmanager
  • P4 — Telegram Bot Integration
    Alertmanager → Telegram bot for ops notifications. Separate channels planned: ops (private), status (public), clients (private).
  • P5 — Ops Dashboards (Go API)
    /ops/latency page: QuestDB table health, Redis/PG/CH status, Redpanda consumer lag, parser channel capacities. /ops/ranking: top 20 by volume, value, frequency.
  • P6 — HD Hot-Reload
    Admin changes HD gate times → PostgreSQL → Redis PUBLISH "config:hd" → Rust metric worker picks up new config → next bar uses new gates. Zero downtime.
What was achieved
  • Full observability stack: Prometheus → Grafana → Alertmanager → Telegram
  • Self-hosted ops pages at /ops/latency and /ops/ranking (live data, no external tools needed)
  • HD configuration hot-reload without service restart
S5
WebSocket & SSE Streaming
Real-time data push to clients (built inside S3 Phase 4)
DONE
  • P1 — WebSocket Server
    WS /v1/stream — bidirectional. Subscribe/unsubscribe per ticker with JSON protocol. Redis PubSub → WS broadcast. JWT in Authorization header.
  • P2 — Server-Sent Events
    GET /v1/live — server-push only. Query params for symbols + metrics. Auto-reconnect built in. Best for web clients that don't need bidirectional.
  • P3 — Tier Gating
    Free = WS blocked entirely. Pro = max 50 symbols. Enterprise = unlimited. HD values stripped/obfuscated per tier in stream.
S5 was implemented as part of S3 Phase 4. Listed separately for architectural clarity — streaming is a distinct capability.
S6
Additional Metrics
Expanding the indicator library beyond HD
PARTIAL
  • M1 — RSI-14 (Wilder's)
    Relative Strength Index with 14-period lookback on 1-minute close prices. Smoothed average gains/losses. 4 unit tests passing.
  • M2 — MACD
    Moving Average Convergence Divergence. EMA-12 / EMA-26 / Signal-9. On demand.
  • M3 — OBV
    On-Balance Volume. Cumulative volume with sign determined by close direction.
  • M4 — Order Flow
    Requires re-enabling orderbook Kafka publishing (currently deferred, 82% of volume).
  • M5–M9 — Future Indicators
    Bollinger Bands, Stochastic, ATR, VWAP, custom signals. Each = 1 engine file in Rust.
Each new metric follows the same pattern: implement MetricEngine trait in Rust, add to worker consumer, write to metrics_hd table + Redis. ~1 session per metric.
S7
Stock Screener
Multi-criteria stock filtering and ranking engine
HOLD
  • P1 — Screener Query Engine
    Filter stocks by metric thresholds (HD > X, RSI < 30, volume > Y). Combine multiple conditions with AND/OR logic.
  • P2 — Saved Screens
    Users save custom screener configurations. Alert when stocks match criteria.
  • P3 — Screener API
    REST endpoint: GET /v1/screener?filters=... Returns ranked list of matching tickers.
On hold — requires discussion on scope, tier access, and which metrics to expose in screener filters.
S8
Broker Scraper
External broker data collection and integration
HOLD
  • P1 — Broker Data Source
    Scrape or integrate with broker APIs for additional data not available in ITCH feed.
  • P2 — Data Normalization
    Normalize broker-specific formats into platform standard schema.
On hold — needs discussion on target brokers, data scope, and legal considerations.
S9
Status Page
Public-facing service status + incident communication
DONE
  • P1 — Error Response Integration
    5xx and degraded responses include status_url field pointing to public status page. Clients can show "check status" link automatically.
  • P2 — /ready Degraded Mode
    GET /ready returns degraded status when any backend (Redis, PG, QuestDB, CH) is unhealthy. Prometheus alert triggers on degraded state.
  • P3 — Instatus Setup
    External hosted status page (Instatus.com). Manual setup required: create account, configure components, set STATUS_PAGE_URL env var.
S10
Client Dashboard
TradingView charts + documentation + ops monitoring pages
DONE
  • P1 — TradingView Chart
    Professional charting at /chart with UDF datafeed. Supports 1m, 5m, 15m, 30m, 1h, daily resolutions. Dark theme. Symbol search. HD metric overlay for enterprise tier.
  • P2 — Documentation Site
    Comprehensive docs at /docs. Architecture, pipeline, services, credentials, API reference, tier system, metrics, key decisions. Interactive with collapsible sections and keyboard navigation.
  • P3 — Pipeline Latency Monitor
    /ops/latency — live dashboard showing QuestDB table health, Redis/PG/CH connectivity, Redpanda consumer lag, parser channel buffer capacity, pipeline performance stats.
  • P4 — Market Ranking
    /ops/ranking — top 20 stocks by volume, value, and trade frequency. Auto-refreshes every 10 seconds during trading hours.
S11
Payment Integration
Xendit/Midtrans gateway for automated tier upgrades
HOLD
  • P1 — Database Schema (prepared in S3)
    Tables ready: subscription_plans, subscriptions, payments, webhook_events. Seed data for Pro Monthly (Rp 299K), Pro Annual (Rp 2.99M), Enterprise Monthly (Rp 999K), Enterprise Annual (Rp 9.99M).
  • P2 — PaymentGateway Interface
    internal/payment/gateway.go defines CreateInvoice, VerifyWebhook, GetPaymentStatus, CancelSubscription. Webhook stub routes return 501.
  • P3 — Xendit Integration
    Native IDR support, virtual accounts (BCA/BNI/BRI/Mandiri), QRIS, e-wallets (OVO/Dana/GoPay). HMAC webhook verification.
  • P4 — Billing Automation
    Recurring billing, grace period (7 days), auto-downgrade on expiry. Portal billing page for payment history + receipts.
On hold — database schema and interfaces prepared. Actual payment gateway integration deferred until customer base justifies it. Manual tier upgrades via admin panel work in the meantime.

Project Repository

idx-market-data-platform/
  go-api/                      Go/Fiber API server (:18090)
  rust-workers/                Parser + metric worker + aggregator + ch-drain
  docker/                      Docker compose configs + Grafana/Prometheus
  schema/                      PostgreSQL (16 tables) + ClickHouse schemas
  fixture/                     Test fixtures, HD accuracy CSV, replay data
  charting_library-master/     TradingView charting library (licensed)
  docs/                        Tech specs, decisions log, operations guide

Service Architecture Summary

12 services in total. All use idxmdp- container prefix on idxmdp_net Docker network. Fully independent — no shared services with other projects.
Rust Parser
Ingestion
Metric Worker
HD + RSI
Redpanda
Message Bus
QuestDB
Hot Store
ClickHouse
Cold Store
Redis
Cache + PubSub
PostgreSQL
Auth DB
Go API
:18090

Telegram Alerts Setup

Get notified on your phone the moment a stock matches your criteria

The Screener can deliver alerts directly to your Telegram. This guide walks you through the one-time setup and creating your first alert. Once linked, every alert you create gets delivered to your chat — not a shared channel.

What you'll need

  • A Telegram account on your phone
  • An account on this platform (login required)
  • About 5 minutes

Step 1 — Find your Telegram chat ID

The platform needs to know which Telegram chat to send your alerts to. Find your numeric chat ID:

  1. Open Telegram on your phone
  2. Search for the bot @userinfobot
  3. Tap Start
  4. The bot replies with your Id: — a positive integer like 560442208
  5. Copy that number — you'll need it in Step 3

Step 2 — Start a chat with the IDX Alerts bot

Critical step. Telegram bots cannot send messages to a user who has never messaged them first. Skip this and your alerts will silently fail with chat not found.
  1. In Telegram, search for @idx_testhink_bot
  2. Open the chat
  3. Tap Start (or send any message — /start is the convention)

That's it. You don't need to interact with the bot beyond this. It's purely for delivery.

Step 3 — Link Telegram in the dashboard

  1. Log into the dashboard
  2. Open Screener → Alerts tab
  3. In the Telegram delivery card, click Link Telegram
  4. A 6-digit code appears (e.g. 451096)
  5. In the form below the code, fill in the inputs left-to-right:
    • Left field: your chat ID from Step 1 (e.g. 560442208)
    • Right field: the 6-digit code from above — auto-filled, but verify it matches
  6. Click Verify

The card flips to show Linked to chat <your ID> with a green Linked badge. You're set up.

Step 4 — Create your first alert

In the same Alerts tab, scroll down to the Create alert form:

FieldWhat to enter
TickerThe stock symbol, e.g. BBCA
Alert nameAnything memorable, e.g. BBCA oversold
ConditionsClick + Add condition, set field, operator, value

Worked example — notify me when BBRI's RSI drops below 30:

  • Ticker: BBRI
  • Alert name: BBRI oversold
  • Condition: field RSI, operator <, value 30

Click Create alert. The new alert appears in the Active alerts table above with status enabled.

Step 5 — Wait for delivery

The alert worker checks every 10 seconds. When the condition transitions from false to true, a message lands on your phone:

🔔 BBRI
BBCA oversold

The message includes the ticker and your alert name.

How delivery actually works

  • Each transition fires once. If RSI drops below 30, you get one message. If it rises back above 30 then drops again, you get another. Mid-condition (already true), no re-fire.
  • Quotas: 50 alerts per user.
  • Disable temporarily: toggle the alert in the Active alerts list to pause without deleting.
  • One ticker per alert: a single alert tracks one ticker. To watch multiple tickers, create one alert per ticker.
  • Cross-ticker conditions are not allowed for alerts (e.g. you cannot reference index membership). Use the Screener tab for cross-ticker scans.

Troubleshooting

SymptomLikely causeFix
code unknown or expired on Verify The code in the page expired, or the dashboard cached an old one Refresh the browser tab and request a fresh Link Telegram code
Alert stays enabled but no message arrives You skipped Step 2 (didn't DM the bot first) Open Telegram, DM @idx_testhink_bot once. The alert auto-retries on the next 10-second cycle — no need to recreate.
Verification failed: code does not match this user You pasted someone else's code Click Re-link for a fresh code tied to your account
Got a message but the data feels stale The screener cache may be slow to refresh Check the Live badge in the Alerts page header. Yellow or red means upstream isn't current.
Screener is disabled in this build. on the Screener tab The platform operator has not enabled the screener (S7_ENABLED=false) Contact the operator to flip the flag

Stop receiving alerts

  • Delete individual alerts from the Active alerts list, OR
  • Toggle an alert to disabled to pause it without losing the configuration, OR
  • Click Re-link in the Telegram delivery card and verify a different chat ID to redirect alerts elsewhere
What's happening behind the scenes

When you create an alert, it lives in postgres with last_eval = false. Every 10 seconds, the alert-worker service runs a cycle:

  1. SELECTs all enabled alerts
  2. HGETs the current snapshot from Redis (screener:state:tickers)
  3. For each alert, evaluates the condition tree against the snapshot row for that alert's ticker
  4. Compares the result to the alert's last_eval. On a false → true transition, the worker:
    • Inserts a row into alert_fired (PK on (alert_id, transition_ts) — idempotent, so worker restarts can't re-deliver the same transition)
    • Calls Telegram's sendMessage API with your linked chat ID
    • On 200 OK, marks delivery_status = sent
    • On 4xx/5xx, marks failed and retries with exponential backoff (1, 2, 4, 8, 16 seconds, max 5 attempts)

The dispatcher's most common failure mode is HTTP 400 chat not found — that's the bot-anti-spam rule from Step 2 biting late.

IDX Symbol Names Refresh

Regenerate the ticker → company-name map shown in the chart's symbol search

The chart's symbol search displays the full company name next to each IDX ticker (e.g. BBCA → "PT Bank Central Asia Tbk"). That mapping lives in go-api/internal/refdata/idx_symbols.json and is embedded into the API binary at build time. This guide explains how to regenerate the file when new tickers list on IDX or when names change.

When to refresh

  • New IPO — ticker shows the bare code instead of company name in symbol search
  • Ticker symbol change after merger / corporate action (e.g. EXCL → XLSmart Telecom Sejahtera)
  • Quarterly hygiene — KSEI republishes ownership data monthly; pulling a fresh snapshot picks up renames you might have missed

Typical cadence: every 2–3 months, or whenever the screener shows an unmapped ticker that traders ask about.

Why we don't fetch live from IDX

The official endpoints at www.idx.co.id/primary/ListedCompany/… and /secondary/get/v1/… are behind Cloudflare's bot challenge. Plain curl returns HTTP 403 with the “Just a moment…” interstitial. A headless-browser scraper would bypass it but is slow and brittle, so we use KSEI's republished ownership CSV instead — same roster, no bot challenge.

Source of truth: KSEI ownership data

KSEI (the Indonesian central securities depository) publishes monthly stock-ownership snapshots which include the full issuer_name for every listed company. The community repo aryakdaniswara/idx-stock-ownership mirrors these as structured CSV. We use the latest CSV's share_code + issuer_name columns.

Refresh procedure

The whole pipeline is a single bash session — no committed script, since the source URL changes per snapshot. Run from the repo root.

Step 1 — Find the latest KSEI CSV

# List recent CSV files in the source repo
curl -sS "https://api.github.com/repos/aryakdaniswara/idx-stock-ownership/contents/data" \
  | grep '"download_url".*\.csv'

Pick the most recent file (filename pattern kepemilikan_saham_YYYYMMDD.csv) and copy its download_url.

Step 2 — Download and extract unique pairs

# Replace URL with the latest from Step 1
curl -sS "https://raw.githubusercontent.com/aryakdaniswara/idx-stock-ownership/main/data/kepemilikan_saham_YYYYMMDD.csv" \
  -o /tmp/ksei.csv

# Extract unique (ticker, name) pairs from columns 2 + 3
awk -F',' 'NR>1 {print $2"\t"$3}' /tmp/ksei.csv | sort -u > /tmp/ksei_pairs.tsv
wc -l /tmp/ksei_pairs.tsv   # expect ~950

Step 3 — Pull current QDB ticker universe

We only emit entries for tickers that actually trade in our QDB — warrants and delisted codes get the bare-ticker fallback automatically.

curl -sS -G "http://localhost:19000/exec" \
  --data-urlencode "query=SELECT DISTINCT ticker FROM idx_ohlcv ORDER BY ticker" \
  | python3 -c "import json,sys; d=json.load(sys.stdin); [print(r[0]) for r in d['dataset']]" \
  > /tmp/qdb_tickers.txt

# Build the intersection (preserve all KSEI variants per ticker)
awk -F'\t' '
  NR==FNR { ksei[$1] = ksei[$1] "\n" $2; next }
  ($1 in ksei) {
    n = split(ksei[$1], variants, "\n")
    for (i=2; i<=n; i++) print $1"\t"variants[i]
  }
' /tmp/ksei_pairs.tsv /tmp/qdb_tickers.txt > /tmp/intersection.tsv

Step 4 — Title-case + write JSON

python3 << 'PYEOF' > go-api/internal/refdata/idx_symbols.json
import re, json

ticker_to_names = {}
with open('/tmp/intersection.tsv') as f:
    for line in f:
        line = line.rstrip('\n')
        if '\t' not in line: continue
        ticker, name = line.split('\t', 1)
        ticker_to_names.setdefault(ticker, []).append(name)

def pick_canonical(names):
    # Drop sub-class variants like "MVS GOTO ..." in favor of plain "GOTO ..."
    candidates = [n for n in names if not re.match(r'^(MVS|DR)\s', n)]
    if not candidates: candidates = names
    return min(candidates, key=len)

pairs = {t: pick_canonical(ns) for t, ns in ticker_to_names.items()}

def fix_parens(s):
    s = re.sub(r'\(\s+', '(', s); s = re.sub(r'\s+\)', ')', s); return s

ACRONYMS = {
    'PT', 'XL', 'CIMB', 'BNI', 'BCA', 'BRI', 'BTPN', 'BFI', 'AKR',
    'MNC', 'GMF', 'KAI', 'IDX', 'OCBC', 'NISP', 'UOB', 'HSBC', 'IFG',
    'BSI', 'WSBP', 'KB', 'IBK', 'QNB', 'SMBC', 'BTN', 'CBP',
}

def smart_title(s):
    s = fix_parens(s); out = []
    for word in s.split():
        m = re.match(r'^(\W*)(.*?)(\W*)$', word)
        prefix, body, suffix = (m.group(1), m.group(2), m.group(3)) if m else ('', word, '')
        if not body: out.append(word); continue
        upper = body.upper()
        if body.lower() == 'tbk':       out.append(prefix + 'Tbk' + suffix)
        elif upper in ACRONYMS:          out.append(prefix + upper + suffix)
        elif body[0].isalpha():          out.append(prefix + body.capitalize() + suffix)
        else:                            out.append(word)
    return ' '.join(out)

out = {
    "_comment": "IDX ticker -> full company name. Sourced from KSEI ownership data, filtered to tickers actually present in our QDB universe, title-cased. Tickers absent fall back to the bare ticker."
}
for ticker in sorted(pairs):
    cleaned = smart_title(pairs[ticker])
    if not cleaned.startswith('PT '): cleaned = 'PT ' + cleaned
    out[ticker] = cleaned

print(json.dumps(out, indent=2, ensure_ascii=False))
PYEOF

Step 5 — Hand-clean the diff

Smart-title gets ~95% right, but a handful of brand acronyms come out wrong (e.g. XLSmartXlsmart). Diff against the previous version and patch the obvious ones in-place:

git diff go-api/internal/refdata/idx_symbols.json | head -100

Common touch-ups:

  • Add new acronym to the ACRONYMS set if it appears in >1 company name
  • Single-occurrence quirks: just hand-edit the JSON value
  • Verify BBCA, TLKM, BMRI, BBRI, GOTO as smoke-check anchors

Step 6 — Rebuild + verify live

docker compose -f docker-compose.dev.yml build api
docker compose -f docker-compose.dev.yml up -d api

# Smoke test: search "BBCA" should return the full company name
curl -sS -b /tmp/cookies.txt "http://localhost:18090/udf/search?query=BBCA&limit=3" \
  | python3 -m json.tool

Expected first result:

{
  "description": "PT Bank Central Asia Tbk",
  "exchange": "IDX",
  "full_name": "IDX:BBCA",
  "symbol": "BBCA",
  "ticker": "BBCA",
  "type": "stock"
}

How the lookup is wired

  • File: go-api/internal/refdata/idx_symbols.json — flat { "TICKER": "Full Name" } map plus a leading _comment key.
  • Loader: go-api/internal/refdata/refdata.go//go:embed's the JSON, parses on package init, exposes refdata.CompanyName(ticker).
  • Consumer: go-api/internal/handler/udf.goUDFSearch and UDFSymbolResolve call the helper stockDescription(ticker) which falls back to the bare ticker for unmapped equities.
  • Search filter: UDFSearch also matches against the company name — typing “Bank Central” finds BBCA.

Adding a single ticker without a full refresh

If only one or two new tickers need adding (e.g. an IPO this week), skip the pipeline and patch the JSON directly:

{
  …
  "NEWX": "PT New Company Tbk",
  …
}

Then rebuild api. Keep the file alphabetically sorted to keep diffs reviewable.

Troubleshooting

SymptomLikely causeFix
Ticker still shows bare code in search after rebuild JSON didn't actually change OR build didn't include refdata package Verify with git diff + docker compose build api --no-cache
Search returns ALL tickers as bare code JSON parse error on init — likely a trailing comma or unescaped quote Check api logs: docker logs idxmdp-api 2>&1 | grep "refdata:" — the loader logs failed to parse on bad JSON
Company name is title-cased weirdly (e.g. Xlsmart) Brand acronym not in the ACRONYMS set — smart-title fell back to str.capitalize() Either hand-edit that one entry, or add the acronym to the set and regenerate
KSEI repo at the GitHub URL is gone or stale Maintainer abandoned it — happens with one-person open-source data mirrors Search GitHub for idx saham indonesia csv for an alternative mirror, OR fall back to a Playwright scraper against IDX directly

Operations Guide

Start · Stop · Status · Monitoring · Logging — everything you need to run the platform day-to-day

Full source: docs/OPERATIONS-GUIDE.md. This page mirrors the runbook verbatim for quick in-browser access. All commands run from the repo root: /home/testhink/idx-market-data-platform.

Quick Links

URLWhat it shows
http://localhost:18082Ops Dashboard — system-wide status, services, tables, topics
/ops/latencyLatency monitor — per-table age, LIVE/STALE, write rate
/ops/rankingTop 20 by volume/value/frequency
http://localhost:19000QuestDB console — SQL queries
http://localhost:28123ClickHouse HTTP — direct queries
http://localhost:13000Grafana — admin / admin

1. Start / Stop

# Start everything (all containers, dependency order)
docker compose -f docker-compose.dev.yml up -d

# Start individual services
docker compose -f docker-compose.dev.yml up -d questdb
docker compose -f docker-compose.dev.yml up -d parser
docker compose -f docker-compose.dev.yml up -d metric-worker

# Stop everything (preserves volumes)
docker compose -f docker-compose.dev.yml down

# Full wipe (removes data volumes — re-apply schemas after)
docker compose -f docker-compose.dev.yml down -v

2. Status Check

# Visual (recommended)
# → http://localhost:18082 — auto-refresh every 10s

# CLI
docker compose -f docker-compose.dev.yml ps
docker logs idxmdp-parser --tail 20
docker logs idxmdp-metric-worker --tail 20
docker logs idxmdp-ch-drain --tail 20

3. Monitoring Commands

Kafka / Redpanda

docker exec idxmdp-redpanda rpk topic list
docker exec idxmdp-redpanda rpk topic consume idx.ohlcv --num 1
docker exec idxmdp-redpanda rpk group describe metric-worker-hd
docker exec idxmdp-redpanda rpk group describe ch-drain

QuestDB (hot storage)

curl -s "http://localhost:19000/exec?query=SELECT%20count()%20FROM%20idx_ticks"
curl -s "http://localhost:19000/exec?query=SELECT%20count()%20FROM%20metrics_hd"

ClickHouse (cold storage)

docker exec idxmdp-clickhouse clickhouse-client --database market_data \
  --query "SELECT count() FROM idx_ticks"

docker exec idxmdp-clickhouse clickhouse-client \
  --query "SHOW TABLES FROM market_data"

Redis (cache)

docker exec idxmdp-redis redis-cli -a idxmdp_redis_dev_2026 GET last:hd:BBCA
docker exec idxmdp-redis redis-cli -a idxmdp_redis_dev_2026 KEYS "last:hd:*"
docker exec idxmdp-redis redis-cli -a idxmdp_redis_dev_2026 DBSIZE

4. Logs & Levels

# Follow all / specific service
docker compose -f docker-compose.dev.yml logs -f
docker compose -f docker-compose.dev.yml logs -f parser

# Last N lines
docker logs idxmdp-parser --tail 50

# RUST_LOG levels (set in docker-compose.dev.yml)
RUST_LOG=idx_parser=info        # default
RUST_LOG=idx_parser=debug       # every skipped message
RUST_LOG=idx_parser=warn        # errors only

All services use Docker json-file driver with rotation: max 50 MB per file × 3 files = ~150 MB per service.

5. Maintenance

# QuestDB retention (keep 14 days)
./scripts/questdb-retention.sh 14 --dry-run
./scripts/questdb-retention.sh 14

# Recommended cron — daily after market close (16:30 WIB)
# 30 16 * * 1-5 /home/testhink/idx-market-data-platform/scripts/questdb-retention.sh 14

# Rebuild after code change
docker compose -f docker-compose.dev.yml build parser
docker compose -f docker-compose.dev.yml up -d

6. Daily Workflow (Trading Day)

07:50   docker compose -f docker-compose.dev.yml up -d
        Open http://localhost:18082 — verify all green

08:30   Pre-market: check logview for SNAP/BID/ASK messages

09:00   Market opens: check TRADE messages in logview
        Dashboard candles building (BBCA, TLKM etc.)
        ops dashboard: metrics_hd row count growing

During  Monitor via ops dashboard (:18082)
        docker logs idxmdp-parser --tail 5
        docker logs idxmdp-metric-worker --tail 5

16:15   Market closes — aggregator flushes open bars
16:30   Run retention: ./scripts/questdb-retention.sh 14

7. Graceful Shutdown & Morning Startup

Principle: stop writers upstream-first, let the pipeline drain, verify zero in-flight, then stop storage bottom-up. Start in the exact reverse order. Never use down -v — that wipes volumes.

7.1 Shutdown — Phase 1: Pre-shutdown drain check

Do not start shutdown until you confirm the pipeline is idle (after 16:15 WIB post-trade window).

# 1a. Verify no new ticks are flowing (should be stable between two checks ~10s apart)
curl -s "http://localhost:19000/exec?query=SELECT%20count()%20FROM%20idx_ticks%20WHERE%20ts%20%3E%20dateadd('m',-1,now())"

# 1b. Verify Kafka consumer lag is zero for every group / every partition
docker exec idxmdp-redpanda rpk group describe ch-drain
docker exec idxmdp-redpanda rpk group describe metric-worker-hd
docker exec idxmdp-redpanda rpk group describe metric-worker-rsi
# Every partition must show LAG = 0. If not, WAIT — do not proceed.

7.2 Shutdown — Phase 2: Run daily retention (optional)

./scripts/questdb-retention.sh 14

7.3 Shutdown — Phase 3: Stop writers upstream-first (drain the pipeline)

# 3a. Stop parser FIRST — cuts off new data at the source
docker compose -f docker-compose.dev.yml stop parser

# 3b. Wait ~30s for the aggregator to flush in-memory 1m bars to Kafka
sleep 30

# 3c. Re-verify consumer lag is zero
docker exec idxmdp-redpanda rpk group describe metric-worker-hd
docker exec idxmdp-redpanda rpk group describe ch-drain

# 3d. Stop metric-worker (drains idx.ohlcv → metrics_hd / metrics_rsi)
docker compose -f docker-compose.dev.yml stop metric-worker

# 3e. Stop ch-drain (drains idx.* → ClickHouse)
docker compose -f docker-compose.dev.yml stop ch-drain

7.4 Shutdown — Phase 4: Stop read-side tier

docker compose -f docker-compose.dev.yml stop api dashboard logview ops
docker compose -f docker-compose.dev.yml stop grafana alertmanager prometheus node-exporter redis-exporter

7.5 Shutdown — Phase 5: Snapshot data stores

# 5a. Redis — force a background save so in-memory state hits disk
docker exec idxmdp-redis redis-cli -a idxmdp_redis_dev_2026 BGSAVE
sleep 3
docker exec idxmdp-redis redis-cli -a idxmdp_redis_dev_2026 LASTSAVE

# 5b. ClickHouse — optional lazy merge (skip if in a hurry; merges happen next start)
docker exec idxmdp-clickhouse clickhouse-client --database market_data \
  -q "OPTIMIZE TABLE idx_ticks FINAL" 2>/dev/null || true
docker exec idxmdp-clickhouse clickhouse-client --database market_data \
  -q "OPTIMIZE TABLE idx_ohlcv FINAL" 2>/dev/null || true

# 5c. QuestDB auto-commits its WAL on clean stop — nothing to do

7.6 Shutdown — Phase 6: Stop infrastructure bottom-up

# Redpanda must stop AFTER all consumers are gone (done in §7.3)
docker compose -f docker-compose.dev.yml stop redpanda

# Then storage
docker compose -f docker-compose.dev.yml stop clickhouse questdb redis postgres

# Verify all stopped
docker compose -f docker-compose.dev.yml ps
Faster alternative after §7.3 drain: once writers are stopped and lag is zero, you can collapse §7.4–7.6 into a single docker compose -f docker-compose.dev.yml stop — graceful SIGTERM to everything. Volumes are preserved. Data-loss risk: zero.

7.7 Startup — Phase 1: Infrastructure first (T-60min, 08:00 WIB)

cd /home/testhink/idx-market-data-platform

# Bring up storage + Kafka first; wait for healthchecks
docker compose -f docker-compose.dev.yml up -d questdb redpanda clickhouse redis postgres

# Poll until all five are (healthy) — usually <30s
for i in 1 2 3 4 5 6; do
  docker compose -f docker-compose.dev.yml ps questdb redpanda clickhouse redis postgres
  sleep 5
done

7.8 Startup — Phase 2: Smoke-test infrastructure

# QuestDB — row counts persisted from last session
curl -s "http://localhost:19000/exec?query=SELECT%20count()%20FROM%20idx_ticks"
curl -s "http://localhost:19000/exec?query=SELECT%20count()%20FROM%20idx_ohlcv"

# Redpanda — topics present (disk-persisted)
docker exec idxmdp-redpanda rpk topic list

# ClickHouse — tables present
docker exec idxmdp-clickhouse clickhouse-client -q "SHOW TABLES FROM market_data"

# Redis — ping
docker exec idxmdp-redis redis-cli -a idxmdp_redis_dev_2026 PING

# Postgres — users table
docker exec idxmdp-postgres psql -U idxmdp -d idxmdp -c "SELECT count(*) FROM users"

If any check fails, STOP here. Do not start writers on top of broken storage.

7.9 Startup — Phase 3: Start consumers before the producer

# ch-drain + metric-worker first — so they're already consuming
# by the time parser starts firing. Prevents startup lag spikes.
docker compose -f docker-compose.dev.yml up -d ch-drain metric-worker

# Confirm they joined their consumer groups cleanly
docker logs --since 30s idxmdp-ch-drain 2>&1 | tail -10
docker logs --since 30s idxmdp-metric-worker 2>&1 | tail -10

7.10 Startup — Phase 4: API, monitoring, read-side tier

docker compose -f docker-compose.dev.yml up -d api
curl -sf http://localhost:18090/health && echo ' API OK'

docker compose -f docker-compose.dev.yml up -d dashboard logview ops
docker compose -f docker-compose.dev.yml up -d prometheus grafana alertmanager node-exporter redis-exporter

7.11 Startup — Phase 5: Parser LAST (T-30min, 08:30 WIB)

# Parser is the data source — start it last so everything downstream is listening
docker compose -f docker-compose.dev.yml up -d parser

# Watch it connect to RabbitMQ and start consuming
docker logs -f --since 10s idxmdp-parser
# Expect: "RabbitMQ connected", message decode counts climbing

7.12 Startup — Phase 6: Pre-market verification (T-30 to T-0)

# 6a. All containers running and healthy
docker compose -f docker-compose.dev.yml ps

# 6b. Parser metrics flowing
curl -s http://localhost:9464/metrics | grep -E 'idx_parser_itch_bytes_total|idx_parser_messages_total'

# 6c. Consumer groups rejoined with zero lag
docker exec idxmdp-redpanda rpk group describe ch-drain
docker exec idxmdp-redpanda rpk group describe metric-worker-hd

# 6d. Pre-market ticks flowing after 08:45 WIB
curl -s "http://localhost:19000/exec?query=SELECT%20count()%20FROM%20idx_ticks%20WHERE%20ts%20%3E%20dateadd('m',-5,now())"

# 6e. Ops dashboard visual check
# → http://localhost:18082  — all 15+ services green
# → http://localhost:13000/d/pipeline-flow  — Grafana pipeline-flow dashboard

7.13 Startup — Phase 7: Market open (09:00 WIB)

# First trades should arrive within the first minute
docker logs --since 2m idxmdp-parser 2>&1 | grep -i trade | head -5

# 1m bars start building at 09:01
curl -s "http://localhost:19000/exec?query=SELECT%20ticker%2C%20ts%2C%20close%20FROM%20idx_ohlcv%20WHERE%20ts%20%3E%20dateadd('m'%2C-2%2Cnow())%20LIMIT%205"

# Chart at /chart?symbol=BBCA should show live candles updating

7.14 Recovery — if something goes wrong

SymptomLikely causeFix
Consumer lag growing on startupmetric-worker started before Kafka was readydocker compose restart metric-worker after confirming Redpanda is healthy
Parser can’t connect to RabbitMQUpstream RabbitMQ down or credentials expiredCheck docker logs idxmdp-parser; verify .env AMQP URL
QuestDB not accepting writesWAL corruption or disk fulldf -h; docker logs idxmdp-questdb
ClickHouse slow to startLarge merge from prior OPTIMIZE FINALWait; docker logs idxmdp-clickhouse
Chart shows stale barsBrowser cache or stale connectionHard reload (Ctrl+Shift+R); check /health
Gaps in today’s data after startupParser missed early messages during startup lagUse the Backfill Guide runbook

8. Troubleshooting

SymptomCheckFix
No trade datarpk topic listidx.ticks missingMarket may be closed; check RabbitMQ connectivity
metrics_hd emptydocker logs idxmdp-metric-workerCheck idx.ohlcv exists (needs trade ticks first)
ClickHouse emptydocker logs idxmdp-ch-drainCheck consumer-group lag; old-format messages are skipped
Container restartingdocker logs <container> --tail 50Check config/connection errors
Port conflictss -tlnp | grep <port>Find the collision with docker ps

Monitoring Guide

Grafana · Prometheus · Alertmanager · Telegram — the authoritative monitoring runbook

Full source: docs/MONITORING-GUIDE.md. The shorter Monitoring page is a UI-level summary; this one is the operational bible.

1. Scope

In scope: container health, QuestDB ILP throughput, ClickHouse inserts, Redpanda consumer lag, Redis memory & clients, host CPU/disk/mem, Go API RPS/latency/errors, Telegram alert routing.

Out of scope: business metrics, billing state, user analytics, market-data correctness (HD accuracy — see Backfill Guide).

The Rust parser/metric-worker/ch-drain binaries do not currently expose Prometheus metrics. Placeholder scrape jobs exist at ports 9464/9465/9466 in monitoring/prometheus/prometheus.yml (commented out). S2 follow-up: axum-based /metrics endpoints in rust-workers/src/metrics_server.rs.

2. Scrape Targets

ContainerPortPathExposes
idxmdp-api2112/metricsGo/Fiber: RPS, latency histograms, error counters, WS/SSE counts
idxmdp-questdb9003/metricsILP committed rows/sec, table row counts
idxmdp-clickhouse9363/metricsinsert rate, system metrics
idxmdp-redpanda9644/public_metricsconsumer lag per group/topic
idxmdp-node-exporter9100/metricshost CPU / memory / disk
idxmdp-redis-exporter9121/metricsRedis memory / client count

3. Access Points

ToolURLAuth
Grafanahttp://localhost:13000admin / admin
Prometheushttp://localhost:19090none
Alertmanagerhttp://localhost:19093none

4. Start / Verify Stack

# Bring up monitoring
docker compose -f docker-compose.dev.yml up -d grafana prometheus alertmanager \
  node-exporter redis-exporter

# Verify each scrape target is UP
curl -s http://localhost:19090/api/v1/targets | jq '.data.activeTargets[] | {job:.labels.job, health:.health}'

# Check a specific metric is being scraped
curl -s 'http://localhost:19090/api/v1/query?query=up' | jq '.data.result'

5. Dashboards (pre-provisioned)

Grafana
IDX API Dashboard
10 panels — request rate, latency p50/p95/p99, 5xx rate, active WS/SSE, rate-limit fail-open, audit fallback writes, stale plugin activations.
Grafana
IDX Platform Overview
8 panels — service health, QuestDB ILP rows/sec, ClickHouse insert rate, Redpanda lag, Redis memory/clients, disk, CPU.

6. Alert Thresholds

MetricWarningCriticalAction
Feed → DB latency>5s>60sCheck parser logs, QuestDB health
API request p99>50ms>200msDB pool exhaustion, slow queries
rate_limit_fail_open>0 for 5m>0 for 15mRedis connectivity
audit_fallback_writes>0>0 for 10mPostgreSQL connectivity
Redpanda consumer lag>1000>10000Metric worker or CH drain backpressure

7. Telegram Routing

# Required env vars (set in .env before starting alertmanager)
TELEGRAM_BOT_TOKEN=...
TELEGRAM_CHAT_ID=...

# Test send
docker exec idxmdp-alertmanager amtool alert add test severity=warning \
  --alertmanager.url=http://localhost:9093
Without these env vars Alertmanager will restart-loop. Either set them or remove the alertmanager service from the compose file until Telegram is configured.

8. Common Troubleshooting

SymptomDiagnosisFix
Target DOWN in Prometheuscurl http://<container>:<port>/metrics from inside idxmdp-prometheusPort / network mismatch in prometheus.yml
Grafana shows "No Data"Check time range, datasource URL (should be http://prometheus:9090)Re-provision datasource from monitoring/grafana/
Alertmanager restart-loopdocker logs idxmdp-alertmanagerSet Telegram env vars or disable service
QuestDB metric gapsQuestDB was restarted (counter reset)Use rate() with resets() in PromQL

Backfill Guide

Recover · Recompute · Restore — the data-recovery swiss army knife

Full source: docs/BACKFILL-GUIDE.md. All commands run inside idxmdp-parser. The binary is at /usr/local/bin/backfill.

Quick Reference

I need to…Run this
Recompute 1m bars from raw ticks for a time windowRunbook A (§6.1)
Recompute HD/RSI metrics from existing 1m barsRunbook B (§6.2)
Restore historical OHLCV from CSV filesRunbook C (§6.3)
Repair HD/RSI flat-day after metric-worker cold-startRunbook D (§6.4) — ./scripts/repair-metrics.sh
Where to look first when something goes wrong:
• Parser logs: docker logs --since 1h idxmdp-parser
• Aggregator metrics: curl -s http://localhost:9464/metrics | grep aggregator
• QuestDB console: http://localhost:19000
• ClickHouse: docker exec idxmdp-clickhouse clickhouse-client -q "SELECT count() FROM idx_ohlcv WHERE date='2026-04-08'"

1. Tiers Rebuilt

TierStorageSubcommand
Hot OHLCVQuestDB idx_ohlcvohlcv
Cold OHLCVClickHouse idx_ohlcvohlcv, restore
Metrics (HD/RSI)QuestDB metrics_*metric

All subcommands re-use the live pipeline's own Aggregator and MetricEngine code so output is byte-identical. Every run is idempotent.

2. Binary Location

The backfill binary lives at rust-workers/target/release/backfill after cargo build --release --bin backfill. For production runs, always invoke it inside the parser container (docker exec idxmdp-parser backfill…) so it uses the same network, env vars and config as the live workers. The host-side binary is only for --dry-run validation.

3. Subcommand Reference

3.1 ohlcv — re-aggregate 1-minute bars

# One day, all tickers
docker exec idxmdp-parser backfill ohlcv \
  --from 2026-04-07 --to 2026-04-08

# A single minute, single ticker
docker exec idxmdp-parser backfill ohlcv \
  --from 2026-04-07T09:14:00 --to 2026-04-07T09:15:00 --ticker BBCA

# Dry-run (count ticks, no writes)
docker exec idxmdp-parser backfill ohlcv \
  --from 2026-04-07 --to 2026-04-08 --dry-run

3.2 metric — recompute HD or RSI

docker exec idxmdp-parser backfill metric \
  --engine hd --from 2026-04-01 --to 2026-04-08

docker exec idxmdp-parser backfill metric \
  --engine rsi --from 2026-04-07 --to 2026-04-08 --ticker BBCA
Stateful engines (HD, RSI) require warm-up. The tool automatically queries bars before --from and feeds them into the engine without writing output. For RSI-14 specifically, widen the warm-up to at least 14 prior bars or the first 14 rows will be NaN/0.

3.3 restore — import CSV into ClickHouse

docker cp /home/testhink/dumps/2026-Q1.csv idxmdp-parser:/tmp/2026-Q1.csv
docker exec idxmdp-parser backfill restore /tmp/2026-Q1.csv

4. Time Format & UTC Gotcha

Accepted forms (all interpreted as UTC, --from inclusive, --to exclusive):

  • YYYY-MM-DD — e.g. 2026-04-07 (midnight UTC)
  • YYYY-MM-DDTHH:MM:SS ISO — 2026-04-07T09:14:00
  • YYYY-MM-DD HH:MM:SS2026-04-07 09:14:00

IDX trades on WIB (UTC+7). Subtract 7 hours before passing to backfill.

EventWIBUTC
Session 1 open09:0002:00
Session 1 close12:0005:00
Session 2 open13:3006:30
Pre-close auction16:1409:14
Market close16:1509:15
If --dry-run reports 0 ticks retrieved for a window when the market was open, you forgot to subtract 7 hours.

5. Runbook A — Recover a missing minute bar

# 1. Detect (QuestDB)
SELECT ticker, count() FROM idx_ohlcv
WHERE ts >= '2026-04-07T09:14:00.000Z'
  AND ts <  '2026-04-07T09:15:00.000Z'
GROUP BY ticker;

# 2. Dry-run
docker exec idxmdp-parser backfill ohlcv \
  --from 2026-04-07T09:14:00 --to 2026-04-07T09:15:00 --dry-run

# 3. Run
docker exec idxmdp-parser backfill ohlcv \
  --from 2026-04-07T09:14:00 --to 2026-04-07T09:15:00

# 4. Verify — both tiers should now match

6. Runbook B — Recompute HD/RSI for a full day

# 1. Detect gaps
SELECT count() FROM metrics_hd
WHERE ts >= '2026-04-07T02:00:00.000Z'
  AND ts <  '2026-04-07T09:30:00.000Z';

# 2. Dry-run (watch for "Warm-up complete: N bars replayed")
docker exec idxmdp-parser backfill metric \
  --engine hd --from 2026-04-07 --to 2026-04-08 --dry-run

# 3. Run
docker exec idxmdp-parser backfill metric \
  --engine hd --from 2026-04-07 --to 2026-04-08

# 4. Verify
SELECT symbol, count(), min(ts), max(ts)
FROM metrics_hd WHERE ts IN '2026-04-07'
GROUP BY symbol LIMIT 10;

Note: IN '2026-04-07' is QuestDB shorthand for the entire UTC day.

7. Runbook C — Restore ClickHouse from CSV

docker cp /home/testhink/dumps/idx_ohlcv-2026Q1.csv idxmdp-parser:/tmp/restore.csv
docker exec idxmdp-parser backfill restore /tmp/restore.csv --dry-run
docker exec idxmdp-parser backfill restore /tmp/restore.csv

# Then re-run Runbook B to rebuild dependent metrics

8. Runbook D — Repair HD/RSI after a metric-worker cold-start

Use this when an unexpected restart leaves metric-worker running with empty in-memory state — the symptom is HD value=0.0 (or RSI stuck at 50.0) for every bar of the affected day across all symbols. Root cause: metric-worker started before ClickHouse was ready to serve queries, so warm-up logged warm-up failed: ... — starting cold and the engine never seeded.

1. Detect — confirm the cold-start case (not a real flat market):

-- QuestDB. zero_cnt ≈ total_cnt means the whole day is bogus.
SELECT count(*) AS total_cnt,
       sum(case when value=0 then 1 else 0 end) AS zero_cnt,
       count(distinct symbol) AS syms
FROM metrics_hd WHERE ts >= '2026-05-07';

# Cross-check metric-worker startup log
docker logs idxmdp-metric-worker --since 24h | grep -E "warm-up failed|warm-up: [0-9]+ bars replayed"

A warm-up failed: ... line on the most recent restart confirms the diagnosis.

2. One-command fix:

# today (UTC)
./scripts/repair-metrics.sh

# a specific past day
./scripts/repair-metrics.sh 2026-05-07

The script does four steps (not three — the first version of this runbook missed step 4, leaving the chart showing huge |delta| bars at the day boundary because ClickHouse still held the cold-start zeros while QuestDB had been repaired):

  1. Restarts idxmdp-metric-worker and waits for warm-up: N bars replayed (rebuilds in-memory state from ClickHouse).
  2. Drops the bad day's partition in QuestDB metrics_hd and metrics_rsi.
  3. Re-runs backfill metric --engine hd and --engine rsi for that day — both subcommands do their own warm-up before processing, so output values are continuous with the prior day. Writes to QuestDB only.
  4. Runs sync-metrics-hd-to-ch.py and sync-metrics-rsi-to-ch.py to mirror the corrected rows into ClickHouse metrics_hd / metrics_rsi. Charts at 1D+ resolution query CH (not QDB), so without this step the histogram shows tall green+red bars at the boundary where CH has stale zeros while QDB is clean.

3. Verify — pick a known symbol and confirm continuity:

-- QuestDB. Today's first value should equal yesterday's last value
-- (first bar of a new day inherits prev-day frozen DMAPO).
SELECT ts, value FROM metrics_hd
WHERE symbol = 'PIPA' AND ts >= '2026-05-06' AND ts < '2026-05-08'
ORDER BY ts;
Idempotent — safe to re-run. Elapsed time: ~60s warmup + ~25s HD backfill + ~25s RSI backfill. Not the fix: seeding refdata / symbol map — refdata is unaffected; this is purely an in-memory engine-state problem.

9. Troubleshooting

Where to look first

docker logs --since 1h idxmdp-parser
docker logs --since 1h idxmdp-metric-worker
docker logs --since 1h idxmdp-ch-drain

# QuestDB console
# → http://localhost:19000

# ClickHouse client
docker exec -it idxmdp-clickhouse clickhouse-client --database market_data

# Prometheus (backfill + aggregator counters)
# → http://localhost:19090/graph?g0.expr=aggregator_bars_emitted_total
# → http://localhost:19090/graph?g0.expr=backfill_rows_written_total
SymptomLikely causeFix
Retrieved 0 ticks from QuestDBPassed WIB time as UTCSubtract 7 hours
Authentication failedWrong IDX__CLICKHOUSE__*docker exec idxmdp-parser env | grep CLICKHOUSE
QuestDB ILP: connect refusedRunning on host not in containerUse docker exec idxmdp-parser…
No bars found for the given rangeEmpty idx_ohlcv for windowRun backfill ohlcv first
No hd output producedWarm-up window had no prior barsWiden --from

Data Pipeline Flow

End-to-end flow from IDX ITCH feed to client dashboards and AmiBroker plugin

Full source: docs/DATA-PIPELINE-FLOW.md. The Data Pipeline page in the Overview group is a visual summary; this page contains the narrative.

Stage Map

1
IDX Exchange → RabbitMQ
Binary ITCH v5 feed published to two upstream AMQP exchanges (itchdata, idxdata).
2
Rust Parser (idxmdp-parser)
Zero-alloc decoder on 9 threads. Consumes from both exchanges with prefetch=500, emits into 9 crossbeam bounded channels for backpressure. 3.1M msg/s at <0.3µs per message.
3
QuestDB (hot)
6 parallel ILP writers over TCP (port 9009, NODELAY). Batch 500 msgs or 100ms flush. 7 tables: idx_ticks, idx_ohlcv, idx_snapshot, idx_orderbook, idx_index, idx_contracts, metrics_hd.
4
Redpanda (message bus)
Kafka-compatible single binary. 6 topics with 10 partitions each: idx.ticks, idx.ohlcv, idx.snapshot, idx.index, metrics.hd. Orderbook currently disabled (82% of volume, will re-enable for M4).
5
Metric Worker (idxmdp-metric-worker)
Kafka consumer group metric-worker-hd. Consumes idx.ohlcv, runs HD 7-step pipeline + RSI-14, writes to QuestDB metrics_hd/metrics_rsi + Redis cache + ClickHouse archive.
6
ClickHouse Drain (idxmdp-ch-drain)
Consumer group ch-drain. Reads all topics, inserts into ClickHouse ReplacingMergeTree. Typically runs after market close (16:30 WIB).
7
Go API (idxmdp-api)
Single Fiber binary serving Admin / Portal / Data API on :18090. REST + WebSocket + SSE + TradingView UDF. Reads from QuestDB (recent), ClickHouse (history), Redis (latest metrics).
8
Clients
TradingView charting library (web), AmiBroker plugin (desktop), admin HTMX panel.

Topic & Table Matrix

DataKafka topicQuestDB tableClickHouse table
Raw tradesidx.ticksidx_ticksidx_ticks
1m barsidx.ohlcvidx_ohlcvidx_ohlcv
Snapshotsidx.snapshotidx_snapshotidx_snapshot
Orderbookdeferredidx_orderbook
Indexidx.indexidx_indexidx_index
HD metricmetrics.hdmetrics_hdmetrics_hd

Retention

TierQuestDBClickHouse
Ticks / snapshots / orderbook3 daysunlimited
1m OHLCV14 daysunlimited
Metrics (HD / RSI)14 daysunlimited

Partition TTL runs as a cron job after market close. See Operations Guide §5.

Parser Tech Spec

Rust ITCH parser — 9-thread pipeline, zero-alloc decoder, 3.1M msg/s

Full source: docs/PARSER-TECH-SPEC.md.

Architecture — 9 Thread Pipeline

RabbitMQ ──▶ Consumer (x2, prefetch=500)
                  │
                  ▼
           Parser threads (x1, zero-alloc decoder)
                  │
        ┌─────────┼──────────┐
        ▼         ▼          ▼
   Aggregator  ILP Writers  Kafka Producer
   (1m bars)   (x6, NODELAY) (async)
        │         │          │
        ▼         ▼          ▼
     idx_ohlcv  6 QDB tables Redpanda (6 topics)

Supported Message Types

TypeNamePurpose
1TradePrice / volume / board (RG / NG / TN)
2SnapshotOHLC + volume + value per ticker
3Orderbook BidTop-of-book bid update
4Orderbook AskTop-of-book ask update
5IndexIHSG and sector indices
6ContractInstrument metadata
9Heartbeat / statusFeed health

Performance

MetricValueNotes
Parse rate3.1M msg/sCriterion bench, fat LTO + native CPU
Per-message<0.3 µsZero allocations on the hot path
Live load~12K msg/s250x headroom vs live feed
Channel capacity65K / 65K / 256–8KConsumer → parser → writers

Board Filter

IDX has three boards: RG (Regular — real market price discovery), NG (Negotiated — off-market), TN (Tunai / Cash). Only RG ticks flow into idx_ohlcv. Filtering out NG/TN matches the Go reference implementation and prevents negotiated trades from distorting OHLCV.

Build Flags

# Cargo.toml release profile
[profile.release]
lto = "fat"
codegen-units = 1
panic = "abort"

# Build command
RUSTFLAGS="-C target-cpu=native" cargo build --release

Bench & Testing Guide

Criterion benchmarks, TSV history tracking, HD accuracy fixtures

Full source: docs/BENCH-GUIDE.md. All commands run from the repo root.

Run Benchmarks

# Full bench suite (from host, NOT container)
cd rust-workers && cargo bench

# Record a bench point with a note in docs/bench-history.tsv
./scripts/bench-record.sh "hd engine v2 with EMA fast-path"

# View historical trend
cat docs/bench-history.tsv

HD Accuracy Fixture

The HD engine has a 100% accuracy gate in CI: the Rust implementation is compared byte-for-byte against the Go reference across 606 tickers. The fixture lives at fixture/hd-accuracy.csv.

# Run the accuracy test locally
cd rust-workers && cargo test --release hd_accuracy

# If it fails, diff is written to /tmp/hd-diff.csv

Unit Tests

# Rust workers
cd rust-workers && cargo test --release

# Go API
cd go-api && go test ./...

# Go API with race detector
cd go-api && go test -race ./...

Replay Fixtures

# Replay a recorded ITCH dump into the live pipeline
docker exec idxmdp-parser replay /etc/idx-parser/fixture/itch-low26.txt

# Large fixture (10.3M messages)
docker exec idxmdp-parser replay /etc/idx-parser/fixture/full-day.txt

Project Status

Stage-by-stage progress snapshot — what's done, what's next, what's on hold

Full source: docs/PROJECT-STATUS.md. See also the Project Stages page for the detailed phase tree inside each stage.

Current Status

Completed
S1–S6, S9
Parser, Metrics, API, Auth, Monitoring
Next
S10
Client dashboard
On Hold
S7, S8, S11
Screener, Scraper, Payments

Stage Roster

StageScopeStatus
S1Parser Backbone — Rust ITCH decoder, QuestDB ingestion, OHLCV aggregatorDONE
S2Metric Pipeline — HD (7-step, 100% Go match), RSI-14, Redpanda, ClickHouse drain, backfill toolDONE
S3API — Go/Fiber single binary with Admin / Portal / Data API, 16 Postgres tablesDONE
S4Telegram Bot — status alerts, admin commandsDONE
S5TradingView charting, HD/RSI chart overlaysDONE
S6RSI engine integration into metric worker + chartDONE
S7ScreenerHOLD
S8Broker scraperHOLD
S9Monitoring — Prometheus, Grafana, Alertmanager, Telegram, InstatusDONE
S10Client dashboardNEXT
S11Payments (Xendit)HOLD

Recent Milestones

  • 2026-04-08 — Operations / Monitoring guides expanded; backfill-guide quality fixes (TL;DR, RSI warm-up note, troubleshooting "where to look").
  • 2026-04-07 — S2-B5 HD accuracy fixture at 100% match vs Go reference across 606 tickers.
  • 2026-04-03 — S3 API DONE: 1 binary, 3 domains, 16 PG tables, HTMX admin, Xendit prep.
  • 2026-04-02 — S2 backfill tool + CI/CD gate DONE.

Enterprise Code Review

Comprehensive E2E audit — 2026-04-12 — Branch: feat/ch-qdb-hybrid-query

This review covers the full platform: Rust workers, Go API, infrastructure, security, and end-to-end data pipeline. Findings are grouped by severity with reasoning and file references.

Audit Scope

Five parallel review agents examined distinct domains simultaneously. Each agent performed an independent deep read of all files in its domain.

DomainAgent FocusFiles Reviewed
Rust WorkersConsumer, parser, aggregator, db_writer, engines, ch_drain, metric_worker18 files
Go APIHandlers, middleware, DB clients, cache, audit, tier system, tests25+ files
InfrastructureDocker Compose (dev/staging/prod), Dockerfiles, schemas, monitoring15 files
SecurityOWASP Top 10, auth/authz, credential management, injection surfacesAll public endpoints + middleware
Data PipelineEnd-to-end trace from Kafka ingest to API response13 files in data path order

Findings Overview

Critical
12
Fix before Monday trading
Important
21
Fix this sprint
Minor
12
Backlog
Total
45
Unique de-duplicated findings

Data Flow with Failure Points

The diagram below traces a single market tick from vendor Redpanda through every processing stage to the client API response. Red markers indicate where data can be silently lost.

1
Vendor Redpanda (port 19092, SASL SCRAM-SHA-256)
Two topics: itchdata (snapshots, orderbook, index) and idxdata (trade ticks). Two independent consumer threads with separate group IDs. If one topic crashes, the other keeps flowing.
2
consumer.rs → crossbeam channel (65,536 cap)
Backpressure via pause()/resume() on Kafka partitions. When the channel is full, the consumer pauses fetching and retries with 50ms sleep. Auto-commit is true (5s interval) — offset commits happen based on wall time, not processing completion.
Backpressure: correct
3
parser.rs → aggregator.rs
Pipe-delimited ASCII parsing. Trade ticks use blocking send (never dropped). Snapshots, orderbook, and index use try_send — dropped if channel full.
Snapshot/Index: Kafka published BEFORE drop check OHLCV bars: silent let _ = try_send
4
db_writer.rs → QuestDB ILP (TCP)
ILP over TCP with reconnect-once-then-drop. If QuestDB is down and the second write attempt also fails, the entire batch is permanently lost. The caller always calls buf.clear() regardless of success.
CRITICAL: batch lost on double-write failure
5
kafka_producer.rs → Internal Redpanda (29092)
Fire-and-forget publish. No backpressure propagation from internal Redpanda back to the consumer. If internal Redpanda is full, messages are silently dropped by librdkafka’s internal queue.
6
ch_drain.rs → ClickHouse
Consumes from internal Redpanda. Batches 1000 rows, HTTP INSERT with 3 retries. Two critical bugs: (a) auto-commit fires before CH insert confirms — on crash, offset is advanced but rows never reach CH. (b) buf.rows.clear() runs unconditionally even when insert fails — rows are permanently discarded after 3 retries.
CRITICAL: auto-commit before insert CRITICAL: rows cleared on failure
7
metric_worker.rs → HD/RSI Engines → QuestDB + Redis
Warm-up from ClickHouse on startup (entire history buffered into RAM — OOM risk). auto.offset.reset=latest means fresh consumer groups skip all historical messages. Sinks (QDB + Redis) errors are logged but offset is committed anyway.
OOM risk on warm-up Stale-seeded metrics post-restart
8
Go API → Hybrid CH + QDB Queries → Client
Split at now-1h: ClickHouse for cold data, QuestDB for hot. QDB wins on overlap. MetricHistory incorrectly calls QueryOHLCV (price bars) for the CH window instead of QueryMetrics (HD/RSI values).
MetricHistory CH window returns wrong data type

Severity Definitions

LevelDefinitionSLA
CRITICALData loss, security breach, or build failure. System is actively vulnerable or losing data.Fix before next trading session
IMPORTANTReliability risk, correctness bug, or defense-in-depth gap. Not actively exploited but will cause incidents.Fix within 1 week
MINORCode quality, latent risk, or operational improvement. Low probability of triggering.Backlog

Data Pipeline Audit

End-to-end trace from Kafka ingest to API response — every failure point identified

C4: ch_drain Auto-Commits Before ClickHouse Insert

CRITICAL rust-workers/src/bin/ch_drain.rs:313-314

What happens: The ch_drain Kafka consumer uses enable.auto.commit = true with a 5-second interval. The consumer accumulates rows in memory and flushes to ClickHouse when the batch reaches 1,000 rows or 5 seconds pass. Since auto-commit fires on wall time (not on successful insert), there is a race window on every cycle.

Failure scenario:

  1. Consumer polls and buffers 800 rows over 4.5 seconds
  2. At 5.0 seconds, Kafka auto-commit fires — offsets for those 800 messages are committed to the broker
  3. At 5.0 seconds, the flush timer also fires — HTTP INSERT to ClickHouse begins
  4. ClickHouse returns 503 (overloaded). All 3 retry attempts fail
  5. The 800 rows are discarded (see C5 below). The offsets are already committed. Those messages will never be replayed

Impact: Up to 1,000 rows per table per event are permanently lost from ClickHouse cold storage. The user sees gaps in historical charts that don’t appear in QuestDB (which got the data via direct ILP write).

Fix: Switch to enable.auto.commit = false and commit offsets manually after ch.insert_json_rows() succeeds. The metric_worker already does this correctly at worker/mod.rs:177.

C5: ch_drain Clears Row Buffer on Insert Failure

CRITICAL rust-workers/src/bin/ch_drain.rs:424-431

What happens: After the insert attempt (success or failure), buf.rows.clear() and buf.last_flush = Instant::now() execute unconditionally — they are outside the match arms.

match ch.insert_json_rows(&database, buf.table, &buf.rows).await {
    Ok(()) => { total_inserted += count; ... }
    Err(e) => { total_errors += count; ... }
}
// OUTSIDE the match — runs regardless:
buf.rows.clear();          // ← rows gone forever
buf.last_flush = Instant::now();

Combined with C4: This guarantees that any ClickHouse hiccup results in permanent data loss. The rows cannot be retried (cleared from memory) and cannot be replayed from Kafka (offset already committed).

Fix: Move buf.rows.clear() into the Ok(()) arm only. On failure, leave rows in the buffer for retry on the next flush cycle.

C6: db_writer Drops Entire Batch on QuestDB Double-Failure

CRITICAL rust-workers/src/db_writer.rs:148-156

What happens: The ILP writer attempts a TCP write. On failure, it reconnects and retries once. If the retry also fails, the function returns without error — and the caller’s macro always calls buf.clear() afterward.

fn flush(stream: &mut TcpStream, buf: &str, addr: &str, label: &str) {
    if let Err(e) = stream.write_all(buf.as_bytes()) {
        *stream = connect(addr, label);             // reconnect
        if let Err(e2) = stream.write_all(buf.as_bytes()) {
            tracing::error!("{}: retry write failed, {} bytes lost", ...);
            // ← returns normally, caller will .clear() the buffer
        }
    }
}

Impact: A QuestDB outage lasting more than one flush cycle (typically seconds) causes silent loss of ticks and OHLCV bars from the hot store. Since the parser doesn’t retry at the channel level, these rows are gone.

Fix: Return Result from flush(). On failure, skip buf.clear() so the batch is preserved for the next flush attempt. Add a idx_parser_db_writer_rows_lost_total counter.

I1: Aggregator OHLCV Bars Silently Dropped

IMPORTANT rust-workers/src/aggregator.rs:277-279

What happens: Completed OHLCV bars are sent via let _ = self.ohlcv_tx.try_send(cb). The let _ = pattern discards the Result — if the channel is full, the bar vanishes with no log, no metric, no alert.

Why it matters: A slow QuestDB writer causes the OHLCV channel to fill. Bars dropped here affect both QuestDB and ClickHouse (since the Kafka producer is downstream). Unlike tick drops (logged with a warning), OHLCV drops are completely invisible.

Fix: Log on drop and increment idx_parser_aggregator_ohlcv_drops_total.

I3: Snapshot Published to Kafka Before QDB Drop Check

IMPORTANT rust-workers/src/pipeline.rs:251,272

What happens: The Kafka publish (kp.send_snapshot) fires before the try_send to the QuestDB channel. If the QDB channel is full, the snapshot is sent to Kafka (and eventually to ClickHouse via ch_drain) but never reaches QuestDB.

Impact: ClickHouse has snapshot/index rows that QuestDB doesn’t. Any query that reads from QuestDB exclusively will see gaps. The same pattern affects idx_index messages.

I4: Orderbook Depth Indexing Mismatch

IMPORTANT rust-workers/src/db_writer.rs:82 vs ch_drain.rs:212

DestinationDepth IndexBest Bid/Ask
QuestDB (ILP writer)0-based (enumerate() starts at 0)depth = 0
ClickHouse (ch_drain)1-based (i + 1)depth = 1

Any cross-database join or comparison on orderbook depth is off-by-one. Screener features that merge both sources will silently mis-label price levels.

I5: MetricHistory Returns Wrong Data Type from ClickHouse

IMPORTANT go-api/internal/handler/data.go:337-355

What happens: The MetricHistory handler (for HD/RSI values) calls ch.QueryOHLCV() for the ClickHouse time window. QueryOHLCV reads from idx_ohlcv (price bars: open, high, low, close, volume) — not from metrics_hd or metrics_rsi (metric values: value, mapo, direction).

Impact: For any request where the date range extends more than 1 hour into the past, the CH portion returns OHLCV price bars instead of HD/RSI metric values. The QDB portion (last hour) is correct. The merged response is structurally valid JSON but semantically wrong — the client chart displays price data where it expects metric indicator values.

Fix: Create ch.QueryMetrics(ctx, symbol, metric, from, to) that reads from metrics_hd/metrics_rsi and call it from MetricHistory.

I10: HDChart hotCut Computed Per-Goroutine

IMPORTANT go-api/internal/handler/hdchart.go:127

What happens: The HD chart handler spawns 6 goroutines (one per timeframe: 1m, 5m, 15m, 30m, 1h, 1d). Each goroutine computes hotCut = time.Now().Add(-1h) independently. If goroutine scheduling crosses a second boundary, different timeframes use different split points.

Impact: The 6-timeframe chart response has inconsistent overlap boundaries. One timeframe may show a gap or duplicate bar at the split point while others are clean. Visible as chart glitches near the 1-hour mark.

Fix: Compute hotCut once before launching goroutines and pass it as a parameter.

Rust Workers Review

Consumer, parser, aggregator, db_writer, Kafka producer, engines, ch_drain, metric_worker

C2: Credentials in Committed config.toml

CRITICAL rust-workers/config.toml:6,12,34

The Kafka SASL password (bridge2025!) and Redis password (idxmdp_redis_dev_2026) are in config.toml, which is committed to git and COPY-ed into the Docker image (Dockerfile line 72). Anyone with repo access or image registry access can extract these credentials.

Why this matters: The Kafka credentials grant read access to the live IDX market data feed. The Redis password grants access to cached metrics, session data, and tier configuration.

Fix: Replace config.toml values with placeholders. Supply real credentials only via IDX__* environment variables at runtime. The settings loader already supports this — config.toml should be a template, not a credential store.

I6: Unsafe UTF-8 in kafka_producer.rs

IMPORTANT rust-workers/src/kafka_producer.rs:58-98

What happens: The producer uses unsafe { self.buf.as_mut_vec() } to write JSON directly into a String’s internal buffer via serde_json::to_writer. If serde_json encounters an IO error mid-write, the String may contain partial UTF-8, violating its invariant and causing undefined behaviour on any subsequent string operation.

Fix: Replace with serde_json::to_string(tick) which is safe and has negligible cost since the buffer is cloned anyway.

I8: metric_worker auto.offset.reset=latest

IMPORTANT rust-workers/src/worker/mod.rs:81

What happens: When the metric worker starts with a new consumer group (no committed offset), it only processes bars arriving after startup. The warm-up from ClickHouse compensates, but if ch_drain is also lagging, the warm-up produces incomplete state.

Impact: HD/RSI values appear correct after ~14 bars but are stale-seeded for the first few minutes post-restart. This is visible as a small “jump” in the metric chart immediately after parser restart.

I9: warmup.rs Buffers Entire CH History in RAM

IMPORTANT rust-workers/src/worker/warmup.rs:52-53

What happens: resp.text().await? buffers the complete ClickHouse response (all rows from idx_ohlcv) into a single String before line-by-line parsing begins. At 700 symbols × 330 bars/day × N days, this easily reaches hundreds of MB.

Fix: Add WHERE ts >= now() - INTERVAL 30 DAY to bound the warm-up query, or use reqwest::Response::bytes_stream() for streaming line-by-line processing.

I21: String Keys in HD/RSI Engine Hot Path

IMPORTANT rust-workers/src/engine/hd.rs:117

What happens: The engine uses FxHashMap<String, HdTickerState>. The .entry(symbol.to_owned()) call allocates a new heap String on every bar (~1,400/s), even for tickers already in the map. The main parser correctly uses fixed [u8; 16] keys.

Fix: Use SmallVec<[u8; 16]> or a fixed-size array key, matching the aggregator’s pattern.

M1: parse_u32 Wrapping Overflow

MINOR rust-workers/src/parser.rs:403-408

The parser uses wrapping_mul/wrapping_add to avoid panics, but a 10-digit volume string like "5000000000" passes the length guard (len ≤ 10) yet overflows a u32 (max 4,294,967,295). The wrapping produces a silently wrong value (705,032,704 instead of 5,000,000,000).

Fix: Replace with checked_mul(10)?.checked_add(...)? to return None on overflow.

M3: Pipeline Comment Says “RabbitMQ”

MINOR rust-workers/src/pipeline.rs:19

Comment says “Raw bytes from RabbitMQ consumer” — should say Kafka/Redpanda after the migration.

M4: Redis KEYS Command in Ops Dashboard

MINOR rust-workers/src/bin/ops.rs:277

KEYS last:hd:* is O(N) and blocks Redis. Safe at current scale (~2,800 keys) but runs every 10 seconds on the ops dashboard. Replace with SCAN for non-blocking enumeration.

Go API Review

Handlers, middleware, DB clients, cache, audit, tier system, tests

C9: Redis Key Injection via Unvalidated metric Param

CRITICAL go-api/internal/handler/data.go:281

key := "last:" + metric + ":" + symbol

symbol is validated by ValidTicker (regex ^[A-Z0-9]{1,10}$), but metric is taken directly from c.Query("metric") with zero validation. An attacker can supply ?metric=../../session to probe arbitrary Redis key namespaces, potentially reading session tokens or rate-limit buckets.

Fix:

var validMetrics = map[string]struct{}{"hd": {}, "rsi": {}}
if _, ok := validMetrics[metric]; !ok {
    return c.Status(400).JSON(models.Err("invalid metric"))
}

C10: Silent Date Parse Failure Bypasses Tier Limits

CRITICAL go-api/internal/handler/data.go:322-323

from, _ := time.Parse("2006-01-02", fromStr)
to, _   := time.Parse("2006-01-02", toStr)

Parse errors are silently discarded. When time.Parse fails, it returns zero-time (year 0001-01-01). The history-depth clamp on line 327 clamps this to earliest = now - HistoryDays, which works for non-enterprise tiers. But enterprise tiers with IsUnlimitedHistory() == true skip the clamp entirely, passing year-0001 to the database and returning the entire ClickHouse history.

Compare with: OHLCVCached (lines 169-175) correctly returns HTTP 400 on parse failure.

Fix: Return 400 on bad date, matching the existing pattern.

I2: indexCache Thundering Herd

IMPORTANT go-api/internal/handler/udf.go:56-75

What happens: The index name cache uses a read-unlock → check-freshness → re-lock-and-write pattern with no singleflight guard. When the 5-minute TTL expires, all concurrent TradingView chart loads simultaneously call qdb.Indices(ctx) instead of just one. TradingView fires multiple /udf/symbols and /udf/search requests per chart load.

Fix: Use sync.Once or golang.org/x/sync/singleflight to deduplicate concurrent refreshes.

I7: Tier Subscriber Has No Reconnect

IMPORTANT go-api/internal/config/tier_subscriber.go:41-44

When the Redis pubsub channel closes (Redis restart, network blip), the subscriber goroutine logs a warning and exits permanently. After this, no tier hot-reloads will be applied to this API instance until the process restarts. This is the only background job without reconnect logic — StartAuditReplay and StartKeyExpiry both have retry loops.

Fix: Wrap the subscribe/listen loop in an outer reconnect loop with exponential backoff.

I20: Admin Self-Promotion of Tier

IMPORTANT go-api/internal/handler/admin/users_edit.go:109-111

The admin UserEdit handler correctly gates role changes to superadmin-only. But tier changes have no such restriction. An admin can pass their own user ID and promote their account tier to “enterprise”, bypassing billing entirely.

Fix: Prevent admins from editing their own tier, or require superadmin role for tier changes.

Other Important Findings

IDFindingFile
I17TokenRefresh doesn’t verify API key is still active before issuing new JWThandler/auth.go:59-90
I19generateSecureToken ignores rand.Read error — zero-entropy token on failurehandler/auth_session.go:440-443
I6 (audit)Partial JSONL write on WriteByte('\n') failure corrupts fallback fileaudit/async.go:117-135

Minor Findings

IDFindingFile
M5UDF W/M resolutions silently alias to 1d instead of returning errorhandler/udf.go:17-29
M6CSRF cookie missing Secure flagmiddleware/csrf.go:29-35
M7Session cookie missing Secure flaghandler/auth_session.go:83-90

Security Audit

OWASP Top 10 — authentication, authorization, injection, credential management

C1: Telegram Bot Token in .env.example

CRITICAL .env.example:151

TELEGRAM_BOT_TOKEN=8715922974:AAHWx7cmM6WL1QD2CfMoqzZDwTPRVMA5a0s

.env.example is explicitly allowed through .gitignore (line 6: !.env.example), meaning it is committed and visible in git history. This token follows the exact structure of a real Telegram bot API token. Possession grants full bot control: receiving alert messages, sending to channels, enumerating chat IDs.

Immediate action: Revoke via @BotFather (/revoke), generate new token, replace in .env.example with CHANGE_ME_YOUR_BOT_TOKEN.

C3: /ops/* Endpoints Have Zero Authentication

CRITICAL go-api/cmd/server/main.go:177-180

Four routes registered before the JWT auth group:

app.Get("/ops/latency", handler.OpsLatencyPage())
app.Get("/ops/ranking", handler.OpsRankingPage())
app.Get("/ops/api/latency", handler.OpsLatencyAPI(qdb, ch, rdb, pool))
app.Get("/ops/api/ranking", handler.OpsRankingAPI(qdb))

/ops/api/latency returns: QuestDB table names and row counts, ClickHouse row counts, Redis DB size, PostgreSQL pool stats, Redpanda consumer group lag with topic names. This is a full infrastructure inventory that directly aids targeted attacks.

Fix: Wrap in SessionAuth + RequireSessionRole("admin").

C7: ClickHouse Has Empty Password

CRITICAL .env:31, docker-compose.dev.yml:98

The default ClickHouse user operates with no password. The clickhouse-users.xml network restriction allows connections from the entire Docker subnet 172.0.0.0/8. Any container on any bridge network on the host can query, insert, or drop all market data tables with no authentication.

C8: Prometheus Metrics Exposed Externally

CRITICAL go-api/cmd/server/main.go:358

http.ListenAndServe(":2112", mux)

Bound to 0.0.0.0:2112 and published via Docker. The /metrics endpoint leaks rate limit counters, session counts, active WebSocket connections, audit buffer stats. For a financial SaaS, this is operational intelligence for an attacker.

Fix: Bind to 127.0.0.1:2112. Let Prometheus scrape via internal Docker network only.

I15: /udf/history Bypasses All Auth and Tier Limits

IMPORTANT go-api/cmd/server/main.go:199-204

The TradingView UDF endpoints are entirely unauthenticated. /udf/history proxies to the same QuestDB and ClickHouse backends as the paid /v1/ohlcv endpoint, returning full OHLCV data with no tier enforcement, no rate limiting, and no history depth limits. Anyone who reverse-engineers the UDF URL gets free access to data that paying customers pay for.

Fix: Gate the UDF group behind SessionAuth, or apply the same tier-based limits from OHLCVCached to UDFHistory.

I16: Rate Limiter Fails Open

IMPORTANT go-api/internal/middleware/ratelimit.go:39-42

This is a documented design decision, but in a financial SaaS context where tier enforcement is revenue-critical, any Redis disruption (OOM, network partition, or key eviction under allkeys-lru) makes all rate limits disappear. A free-tier user becomes unlimited.

I18: Plugin Report Accepts Unauthenticated Data

IMPORTANT go-api/internal/handler/plugin.go:159-208

POST /v1/plugin/report is outside the JWT group. An unauthenticated attacker can write arbitrary strings into the reports table and inject content into structured log output. If logs are forwarded to a SIEM, this is log injection.

Other Security Findings

IDSeverityFinding
I17IMPORTANTTokenRefresh does not check if API key is still active before issuing new JWT
I19IMPORTANTgenerateSecureToken ignores rand.Read error — potential zero-entropy token
M8MINORFailed login attempts not audited (OWASP A09)
M6/M7MINORCSRF + session cookies missing Secure flag

Infrastructure Review

Docker Compose, Dockerfiles, schemas, monitoring, networking

C11: go-api/Dockerfile References Go 1.25 (Does Not Exist)

CRITICAL go-api/Dockerfile:1

FROM golang:1.25-alpine AS builder

Go 1.25 does not exist (latest stable is 1.24.x as of April 2026). This causes docker build to fail with an image-not-found error. The CI job and dev compose api service both use this Dockerfile. The API image cannot be built.

Fix: Change to golang:1.24-alpine.

C12: Postgres Audit Log Partitions Only Cover Through 2025-06

CRITICAL schema/postgres.sql:91-95

Only two partitions exist: audit_log_2025_01 (Jan 2025) and audit_log_2025_06 (Jun 2025). Today is 2026-04-12. Any audit write with created_at ≥ 2025-07-01 will fail with a PostgreSQL partition constraint violation. All current audit logging is broken.

Fix: Create partitions for the current date range:

CREATE TABLE audit_log_2025_07 PARTITION OF audit_log
  FOR VALUES FROM ('2025-07-01') TO ('2026-01-01');
CREATE TABLE audit_log_2026_01 PARTITION OF audit_log
  FOR VALUES FROM ('2026-01-01') TO ('2027-01-01');

I11: Alertmanager Has No Active Receiver

IMPORTANT monitoring/alertmanager/alertmanager.yml:57

The telegram-ops receiver has no telegram_configs (commented out). Alerts fire in Prometheus, reach Alertmanager, and are recorded only in the in-memory UI — no Telegram, no email, no PagerDuty. A QuestDBDown or APIDown critical alert will silently expire after 5 minutes without operator notification.

I12: No Memory/CPU Limits on Any Container

IMPORTANT docker-compose.dev.yml (all 18 services)

No service has deploy.resources.limits. On a single-host deployment, a runaway ClickHouse query or Redpanda log storm can OOM the entire host and take down all 18 containers simultaneously.

I13: Parser Missing depends_on for Outbound Redpanda

IMPORTANT docker-compose.dev.yml:247

The parser publishes to idxmdp-redpanda:9092 but only depends on questdb. If outbound-redpanda is not yet healthy at parser startup, initial publish attempts fail silently.

Other Infrastructure Findings

IDSeverityFinding
I14IMPORTANTmetric-worker has no Docker health check
M9MINORQuestDB ILP port 19009 exposed to 0.0.0.0 with no auth
M10MINORStaging/prod compose references non-existent Dockerfile.drain and target: runtime
M11MINORProduction QuestDB volume mount path wrong (/var/lib/questdb vs /root/.questdb)
M12MINORClickHouse image tag 24.3 is floating — should pin to patch

Action Plan

Prioritised fix schedule with effort estimates

Before Monday Trading (Tonight)

These items are either actively losing data, actively exploitable, or blocking builds. Fix before 08:45 WIB Monday.

#ActionFilesEst.
1C1 Revoke Telegram bot token via @BotFather, replace with placeholder in .env.example.env.example2 min
2C4+C5 Fix ch_drain: disable auto-commit, commit after successful insert, move buf.rows.clear() into Ok armch_drain.rs30 min
3C3 Add auth to /ops/* — wrap in SessionAuth + RequireSessionRole("admin")main.go15 min
4C9 Validate metric param — whitelist {"hd","rsi"}data.go:2815 min
5C10 Return 400 on bad date in MetricHistorydata.go:3225 min
6C12 Create Postgres audit log partitions for 2025-07 through 2027-01postgres.sql10 min

This Week

Reliability and defense-in-depth improvements. Schedule across sprint.

#ActionFiles
7C2 Scrub config.toml — placeholder values only, rotate bridge2025!config.toml, .env
8C6 Fix db_writer — return Result from flush, preserve buffer on failuredb_writer.rs
9C7 Set ClickHouse password, restrict network access.env, clickhouse-users.xml
10C8 Bind Prometheus metrics to 127.0.0.1:2112main.go, docker-compose
11C11 Fix Go version in Dockerfile (1.251.24)go-api/Dockerfile
12I1 Add drop counters to aggregator OHLCV channelaggregator.rs
13I4 Fix orderbook depth consistency (0-based everywhere)ch_drain.rs
14I5 Implement ch.QueryMetrics() for MetricHistory CH windowclickhouse/client.go, data.go
15I7 Add reconnect loop to SubscribeTierConfigtier_subscriber.go
16I15 Add auth/tier limits to UDF endpointsmain.go, udf.go

Backlog

#ActionFiles
17I6 Remove unsafe from kafka_producer.rskafka_producer.rs
18I9 Bound warm-up query or stream responsewarmup.rs
19I10 Compute hotCut once before goroutineshdchart.go
20I11 Wire Alertmanager Telegram receiveralertmanager.yml
21I12 Add container resource limitsdocker-compose.dev.yml
22I21 Optimize engine hash map keyshd.rs, rsi.rs
23MINOR Remaining 12 minor findingsVarious
Review methodology: 5 parallel agents, each with isolated context, ran for ~4-5 minutes each. Total analysis time: ~5 minutes wall clock. Cross-domain findings were de-duplicated manually. Confidence scores ranged from 80-100% for critical findings.