Skip to content

Infrastructure Architecture

Overview

OnChain Software's blockchain data infrastructure is built around a single high-performance node running an Erigon client that synchronizes blockchain data into a ClickHouse database. This foundation supports all our applications and provides reliable data access for trading algorithms, analytics platforms, and customer-facing products.

Node Architecture

Our infrastructure currently consists of a single production node with the following components:

  • Erigon Client: High-performance Ethereum client for blockchain synchronization
  • ClickHouse Database: Columnar database optimized for analytical queries
  • Core Tables: Three fundamental tables capturing all on-chain activity:
    • Blocks: Block headers and metadata
    • Transactions: Individual transaction details
    • Logs: Smart contract events and logs

Data Pipeline

  1. Real-time Collection: Erigon client synchronizes with Ethereum network on every block
  2. Storage: Raw blockchain data stored in ClickHouse core tables
  3. Processing: Derived tables and views extract specific data patterns (DEX swaps, price feeds, etc.)
  4. Access: (In Development) ClickHouse endpoint exposure for application connectivity

Supported Networks

Current: Ethereum Mainnet

  • Full historical data from genesis
  • Real-time synchronization with latest blocks
  • Complete transaction and event log coverage

Planned Network Expansion

We're planning to expand our data coverage to include: - Base (Coinbase L2) - Optimism (Ethereum L2) - BNB Chain (Binance Smart Chain) - Polygon (Polygon PoS) - Additional networks based on product requirements

Data Quality & Update Frequencies

Node Data Freshness

  • Real-time Sync: Blockchain data is updated on every Ethereum block (~12 seconds)
  • Core Tables: Blocks, Transactions, and Logs are populated immediately upon block confirmation
  • Derived Tables: Materialized views and aggregations update automatically as new data arrives

Catalog Documentation Updates

  • Minimum Frequency: Schema and documentation updates occur weekly
  • Target Frequency: Daily updates to reflect infrastructure changes
  • Version Control: All changes tracked through Git for complete audit trail

Data Quality Assurance

  • Partitioned Storage: Tables partitioned by time for optimal query performance
  • Automated Processing: Materialized views ensure derived data consistency
  • Schema Validation: All incoming data validated against defined schemas
  • Monitoring: Continuous monitoring of data pipeline health and sync status

Available Datasets

Ethereum Network Data

Our Ethereum dataset provides comprehensive blockchain information including:

  • Blocks: Core blockchain blocks with metadata and gas metrics
  • Transactions: Individual transaction details and execution results
  • Logs: Smart contract event logs and blockchain events
  • DEX Trading Data: Decentralized exchange swap transactions and liquidity metrics

Price & Market Data

Time-series pricing information optimized for analytics:

Materialized Views & Aggregations

Pre-computed data for enhanced query performance: