Skip to content

Data Access & Integration

Current Status: Development Phase

We are actively developing the data access layer with the following planned architecture:

  • ClickHouse Endpoint: Direct database access via HTTP/TCP protocols
  • API Server: Intermediary server for authentication and rate limiting
  • Customer Access: Secure endpoint exposure for paying customers and partners

Integration Patterns

Internal Applications

Direct ClickHouse Connections - Maximum performance for internal tools - Native SQL query capabilities - Real-time data access - Custom aggregation support

Use Cases: - Trading algorithm data feeds - Internal analytics dashboards - Risk management systems - Portfolio tracking applications

External Partners

API-based Access - Authentication and authorization - Rate limiting and usage tracking - Billing integration - Standardized data formats

Access Methods: - RESTful API endpoints - GraphQL query interface - WebSocket streaming for real-time data - Bulk data export capabilities

Development Tools

MCP Server Integration - AI/ML workflow integration - Model Context Protocol support - Automated data analysis - Machine learning pipeline integration

Authentication & Security

Internal Access

  • Direct database authentication
  • Role-based access control
  • Network-level security
  • Audit logging

External Access

  • API key authentication
  • OAuth 2.0 integration
  • Rate limiting per customer
  • Usage monitoring and billing

Query Optimization

Best Practices

  • Time-based Partitioning: Leverage table partitions for date range queries
  • Index Usage: Utilize primary and secondary indexes effectively
  • Aggregation Tables: Use materialized views for common aggregations
  • Column Selection: Select only necessary columns to minimize data transfer

Common Query Patterns

Historical Price Analysis

SELECT
    hour,
    token_address,
    avg(price_usd) as avg_price,
    volume_usd
FROM token_prices_hourly
WHERE hour >= '2024-01-01'
  AND token_address = '0x...'
ORDER BY hour;

DEX Trading Volume

SELECT
    date_trunc('day', block_time) as day,
    protocol,
    sum(amount_usd) as daily_volume
FROM dex_swaps
WHERE block_time >= now() - interval '30 days'
GROUP BY day, protocol
ORDER BY day DESC;

Transaction Fee Analytics

SELECT
    date_trunc('hour', block_time) as hour,
    avg(gas_price) as avg_gas_price,
    sum(gas_used * gas_price) as total_fees
FROM transactions
WHERE block_time >= now() - interval '7 days'
GROUP BY hour
ORDER BY hour;

Performance Considerations

Query Performance

  • Use appropriate time ranges to limit data scans
  • Leverage materialized views for complex aggregations
  • Consider data freshness requirements
  • Monitor query execution times

Data Freshness

  • Real-time Data: Core tables updated every ~12 seconds
  • Derived Tables: Materialized views update automatically
  • Aggregations: Hourly aggregations available with minimal delay
  • Historical Data: Complete history available from genesis

Monitoring & Support

Data Pipeline Health

  • Continuous synchronization monitoring
  • Automated alerting for data delays
  • Performance metric tracking
  • Error logging and investigation

Developer Support

  • Query optimization assistance
  • Schema design consultation
  • Integration pattern guidance
  • Performance troubleshooting

Future Enhancements

Planned Features

  • GraphQL endpoint for flexible queries
  • Real-time WebSocket streaming
  • Enhanced authentication systems
  • Multi-network data aggregation
  • Advanced caching layers

Scalability Improvements

  • Read replica deployment
  • Query result caching
  • Connection pooling optimization
  • Load balancing implementation