How the WAL+Parquet union query works
Last updated: 2026-05-05 · Pharlux v1.0.0 · By Ian Holt
The thing most observability platforms don't talk about: you cannot query data you have not persisted, and persisting hurts throughput. ClickHouse-style batch writers compress beautifully and scan fast, but typical batch intervals introduce tens of seconds of staleness before new data is queryable. In-memory buffers are immediate but lose data on crash. The two requirements — durability and freshness — pull in opposite directions, and most platforms pick one and document around the other.
Pharlux ships a different design. A custom Apache DataFusion TableProvider unions an in-memory WAL buffer with on-disk Apache Parquet files into one consistent view at query time. Freshly-ingested data is queryable as soon as it lands in the WAL — no flush wait, no buffer-window staleness — and it is queryable through the same SQL surface as historical data on disk.