Apache DataSketches API
Apache DataSketches is the open-source library providing production-quality implementations of sketch algorithms including Theta Sketches (set operations), Quantiles Sketches (percentile estimation), HLL (HyperLogLog for cardinality), CPC, Frequency, and Tuple sketches. It is widely used in data warehouses and OLAP systems including Apache Druid, Apache Spark, and Amazon Redshift. The library provides Java, C++, and Python APIs.