Sketches for Size of Join Estimation
This package contains the implementation in C++ of the sketching algorithms for
size of join estimation. The content of the package can be divided into the
following main categories:

Schemes for generating pseudorandom numbers
+/1 random variables and bvalued hashes are at the core of all the
sketching techniques. The package contains efficient implementations of the
generating schemes based on BCH codes, ReedMuller codes, and CarterWegman
hashes. Implementations for fast rangesummation and dyadic mappings of the +/1
random variables are also provided.
For details see the papers:

Fast RangeSummable Random Variables for Efficient Aggregate Estimation
by F. Rusu and A. Dobra

PseudoRandom Number Generation for SketchBased Estimations
by F. Rusu and A. Dobra
Files:

Sketches for size of join estimation
The package contains the implementation of the known sketches for size of
join and second frequency moment estimation. It includes the original AGMS
sketches and the improved versions FastAGMS sketches, FastCount sketches, and
CountMin sketches.
For details consult the papers:

Statistical Analysis of Sketch Estimators
by F. Rusu and A. Dobra

Sketches for Size of Join Estimation
by F. Rusu and A. Dobra
Files:

Sketches over sampled data
The update performance of sketches can be significantly improved if only a
sample of the data is sketched, without significant degradation in the
accuracy. Simple sampling algorithms like Bernoulli sampling, sampling without
replacement, and sampling with replacement, are also included in the package.
For details see the paper:

Sketching Sampled Data Streams
by F. Rusu and A. Dobra
Files:

Synthetic data generator
The package contains a simple data generator for different distributions
with customizable parameters. This allows the testing of the implemented
algorithms.
Files:

Examples
Two usage examples are included in the package. The first example generates
size of join estimations with all the implemented sketching techniques. The
second example is for sketching Bernoulli samples.
Files:
Requirements
In order to compile the package, the GSL development library has to be installed.
Download
The package can be found here.
UC Merced 
EECS 
Home
Last updated: Thursday, July 28, 2011