Supplementary material: - aistats14b-supp.pdf: formulae and error bounds for the Fast Gauss Transform (FGT) approximation; an extended discussion of the noise model, including a proof of theorem 3.1 from the paper; and additional experiments' figures. The following are GIF animations of the elastic embedding algorithm (EE) ran on the infiniteMNIST dataset with N=1020000 images of handwritten digits, using as optimization one of GD (gradient descent), FP (fixed-point iteration) and L-BFGS, where the objective function gradient is approximated using the Fast Gauss Transform (FGT) or the Barnes-Hut (BH) method. - infiniteMNIST.gif: all methods shown at once. For each plot, the runtime grows from 0 to 660 minutes with a step of 10 min. - GD_FGT.gif,FP_FGT.gif,LBFGS_FGT.gif,GD_BH.gif,FP_BH.gif,LBFGS_BH.gif: each method separately. The runtime is limited to 11 hours for all the methods. - infiniteMNIST_digits.gif, infiniteMNIST_markers.gif: FGT run with L-BFGS optimization for 13 hours. The first animation shows a subset of 500 digits (to avoid clutter) shown as actual digits; the second animation shows all 1020000 digits as color markers. The following are GIF animations of the elastic embedding algorithm (EE) ran on a subset of the MNIST dataset with 6000 points, comparing the embedding using exact gradients vs using approximated gradients with BH or FGT: - MNIST_6k_iterations.gif: over iterations. - MNIST_6k_runtime.gif: over runtime. All these animations may be seen with a web browser or with specialized GIF image viewers.