Supplementary material: - icml13-proofs.pdf: derivation of Theorem 1.1 from the paper. - Newton.gif,Euler.gif,Halley.gif: GIF animation of the root-finding algorithms for one of the points in the Lena image dataset. The type of the step (normal step, bisection, etc.) is shown as a title. The perplexity is set to 30 (shown as black curve), 250 closest neighbors are preserved in the computation of the entropy (shown in red). The initial bounds are set in the interval [-8 3]. beta is computed to an accuracy of 1e-10. The initial value is log(beta) = -6.5. Notice that Newton's method requires one more iteration in the area of the solution to achieve the desired tolerance. - Newton_bad_init.gif,Euler_bad_init.gif,Halley_bad_init.gif: same setup as above, except in this case the initialization is located in a flat region. Notice very conservative steps of Halley's method, and enormously long initial steps of Newton's and Euler's method. All these animations may be seen with a web browser or with specialized GIF image viewers.