Numerical methods for modeling spectroscopic data

Chemical physicists have developed powerful theoretical and computational techniques that allow spectra of all types to be calculated given knowledge of the potential energy surfaces, dynamics, and radiation-matter couplings for a particular molecular system.  Experimental spectroscopists, however, are normally in the business of trying to learn about the physical parameters of a system of interest by measuring its spectra; that is, one needs a procedure for inverting the spectral observables to extract the physical parameters.  Only for certain special cases have inversion protocols been developed, and it is unlikely that unique direct inversions even exist for most spectral observables.  Therefore the problem has traditionally been approached by the "intelligently guided random search" method: choose a set of material parameters, calculate the spectra, compare with experiment, modify the parameters, recalculate the spectra, and continue iteratively until a best fit between theoretical prediction and experimental observation is found.  Since the spectra depend on the molecular parameters in a well-defined but complex and highly nonlinear fashion, converging on a best fit set of parameters is time-consuming and requires considerable judgment and experience on the part of the human operator involved.


We are exploring the applicability of several automated approaches to these problems.  Genetic algorithms are global optimization methods that mimic the mechanisms of natural selection described by genetics and the Darwinian theory of evolution.  In our application of genetic algorithms, the parameters that describe the molecular system are encoded as strings of numbers analogous to chromosomes.  Initially a number of different parameter sets are selected randomly, each corresponding to an individual in a population.  The spectra corresponding to each parameter set are calculated, and their closeness to the actual spectra to be modeled is used as a measure of the "fitness" of each member.  The fittest members of each generation are then allowed to "breed", exchanging genetic material and producing some "offspring" that are even more "fit" (match the target spectra more closely).  A low rate of mutation is also introduced to allow exploration of parts of parameter space not spanned by the original population.  Eventually, the population converges to a group with very similar parameters that fit the target spectra very well.


We obtained very good results with this method in modeling absorption and resonance Raman spectra for molecules with up to five Raman-active vibrations, but scalability to larger systems remains a problem. 
More recently, we have started to explore the use of large-scale optimization algorithms available in commercial software packages. 
Recently we found that this problem can be solved quite effectively by utilizing a routine based on the Reflective Newton Trust Region algorithm available in the commercial MATLAB package.  Optimizations that typically require days to weeks of human time when performed interactively can be accomplished automatically in less than an hour of computer time.  The method can handle large molecules and mixtures of spectral broadening mechanisms, and is robust toward noise or missing data points.  Further work in this direction continues.

Recent publications in this field


Mark Lilichenko and Anne Myers Kelley.  Application of artificial neural networks and genetic algorithms to modeling molecular electronic spectra in solution.  J. Chem. Phys. 114, 7094-7102 (2001).


Margaret H. Hennessy and Anne Myers Kelley.  Using real-valued multi-objective genetic algorithms to model molecular absorption spectra and Raman excitation profiles in solution. 
Phys. Chem. Chem. Phys. 6, 1085-1095 (2004).

 

Eric Shorr and Anne Myers Kelley.  Automatic parameter optimization in modeling absorption spectra and resonance Raman excitation profiles.  Phys. Chem. Chem. Phys. 9, 4785-4792 (2007).

All publications