Research in computational structure prediction methods can ultimately be boiled down to two categories: the sampling and scoring problems. The sampling problem is briefly defined as the task to search the vast conformational space of a system. Based in Levinthal’s Paradox, methods to sample this space must be efficient and energetically favorable to represent the system. Given a particular conformational in the sampled search space, the scoring problem is defined as the ability of a program to determine the likeliness of observing that conformation. There are four major types of energy functions, which can be broken down to physics-based, knowledge-based, empirical functions, and applied more recently, machine learning-based methods.
Score function development that we develop in the Rosetta software is primarily based on a combination of physics-based calculations derived from molecular mechanics (e.g. Lennard-Jones 6-12 potentials) and knowledge-based probabilities from existing structural information (Ramachandran backbone angle observations). Scoring a particular system from a linear summation of individual energy terms derived using these methods.
All these score functions we develop attempt to the question: Given a 3-D representation of a chemical system, how likely is this conformation likely to be observed? In other words, is this conformational energetically favorable? Given a set of possible conformations, can we extract the lowest energy, and therefore, the more biochemically relevant models?