Using a simple lattice model and Monte Carlo dynamics we have studied the kinetics (and some thermodynamics) of proteinlike heteropolymer folding. Our results agree with previous works on other simple models and also match some of the properties of the folding real proteins. We find that our models display a two-stage folding behavior. First there is a rapid collapse to a compact state, followed by a slower stage in which the collapsed state rearranges itself to the native structure. We find that the folding time has a minimum plateau at intermediate temperatures and diverges at both high and low temperatures. The same is true for the collapse time. In this work we have examined the folding behavior as a function sequence and have discovered several interesting results. The collapse time and the glass temperature are both sequence independent (self-averaging) quantities. The folding time and temperature are both sequence dependent. The folding time correlates approximately with the energy of the native state: The lower this energy the faster the chain folds. This is consistent with the results found by Shakhnovich[15] that the larger the energy gap of the native state the better the sequence folds. We did not measure the gap, since there is no clear or simple definition of the gap in our system. Another way to view this result is that sequences with unfrustrated native states (native states with no bad contacts) fold best; i.e., we want to minimize energetic frustration of the ground state. However, we expect that this may be a property of these simple systems and that in more complex systems other forms of frustration (geometric or energetic frustration of conformations other than the native state) may play an important role. One would then expect that systems with reduced frustration should give rise to a large number of conformations that are rapidly connected kinetically to the native state (rapid compared to the folding time) or, as first proposed by Leopold and others,[33] a ``dominant folding funnel.''
An important point we have tried to stress is the issue of time scales, in particular the relevant physical time scales for this system and for protein folding in general. We note that there was no simple way to connect the computation time (Monte Carlo steps) to physical time. Rather than attempt to do so, we simply ran our simulations for a reasonable number of steps and then observed the folding time for the system. It is this folding time that now becomes the key time scale. For example, when we say a sequence does not fold what we mean is that is does not fold within a time that is over an order of magnitude greater than the folding time for the fast sequences. Since we are looking at finite systems we know that they will all fold given enough time. What is important is whether they fold in a reasonable time where reasonable is the folding time for the faster sequences. For real proteins, this time scale would be some suitable biological time.
By examining the behavior of folding time versus temperature we
defined the glass transition temperature of this system. Below this
temperature the kinetics slow down, causing the folding time to
increase rapidly. Also the collapse times lose their self-averaging
property and are now dependent on sequence. Most importantly we
observed that for the slow-folding sequences the folding temperature
(the temperature at which the native state is half populated) is below
the glass temperature. This indicates that these sequences will never
fold since at temperatures where the native state is thermodynamically
stable it is kinetically inaccessible. Good folding sequences have
greater than
. It has been suggested by
others[19, 20] that a good design principle for
optimizing folding would be to maximize the ratio
. We
observe this result explicitly in our simulations.
Perhaps the most interesting observation is that even simple systems such as these display a wide variety of complex and intriguing properties, many of which are shared by real proteins. This is particularly compelling in that one can much more easily study these simple systems and understand their behavior in great detail. By examining slightly more complex models we hope to understand how much of protein behavior is unique to proteins and how much is shared by the general class of heteropolymer systems. Hopefully, much of the apparent complexity of proteins will be understandable in the context of simpler model system.
We would like to gratefully acknowledge the computational assistance of A. Schweitzer. We also thank S. Skourtis, P. G. Wolynes, and K. Dill for helpful discussions. J. N. O. is a Beckman Young Investigator. This work was funded by the Arnold and Mabel Beckman Foundation and by the National Science Foundation (Grant No. MCB-9018768). J. N. O. is in residence at the Instituto de Física e Química de São Carlos, Universidade de São Paulo, São Carlos, SP, Brazil during part of the summers.