Summary: | Single-molecule real time trajectories are embedded in high noise. To extract kinetic or dynamic information of the molecules from these trajectories often requires idealization of the data in steps and dwells. One major premise behind the existing single-molecule data analysis algorithms is the gaussian 'white' noise, which displays no correlation in time and whose amplitude is independent on data sampling frequency. This so-called 'white' noise is widely assumed but its validity has not been critically evaluated. We show that correlated noise exists in single-molecule real time trajectories collected from optical tweezers. The assumption of white noise during analysis of these data can lead to serious over- or underestimation of the number of steps depending on the algorithms employed. We present a statistical method that quantitatively evaluates the structure of the underlying noise, takes the noise structure into account, and identifies steps and dwells in a single-molecule trajectory. Unlike existing data analysis algorithms, this method uses Generalized Least Squares (GLS) to detect steps and dwells. Under the GLS framework, the optimal number of steps is chosen using model selection criteria such as Bayesian Information Criterion (BIC). Comparison with existing step detection algorithms showed that this GLS method can detect step locations with highest accuracy in the presence of correlated noise. Because this method is automated, and directly works with high bandwidth data without pre-filtering or assumption of gaussian noise, it may be broadly useful for analysis of single-molecule real time trajectories.
|