Rapid protein fragment search using hash functions based on the fourier transform

Tatsuya Akutsu, Kentaro Onizuka, Masato Ishikawa

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Motivation: Since the protein structure database has been growing very rapidly in recent years, the development of eficient methods for searching for similar structures is very important. Results: Results: This paper presents a novel method for searching for similar fragments of proteins. In this method, a hash vector (a vector of real numbers) is associated with each fixed-length fragment of three-dimensional protein structure. Each vector consists of low-frequency components of the Fourier-like spectrum for the distances between Cα atoms and the centroid. Then, we can analyze the similarity between fragments by evaluating the difference between hash vectors. The novel aspect of the method is that the following property is proved theoretically: if the root mean square distance between two fragments is small, then the distance between the hash vectors is small. Several variants of this method were compared with a naive method and a previous method using PDB data. The results show that the fastest one among the variants is 18-80 times faster than the naive method, and 3-10 times faster than the previous method. Contact: E-mail: takutsu@ims.u-tokyo.ac.jp.

Original languageEnglish
Pages (from-to)357-364
Number of pages8
JournalBioinformatics
Volume13
Issue number4
DOIs
Publication statusPublished - 1 Jan 1997

Cite this