Protein Science
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Protein Science (2004), 13:54-62. Published by Cold Spring Harbor Laboratory Press. Copyright © 2004 The Protein Society
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by John, B.
Right arrow Articles by Sali, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by John, B.
Right arrow Articles by Sali, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Detection of homologous proteins by an intermediate sequence search

Bino John1 and Andrej Sali2

1 Laboratory of Molecular Biophysics, Pels Family Center for Biochemistry and Structural Biology, The Rockefeller University, New York, New York 10021, USA
2 Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry, and California Institute for Quantitative Biomedical Research, University of California at San Francisco, San Francisco, California 94143, USA

Reprint requests to: Andrej Sali, Mission Bay Genentech Hall, Ste. N472D, 600 16th St., University of California at San Francisco, San Francisco, CA 94143, USA; e-mail: sali{at}salilab.org; fax: (415) 514-4231.

We developed a variant of the intermediate sequence search method (ISSnew) for detection and alignment of weakly similar pairs of protein sequences. ISSnew relates two query sequences by an intermediate sequence that is potentially homologous to both queries. The improvement was achieved by a more robust overlap score for a match between the queries through an intermediate. The approach was benchmarked on a data set of 2369 sequences of known structure with insignificant sequence similarity to each other (BLAST E-value larger than 0.001); 2050 of these sequences had a related structure in the set. ISSnew performed significantly better than both PSI-BLAST and a previously described intermediate sequence search method. PSI-BLAST could not detect correct homologs for 1619 of the 2369 sequences. In contrast, ISSnew assigned a correct homolog as the top hit for 121 of these 1619 sequences, while incorrectly assigning homologs for only nine targets; it did not assign homologs for the remainder of the sequences. By estimate, ISSnew may be able to assign the folds of domains in ~29,000 of the ~500,000 sequences unassigned by PSI-BLAST, with 90% specificity (1 - false positives fraction). In addition, we show that the 15 alignments with the most significant BLAST E-values include the nearly best alignments constructed by ISSnew.

Keywords: protein homology; protein evolution; sequence alignment; comparative protein structure modeling; fold assignment


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
H. Li, X. Dai, and X. Zhao
A nearest neighbor approach for automated transporter prediction and categorization from protein sequences
Bioinformatics, May 1, 2008; 24(9): 1129 - 1136.
[Abstract] [Full Text] [PDF]


Home page
Food Science and Technology InternationalHome page
M. Darewicz, J. Dziuba, and P. Minkiewicz
Computational Characterisation and Identification of Peptides for in silico Detection of Potentially Celiac-Toxic Proteins
Food Science and Technology International, April 1, 2007; 13(2): 125 - 133.
[Abstract] [PDF]


Home page
BioinformaticsHome page
J. Sim, S.-Y. Kim, and J. Lee
Prediction of protein solvent accessibility using fuzzy k-nearest neighbor method
Bioinformatics, June 15, 2005; 21(12): 2844 - 2849.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. Espadaler, R. Aragues, N. Eswar, M. A. Marti-Renom, E. Querol, F. X. Aviles, A. Sali, and B. Oliva
Detecting remotely related proteins by their interactions and sequence similarity
PNAS, May 17, 2005; 102(20): 7151 - 7156.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2004 by The Protein Society.