Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Article in English | MEDLINE | ID: mdl-38316555

ABSTRACT

The recent CASP15 competition highlighted the critical role of multiple sequence alignments (MSAs) in protein structure prediction, as demonstrated by the success of the top AlphaFold2-based prediction methods. To push the boundaries of MSA utilization, we conducted a petabase-scale search of the Sequence Read Archive (SRA), resulting in gigabytes of aligned homologs for CASP15 targets. These were merged with default MSAs produced by ColabFold-search and provided to ColabFold-predict. By using SRA data, we achieved highly accurate predictions (GDT_TS > 70) for 66% of the non-easy targets, whereas using ColabFold-search default MSAs scored highly in only 52%. Next, we tested the effect of deep homology search and ColabFold's advanced features, such as more recycles, on prediction accuracy. While SRA homologs were most significant for improving ColabFold's CASP15 ranking from 11th to 3rd place, other strategies contributed too. We analyze these in the context of existing strategies to improve prediction.


Subject(s)
Computational Biology , Proteins , Computational Biology/methods , Proteins/chemistry , Sequence Alignment , Protein Conformation , Software , Algorithms , Sequence Analysis, Protein/methods
2.
bioRxiv ; 2023 Jul 11.
Article in English | MEDLINE | ID: mdl-37503235

ABSTRACT

The recent CASP15 competition highlighted the critical role of multiple sequence alignments (MSAs) in protein structure prediction, as demonstrated by the success of the top AlphaFold2-based prediction methods. To push the boundaries of MSA utilization, we conducted a petabase-scale search of the Sequence Read Archive (SRA), resulting in gigabytes of aligned homologs for CASP15 targets. These were merged with default MSAs produced by ColabFold-search and provided to ColabFold-predict. By using SRA data, we achieved highly accurate predictions (GDT_TS > 70) for 66% of the non-easy targets, whereas using ColabFold-search default MSAs scored highly in only 52%. Next, we tested the effect of deep homology search and ColabFold's advanced features, such as more recycles, on prediction accuracy. While SRA homologs were most significant for improving ColabFold's CASP15 ranking from 11th to 3rd place, other strategies contributed too. We analyze these in the context of existing strategies to improve prediction.

3.
NAR Genom Bioinform ; 5(1): lqad007, 2023 Mar.
Article in English | MEDLINE | ID: mdl-36814456

ABSTRACT

Zooplankton are important eukaryotic constituents of marine ecosystems characterized by limited motility in the water. These metazoans predominantly occupy intermediate trophic levels and energetically link primary producers to higher trophic levels. Through processes including diel vertical migration (DVM) and production of sinking pellets they also contribute to the biological carbon pump which regulates atmospheric CO2 levels. Despite their prominent role in marine ecosystems, and perhaps, because of their staggering diversity, much remains to be discovered about zooplankton biology. In particular, the circadian clock, which is known to affect important processes such as DVM has been characterized only in a handful of zooplankton species. We present annotated de novo assembled transcriptomes from a diverse, representative cohort of 17 marine zooplankton representing six phyla and eight classes. These transcriptomes represent the first sequencing data for a number of these species. Subsequently, using translated proteomes derived from this data, we demonstrate in silico the presence of orthologs to most core circadian clock proteins from model metazoans in all sequenced species. Our findings, bolstered by sequence searches against publicly available data, indicate that the molecular machinery underpinning endogenous circadian clocks is widespread and potentially well conserved across marine zooplankton taxa.

4.
Syst Biol ; 70(3): 608-622, 2021 04 15.
Article in English | MEDLINE | ID: mdl-33252676

ABSTRACT

Detecting the signature of selection in coding sequences and associating it with shifts in phenotypic states can unveil genes underlying complex traits. Of the various signatures of selection exhibited at the molecular level, changes in the pattern of selection at protein-coding genes have been of main interest. To this end, phylogenetic branch-site codon models are routinely applied to detect changes in selective patterns along specific branches of the phylogeny. Many of these methods rely on a prespecified partition of the phylogeny to branch categories, thus treating the course of trait evolution as fully resolved and assuming that phenotypic transitions have occurred only at speciation events. Here, we present TraitRELAX, a new phylogenetic model that alleviates these strong assumptions by explicitly accounting for the uncertainty in the evolution of both trait and coding sequences. This joint statistical framework enables the detection of changes in selection intensity upon repeated trait transitions. We evaluated the performance of TraitRELAX using simulations and then applied it to two case studies. Using TraitRELAX, we found an intensification of selection in the primate SEMG2 gene in polygynandrous species compared to species of other mating forms, as well as changes in the intensity of purifying selection operating on sixteen bacterial genes upon transitioning from a free-living to an endosymbiotic lifestyle.[Evolutionary selection; intensification; $\gamma $-proteobacteria; genotype-phenotype; relaxation; SEMG2.].


Subject(s)
Evolution, Molecular , Phenotype , Selection, Genetic , Animals , Codon , Models, Genetic , Phylogeny , Primates/genetics
5.
Virology ; 513: 114-128, 2018 01 01.
Article in English | MEDLINE | ID: mdl-29065352

ABSTRACT

The order Herpesvirales includes animal viruses with large double-strand DNA genomes replicating in the nucleus. The main capsid protein in the best-studied family Herpesviridae contains a domain with HK97-like fold related to bacteriophage head proteins, and several virion maturation factors are also homologous between phages and herpesviruses. The origin of herpesvirus DNA replication proteins is less well understood. While analyzing the genomes of herpesviruses in the family Malacohepresviridae, we identified nearly 30 families of proteins conserved in other herpesviruses, including several phage-related domains in morphogenetic proteins. Herpesvirus DNA replication factors have complex evolutionary history: some are related to cellular proteins, but others are closer to homologs from large nucleocytoplasmic DNA viruses. Phylogenetic analyses suggest that the core replication machinery of herpesviruses may have been recruited from the same pool as in the case of other large DNA viruses of eukaryotes.


Subject(s)
Evolution, Molecular , Herpesviridae/enzymology , Herpesviridae/genetics , Viral Proteins/genetics , Giant Viruses/genetics , Phylogeny , Sequence Homology, Amino Acid
6.
Protist ; 167(6): 526-543, 2016 12.
Article in English | MEDLINE | ID: mdl-27744090

ABSTRACT

Certain protist lineages bear cytoskeletal structures that are germane to them and define their individual group. Trichomonadida are excavate parasites united by a unique cytoskeletal framework, which includes tubulin-based structures such as the pelta and axostyle, but also other filaments such as the striated costa whose protein composition remains unknown. We determined the proteome of the detergent-resistant cytoskeleton of Tetratrichomonas gallinarum. 203 proteins with homology to Trichomonas vaginalis were identified, which contain significantly more long coiled-coil regions than control protein sets. Five candidates were shown to associate with previously described cytoskeletal structures including the costa and the expression of a single T. vaginalis protein in T. gallinarum induced the formation of accumulated, striated filaments. Our data suggests that filament-forming proteins of protists other than actin and tubulin share common structural properties with metazoan intermediate filament proteins, while not being homologous. These filament-forming proteins might have evolved many times independently in eukaryotes, or simultaneously in a common ancestor but with different evolutionary trajectories downstream in different phyla. The broad variety of filament-forming proteins uncovered, and with no homologs outside of the Trichomonadida, once more highlights the diverse nature of eukaryotic proteins with the ability to form unique cytoskeletal filaments.


Subject(s)
Proteome , Protozoan Proteins/metabolism , Trichomonadida/ultrastructure , Animals , Cytoskeleton/metabolism , Cytoskeleton/ultrastructure , Intermediate Filament Proteins/metabolism , Microscopy, Electron, Transmission , Parasites/metabolism , Parasites/ultrastructure , Trichomonadida/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...