Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
1.
J Clin Med ; 12(20)2023 Oct 10.
Article in English | MEDLINE | ID: mdl-37892566

ABSTRACT

Primary immune thrombocytopenia (ITP) is a complex autoimmune disease whose hallmark is a deregulation of cellular and humoral immunity leading to increased destruction and reduced production of platelets. The heterogeneity of presentation and clinical course hampers personalized approaches for diagnosis and management. In 2021, the Spanish ITP Group (GEPTI) of the Spanish Society of Hematology and Hemotherapy (SEHH) updated a consensus document that had been launched in 2011. The updated guidelines have been the reference for the diagnosis and management of primary ITP in Spain ever since. Nevertheless, the emergence of new tools and strategies makes it advisable to review them again. For this reason, we have updated the main recommendations appropriately. Our aim is to provide a practical tool to facilitate the integral management of all aspects of primary ITP management.

2.
Hemasphere ; 7(8): e936, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37476303

ABSTRACT

The International Prognostic Score of thrombosis in Essential Thrombocythemia (IPSET-thrombosis) and its revised version have been proposed to guide thrombosis prevention strategies. We evaluated both classifications to prognosticate thrombosis in 1366 contemporary essential thrombocythemia (ET) patients prospectively followed from the Spanish Registry of ET. The cumulative incidence of thrombosis at 10 years, taking death as a competing risk, was 11.4%. The risk of thrombosis was significantly higher in the high-risk IPSET-thrombosis and high-risk revised IPSET-thrombosis, but no differences were observed among the lower risk categories. Patients allocated in high-risk IPSET-thrombosis (subdistribution hazard ratios [SHR], 3.7 [95% confidence interval, CI, 1.6-8.7]) and high-risk revised IPSET-thrombosis (SHR, 3.2 [95% CI, 1.4-7.45]) showed an increased risk of arterial thrombosis, whereas both scoring systems failed to predict venous thrombosis. The incidence rate of thrombosis in intermediate risk revised IPSET-thrombosis (aged >60 years, JAK2-negative, and no history of thrombosis) was very low regardless of the treatment administered (0.9% and 0% per year with and without cytoreduction, respectively). Dynamic application of the revised IPSET-thrombosis showed a low rate of thrombosis when patients without history of prior thrombosis switched to a higher risk category after reaching 60 years of age. In conclusion, IPSET-thrombosis scores are useful for identifying patients at high risk of arterial thrombosis, whereas they fail to predict venous thrombosis. Controlled studies are needed to determine the appropriate treatment of ET patients assigned to the non-high-risk categories.

3.
Rev. clín. esp. (Ed. impr.) ; 223(6): 340-349, jun.- jul. 2023.
Article in Spanish | IBECS | ID: ibc-221349

ABSTRACT

Objetivos El objetivo consistía en evaluar un programa de gestión de anticoagulantes orales directos (ACOD) en pacientes con fibrilación auricular no valvular (FANV) según sus perfiles, idoneidad de la dosis, patrones de cambio de tratamiento, efectividad y seguridad Se trató de un estudio observacional, prospectivo y longitudinal en una cohorte de pacientes atendidos en la práctica clínica cotidiana en un hospital regional español con un plan de seguimiento de 3 años para pacientes que iniciaron el tratamiento con dabigatrán, rivaroxabán o apixabán entre enero de 2012 y diciembre de 2016. Métodos Se analizaron 490 episodios de tratamiento (apixabán 2,5mg, 9,4%; apixabán 5mg, 21,4%; dabigatrán 75mg, 0,6%; dabigatrán 110mg, 12,4%; dabigatrán 150mg, 19,8%; rivaroxabán 15mg, 17,8%; rivaroxabán 20mg, 18,6%) en 445 pacientes. En el 13,6% de los pacientes tratados con dabigatrán, el 9,7% de los tratados con rivaroxabán y el 3,9% de los tratados con apixabán se cambió a otros ACOD o se modificó la dosis. Resultados El ACOD al que se cambió con mayor frecuencia fue el apixabán. Los motivos más frecuentes para cambiar de tratamiento fueron toxicidad (23,8%), hemorragia (21,4%) y deterioro renal (16,7%). En el 23,8% de los episodios se constató una inadecuación de la dosis. Las tasas de ictus y accidentes isquémicos transitorios (AIT) fueron de 1,64 y 0,54 eventos/100 años/paciente, respectivamente, mientras que las de hemorragias importantes, no importantes, pero clínicamente relevantes (NICR) e intracraneales fueron de 2,4, 5 y 0,5 eventos/100 años/paciente, respectivamente. Las hemorragias digestivas y genitourinarias fueron el tipo más frecuente de eventos hemorrágicos. En el análisis multifactorial, el ictus previo y la edad fueron factores predictivos independientes de ictus/AIT. El uso concomitante de antiagregantes plaquetarios, el sexo masculino y la edad fueron factores predictivos independientes de eventos hemorrágicos (AU)


Aims The aim is to evaluate a management program for direct oral anticoagulants (DOACs) in non-valvular atrial fibrillation (NVAF) patients according to their profiles, appropriateness of dosing, patterns of crossover, effectiveness and safety. This is an observational and longitudinal prospective study in a cohort of patients attended in daily clinical practice in a regional hospital in Spain with 3-year a follow-up plan for patients initiating dabigatran, rivaroxaban or apixaban between Jan/2012 and Dec/2016. Methods We analyzed 490 episodes of treatment (apixaban 2.5, 9.4%; apixaban 5, 21.4%; dabigatran 75, 0.6%; dabigatran 110, 12.4%; dabigatran 150, 19.8%; rivaroxaban 15, 17.8% and rivaroxaban 20, 18.6%) in 445 patients. 13.6% of patients on dabigatran, 9.7% on rivaroxaban, and 3.9% on apixaban switched to other DOACs or changed dosing. Results Apixaban was the most frequent DOAC switched to. The most frequent reasons for switching were toxicity (23.8%), bleeding (21.4%) and renal deterioration (16.7%). Inappropriateness of dose was found in 23.8% of episodes. Rates of stroke/transient ischemic attack (TIA) were 1.64/0.54 events/100 patients-years, while rates of major, clinically relevant non-major (CRNM) bleeding and intracranial bleeding were 2.4, 5, and 0.5 events/100 patients-years. Gastrointestinal and genitourinary bleeding were the most common type of bleeding events (BE). On multivariable analysis, prior stroke and age were independent predictors of stroke/TIA. Concurrent platelet inhibitors, male gender and age were independent predictors of BE. Conclusion This study complements the scant data available on the use of DOACs in NVAF patients in Spain, confirming a good safety and effectiveness profil (AU)


Subject(s)
Humans , Male , Female , Aged , Practice Patterns, Physicians' , Atrial Fibrillation/drug therapy , Anticoagulants/administration & dosage , Dabigatran/administration & dosage , Rivaroxaban/administration & dosage , Follow-Up Studies , Prospective Studies , Longitudinal Studies , Treatment Outcome , Administration, Oral , Spain
4.
Rev Clin Esp (Barc) ; 223(6): 340-349, 2023.
Article in English | MEDLINE | ID: mdl-37105383

ABSTRACT

AIMS: The aim is to evaluate a management program for direct oral anticoagulants (DOACs) in non-valvular atrial fibrillation (NVAF) patients according to their profiles, appropriateness of dosing, patterns of crossover, effectiveness and safety. This is an observational and longitudinal prospective study in a cohort of patients attended in daily clinical practice in a regional hospital in Spain with 3-year a follow-up plan for patients initiating dabigatran, rivaroxaban or apixaban between JAN/2012-DEC/2016. METHODS: We analyzed 490 episodes of treatment (apixaban 2.5 9.4%, apixaban 5 21.4%, dabigatran 75 0.6%, dabigatran 110 12,4%, dabigatran 150 19.8%, rivaroxaban 15 17.8% and rivaroxaban 20 18.6%) in 445 patients. 13.6% of patients on dabigatran, 9.7% on rivaroxaban, and 3.9% on apixaban switched to other DOACs or changed dosing. RESULTS: Apixaban was the most frequent DOAC switched to. The most frequent reasons for switching were toxicity (23.8%), bleeding (21.4%) and renal deterioration (16.7%). Inappropriateness of dose was found in 23.8% of episodes. Rates of stroke/transient ischemic attack (TIA) were 1.64/0.54 events/100 patients-years, while rates of major, clinically relevant non-major (CRNM) bleeding and intracranial bleeding were 2.4, 5, and 0.5 events/100 patients-years. Gastrointestinal and genitourinary bleeding were the most common type of bleeding events (BE). On multivariable analysis, prior stroke and age were independent predictors of stroke/TIA. Concurrent platelet inhibitors, male gender and age were independent predictors of BE. CONCLUSION: This study complements the scant data available on the use of DOACs in NVAF patients in Spain, confirming a good safety and effectiveness profile.


Subject(s)
Atrial Fibrillation , Ischemic Attack, Transient , Stroke , Humans , Male , Atrial Fibrillation/complications , Atrial Fibrillation/drug therapy , Atrial Fibrillation/chemically induced , Rivaroxaban/adverse effects , Dabigatran/adverse effects , Anticoagulants/adverse effects , Ischemic Attack, Transient/chemically induced , Ischemic Attack, Transient/drug therapy , Prospective Studies , Spain , Stroke/prevention & control , Stroke/chemically induced , Hemorrhage/chemically induced , Hemorrhage/epidemiology , Hemorrhage/drug therapy , Retrospective Studies
5.
Ann Hematol ; 102(2): 447-456, 2023 Feb.
Article in English | MEDLINE | ID: mdl-36422672

ABSTRACT

The SARS-CoV-2 pandemic has favored the expansion of telemedicine. Philadelphia-negative chronic myeloproliferative neoplasms (Ph-MPN) might be good candidates for virtual follow-up. In this study, we aimed to analyze the follow-up of patients with Ph-MPN in Spain during COVID-19, its effectiveness, and acceptance among patients. We present a multicenter retrospective study from 30 centers. Five hundred forty-one patients were included with a median age of 67 years (yr). With a median follow-up of 19 months, 4410 appointments were recorded. The median of visits per patient was 7 and median periodicity was 2.7 months; significantly more visits and a higher frequency of them were registered in myelofibrosis (MF) patients. 60.1% of visits were in-person, 39.5% were by telephone, and 0.3% were videocall visits, with a predominance of telephone visits for essential thrombocythemia (ET) and polycythemia vera (PV) patients over MF, as well as for younger patients (< 50 yr). The proportion of phone visits significantly decreased after the first semester of the pandemic. Pharmacological modifications were performed only in 25.7% of the visits, and, considering overall management, ET patients needed fewer global treatment changes. Telephone contact effectiveness reached 90% and only 5.4% required a complementary in-person appointment. Although 56.2% of the cohort preferred in-person visits, 90.5% of our patients claimed to be satisfied with follow-up during the pandemic, with an 83% of positive comments. In view of our results, telemedicine has proven effective and efficient, and might continue to play a complementary role in Ph-MPN patients' follow-up.


Subject(s)
COVID-19 , Myeloproliferative Disorders , Polycythemia Vera , Primary Myelofibrosis , Thrombocythemia, Essential , Humans , Aged , Pandemics , Retrospective Studies , Patient Satisfaction , Spain/epidemiology , SARS-CoV-2 , Myeloproliferative Disorders/epidemiology , Myeloproliferative Disorders/therapy , Polycythemia Vera/epidemiology , Primary Myelofibrosis/epidemiology , Thrombocythemia, Essential/epidemiology
6.
BMJ Open ; 12(11): e062873, 2022 11 04.
Article in English | MEDLINE | ID: mdl-36332946

ABSTRACT

INTRODUCTION: To date, no pancreatic stump closure technique has been shown to be superior to any other in distal pancreatectomy. Although several studies have shown a trend towards better results in transection using a radiofrequency device (radiofrequency-assisted transection (RFT)), no randomised trial for this purpose has been performed to date. Therefore, we designed a randomised clinical trial, with the hypothesis that this technique used in distal pancreatectomies is superior in reducing clinically relevant postoperative pancreatic fistula (CR-POPF) than mechanical closures. METHODS AND ANALYSIS: TRANSPAIRE is a multicentre randomised controlled trial conducted in seven Spanish pancreatic centres that includes 112 patients undergoing elective distal pancreatectomy for any indication who will be randomly assigned to RFT or classic stapler transections (control group) in a ratio of 1:1. The primary outcome is the CR-POPF percentage. Sample size is calculated with the following assumptions: 5% one-sided significance level (α), 80% power (1-ß), expected POPF in control group of 32%, expected POPF in RFT group of 10% and a clinically relevant difference of 22%. Secondary outcomes include postoperative results, complications, radiological evaluation of the pancreatic stump, metabolomic profile of postoperative peritoneal fluid, survival and quality of life. Follow-ups will be carried out in the external consultation at 1, 6 and 12 months postoperatively. ETHICS AND DISSEMINATION: TRANSPAIRE has been approved by the CEIM-PSMAR Ethics Committee. This project is being carried out in accordance with national and international guidelines, the basic principles of protection of human rights and dignity established in the Declaration of Helsinki (64th General Assembly, Fortaleza, Brazil, October 2013), and in accordance with regulations in studies with biological samples, Law 14/2007 on Biomedical Research will be followed. We have defined a dissemination strategy, whose main objective is the participation of stakeholders and the transfer of knowledge to support the exploitation of activities. REGISTRATION DETAILS: ClinicalTrials.gov Registry (NCT04402346).


Subject(s)
Pancreatectomy , Humans , Multicenter Studies as Topic , Pancreas/surgery , Pancreatectomy/adverse effects , Pancreatectomy/methods , Pancreatic Fistula/etiology , Pancreatic Fistula/prevention & control , Postoperative Complications/etiology , Quality of Life , Randomized Controlled Trials as Topic , Risk Factors
7.
J Transl Med ; 20(1): 373, 2022 08 18.
Article in English | MEDLINE | ID: mdl-35982500

ABSTRACT

BACKGROUND: Recently, extensive cancer genomic studies have revealed mutational and clinical data of large cohorts of cancer patients. For example, the Pan-Lung Cancer 2016 dataset (part of The Cancer Genome Atlas project), summarises the mutational and clinical profiles of different subtypes of Lung Cancer (LC). Mutational and clinical signatures have been used independently for tumour typification and prediction of metastasis in LC patients. Is it then possible to achieve better typifications and predictions when combining both data streams? METHODS: In a cohort of 1144 Lung Adenocarcinoma (LUAD) and Lung Squamous Cell Carcinoma (LSCC) patients, we studied the number of missense mutations (hereafter, the Total Mutational Load TML) and distribution of clinical variables, for different classes of patients. Using the TML and different sets of clinical variables (tumour stage, age, sex, smoking status, and packs of cigarettes smoked per year), we built Random Forest classification models that calculate the likelihood of developing metastasis. RESULTS: We found that LC patients different in age, smoking status, and tumour type had significantly different mean TMLs. Although TML was an informative feature, its effect was secondary to the "tumour stage" feature. However, its contribution to the classification is not redundant with the latter; models trained using both TML and tumour stage performed better than models trained using only one of these variables. We found that models trained in the entire dataset (i.e., without using dimensionality reduction techniques) and without resampling achieved the highest performance, with an F1 score of 0.64 (95%CrI [0.62, 0.66]). CONCLUSIONS: Clinical variables and TML should be considered together when assessing the likelihood of LC patients progressing to metastatic states, as the information these encode is not redundant. Altogether, we provide new evidence of the need for comprehensive diagnostic tools for metastasis.


Subject(s)
Adenocarcinoma of Lung , Carcinoma, Non-Small-Cell Lung , Carcinoma, Squamous Cell , Lung Neoplasms , Adenocarcinoma of Lung/genetics , Adenocarcinoma of Lung/pathology , Carcinoma, Non-Small-Cell Lung/pathology , Carcinoma, Squamous Cell/genetics , Humans , Lung Neoplasms/genetics , Lung Neoplasms/pathology , Mutation/genetics
8.
Pharmaceuticals (Basel) ; 15(7)2022 Jun 23.
Article in English | MEDLINE | ID: mdl-35890078

ABSTRACT

Primary immune thrombocytopenia (ITP) is an autoimmune disorder that causes low platelet counts and subsequent bleeding risk. Although current corticosteroid-based ITP therapies are able to improve platelet counts, up to 70% of subjects with an ITP diagnosis do not achieve a sustained clinical response in the absence of treatment, thus requiring a second-line therapy option as well as additional care to prevent bleeding. Less than 40% of patients treated with thrombopoietin analogs, 60% of those treated with splenectomy, and 20% or fewer of those treated with rituximab or fostamatinib reach sustained remission in the absence of treatment. Therefore, optimizing therapeutic options for ITP management is mandatory. The pathophysiology of ITP is complex and involves several mechanisms that are apparently unrelated. These include the clearance of autoantibody-coated platelets by splenic macrophages or by the complement system, hepatic desialylated platelet destruction, and the inhibition of platelet production from megakaryocytes. The number of pathways involved may challenge treatment, but, at the same time, offer the possibility of unveiling a variety of new targets as the knowledge of the involved mechanisms progresses. The aim of this work, after revising the limitations of the current treatments, is to perform a thorough review of the mechanisms of action, pharmacokinetics/pharmacodynamics, efficacy, safety, and development stage of the novel ITP therapies under investigation. Hopefully, several of the options included herein may allow us to personalize ITP management according to the needs of each patient in the near future.

9.
Vaccines (Basel) ; 10(6)2022 Jun 16.
Article in English | MEDLINE | ID: mdl-35746569

ABSTRACT

Worldwide vaccination against SARS-CoV-2 has allowed the detection of hematologic autoimmune complications. Adverse events (AEs) of this nature had been previously observed in association with other vaccines. The underlying mechanisms are not totally understood, although mimicry between viral and self-antigens plays a relevant role. It is important to remark that, although the incidence of these AEs is extremely low, their evolution may lead to life-threatening scenarios if treatment is not readily initiated. Hematologic autoimmune AEs have been associated with both mRNA and adenoviral vector-based SARS-CoV-2 vaccines. The main reported entities are secondary immune thrombocytopenia, immune thrombotic thrombocytopenic purpura, autoimmune hemolytic anemia, Evans syndrome, and a newly described disorder, so-called vaccine-induced immune thrombotic thrombocytopenia (VITT). The hallmark of VITT is the presence of anti-platelet factor 4 autoantibodies able to trigger platelet activation. Patients with VITT present with thrombocytopenia and may develop thrombosis in unusual locations such as cerebral beds. The management of hematologic autoimmune AEs does not differ significantly from that of these disorders in a non-vaccine context, thus addressing autoantibody production and bleeding/thromboembolic risk. This means that clinicians must be aware of their distinctive signs in order to diagnose them and initiate treatment as soon as possible.

10.
Proc Data Compress Conf ; 2021: 193-202, 2021 Mar.
Article in English | MEDLINE | ID: mdl-34778549

ABSTRACT

Computing the matching statistics of patterns with respect to a text is a fundamental task in bioinformatics, but a formidable one when the text is a highly compressed genomic database. Bannai et al. gave an efficient solution for this case, which Rossi et al. recently implemented, but it uses two passes over the patterns and buffers a pointer for each character during the first pass. In this paper, we simplify their solution and make it streaming, at the cost of slowing it down slightly. This means that, first, we can compute the matching statistics of several long patterns (such as whole human chromosomes) in parallel while still using a reasonable amount of RAM; second, we can compute matching statistics online with low latency and thus quickly recognize when a pattern becomes incompressible relative to the database. Our code is available at https://github.com/koeppl/phoni.

13.
Proc Worksh Algorithm Eng Exp ; 2021: 60-72, 2021.
Article in English | MEDLINE | ID: mdl-35355938

ABSTRACT

Prefix-free parsing (PFP) was introduced by Boucher et al. (2019) as a preprocessing step to ease the computation of Burrows-Wheeler Transforms (BWTs) of genomic databases. Given a string S, it produces a dictionary D and a parse P of overlapping phrases such that BWT(S) can be computed from D and P in time and workspace bounded in terms of their combined size |PFP(S)|. In practice D and P are significantly smaller than S and computing BWT(S) from them is more efficient than computing it from S directly, at least when S is the concatenation of many genomes. In this paper, we consider PFP(S) as a data structure and show how it can be augmented to support full suffix tree functionality, still built and fitting within O(|PFP(S)|) space. This entails the efficient computation of various primitives to simulate the suffix tree: computing a longest common extension (LCE) of two positions in S; reading any cell of its suffix array (SA), of its inverse (ISA), of its BWT, and of its longest common prefix array (LCP); and computing minima over ranges and next/previous smaller value queries over the LCP. Our experimental results show that the PFP suffix tree can be efficiently constructed for very large repetitive datasets and that its operations perform competitively with other compressed suffix trees that can only handle much smaller datasets.

14.
J Bioinform Comput Biol ; 17(3): 1950011, 2019 06.
Article in English | MEDLINE | ID: mdl-31230498

ABSTRACT

Signaling pathways are responsible for the regulation of cell processes, such as monitoring the external environment, transmitting information across membranes, and making cell fate decisions. Given the increasing amount of biological data available and the recent discoveries showing that many diseases are related to the disruption of cellular signal transduction cascades, in silico discovery of signaling pathways in cell biology has become an active research topic in past years. However, reconstruction of signaling pathways remains a challenge mainly because of the need for systematic approaches for predicting causal relationships, like edge direction and activation/inhibition among interacting proteins in the signal flow. We propose an approach for predicting signaling pathways that integrates protein interactions, gene expression, phenotypes, and protein complex information. Our method first finds candidate pathways using a directed-edge-based algorithm and then defines a graph model to include causal activation relationships among proteins, in candidate pathways using cell cycle gene expression and phenotypes to infer consistent pathways in yeast. Then, we incorporate protein complex coverage information for deciding on the final predicted signaling pathways. We show that our approach improves the predictive results of the state of the art using different ranking metrics.


Subject(s)
Cell Cycle , Computational Biology/methods , Multiprotein Complexes/metabolism , Signal Transduction , Algorithms , Cell Cycle/genetics , Computer Graphics , Data Visualization , Gene Expression , Protein Interaction Mapping/methods , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolism
15.
Bioinformatics ; 35(20): 4120-4128, 2019 10 15.
Article in English | MEDLINE | ID: mdl-30887042

ABSTRACT

MOTIVATION: Genome repositories are growing faster than our storage capacities, challenging our ability to store, transmit, process and analyze them. While genomes are not very compressible individually, those repositories usually contain myriads of genomes or genome reads of the same species, thereby creating opportunities for orders-of-magnitude compression by exploiting inter-genome similarities. A useful compression system, however, cannot be only usable for archival, but it must allow direct access to the sequences, ideally in transparent form so that applications do not need to be rewritten. RESULTS: We present a highly compressed filesystem that specializes in storing large collections of genomes and reads. The system obtains orders-of-magnitude compression by using Relative Lempel-Ziv, which exploits the high similarities between genomes of the same species. The filesystem transparently stores the files in compressed form, intervening the system calls of the applications without the need to modify them. A client/server variant of the system stores the compressed files in a server, while the client's filesystem transparently retrieves and updates the data from the server. The data between client and server are also transferred in compressed form, which saves an order of magnitude network time. AVAILABILITY AND IMPLEMENTATION: The C++ source code of our implementation is available for download in https://github.com/vsepulve/relz_fs.


Subject(s)
Data Compression , Genome , Software
16.
Comput J ; 61(5): 773-788, 2018 May.
Article in English | MEDLINE | ID: mdl-29795706

ABSTRACT

Suffix trees are one of the most versatile data structures in stringology, with many applications in bioinformatics. Their main drawback is their size, which can be tens of times larger than the input sequence. Much effort has been put into reducing the space usage, leading ultimately to compressed suffix trees. These compressed data structures can efficiently simulate the suffix tree, while using space proportional to a compressed representation of the sequence. In this work, we take a new approach to compressed suffix trees for repetitive sequence collections, such as collections of individual genomes. We compress the suffix trees of individual sequences relative to the suffix tree of a reference sequence. These relative data structures provide competitive time/space trade-offs, being almost as small as the smallest compressed suffix trees for repetitive collections, and competitive in time with the largest and fastest compressed suffix trees.

17.
PLoS One ; 12(9): e0183460, 2017.
Article in English | MEDLINE | ID: mdl-28937982

ABSTRACT

Many proteins work together with others in groups called complexes in order to achieve a specific function. Discovering protein complexes is important for understanding biological processes and predict protein functions in living organisms. Large-scale and throughput techniques have made possible to compile protein-protein interaction networks (PPI networks), which have been used in several computational approaches for detecting protein complexes. Those predictions might guide future biologic experimental research. Some approaches are topology-based, where highly connected proteins are predicted to be complexes; some propose different clustering algorithms using partitioning, overlaps among clusters for networks modeled with unweighted or weighted graphs; and others use density of clusters and information based on protein functionality. However, some schemes still require much processing time or the quality of their results can be improved. Furthermore, most of the results obtained with computational tools are not accompanied by an analysis of false positives. We propose an effective and efficient mining algorithm for discovering highly connected subgraphs, which is our base for defining protein complexes. Our representation is based on transforming the PPI network into a directed acyclic graph that reduces the number of represented edges and the search space for discovering subgraphs. Our approach considers weighted and unweighted PPI networks. We compare our best alternative using PPI networks from Saccharomyces cerevisiae (yeast) and Homo sapiens (human) with state-of-the-art approaches in terms of clustering, biological metrics and execution times, as well as three gold standards for yeast and two for human. Furthermore, we analyze false positive predicted complexes searching the PDBe (Protein Data Bank in Europe) database in order to identify matching protein complexes that have been purified and structurally characterized. Our analysis shows that more than 50 yeast protein complexes and more than 300 human protein complexes found to be false positives according to our prediction method, i.e., not described in the gold standard complex databases, in fact contain protein complexes that have been characterized structurally and documented in PDBe. We also found that some of these protein complexes have recently been classified as part of a Periodic Table of Protein Complexes. The latest version of our software is publicly available at http://doi.org/10.6084/m9.figshare.5297314.v1.


Subject(s)
Algorithms , Models, Molecular , Protein Interaction Mapping/methods , Proteins/metabolism , Humans , Internet , Saccharomyces cerevisiae , Software
18.
Inf Retr Boston ; 20(3): 253-291, 2017.
Article in English | MEDLINE | ID: mdl-28596702

ABSTRACT

Most of the fastest-growing string collections today are repetitive, that is, most of the constituent documents are similar to many others. As these collections keep growing, a key approach to handling them is to exploit their repetitiveness, which can reduce their space usage by orders of magnitude. We study the problem of indexing repetitive string collections in order to perform efficient document retrieval operations on them. Document retrieval problems are routinely solved by search engines on large natural language collections, but the techniques are less developed on generic string collections. The case of repetitive string collections is even less understood, and there are very few existing solutions. We develop two novel ideas, interleaved LCPs and precomputed document lists, that yield highly compressed indexes solving the problem of document listing (find all the documents where a string appears), top-k document retrieval (find the k documents where a string appears most often), and document counting (count the number of documents where a string appears). We also show that a classical data structure supporting the latter query becomes highly compressible on repetitive data. Finally, we show how the tools we developed can be combined to solve ranked conjunctive and disjunctive multi-term queries under the simple [Formula: see text] model of relevance. We thoroughly evaluate the resulting techniques in various real-life repetitiveness scenarios, and recommend the best choices for each case.

SELECTION OF CITATIONS
SEARCH DETAIL
...