Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Assunto principal
Intervalo de ano de publicação
1.
Entropy (Basel) ; 26(4)2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38667833

RESUMO

Structural properties of the currency market were examined with the use of topological networks. Relationships between currencies were analyzed by constructing minimal spanning trees (MSTs). The dissimilarities between time series of currency returns were measured in various ways: by applying Euclidean distance, Pearson's linear correlation coefficient, Spearman's rank correlation coefficient, Kendall's coefficient, partial correlation, dynamic time warping measure, and Kullback-Leibler relative entropy. For the constructed MSTs, their topological characteristics were analyzed and conclusions were drawn regarding the influence of the dissimilarity measure used. It turned out that the strength of most types of correlations was highly dependent on the choice of the numeraire currency, while partial correlations were invariant in this respect. It can be stated that a network built on the basis of partial correlations provides a more adequate illustration of pairwise relationships in the foreign exchange market. The data for quotations of 37 of the most important world currencies and four precious metals in the period from 1 January 2019 to 31 December 2022 were used. The outbreak of the COVID-19 pandemic in 2020 and Russia's invasion of Ukraine in 2022 triggered changes in the topology of the currency network. As a result of these crises, the average distances between tree nodes decreased and the centralization of graphs increased. Our results confirm that currencies are often pegged to other currencies due to countries' geographic locations and economic ties. The detected structures can be useful in descriptions of the currency market, can help in constructing a stable portfolio of the foreign exchange rates, and can be a valuable tool in searching for economic factors influencing specific groups of countries.

2.
Ecol Evol ; 13(10): e10520, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37809360

RESUMO

Time series are a critical component of ecological analysis, used to track changes in biotic and abiotic variables. Information can be extracted from the properties of time series for tasks such as classification (e.g., assigning species to individual bird calls); clustering (e.g., clustering similar responses in population dynamics to abrupt changes in the environment or management interventions); prediction (e.g., accuracy of model predictions to original time series data); and anomaly detection (e.g., detecting possible catastrophic events from population time series). These common tasks in ecological research all rely on the notion of (dis-) similarity, which can be determined using distance measures. A plethora of distance measures have been described, predominantly in the computer and information sciences, but many have not been introduced to ecologists. Furthermore, little is known about how to select appropriate distance measures for time-series-related tasks. Therefore, many potential applications remain unexplored. Here, we describe 16 properties of distance measures that are likely to be of importance to a variety of ecological questions involving time series. We then test 42 distance measures for each property and use the results to develop an objective method to select appropriate distance measures for any task and ecological dataset. We demonstrate our selection method by applying it to a set of real-world data on breeding bird populations in the UK and discuss other potential applications for distance measures, along with associated technical issues common in ecology. Our real-world population trends exhibit a common challenge for time series comparisons: a high level of stochasticity. We demonstrate two different ways of overcoming this challenge, first by selecting distance measures with properties that make them well suited to comparing noisy time series and second by applying a smoothing algorithm before selecting appropriate distance measures. In both cases, the distance measures chosen through our selection method are not only fit-for-purpose but are consistent in their rankings of the population trends. The results of our study should lead to an improved understanding of, and greater scope for, the use of distance measures for comparing ecological time series and help us answer new ecological questions.

3.
BMC Bioinformatics ; 22(1): 170, 2021 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-33789571

RESUMO

BACKGROUND: The most common measure of association between two continuous variables is the Pearson correlation (Maronna et al. in Safari an OMC. Robust statistics, 2019. https://login.proxy.bib.uottawa.ca/login?url=https://learning.oreilly.com/library/view/-/9781119214687/?ar&orpq&email=^u). When outliers are present, Pearson does not accurately measure association and robust measures are needed. This article introduces three new robust measures of correlation: Taba (T), TabWil (TW), and TabWil rank (TWR). The correlation estimators T and TW measure a linear association between two continuous or ordinal variables; whereas TWR measures a monotonic association. The robustness of these proposed measures in comparison with Pearson (P), Spearman (S), Quadrant (Q), Median (M), and Minimum Covariance Determinant (MCD) are examined through simulation. Taba distance is used to analyze genes, and statistical tests were used to identify those genes most significantly associated with Williams Syndrome (WS). RESULTS: Based on the root mean square error (RMSE) and bias, the three proposed correlation measures are highly competitive when compared to classical measures such as P and S as well as robust measures such as Q, M, and MCD. Our findings indicate TBL2 was the most significant gene among patients diagnosed with WS and had the most significant reduction in gene expression level when compared with control (P value = 6.37E-05). CONCLUSIONS: Overall, when the distribution is bivariate Log-Normal or bivariate Weibull, TWR performs best in terms of bias and T performs best with respect to RMSE. Under the Normal distribution, MCD performs well with respect to bias and RMSE; but TW, TWR, T, S, and P correlations were in close proximity. The identification of TBL2 may serve as a diagnostic tool for WS patients. A Taba R package has been developed and is available for use to perform all necessary computations for the proposed methods.


Assuntos
Correlação de Dados , Simulação por Computador , Humanos
4.
Front Microbiol ; 11: 567769, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33304326

RESUMO

Phages are viruses that infect bacteria. The phages can be classified into two different categories based on their lifestyles: temperate and lytic. Now, the metavirome can generate a large number of fragments from the viral genomic sequences of entire environmental community, which makes it impossible to determine their lifestyles through experiments. Thus, there is a need to development computational methods for annotating phage contigs and making prediction of their lifestyles. Alignment-based methods for classifying phage lifestyle are limited by incomplete assembled genomes and nucleotide databases. Alignment-free methods based on the frequencies of k-mers were widely used for genome and metagenome comparison which did not rely on the completeness of genome or nucleotide databases. To mimic fragmented metagenomic sequences, the temperate and lytic phages genomic sequences were split into non-overlapping fragments with different lengths, then, I comprehensively compared nine alignment-free dissimilarity measures with a wide range of choices of k-mer length and Markov orders for predicting the lifestyles of these phage contigs. The dissimilarity measure, d 2 S , performed better than other dissimilarity measures for classifying the lifestyles of phages. Thus, I propose that the alignment-free method, d 2 S , can be used for predicting the lifestyles of phages which derived from the metagenomic data.

5.
R Soc Open Sci ; 5(1): 171545, 2018 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-29410857

RESUMO

We use an information-theoretic measure of linguistic similarity to investigate the organization and evolution of scientific fields. An analysis of almost 20 M papers from the past three decades reveals that the linguistic similarity is related but different from experts and citation-based classifications, leading to an improved view on the organization of science. A temporal analysis of the similarity of fields shows that some fields (e.g. computer science) are becoming increasingly central, but that on average the similarity between pairs of disciplines has not changed in the last decades. This suggests that tendencies of convergence (e.g. multi-disciplinarity) and divergence (e.g. specialization) of disciplines are in balance.

6.
Ecol Evol ; 7(13): 4835-4843, 2017 07.
Artigo em Inglês | MEDLINE | ID: mdl-28690812

RESUMO

The amount of variation in species composition among sampling units or beta diversity has become a primary tool for connecting the spatial structure of species assemblages to ecological processes. Many different measures of beta diversity have been developed. Among them, the total variance in the community composition matrix has been proposed as a single-number estimate of beta diversity. In this study, I first show that this measure summarizes the compositional variation among sampling units after nonlinear transformation of species abundances. Therefore, it is not always adequate for estimating beta diversity. Next, I propose an alternative approach for calculating beta diversity in which variance is substituted by a weighted measure of concentration (i.e., an inverse measure of evenness). The relationship between this new measure of beta diversity and so-called multiple-site dissimilarity measures is also discussed.

7.
Ciênc. rural ; 41(9): 1503-1508, set. 2011. ilus, tab
Artigo em Português | LILACS | ID: lil-600727

RESUMO

O objetivo deste trabalho foi avaliar a consistência do padrão de agrupamento obtido a partir da combinação de duas medidas de dissimilaridade e quatro métodos de agrupamento, em cenários formados por combinações de número de cultivares e número de variáveis, com dados reais de cultivares de milho (Zea mays L.) e com dados simulados. Foram usados os dados reais de cinco variáveis mensuradas em 69 experimentos de competição de cultivares de milho, cujo número de cultivares avaliadas oscilou entre 9 e 40. A fim de investigar os resultados com maior número de cultivares e de variáveis, foram simulados, sob distribuição normal padrão, 1.000 experimentos para cada um dos 54 cenários formados pela combinação entre o número de cultivares (20, 30, 40, 50, 60, 70, 80, 90 e 100) e o número de variáveis (5, 6, 7, 8, 9 e 10). Foram realizadas análises de correlação, de diagnóstico de multicolinearidade e de agrupamento. A consistência do padrão de agrupamento foi avaliada por meio do coeficiente de correlação cofenética. Há decréscimo da consistência do padrão de agrupamento com o acréscimo do número de cultivares e de variáveis. A distância euclidiana proporciona maior consistência no padrão de agrupamento em relação à distância de Manhattan. A consistência do padrão de agrupamento entre os métodos aumenta na seguinte ordem: Ward, ligação completa, ligação simples e ligação média entre grupo.


The objective of this research was to evaluate the clustering pattern consistency obtained from the combination of the two dissimilarity measures and four clustering methods, in scenarios consist of combinations number of cultivars and number of variables, with real data in corn cultivars (Zea mays L.) and simulated data. We used real data from five variables measured in 69 trials involving corn cultivars, the number of cultivars ranged between 9 and 40. In order to investigate the results with more cultivars and variables, were simulated under the standard normal distribution, 1,000 experiments for each of the 54 scenarios formed by the combination among the number of cultivars (20, 30, 40, 50, 60, 70, 80, 90 and 100) and the number of variables (5, 6, 7, 8, 9 and 10). Analyses of correlation, diagnoses of multicollinearity ans cluster were carried out. Clustering pattern consistency was evaluated by the cophenetic correlation coefficient. There is a decrease of clustering pattern consistency with the increase in the number of cultivars and variable. The euclidean distance provides greater clustering pattern consistency in relation to Manhattan distance. The clustering pattern consistency among the methods increases as follows: Ward's, complete linkage, single linkage and average linkage between groups.

8.
Ciênc. rural ; 38(8): 2138-2145, Nov. 2008. ilus, tab
Artigo em Português | LILACS | ID: lil-511990

RESUMO

Os objetivos deste trabalho foram comparar métodos de agrupamento, com base nas medidas de dissimilaridade (euclidiana média padronizada e generalizada de Mahalanobis) e obter informações sobre a divergência genética em cultivares de feijão (Phaseolus vulgaris L.). Quatorze cultivares de feijão foram avaliadas em nove experimentos conduzidos em Santa Maria, Estado do Rio Grande do Sul (latitude 29°42S, longitude 53°49W e 95m de altitude), nos anos agrícolas de 2000/2001 a 2004/2005. Foi utilizado o delineamento aleatorizado em blocos, com três repetições, e foram avaliados os caracteres produtividade de grãos, número de vagens por planta e de sementes por vagem, massa de cem grãos, população final de plantas, número de dias da emergência ao florescimento e da emergência à colheita, altura de inserção de primeira e de última vagem e grau de acamamento. Agrupamentos com base na distância euclidiana média padronizada são distintos dos formados com base na distância generalizada de Mahalanobis. O método de Tocher e os métodos hierárquicos da ligação simples, de Ward, da ligação completa, da mediana, da ligação média dentro de grupo e da ligação média entre grupo, com base na distância generalizada de Mahalanobis formam grupos concordantes. A cultivar "Iraí" apresenta comportamento distinto das demais cultivares.


The aim of this research was to compare cluster methods, on the basis of the dissimilarity (standardized average euclidian and Mahalanobis generalized) and obtain information on genetic diversity in common bean cultivars (Phaseolus vulgaris). Fourteen common beans cultivars were evaluated in nine experiments conducted in Santa Maria, Rio Grande do Sul State, Brazil (latitude 29°42 S, longitude 53°49 W, altitude 95m), in agricultural years from 2000/2001 to 2004/2005. Randomized blocks design with three repetitions was installed to evaluated the following characters: grain yield, number of pods per plant, number of seeds per pod, weight of 100 grains, final population of plants, number of days of the emergency to flowering, number of days of the emergency to harvest, height of first pod insertion and height of the final pod insertion. Clusters based on the standardized average euclidian distance are distinct from those formed on the basis of Mahalanobis generalized distance. The Tocher's method and hierarchical methods of the single linkage, Ward, complete linkage, median, the average linkage within the group and average linkage between groups, based on the Mahalanobis generalized distance form agreement cluster. The cultivar 'Iraí' presents distinct behavior from other cultivars.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...