Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PLoS One ; 16(4): e0247751, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33826612

RESUMO

There are many psychological applications that require collapsing the information in a two-mode (e.g., respondents-by-attributes) binary matrix into a one-mode (e.g., attributes-by-attributes) similarity matrix. This process requires the selection of a measure of similarity between binary attributes. A vast number of binary similarity coefficients have been proposed in fields such as biology, geology, and ecology. Although previous studies have reported cluster analyses of binary similarity coefficients, there has been little exploration of how cluster memberships are affected by the base rates (percentage of ones) for the binary attributes. We conducted a simulation experiment that compared two-cluster K-median partitions of 71 binary similarity coefficients based on their pairwise correlations obtained under 15 different base-rate configurations. The results reveal that some subsets of coefficients consistently group together regardless of the base rates. However, there are other subsets of coefficients that group together for some base rates, but not for others.


Assuntos
Algoritmos , Simulação por Computador , Modelos Teóricos
2.
Br J Math Stat Psychol ; 73(3): 375-396, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-31512759

RESUMO

Most partitioning methods used in psychological research seek to produce homogeneous groups (i.e., groups with low intra-group dissimilarity). However, there are also applications where the goal is to provide heterogeneous groups (i.e., groups with high intra-group dissimilarity). Examples of these anticlustering contexts include construction of stimulus sets, formation of student groups, assignment of employees to project work teams, and assembly of test forms from a bank of items. Unfortunately, most commercial software packages are not equipped to accommodate the objective criteria and constraints that commonly arise for anticlustering problems. Two important objective criteria for anticlustering based on information in a dissimilarity matrix are: a diversity measure based on within-cluster sums of dissimilarities; and a dispersion measure based on the within-cluster minimum dissimilarities. In many instances, it is possible to find a partition that provides a large improvement in one of these two criteria with little (or no) sacrifice in the other criterion. For this reason, it is of significant value to explore the trade-offs that arise between these two criteria. Accordingly, the key contribution of this paper is the formulation of a bicriterion optimization problem for anticlustering based on the diversity and dispersion criteria, along with heuristics to approximate the Pareto efficient set of partitions. A motivating example and computational study are provided within the framework of test assembly.


Assuntos
Análise por Conglomerados , Modelos Estatísticos , Psicologia/estatística & dados numéricos , Algoritmos , Heurística Computacional , Simulação por Computador , Avaliação Educacional/estatística & dados numéricos , Humanos , Testes Neuropsicológicos/estatística & dados numéricos , Psicometria/estatística & dados numéricos
3.
Br J Math Stat Psychol ; 72(1): 155-182, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-29633235

RESUMO

Affinity propagation is a message-passing-based clustering procedure that has received widespread attention in domains such as biological science, physics, and computer science. However, its implementation in psychology and related areas of social science is comparatively scant. In this paper, we describe the basic principles of affinity propagation, its relationship to other clustering problems, and the types of data for which it can be used for cluster analysis. More importantly, we identify the strengths and weaknesses of affinity propagation as a clustering tool in general and highlight potential opportunities for its use in psychological research. Numerical examples are provided to illustrate the method.


Assuntos
Algoritmos , Análise por Conglomerados , Reconhecimento Automatizado de Padrão/métodos , Psicologia/métodos , Simulação por Computador , Humanos , Pesquisa , Projetos de Pesquisa
4.
Br J Math Stat Psychol ; 70(1): 1-24, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-28130935

RESUMO

The emergence of Gaussian model-based partitioning as a viable alternative to K-means clustering fosters a need for discrete optimization methods that can be efficiently implemented using model-based criteria. A variety of alternative partitioning criteria have been proposed for more general data conditions that permit elliptical clusters, different spatial orientations for the clusters, and unequal cluster sizes. Unfortunately, many of these partitioning criteria are computationally demanding, which makes the multiple-restart (multistart) approach commonly used for K-means partitioning less effective as a heuristic solution strategy. As an alternative, we propose an approach based on iterated local search (ILS), which has proved effective in previous combinatorial data analysis contexts. We compared multistart, ILS and hybrid multistart-ILS procedures for minimizing a very general model-based criterion that assumes no restrictions on cluster size or within-group covariance structure. This comparison, which used 23 data sets from the classification literature, revealed that the ILS and hybrid heuristics generally provided better criterion function values than the multistart approach when all three methods were constrained to the same 10-min time limit. In many instances, these differences in criterion function values reflected profound differences in the partitions obtained.


Assuntos
Algoritmos , Análise por Conglomerados , Interpretação Estatística de Dados , Modelos Estatísticos , Distribuição Normal , Simulação por Computador
5.
Multivariate Behav Res ; 43(1): 29-49, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-26788971

RESUMO

Clusterwise linear regression is a multivariate statistical procedure that attempts to cluster objects with the objective of minimizing the sum of the error sums of squares for the within-cluster regression models. In this article, we show that the minimization of this criterion makes no effort to distinguish the error explained by the within-cluster regression models from the error explained by the clustering process. In some cases, most of the variation in the response variable is explained by clustering the objects, with little additional benefit provided by the within-cluster regression models. Accordingly, there is tremendous potential for overfitting with clusterwise regression, which is demonstrated with numerical examples and simulation experiments. To guard against the misuse of clusterwise regression, we recommend a benchmarking procedure that compares the results for the observed empirical data with those obtained across a set of random permutations of the response measures. We also demonstrate the potential for overfitting via an empirical application related to the prediction of reflective judgment using high school and college performance measures.

6.
Br J Math Stat Psychol ; 58(Pt 2): 319-32, 2005 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-16293203

RESUMO

Partitioning indices associated with the within-cluster sums of pairwise dissimilarities often exhibit a systematic bias towards clusters of a particular size, whereas minimization of the partition diameter (i.e. the maximum dissimilarity element across all pairs of objects within the same cluster) does not typically have this problem. However, when the partition-diameter criterion is used, there is often a myriad of alternative optimal solutions that can vary significantly with respect to their substantive interpretation. We propose a bicriterion partitioning approach that considers both diameter and within-cluster sums in the optimization problem and facilitates selection from among the alternative optima. We developed several MATLAB-based exchange algorithms that rapidly provide excellent solutions to bicriterion partitioning problems. These algorithms were evaluated using synthetic data sets, as well as an empirical dissimilarity matrix.


Assuntos
Modelos Psicológicos , Análise por Conglomerados , Humanos , Psicologia/métodos , Psicologia/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...