Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
IEEE Trans Neural Netw Learn Syst ; 34(8): 4153-4166, 2023 Aug.
Article in English | MEDLINE | ID: mdl-34752411

ABSTRACT

Social reviews are indispensable resources for modern consumers' decision making. To influence the reviews, for financial gains, some companies may choose to pay groups of fraudsters rather than individuals to demote or promote products and services. This is because consumers are more likely to be misled by a large amount of similar reviews, produced by a group of fraudsters. Semantic relation such as content similarity (CS) and polarity similarity is an important factor characterizing solicited group frauds. Recent approaches on fraudster group detection employed handcrafted features of group behaviors that failed to capture the semantic relation of review text from the reviewers. In this article, we propose the first neural approach, HIN-RNN, a heterogeneous information network (HIN) compatible recurrent neural network (RNN) for fraudster group detection that makes use of semantic similarity and requires no handcrafted features. The HIN-RNN provides a unifying architecture for representation learning of each reviewer, with the initial vector as the sum of word embeddings (SoWEs) of all review text written by the same reviewer, concatenated by the ratio of negative reviews. Given a co-review network representing reviewers who have reviewed the same items with similar ratings and the reviewers' vector representation, a collaboration matrix is captured through the HIN-RNN training. The proposed approach is demonstrated to be effective with marked improvement over state-of-the-art approaches on both the Yelp (22% and 12% in terms of recall and F1-value, respectively) and Amazon (4% and 2% in terms of recall and F1-value, respectively) datasets.

2.
Article in English | MEDLINE | ID: mdl-36279342

ABSTRACT

Motivated by potential financial gain, companies may hire fraudster groups to write fake reviews to either demote competitors or promote their own businesses. Such groups are considerably more successful in misleading customers, as people are more likely to be influenced by the opinion of a large group. To detect such groups, a common model is to represent fraudster groups' static networks, consequently overlooking the longitudinal behavior of a reviewer, thus, the dynamics of coreview relations among reviewers in a group. Hence, these approaches are incapable of excluding outlier reviewers, which are fraudsters intentionally camouflaging themselves in a group and genuine reviewers happen to coreview in fraudster groups. To address this issue, we propose "FGDT", a framework for "fraudster group detection through temporal relations." FGDT first capitalizes on the effectiveness of the HIN-recurrent neural network (RNN) in both reviewers' representation learning while capturing the collaboration between reviewers. The HIN-RNN models the coreview relations of reviewers in a group in a fixed time window of 28 days. We refer to this as spatial relation learning representation to signify the generalizability of this work to other networked scenarios. Then, we use an RNN on the spatial relations to predict the spatio-temporal relations of reviewers in the group. In the third step, a graph convolution network (GCN) refines the reviewers' vector representations using these predicted relations. These refined representations are then used to remove outlier reviewers. The average of the remaining reviewers' representation is then fed to a simple fully connected layer to predict if the group is a fraudster group or not. Exhaustive experiments of FGDT showed a 5% (4%), 12% (5%), and 12% (5%) improvement over three of the most recent approaches on precision, recall, and F1-value over the Yelp (Amazon) dataset, respectively.

3.
Comput Med Imaging Graph ; 70: 173-184, 2018 12.
Article in English | MEDLINE | ID: mdl-29691123

ABSTRACT

Achieving a high performance for the detection and characterization of architectural distortion in screening mammograms is important for an efficient breast cancer early detection. Viewing a mammogram image as a rough surface that can be described using the fractal theory is a well-recognized approach. This paper presents a new fractal-based computer-aided detection (CAD) algorithm for characterizing various breast tissues in screening mammograms with a particular focus on distinguishing between architectural distortion and normal breast parenchyma. The proposed approach is based on two underlying assumptions: (i) monitoring the variation pattern of fractal dimension, with the changes of the image resolution, is a useful tool to distinguish textural patterns of breast tissue, (ii) the bidimensional empirical mode decomposition (BEMD) algorithm appropriately generates a multiresolution representation of the mammogram. The proposed CAD has been tested using different validation datasets of mammographic regions of interest (ROIs) extracted from the Digital Database for Screening Mammography (DDSM) database. The validation ROI datasets contain architectural distortion (AD), normal breast tissue, and AD surrounding tissue. The highest classification performance, in terms of area under the receiver operating characteristic curve, of Az = 0.95 was achieved when the proposed approach applied to distinguish 187 architectural distortion depicting regions from 2191 normal breast parenchyma regions. The obtained results validate the underlying hypothesis and demonstrate that effectiveness of capturing the variation of the fractal dimension measurements within an appropriate multiscale representation of the digital mammogram. Results also reveal that this tool has the potential of prescreening other key and common mammographic signs of early breast cancer.


Subject(s)
Diagnosis, Computer-Assisted , Image Processing, Computer-Assisted/methods , Mammography , Algorithms , Breast Neoplasms/diagnostic imaging , Female , Humans
4.
Article in English | MEDLINE | ID: mdl-28113406

ABSTRACT

In extreme cold weather, living organisms produce Antifreeze Proteins (AFPs) to counter the otherwise lethal intracellular formation of ice. Structures and sequences of various AFPs exhibit a high degree of heterogeneity, consequently the prediction of the AFPs is considered to be a challenging task. In this research, we propose to handle this arduous manifold learning task using the notion of localized processing. In particular, an AFP sequence is segmented into two sub-segments each of which is analyzed for amino acid and di-peptide compositions. We propose to use only the most significant features using the concept of information gain (IG) followed by a random forest classification approach. The proposed RAFP-Pred achieved an excellent performance on a number of standard datasets. We report a high Youden's index (sensitivity+specificity-1) value of 0.75 on the standard independent test data set outperforming the AFP-PseAAC, AFP_PSSM, AFP-Pred, and iAFP by a margin of 0.05, 0.06, 0.14, and 0.68, respectively. The verification rate on the UniProKB dataset is found to be 83.19 percent which is substantially superior to the 57.18 percent reported for the iAFP method.


Subject(s)
Antifreeze Proteins/chemistry , Computational Biology/methods , Dipeptides/chemistry , Machine Learning , Sequence Analysis, Protein/methods , Algorithms , Antifreeze Proteins/analysis , Antifreeze Proteins/classification , Databases, Protein , Dipeptides/analysis , ROC Curve
5.
IEEE Trans Neural Netw Learn Syst ; 29(8): 3573-3587, 2018 08.
Article in English | MEDLINE | ID: mdl-28829320

ABSTRACT

Class imbalance is a common problem in the case of real-world object detection and classification tasks. Data of some classes are abundant, making them an overrepresented majority, and data of other classes are scarce, making them an underrepresented minority. This imbalance makes it challenging for a classifier to appropriately learn the discriminating boundaries of the majority and minority classes. In this paper, we propose a cost-sensitive (CoSen) deep neural network, which can automatically learn robust feature representations for both the majority and minority classes. During training, our learning procedure jointly optimizes the class-dependent costs and the neural network parameters. The proposed approach is applicable to both binary and multiclass problems without any modification. Moreover, as opposed to data-level approaches, we do not alter the original data distribution, which results in a lower computational cost during the training process. We report the results of our experiments on six major image classification data sets and show that the proposed approach significantly outperforms the baseline algorithms. Comparisons with popular data sampling techniques and CoSen classifiers demonstrate the superior performance of our proposed method.

6.
IEEE Trans Pattern Anal Mach Intell ; 38(3): 431-46, 2016 Mar.
Article in English | MEDLINE | ID: mdl-27046489

ABSTRACT

We present a framework to automatically detect and remove shadows in real world scenes from a single image. Previous works on shadow detection put a lot of effort in designing shadow variant and invariant hand-crafted features. In contrast, our framework automatically learns the most relevant features in a supervised manner using multiple convolutional deep neural networks (ConvNets). The features are learned at the super-pixel level and along the dominant boundaries in the image. The predicted posteriors based on the learned features are fed to a conditional random field model to generate smooth shadow masks. Using the detected shadow masks, we propose a Bayesian formulation to accurately extract shadow matte and subsequently remove shadows. The Bayesian formulation is based on a novel model which accurately models the shadow generation process in the umbra and penumbra regions. The model parameters are efficiently estimated using an iterative optimization procedure. Our proposed framework consistently performed better than the state-of-the-art on all major shadow databases collected under a variety of conditions.

7.
IEEE Trans Image Process ; 25(7): 3372-3383, 2016 Jul.
Article in English | MEDLINE | ID: mdl-28113718

ABSTRACT

Indoor scene recognition is a multi-faceted and challenging problem due to the diverse intra-class variations and the confusing inter-class similarities that characterize such scenes. This paper presents a novel approach that exploits rich mid-level convolutional features to categorize indoor scenes. Traditional convolutional features retain the global spatial structure, which is a desirable property for general object recognition. We, however, argue that the structure-preserving property of the convolutional neural network activations is not of substantial help in the presence of large variations in scene layouts, e.g., in indoor scenes. We propose to transform the structured convolutional activations to another highly discriminative feature space. The representation in the transformed space not only incorporates the discriminative aspects of the target data set but also encodes the features in terms of the general object categories that are present in indoor scenes. To this end, we introduce a new large-scale data set of 1300 object categories that are commonly present in indoor scenes. Our proposed approach achieves a significant performance boost over the previous state-of-the-art approaches on five major scene classification data sets.

8.
Annu Int Conf IEEE Eng Med Biol Soc ; 2016: 3965-3968, 2016 Aug.
Article in English | MEDLINE | ID: mdl-28269153

ABSTRACT

Aiming at improving the performance of computer-aided detection of architectural distortion (AD) in mammograms, this paper investigates whether textural patterns of AD surrounding tissue (ST) have the potential to detect AD signatures. More specifically, for characterizing the presence of AD; we investigated the application of textural analysis for discriminating between AD surrounding tissue and normal breast parenchyma. We evaluated the underlying hypothesis using a dataset of 2544 regions of interest (ROI) obtained from the Digital Database for Screening Mammography (DDSM). The ROI dataset contained 353 ST regions and 2191 normal parenchyma related regions. The bidimensional empirical mode decomposition (BEMD) algorithm was, first, applied to extract, from each ROI, the 2D intrinsic mode functions (2DIMF) or detail subbands. Then, statistical signatures of IMF layers were computed and used along with the fractal dimension, estimated from the original ROI, for discriminating ST from the normal breast tissue. The statistical analysis of various textural descriptors demonstrated the significant difference between characteristics of AD surrounding tissue and normal breast parenchyma. The highest AD recognition results of Az = 0.869, obtained from the textural analysis of AD surrounding tissue, is very promising and comparable with Az = 0.913 produced from characterizing AD regions.


Subject(s)
Algorithms , Breast/diagnostic imaging , Image Processing, Computer-Assisted/methods , Mammography/methods , Breast Neoplasms/diagnostic imaging , Databases as Topic , Female , Fractals , Humans , ROC Curve
9.
Article in English | MEDLINE | ID: mdl-26736212

ABSTRACT

Among the different and common mammographic signs of the early-stage breast cancer, the architectural distortion is the most difficult to be identified. In this paper, we propose a new multiscale statistical texture analysis to characterize the presence of architectural distortion by distinguishing between textural patterns of architectural distortion and normal breast parenchyma. The proposed approach, firstly, applies the bidimensional empirical mode decomposition algorithm to decompose each mammographic region of interest into a set of adaptive and data-driven two-dimensional intrinsic mode functions (IMF) layers that capture details or high-frequency oscillations of the input image. Then, a model-based approach is applied to IMF histograms to acquire the first order statistics. The normalized entropy measure is also computed from each IMF and used as a complementary textural feature for the recognition of architectural distortion patterns. For evaluating the proposed AD characterization approach, we used a mammographic dataset of 187 true positive regions (i.e. depicting architectural distortion) and 887 true negative (normal parenchyma) regions, extracted from the DDSM database. Using the proposed multiscale textural features and the nonlinear support vector machine classifier, the best classification performance, in terms of the area under the receiver operating characteristic curve (or Az value), achieved was 0.88.


Subject(s)
Breast Neoplasms/diagnostic imaging , Mammography , Algorithms , Area Under Curve , Databases, Factual , Entropy , Female , Humans , ROC Curve
10.
IEEE Trans Cybern ; 44(10): 1962-77, 2014 Oct.
Article in English | MEDLINE | ID: mdl-24686310

ABSTRACT

The expectation maximization (EM) is the standard training algorithm for hidden Markov model (HMM). However, EM faces a local convergence problem in HMM estimation. This paper attempts to overcome this problem of EM and proposes hybrid metaheuristic approaches to EM for HMM. In our earlier research, a hybrid of a constraint-based evolutionary learning approach to EM (CEL-EM) improved HMM estimation. In this paper, we propose a hybrid simulated annealing stochastic version of EM (SASEM) that combines simulated annealing (SA) with EM. The novelty of our approach is that we develop a mathematical reformulation of HMM estimation by introducing a stochastic step between the EM steps and combine SA with EM to provide better control over the acceptance of stochastic and EM steps for better HMM estimation. We also extend our earlier work and propose a second hybrid which is a combination of an EA and the proposed SASEM, (EA-SASEM). The proposed EA-SASEM uses the best constraint-based EA strategies from CEL-EM and stochastic reformulation of HMM. The complementary properties of EA and SA and stochastic reformulation of HMM of SASEM provide EA-SASEM with sufficient potential to find better estimation for HMM. To the best of our knowledge, this type of hybridization and mathematical reformulation have not been explored in the context of EM and HMM training. The proposed approaches have been evaluated through comprehensive experiments to justify their effectiveness in signal modeling using the speech corpus: TIMIT. Experimental results show that proposed approaches obtain higher recognition accuracies than the EM algorithm and CEL-EM as well.

11.
IEEE Trans Pattern Anal Mach Intell ; 32(11): 2106-12, 2010 Nov.
Article in English | MEDLINE | ID: mdl-20603520

ABSTRACT

In this paper, we present a novel approach of face identification by formulating the pattern recognition problem in terms of linear regression. Using a fundamental concept that patterns from a single-object class lie on a linear subspace, we develop a linear model representing a probe image as a linear combination of class-specific galleries. The inverse problem is solved using the least-squares method and the decision is ruled in favor of the class with the minimum reconstruction error. The proposed Linear Regression Classification (LRC) algorithm falls in the category of nearest subspace classification. The algorithm is extensively evaluated on several standard databases under a number of exemplary evaluation protocols reported in the face recognition literature. A comparative study with state-of-the-art algorithms clearly reflects the efficacy of the proposed approach. For the problem of contiguous occlusion, we propose a Modular LRC approach, introducing a novel Distance-based Evidence Fusion (DEF) algorithm. The proposed methodology achieves the best results ever reported for the challenging problem of scarf occlusion.


Subject(s)
Algorithms , Linear Models , Pattern Recognition, Automated/methods , Artificial Intelligence , Biometry/methods , Face , Female , Humans , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Least-Squares Analysis , Male
12.
IEEE Trans Syst Man Cybern B Cybern ; 39(1): 182-97, 2009 Feb.
Article in English | MEDLINE | ID: mdl-19068441

ABSTRACT

This paper attempts to overcome the tendency of the expectation-maximization (EM) algorithm to locate a local rather than global maximum when applied to estimate the hidden Markov model (HMM) parameters in speech signal modeling. We propose a hybrid algorithm for estimation of the HMM in automatic speech recognition (ASR) using a constraint-based evolutionary algorithm (EA) and EM, the CEL-EM. The novelty of our hybrid algorithm (CEL-EM) is that it is applicable for estimation of the constraint-based models with many constraints and large numbers of parameters (which use EM) like HMM. Two constraint-based versions of the CEL-EM with different fusion strategies have been proposed using a constraint-based EA and the EM for better estimation of HMM in ASR. The first one uses a traditional constraint-handling mechanism of EA. The other version transforms a constrained optimization problem into an unconstrained problem using Lagrange multipliers. Fusion strategies for the CEL-EM use a staged-fusion approach where EM has been plugged with the EA periodically after the execution of EA for a specific period of time to maintain the global sampling capabilities of EA in the hybrid algorithm. A variable initialization approach (VIA) has been proposed using a variable segmentation to provide a better initialization for EA in the CEL-EM. Experimental results on the TIMIT speech corpus show that CEL-EM obtains higher recognition accuracies than the traditional EM algorithm as well as a top-standard EM (VIA-EM, constructed by applying the VIA to EM).


Subject(s)
Artificial Intelligence , Markov Chains , Pattern Recognition, Automated/methods , Speech Recognition Software , Algorithms , Humans , Models, Statistical , Normal Distribution , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...