Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 3290-3304, 2024 May.
Article in English | MEDLINE | ID: mdl-38190688

ABSTRACT

This study proposes a set of generic rules to revise existing neural networks for 3D point cloud processing to rotation-equivariant quaternion neural networks (REQNNs), in order to make feature representations of neural networks to be rotation-equivariant and permutation-invariant. Rotation equivariance of features means that the feature computed on a rotated input point cloud is the same as applying the same rotation transformation to the feature computed on the original input point cloud. We find that the rotation-equivariance of features is naturally satisfied, if a neural network uses quaternion features. Interestingly, we prove that such a network revision also makes gradients of features in the REQNN to be rotation-equivariant w.r.t. inputs, and the training of the REQNN to be rotation-invariant w.r.t. inputs. Besides, permutation-invariance examines whether the intermediate-layer features are invariant, when we reorder input points. We also evaluate the stability of knowledge representations of REQNNs, and the robustness of REQNNs to adversarial rotation attacks. Experiments have shown that REQNNs outperform traditional neural networks in both terms of classification accuracy and robustness on rotated testing samples.

2.
IEEE Trans Pattern Anal Mach Intell ; 46(7): 4625-4640, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38271170

ABSTRACT

Various attribution methods have been developed to explain deep neural networks (DNNs) by inferring the attribution/importance/contribution score of each input variable to the final output. However, existing attribution methods are often built upon different heuristics. There remains a lack of a unified theoretical understanding of why these methods are effective and how they are related. Furthermore, there is still no universally accepted criterion to compare whether one attribution method is preferable over another. In this paper, we resort to Taylor interactions and for the first time, we discover that fourteen existing attribution methods, which define attributions based on fully different heuristics, actually share the same core mechanism. Specifically, we prove that attribution scores of input variables estimated by the fourteen attribution methods can all be mathematically reformulated as a weighted allocation of two typical types of effects, i.e., independent effects of each input variable and interaction effects between input variables. The essential difference among these attribution methods lies in the weights of allocating different effects. Inspired by these insights, we propose three principles for fairly allocating the effects, which serve as new criteria to evaluate the faithfulness of attribution methods. In summary, this study can be considered as a new unified perspective to revisit fourteen attribution methods, which theoretically clarifies essential similarities and differences among these methods. Besides, the proposed new principles enable people to make a direct and fair comparison among different methods under the unified perspective.

3.
IEEE Trans Pattern Anal Mach Intell ; 45(4): 5099-5113, 2023 Apr.
Article in English | MEDLINE | ID: mdl-35994541

ABSTRACT

Compared to traditional learning from scratch, knowledge distillation sometimes makes the DNN achieve superior performance. In this paper, we provide a new perspective to explain the success of knowledge distillation based on the information theory, i.e., quantifying knowledge points encoded in intermediate layers of a DNN for classification. To this end, we consider the signal processing in a DNN as a layer-wise process of discarding information. A knowledge point is referred to as an input unit, the information of which is discarded much less than that of other input units. Thus, we propose three hypotheses for knowledge distillation based on the quantification of knowledge points. 1. The DNN learning from knowledge distillation encodes more knowledge points than the DNN learning from scratch. 2. Knowledge distillation makes the DNN more likely to learn different knowledge points simultaneously. In comparison, the DNN learning from scratch tends to encode various knowledge points sequentially. 3. The DNN learning from knowledge distillation is often more stably optimized than the DNN learning from scratch. To verify the above hypotheses, we design three types of metrics with annotations of foreground objects to analyze feature representations of the DNN, i.e., the quantity and the quality of knowledge points, the learning speed of different knowledge points, and the stability of optimization directions. In experiments, we diagnosed various DNNs on different classification tasks, including image classification, 3D point cloud classification, binary sentiment classification, and question answering, which verified the above hypotheses.

4.
PLoS Comput Biol ; 18(10): e1010594, 2022 10.
Article in English | MEDLINE | ID: mdl-36215325

ABSTRACT

Advanced volumetric imaging methods and genetically encoded activity indicators have permitted a comprehensive characterization of whole brain activity at single neuron resolution in Caenorhabditis elegans. The constant motion and deformation of the nematode nervous system, however, impose a great challenge for consistent identification of densely packed neurons in a behaving animal. Here, we propose a cascade solution for long-term and rapid recognition of head ganglion neurons in a freely moving C. elegans. First, potential neuronal regions from a stack of fluorescence images are detected by a deep learning algorithm. Second, 2-dimensional neuronal regions are fused into 3-dimensional neuron entities. Third, by exploiting the neuronal density distribution surrounding a neuron and relative positional information between neurons, a multi-class artificial neural network transforms engineered neuronal feature vectors into digital neuronal identities. With a small number of training samples, our bottom-up approach is able to process each volume-1024 × 1024 × 18 in voxels-in less than 1 second and achieves an accuracy of 91% in neuronal detection and above 80% in neuronal tracking over a long video recording. Our work represents a step towards rapid and fully automated algorithms for decoding whole brain activity underlying naturalistic behaviors.


Subject(s)
Brain , Caenorhabditis elegans , Animals , Caenorhabditis elegans/physiology , Brain/physiology , Neurons/physiology
5.
IEEE Trans Pattern Anal Mach Intell ; 43(11): 3949-3963, 2021 11.
Article in English | MEDLINE | ID: mdl-32396071

ABSTRACT

In this paper, we present a method to mine object-part patterns from conv-layers of a pre-trained convolutional neural network (CNN). The mined object-part patterns are organized by an And-Or graph (AOG). This interpretable AOG representation consists of a four-layer semantic hierarchy, i.e., semantic parts, part templates, latent patterns, and neural units. The AOG associates each object part with certain neural units in feature maps of conv-layers. The AOG is constructed with very few annotations (e.g., 3-20) of object parts. We develop a question-answering (QA) method that uses active human-computer communications to mine patterns from a pre-trained CNN, in order to explain features in conv-layers incrementally. During the learning process, our QA method uses the current AOG for part localization. The QA method actively identifies objects, whose feature maps cannot be explained by the AOG. Then, our method asks people to annotate parts on the unexplained objects, and uses answers to discover CNN patterns corresponding to newly labeled parts. In this way, our method gradually grows new branches and refines existing branches on the AOG to semanticize CNN representations. In experiments, our method exhibited a high learning efficiency. Our method used about 1/6- 1/3 of the part annotations for training, but achieved similar or better part-localization performance than fast-RCNN methods.

6.
IEEE Trans Pattern Anal Mach Intell ; 43(10): 3416-3431, 2021 10.
Article in English | MEDLINE | ID: mdl-32224452

ABSTRACT

This paper proposes a generic method to learn interpretable convolutional filters in a deep convolutional neural network (CNN) for object classification, where each interpretable filter encodes features of a specific object part. Our method does not require additional annotations of object parts or textures for supervision. Instead, we use the same training data as traditional CNNs. Our method automatically assigns each interpretable filter in a high conv-layer with an object part of a certain category during the learning process. Such explicit knowledge representations in conv-layers of the CNN help people clarify the logic encoded in the CNN, i.e., answering what patterns the CNN extracts from an input image and uses for prediction. We have tested our method using different benchmark CNNs with various architectures to demonstrate the broad applicability of our method. Experiments have shown that our interpretable filters are much more semantically meaningful than traditional filters.

7.
IEEE Trans Pattern Anal Mach Intell ; 43(11): 3863-3877, 2021 Nov.
Article in English | MEDLINE | ID: mdl-32386138

ABSTRACT

This paper introduces an explanatory graph representation to reveal object parts encoded inside convolutional layers of a CNN. Given a pre-trained CNN, each filter1 in a conv-layer usually represents a mixture of object parts. We develop a simple yet effective method to learn an explanatory graph, which automatically disentangles object parts from each filter without any part annotations. Specifically, given the feature map of a filter, we mine neural activations from the feature map, which correspond to different object parts. The explanatory graph is constructed to organize each mined part as a graph node. Each edge connects two nodes, whose corresponding object parts usually co-activate and keep a stable spatial relationship. Experiments show that each graph node consistently represented the same object part through different images, which boosted the transferability of CNN features. The explanatory graph transferred features of object parts to the task of part localization, and our method significantly outperformed other approaches.

8.
IEEE Trans Pattern Anal Mach Intell ; 38(3): 532-45, 2016 Mar.
Article in English | MEDLINE | ID: mdl-27046496

ABSTRACT

We categorize this research in terms of its contribution to both graph theory and computer vision. From the theoretical perspective, this study can be considered as the first attempt to formulate the idea of mining maximal frequent subgraphs in the challenging domain of messy visual data, and as a conceptual extension to the unsupervised learning of graph matching. We define a soft attributed pattern (SAP) to represent the common subgraph pattern among a set of attributed relational graphs (ARGs), considering both their structure and attributes. Regarding the differences between ARGs with fuzzy attributes and conventional labeled graphs, we propose a new mining strategy that directly extracts the SAP with the maximal graph size without applying node enumeration. Given an initial graph template and a number of ARGs, we develop an unsupervised method to modify the graph template into the maximal-size SAP. From a practical perspective, this research develops a general platform for learning the category model (i.e., the SAP) from cluttered visual data (i.e., the ARGs) without labeling "what is where," thereby opening the possibility for a series of applications in the era of big visual data. Experiments demonstrate the superior performance of the proposed method on RGB/RGB-D images and videos.

9.
Int J Clin Exp Med ; 8(5): 6829-34, 2015.
Article in English | MEDLINE | ID: mdl-26221221

ABSTRACT

AIM: To study the role of protoporphyrin IX (pPIX) in mitochondrial metabolism of hydrogen peroxide (H2O2). METHODS: O2 (-) specific fluorescent markers DMA (9,10-dimerthylanthracence) and SOSG (Singlet Oxygen Sensor Green reagent) were used for measurement of singlet oxygen ((1)O2). Catalyzing conversion of H2O2 into (1)O2 by pPIX was monitored in vitro under varied H2O2 content, temperature, and PH value in the reaction. Ex vivo mitochondrial model was used to analyze effects of ferrochelatase (FECH) and high energy X-rays on this catalytic reaction. RESULTS: In complete dark, measurable (1)O2 was generated when 1.5 mM of H2O2 was incubated with 24 µM of pPIX H2O2 at 37°C for 3 hours. Mitochondrial yield of H2O2 was 0.11±0.03 nmole/mg/min. Mitochondrial FECH significantly improve the catalytic ability of pPIX converting H2O2 into (1)O2. At presence of high-energy X-ray, incubation of 14.4 µM of pPIX with 0.54 µM of H2O2 also generated (1)O2, during which the fluorescence density of 1.05 µM of DMA decreased by 41.5% (P < 0.05). This conversion was not observed when pPIX was replaced with structurally similar hematoporphyrin. CONCLUSION: pPIX can catalyze conversion of H2O2 into (1)O2.

SELECTION OF CITATIONS
SEARCH DETAIL
...