Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nat Commun ; 14(1): 5952, 2023 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-37741834

RESUMO

Emerging data-intensive computation has driven the advanced packaging and vertical stacking of integrated circuits, for minimized latency and energy consumption. Yet a monolithic three-dimensional (3D) integrated structure with interleaved logic and high-density memory layers has been difficult to achieve due to challenges in managing the thermal budget. Here we experimentally demonstrate a monolithic 3D integration of atomically-thin molybdenum disulfide (MoS2) transistors and 3D vertical resistive random-access memories (VRRAMs), with the MoS2 transistors stacked between the bottom-plane and top-plane VRRAMs. The whole fabrication process is integration-friendly (below 300 °C), and the measurement results confirm that the top-plane fabrication does not affect the bottom-plane devices. The MoS2 transistor can drive each layer of VRRAM into four resistance states. Circuit-level modeling of the monolithic 3D structure demonstrates smaller area, faster data transfer, and lower energy consumption than a planar memory. Such platform holds a high potential for energy-efficient 3D on-chip memory systems.

2.
Artigo em Inglês | MEDLINE | ID: mdl-36063528

RESUMO

Deep neural network (DNN) model compression is a popular and important optimization method for efficient and fast hardware acceleration. However, the compressed model is usually fixed, without the capability to tune the computing complexity (i.e., latency in hardware) on-the-fly, depending on dynamic latency requirements, workloads, and computing hardware resource allocation. To address this challenge, dynamic DNN with run-time adaption of computing structures has been constructed through training with a cross-entropy objective function consisting of multiple subnets sampled from the supernet. Our investigations in this work show that the performance of dynamic inference highly relies on the quality of subnet sampling. To construct a dynamic DNN with multiple high-quality subnets, we propose a progressive subnetwork searching framework, which is embedded with several proposed new techniques, including trainable noise ranking, channel-group sampling, selective fine-tuning, and subnet filtering. Our proposed framework empowers the target dynamic DNN with higher accuracy for all the subnets compared with prior works on both the Canadian Institute for Advanced Research dataset with 10 classes (CIFAR-10) and ImageNet datasets. Specifically, compared with United States-Neural Network (US-NN), our method achieves 0.9% average accuracy gain for Alexnet, 2.5% for ResNet18, 1.1% for Visual Geometry Group (VGG)11, and 0.58% for MobileNetv1, on the ImageNet dataset, respectively. Moreover, to demonstrate run-time tuning of computing latency of dynamic DNN in real computing system, we have deployed our constructed dynamic networks into Nvidia Titan graphics processing unit (GPU) and Intel Xeon central processing unit (CPU), showing great improvement over prior works. The code is available at https://github.com/ASU-ESIC-FAN-Lab/Dynamic-inference.

3.
IEEE Trans Neural Netw Learn Syst ; 33(9): 4930-4944, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-33735086

RESUMO

Large deep neural network (DNN) models pose the key challenge to energy efficiency due to the significantly higher energy consumption of off-chip DRAM accesses than arithmetic or SRAM operations. It motivates the intensive research on model compression with two main approaches. Weight pruning leverages the redundancy in the number of weights and can be performed in a non-structured, which has higher flexibility and pruning rate but incurs index accesses due to irregular weights, or structured manner, which preserves the full matrix structure with a lower pruning rate. Weight quantization leverages the redundancy in the number of bits in weights. Compared to pruning, quantization is much more hardware-friendly and has become a "must-do" step for FPGA and ASIC implementations. Thus, any evaluation of the effectiveness of pruning should be on top of quantization. The key open question is, with quantization, what kind of pruning (non-structured versus structured) is most beneficial? This question is fundamental because the answer will determine the design aspects that we should really focus on to avoid the diminishing return of certain optimizations. This article provides a definitive answer to the question for the first time. First, we build ADMM-NN-S by extending and enhancing ADMM-NN, a recently proposed joint weight pruning and quantization framework, with the algorithmic supports for structured pruning, dynamic ADMM regulation, and masked mapping and retraining. Second, we develop a methodology for fair and fundamental comparison of non-structured and structured pruning in terms of both storage and computation efficiency. Our results show that ADMM-NN-S consistently outperforms the prior art: 1) it achieves 348× , 36× , and 8× overall weight pruning on LeNet-5, AlexNet, and ResNet-50, respectively, with (almost) zero accuracy loss and 2) we demonstrate the first fully binarized (for all layers) DNNs can be lossless in accuracy in many cases. These results provide a strong baseline and credibility of our study. Based on the proposed comparison framework, with the same accuracy and quantization, the results show that non-structured pruning is not competitive in terms of both storage and computation efficiency. Thus, we conclude that structured pruning has a greater potential compared to non-structured pruning. We encourage the community to focus on studying the DNN inference acceleration with structured sparsity.

4.
Artigo em Inglês | MEDLINE | ID: mdl-34529561

RESUMO

Traditional Deep Neural Network (DNN) security is mostly related to the well-known adversarial input example attack.Recently, another dimension of adversarial attack, namely, attack on DNN weight parameters, has been shown to be very powerful. Asa representative one, the Bit-Flip based adversarial weight Attack (BFA) injects an extremely small amount of faults into weight parameters to hijack the executing DNN function. Prior works of BFA focus on un-targeted attacks that can hack all inputs into a random output class by flipping a very small number of weight bits stored in computer memory. This paper proposes the first work oftargetedBFA based (T-BFA) adversarial weight attack on DNNs, which can intentionally mislead selected inputs to a target output class. The objective is achieved by identifying the weight bits that are highly associated with classification of a targeted output through a class-dependent weight bit searching algorithm. Our proposed T-BFA performance is successfully demonstrated on multiple DNN architectures for image classification tasks. For example, by merely flipping 27 out of 88 million weight bits of ResNet-18, our T-BFA can misclassify all the images from Hen class into Goose class (i.e., 100% attack success rate) in ImageNet dataset, while maintaining 59.35% validation accuracy.

5.
Bioinformatics ; 36(6): 1814-1822, 2020 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-31688914

RESUMO

MOTIVATION: Detecting cancer gene expression and transcriptome changes with mRNA-sequencing or array-based data are important for understanding the molecular mechanisms underlying carcinogenesis and cellular events during cancer progression. In previous studies, the differentially expressed genes were detected across patients in one cancer type. These studies ignored the role of mRNA expression changes in driving tumorigenic mechanisms that are either universal or specific in different tumor types. To address the problem, we introduce two network-based multi-task learning frameworks, NetML and NetSML, to discover common differentially expressed genes shared across different cancer types as well as differentially expressed genes specific to each cancer type. The proposed frameworks consider the common latent gene co-expression modules and gene-sample biclusters underlying the multiple cancer datasets to learn the knowledge crossing different tumor types. RESULTS: Large-scale experiments on simulations and real cancer high-throughput datasets validate that the proposed network-based multi-task learning frameworks perform better sample classification compared with the models without the knowledge sharing across different cancer types. The common and cancer-specific molecular signatures detected by multi-task learning frameworks on The Cancer Genome Atlas ovarian, breast and prostate cancer datasets are correlated with the known marker genes and enriched in cancer-relevant Kyoto Encyclopedia of Genes and Genome pathways and gene ontology terms. AVAILABILITY AND IMPLEMENTATION: Source code is available at: https://github.com/compbiolabucf/NetML. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Transcriptoma , Biomarcadores , Redes Reguladoras de Genes , Genoma , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...