Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
Add more filters










Publication year range
1.
Vavilovskii Zhurnal Genet Selektsii ; 28(4): 443-455, 2024 Jul.
Article in English | MEDLINE | ID: mdl-39040972

ABSTRACT

Analysis of hyperspectral images is of great interest in plant studies. Nowadays, this analysis is used more and more widely, so the development of hyperspectral image processing methods is an urgent task. This paper presents a hyperspectral image processing pipeline that includes: preprocessing, basic statistical analysis, visualization of a multichannel hyperspectral image, and solving classification and clustering problems using machine learning methods. The current version of the package implements the following methods: construction of a confidence interval of an arbitrary level for the difference of sample averages; verification of the similarity of intensity distributions of spectral lines for two sets of hyperspectral images on the basis of the Mann-Whitney U-criterion and Pearson's criterion of agreement; visualization in two-dimensional space using dimensionality reduction methods PCA, ISOMAP and UMAP; classification using linear or ridge regression, random forest and catboost; clustering of samples using the EM-algorithm. The software pipeline is implemented in Python using the Pandas, NumPy, OpenCV, SciPy, Sklearn, Umap, CatBoost and Plotly libraries. The source code is available at: https://github.com/igor2704/Hyperspectral_images. The pipeline was applied to identify melanin pigment in the shell of barley grains based on hyperspectral data. Visualization based on PCA, UMAP and ISOMAP methods, as well as the use of clustering algorithms, showed that a linear separation of grain samples with and without pigmentation could be performed with high accuracy based on hyperspectral data. The analysis revealed statistically significant differences in the distribution of median intensities for samples of images of grains with and without pigmentation. Thus, it was demonstrated that hyperspectral images can be used to determine the presence or absence of melanin in barley grains with great accuracy. The flexible and convenient tool created in this work will significantly increase the efficiency of hyperspectral image analysis.

2.
Mol Biol (Mosk) ; 57(2): 155-165, 2023.
Article in Russian | MEDLINE | ID: mdl-37000645

ABSTRACT

Nonribosomal peptides play an important role in the vital activity of bacteria and have an extremely broad field of biological activity. In particular, they act as antibiotics, toxins, surfactants, siderophores, and also perform a number of other specific functions. Biosynthesis of these molecules does not occur on ribosomes but by special enzymes that form gene clusters in bacterial genomes. We hypothesized that the presence of nonribosomal peptide synthesis pathways is a specific feature of bacterial metabolism, which may affect other vital processes of the cell, including translational ones. This work was the first to show the relationship between the translation regulation mechanism of protein-coding genes in bacteria, which is largely determined by the efficiency of translation elongation, and the presence of gene clusters in the genomes for the biosynthesis of nonribosomal peptides. Bioinformatic analysis of the translation elongation efficiency of protein-coding genes was performed in 11679 bacterial genomes, some of which contained gene clusters of nonribosomal peptide biosynthesis and some of which did not. The analysis showed that bacteria whose genomes contained clusters of nonribosomal peptide biosynthetic genes and those without such gene clusters differ significantly in the molecular mechanisms that ensure translation efficiency. Thus, among microorganisms whose genomes contain gene clusters of nonribosomal peptide synthetases, a significantly smaller part of them is characterized by optimized regulation of the number of local inverted repeats, while most of them have genomes optimized by the averaged energy of inverted repeats studs in mRNA and additionally by codon composition. Our results suggest that the presence of nonribosomal peptide biosynthetic pathways in bacteria may influence the structure of the overall bacterial metabolism, which is also expressed in the specific mechanisms of ribosomal protein biosynthesis.


Subject(s)
Bacteria , Peptides , Bacteria/genetics , Peptides/chemistry , Computational Biology , Genome, Bacterial , Multigene Family
3.
Vavilovskii Zhurnal Genet Selektsii ; 27(7): 737-745, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38213704

ABSTRACT

The development of next-generation sequencing technologies has provided new opportunities for genotyping various organisms, including plants. Genotyping by sequencing (GBS) is used to identify genetic variability more rapidly, and is more cost-effective than whole-genome sequencing. GBS has demonstrated its reliability and flexibility for a number of plant species and populations. It has been applied to genetic mapping, molecular marker discovery, genomic selection, genetic diversity studies, variety identification, conservation biology and evolutionary studies. However, reduction in sequencing time and cost has led to the need to develop efficient bioinformatics analyses for an ever-expanding amount of sequenced data. Bioinformatics pipelines for GBS data analysis serve the purpose. Due to the similarity of data processing steps, existing pipelines are mainly characterised by a combination of software packages specifically selected either to process data for certain organisms or to process data from any organisms. However, despite the usage of efficient software packages, these pipelines have some disadvantages. For example, there is a lack of process automation (in some pipelines, each step must be started manually), which significantly reduces the performance of the analysis. In the majority of pipelines, there is no possibility of automatic installation of all necessary software packages; for most of them, it is also impossible to switch off unnecessary or completed steps. In the present work, we have developed a GBS-DP bioinformatics pipeline for GBS data analysis. The pipeline can be applied for various species. The pipeline is implemented using the Snakemake workflow engine. This implementation allows fully automating the process of calculation and installation of the necessary software packages. Our pipeline is able to perform analysis of large datasets (more than 400 samples).

4.
Vavilovskii Zhurnal Genet Selektsii ; 27(7): 859-868, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38500740

ABSTRACT

The pigment composition of plant seed coat affects important properties such as resistance to pathogens, pre-harvest sprouting, and mechanical hardness. The dark color of barley (Hordeum vulgare L.) grain can be attributed to the synthesis and accumulation of two groups of pigments. Blue and purple grain color is associated with the biosynthesis of anthocyanins. Gray and black grain color is caused by melanin. These pigments may accumulate in the grain shells both individually and together. Therefore, it is difficult to visually distinguish which pigments are responsible for the dark color of the grain. Chemical methods are used to accurately determine the presence/ absence of pigments; however, they are expensive and labor-intensive. Therefore, the development of a new method for quickly assessing the presence of pigments in the grain would help in investigating the mechanisms of genetic control of the pigment composition of barley grains. In this work, we developed a method for assessing the presence or absence of anthocyanins and melanin in the barley grain shell based on digital image analysis using computer vision and machine learning algorithms. A protocol was developed to obtain digital RGB images of barley grains. Using this protocol, a total of 972 images were acquired for 108 barley accessions. Seed coat from these accessions may contain anthocyanins, melanins, or pigments of both types. Chemical methods were used to accurately determine the pigment content of the grains. Four models based on computer vision techniques and convolutional neural networks of different architectures were developed to predict grain pigment composition from images. The U-Net network model based on the EfficientNetB0 topology showed the best performance in the holdout set (the value of the "accuracy" parameter was 0.821).

5.
Vavilovskii Zhurnal Genet Selektsii ; 26(8): 787-797, 2022 Dec.
Article in English | MEDLINE | ID: mdl-36694720

ABSTRACT

Phospholipases A2 (PLA2) are capable of hydrolyzing the sn-2 position of glycerophospholipids to release fatty acids and lysophospholipids. The PLA2 superfamily enzymes are widespread and present in most mammalian cells and tissues, regulating metabolism, remodeling the membrane and maintaining its homeostasis, producing lipid mediators and activating inflammatory reactions, so disruption of PLA2-regulated lipid metabolism often leads to various diseases. In this study, 29 PLA2 genes in the human genome were systematically collected and described based on literature and sequence analyses. Localization of the PLA2 genes in human genome showed they are placed on 12 human chromosomes, some of them forming clusters. Their RVI scores estimating gene tolerance to the mutations that accumulate in the human population demonstrated that the G4-type PLA2 genes belonging to one of the two largest clusters (4 genes) were most tolerant. On the contrary, the genes encoding G6-type PLA2s (G6B, G6F, G6C, G6A) localized outside the clusters had a reduced tolerance to mutations. Analysis of the association between PLA2 genes and human diseases found in the literature showed 24 such genes were associated with 119 diseases belonging to 18 groups, so in total 229 disease/PLA2 gene relationships were described to reveal that G4, G2 and G7-type PLA2 proteins were involved in the largest number of diseases if compared to other PLA2 types. Three groups of diseases turned out to be associated with the greatest number of PLA2 types: neoplasms, circulatory and endocrine system diseases. Phylogenetic analysis showed that a common origin can be established only for secretory PLA2s (G1, G2, G3, G5, G10 and G12). The remaining PLA2 types (G4, G6, G7, G8, G15 and G16) could be considered evolutionarily independent. Our study has found that the genes most tolerant to PLA2 mutations in humans (G4, G2, and G7 types) belong to the largest number of disease groups.

6.
Vavilovskii Zhurnal Genet Selektsii ; 25(1): 64-70, 2021 Feb.
Article in English | MEDLINE | ID: mdl-34901704

ABSTRACT

Determining the quantitative content of chlorophylls in plant leaves by their reflection spectra is an important task both in monitoring the state of natural and industrial phytocenoses, and in laboratory studies of normal and pathological processes during plant growth. The use of machine learning methods for these purposes is promising, since these methods allow inferring the relationships between input and output variables (prediction model), and in order to improve the quality of the prediction, a researcher may modify predictors and selects a set of method parameters. Here, we present the results of the implementation and evaluation of the random forest algorithm for predicting the total concentration of chlorophylls a and b from the reflection spectra of plant leaves in the visible and infrared wavelengths. We used the reflection spectra for 276 leaf samples from 39 plant species obtained from open sources. 181 samples were from the sycamore maple (Acer pseudoplatanus L.). The reflection spectrum represented wavelengths from 400 to 2500 nm with a step of 1 nm. The training set consisted of the 85 % of A. pseudoplatanus L. samples, and the performance was evaluated on the remaining 15 % samples of this species (validation sample). Six models based on the random forest algorithm with different predictors were evaluated. The selection of control parameters was performed by cross-checking on five partitions. For the first model, the intensity of the reflection spectra without any transformation was used. Based on the analysis of this model, the optimal ranges of wavelengths for the remaining five models were selected. The best results were obtained by models that used a two-point estimation of the derivative of the reflection spectrum in the visible wavelength range as input data. We compared one of these models (the two-point estimation of the derivative of the reflection spectrum in the range of 400-800 nm with a step of 1 nm) with the model by other authors (which is based on the functional dependence between two unknown parameters selected by the least squares method and two reflection coefficients, the choice of which is described in the article). The comparison of the results of predictions of the model based on the random forest algorithm with the model of other authors was carried out both on the validation sample of maple and on the sample from other plant species. In the first case, the predictions of the method based on a random forest had a lower estimate of the standard deviation. In the second case, the predictions of this method had a large error for small values of chlorophyll, while the third-party method had acceptable predictions. The article provides the analysis of the results, as well as recommendations for using this machine learning method to assess the quantitative content of chlorophylls in leaves.

7.
Vavilovskii Zhurnal Genet Selektsii ; 25(3): 251-259, 2021 May.
Article in Russian | MEDLINE | ID: mdl-34901721

ABSTRACT

The expression of eukaryotic genes can be regulated at several stages, including the translation of mRNA. It is known that the structure of mRNA can affect both the efficiency of interaction with the translation apparatus in general and the choice of translation initiation sites. To study the translated fraction of the transcriptome, experimental methods of analysis were developed, the most informative of which is ribosomal profiling (RP, Ribo-seq). Originally developed for use in yeast systems, this method has been adapted for research in translation mechanisms in many plant species. This technology includes the isolation of the polysomal fraction and high-performance sequencing of a pool of mRNA fragments associated with ribosomes. Comparing the results of transcript coverage with reads obtained using the ribosome profiling with the transcriptional efficiency of genes allows the translation efficiency to be evaluated for each transcript. The exact positions of ribosomes determined on mRNA sequences allow determining the translation of open reading frames and switching between the translation of several reading frames - a phenomenon in which two or more overlapping frames are read from one mRNA and different proteins are synthesized. The advantage of this method is that it provides quantitative estimates of ribosome coverage of mRNA and can detect relatively rare translation events. Using this technology, it was possible to identify and classify plant genes by the type of regulation of their expression at the transcription, translation, or both levels. Features of the mRNA structure that affect translation levels have been revealed: the formation of G2 quadruplexes and the presence of specific motifs in the 5'-UTR region, GC content, the presence of alternative translation starts, and the influence of uORFs on the translation of downstream mORFs. In this review, we briefly reviewed the RP methodology and the prospects for its application to study the structural and functional organization and regulation of plant gene expression.

8.
Vavilovskii Zhurnal Genet Selektsii ; 25(3): 269-275, 2021 May.
Article in English | MEDLINE | ID: mdl-34901723

ABSTRACT

Viroids belong to a very interesting class of molecules attracting researchers in phytopathology and molecular evolution. Here we review recent literature data concerning the genetics of Potato spindle tuber viroid (PSTVd) and the mechanisms related to its pathological effect on the host plants. PSTVd can be transmitted vertically through microspores and macrospores, but not with pollen from another infected plant. The 359 nucleotidelong genomic RNA of PSTVd is highly structured and its 3D-conformation is responsible for interaction with host cellular factors to mediate replication, transport between tissues during systemic infection and the severity of pathological symptoms. RNA replication is prone to errors and infected plants contain a population of mutated forms of the PSTVd genome. Interestingly, at 7 DAI, only 25 % of the newly synthesized RNAs were identical to the master copy, but this proportion increased to up to 70 % at 14 DAI and remained the same afterwards. PSTVd infection induces the immune response in host plants. There are PSTVd strains with a severe, a moderate or a mild pathological effect. Interestingly, viroid replication itself does not necessarily induce strong morphological or physiological symptoms. In the case of PSTVd, disease symptoms may occur due to RNA-interference, which decreases the expression levels of some important cellular regulatory factors, such as, for example, potato StTCP23 from the gibberellic acid pathway with a role in tuber morphogenesis or tomato FRIGIDA-like protein 3 with an early flowering phenotype. This association between the small segments of viroid genomic RNAs complementary to the untranslated regions of cellular mRNAs and disease symptoms provides a way for new resistant cultivars to be developed by genetic editing. To conclude, viroids provide a unique model to reveal the fundamental features of living systems, which appeared early in evolution and still remain undiscovered.

9.
Vavilovskii Zhurnal Genet Selektsii ; 24(4): 340-347, 2020 Jul.
Article in English | MEDLINE | ID: mdl-33659816

ABSTRACT

The color of the grain shell of cereals is an important feature that characterizes the pigments and metabolites contained in it. The grain shell is the main barrier between the grain and the environment, so its characteristics are associated with a number of important biological functions: moisture absorption, grain viability, resistance to pre-harvest germination. The presence of pigments in the shell affects various technological properties of the grain. Color characteristics, as well as the appearance of the grain shell are an important indicator of plant diseases. In addition, the color of the grains serves as a classifying feature of plants. Genetic control of the color formation of both grains and other plant organs is exerted by genes encoding enzymes involved in the biosynthesis of pigments, as well as regulatory genes. For a number of pigments, these genes are well understood, but for some pigments, such as melanin, which causes the black color of grains in barley, the molecular mechanisms of biosynthesis are still poorly understood. When studying the mechanisms of genetic control of grain color, breeders and geneticists are constantly faced with the need to assess the color characteristics of their shell. The technical means of addressing this problem include spectrophotometers, spectrometers, hyperspectral cameras. However, these cameras are expensive, especially with high resolution, both spatial and spectral. An alternative is to use digital cameras that allow you to get high-quality images with high spatial and color resolution. In this regard, recently, in the field of plant phenotyping, methods for evaluating the color and texture characteristics of cereals based on the analysis of two-dimensional images obtained by digital cameras have been intensively developed. This mini-review is devoted to the main tasks related to the analysis of color and texture characteristics of cereals, and to methods of their description based on digital images.

10.
Genetika ; 52(7): 788-803, 2016 Jul.
Article in Russian | MEDLINE | ID: mdl-29368867

ABSTRACT

Phenomics is a field of science at the junction of biology and informatics which solves the problems of rapid, accurate estimation of the plant phenotype; it was rapidly developed because of the need to analyze phenotypic characteristics in large scale genetic and breeding experiments in plants. It is based on using the methods of computer image analysis and integration of biological data. Owing to automation, new approaches make it possible to considerably accelerate the process of estimating the characteristics of a phenotype, to increase its accuracy, and to remove a subjectivism (inherent to humans). The main technologies of high-throughput plant phenotyping in both controlled and field conditions, their advantages and disadvantages, and also the prospects of their use for the efficient solution of problems of plant genetics and breeding are presented in the review.


Subject(s)
Image Processing, Computer-Assisted , Phenotype , Plant Breeding/methods , Plants/anatomy & histology , Plants/genetics
11.
Genetika ; 50(2): 172-80, 2014 Feb.
Article in Russian | MEDLINE | ID: mdl-25711025

ABSTRACT

In this study, genetic and monosomic analyses of the leaf pubescence of ANK 7A, ANK 7B, and ANK 7C wheat isogenic lines were carried out based on the Novosibirsk 67 wheat variety. According to visual analysis, the variety-recipient has a soft, uniform pubescence, and lines have trichomes on the surfaces of their leaves inherited from the two. Chinese varieties and one Soviet variety. Using the high throughput phenotyping method LHDetect2, which allows one to allocate the phenotypic classes of offspring in crosses based on the quantitative characteristics of leaf pubescence, it was found that chromosome 7B of the isogenic lines has a gene that determines the presence of long trichomes, and chromosome 7D of the Novosibirsk 67 variety has a gene that increases the density of pubescence. The obtained data allowed for the formulation of a hypothesis for the existence of a homoallelic series of genes that control leaf pubescence in the chromosomes of the seventh homeologous group of common wheat.


Subject(s)
Phenotype , Plant Leaves/genetics , Triticum/genetics , Chromosome Mapping , Chromosomes, Plant , Genetic Markers , Genotype , Plant Leaves/anatomy & histology
12.
Genetika ; 47(6): 836-41, 2011 Jun.
Article in Russian | MEDLINE | ID: mdl-21866865

ABSTRACT

Computer-aided image processing was used to study the morphology of leaf hairiness in the wheat cultivars Saratovskaya 29 and Golubka, as well as the introgressed strain 102/00i of the cultivar Rodina carrying the hairiness control gene introgressed from Aegilops speltoides. Morphological differences in leaf hairiness were detected and described in detail. The genetic control ofhairiness was studied in two cultivars (Golubka and Saratovskaya 29) with similar hairiness patterns. Crossing these cultivars with the cultivar Rodina showed a monogenic inheritance in the cultivar Golubka and a digenic inheritance in the cultivar Saratovskaya 29, which has a denser hairiness. In the strain 102/00i and the cultivar Golubka, the number of trichomes was positively correlated with their mean length. The cultivar Golubka was used as an example to study the effect of environmental conditions on the formation of hairiness. Plants of these cultivars were found to form more but shorter trichomes.


Subject(s)
Plant Leaves/genetics , Triticum/genetics
15.
Mol Biol (Mosk) ; 38(1): 69-81, 2004.
Article in Russian | MEDLINE | ID: mdl-15042837

ABSTRACT

The review describes several modules of the GeneExpress integrated computer system concerning the regulation of gene expression in eukaryotes. Approaches to the presentation of experimental data in databases are considered. The employment of GeneExpress in computer analysis and modeling of the organization and function of genetic systems is illustrated with examples. GeneExpress is available at http://wwwmgs.bionet.nsc.ru/mgs/gnw/.


Subject(s)
Gene Expression Regulation , Systems Integration , Animals , Databases, Genetic , Evolution, Molecular , Promoter Regions, Genetic , RNA, Messenger/genetics , Vertebrates/genetics
17.
Bioinformatics ; 17(11): 1035-46, 2001 Nov.
Article in English | MEDLINE | ID: mdl-11724732

ABSTRACT

MOTIVATION: It is known that the physico-chemical characteristics of proteins underlying specific folding of the polypeptide chain and the protein function are evolutionary conserved. Detection of such characteristics while analyzing homologous sequences would expand essentially the knowledge on protein function, structure, and evolution. These characteristics are maintained constant, in particular, by co-ordinated substitutions. In this process, the destabilizing effect of a substitution may be compensated by another substitution at a different position within the same protein, making the overall change in this protein characteristic insignificant. Consequently, the patterns of co-ordinated substitutions contain important information on conserved physico-chemical properties of proteins, requiring their investigation and development of the corresponding methods and software for correlation analysis of protein sequences available to a wide range of users. RESULTS: A software package for analyzing correlated amino acid substitutions at different positions within aligned protein sequences was developed. The approach implies searching for evolutionary conserved physico-chemical characteristics of proteins based on the information on the pairwise correlations of amino acid substitutions at different protein positions. The software was applied to analyze DNA-binding domains of the homeodomain class. As a result, two conservative physico-chemical characteristics preserved due to the co-ordinated substitutions at certain groups of positions in the protein sequence. Possible functional roles of these characteristics are discussed. AVAILABILITY: The program package is available at http://wwwmgs.bionet.nsc.ru/programs/CRASP/.


Subject(s)
Proteins/chemistry , Proteins/genetics , Software , Algorithms , Amino Acid Sequence , Amino Acid Substitution , Binding Sites/genetics , Chemical Phenomena , Chemistry, Physical , Computational Biology , Conserved Sequence , Evolution, Molecular , Homeodomain Proteins/chemistry , Homeodomain Proteins/genetics , Models, Molecular , Protein Structure, Tertiary
SELECTION OF CITATIONS
SEARCH DETAIL
...