ABSTRACT
The bithorax complex (BX-C) of Drosophila, one of two complexes that act as master regulators of the body plan of the fly, is included within a sequence of 338,234 bp (SEQ89E). This paper presents the strategy used in sequencing SEQ89E and an analysis of its open reading frames. The BX-C sequence (BXCALL) contains 314,895 bp obtained by deletion of putative genes that are located at each end of SEQ89E and appear to be functionally unrelated to the BX-C. Only 1.4% of BXCALL codes for the three homeodomain-containing proteins of the complex. Principal findings include a putative ABD-A protein (ABD-AII) larger than a previously known ABD-A protein and a putative glucose transporter-like gene (1521 bp) located at or near the bithoraxoid (bxd), infra-abdominal-2 (iab-2) boundary on the opposite strand relative to that of the homeobox-containing genes.
Subject(s)
Drosophila/genetics , Genes, Insect , Animals , Codon , Introns , Molecular Sequence Data , Open Reading Frames , Restriction MappingABSTRACT
The bithorax complex (BX-C) of Drosophila, one of two complexes that act as master regulators of the body plan of the fly, has now been entirely sequenced and comprises approximately 315,000 bp, only 1.4% of which codes for protein. Analysis of this sequence reveals significantly overrepresented DNA motifs of unknown, as well as known, functions in the non-protein-coding portion of the sequence. The following types of motifs in that portion are analyzed: (i) concatamers of mono-, di-, and trinucleotides; (ii) tightly clustered hexanucleotides (spaced < or = 5 bases apart); (iii) direct and reverse repeats longer than 20 bp; and (iv) a number of motifs known from biochemical studies to play a role in the regulation of the BX-C. The hexanucleotide AGATAC is remarkably overrepresented and is surmised to play a role in chromosome pairing. The positions of sites of highly overrepresented motifs are plotted for those that occur at more than five sites in the sequence, when < 0.5 case is expected. Expected values are based on a third-order Markov chain, which is the optimal order for representing the BXCALL sequence.