Overlapping Genes?
Liam Reilly
| 12-12-2025
· Information Team
Genomes, the blueprints of life, contain vast stretches of DNA, which encode the proteins necessary for cellular function.
Among these, overlapping genes (OLGs) represent a fascinating genetic arrangement where two or more genes share nucleotide sequences partially or entirely within the same genomic region.

Defining Overlapping Genes

An overlapping gene occurs when expressible nucleotide sequences of two genes intersect within the genome. This overlap may happen on the same DNA strand, using different reading frames, or on opposite strands coding complementary sequences. Essentially, a single DNA segment simultaneously encodes multiple distinct protein products through alternative reading frames or initiation codons. This arrangement allows compact genomes, particularly in viruses, to maximize informational output from limited DNA.

Evolutionary Origins and Models

Overlapping genes often arise via a process known as overprinting, where point mutations in an existing gene create a new open reading frame (ORF) in an alternate reading frame, leading to a novel protein while preserving the original gene function. In many organisms — notably viruses and some bacteria — one frame represents the ancestral gene, while the overlapping frame is newly originated.
In some cases, overlapping proteins evolve under different selective pressures: the ancestral frame remains under strong purifying selection to conserve its function, while the de novo frame can undergo positive or relaxed purifying selection — allowing rapid adaptation or functional innovation.
However, over long evolution, the de novo protein may also become subject to stabilizing selection, resulting in “conserved overlaps” where both reading frames maintain their encoded proteins with minimal sequence divergence.

Functional Significance and Biological Impact

Initially discovered in viral genomes—where compactness is critical—overlapping genes have been identified across bacteria, archaea, and eukaryotes, including humans. Their presence expands the protein-coding potential without increasing genome size. Proteins encoded by OLGs often have unique, sometimes essential, roles in cellular processes, stress responses, and pathogen infectivity.
For instance, many viral overlapping proteins contribute to immune evasion, replication efficiency, or cell entry mechanisms. In mammals, overlapping genes have been implicated in tumor suppression, aging, and metabolic regulation. The ability to encode multiple proteins from the same DNA segment exemplifies genetic economy and innovation.

Challenges in Detection and Annotation

Overlapping genes are notoriously difficult to identify using conventional genome annotation techniques, which often overlook alternative reading frames or antisense transcripts. This oversight leads to underestimation of their prevalence and functional importance.
Sophisticated bioinformatic tools integrating sequence conservation, expression data, and evolutionary analysis are advancing OLG detection. Recent high-quality datasets from experimentally confirmed viral overlapping genes enable training of algorithms to discover overlooked OLGs in higher organisms.

Applications in Synthetic Biology

The unique capacity of overlapping genes to encode distinct proteins from a single nucleotide sequence inspires synthetic biology approaches to design compact genetic circuits with multiple functions. Recent advances utilize computational modeling and deep generative algorithms to engineer overlapping genes, suggesting applications in gene therapy, vaccine design, and biological computing.
Dr. Luis M. Muñoz-Baena, an academic researcher, who stated: "Gene overlap occurs when two or more genes are encoded by the same nucleotides. This phenomenon is found in all taxonomic domains, but is particularly common in viruses, where it may increase the information content of compact genomes or influence the creation of new genes."
Overlapping genes represent a compelling genomic feature that enables multiple distinct proteins to be encoded within the same DNA sequence through alternative reading frames and strand orientations. Their evolutionary emergence via overprinting and subsequent selective pressures illustrate the delicate balance of functional conservation and innovation. Present across diverse life forms, overlapping genes enhance coding capacity, contribute to crucial biological functions, and challenge traditional genome annotation methods.