BACKGROUND:The rodent specific reproductive homeobox (Rhox) gene cluster on the X chromosome has been reported to contain twelve homeobox-containing genes, Rhox1-12.
RESULTS:We have identified a 40 kb genomic region within the Rhox cluster that is duplicated eight times in tandem resulting in the presence of eight paralogues of Rhox2 and Rhox3 and seven paralogues of Rhox4. Transcripts have been identified for the majority of these paralogues and all but three are predicted to produce full-length proteins with functional potential. We predict that there are a total of thirty-two Rhox genes at this genomic location, making it the most gene-rich homoeobox cluster identified in any species. From the 95% sequence similarity between the eight duplicated genomic regions and the synonymous substitution rate of the Rhox2, 3 and 4 paralogues we predict that the duplications occurred after divergence of mouse and rat and represent the youngest homoeobox cluster identified to date. Molecular evolutionary analysis reveals that this cluster is an actively evolving region with Rhox2 and 4 paralogues under diversifying selection and Rhox3 evolving neutrally. The biological importance of this duplication is emphasised by the identification of an important role for Rhox2 and Rhox4 in regulating the initial stages of embryonic stem (ES) cell differentiation.
CONCLUSION:The gene rich Rhox cluster provides the mouse with significant biological novelty that we predict could provide a substrate for speciation. Moreover, this unique cluster may explain species differences in ES cell derivation and maintenance between mouse, rat and human.