Accumulating data from large-scale transcriptome studies have identified a class of poorly understood non-protein-coding RNAs, including microRNAs, piwi-interacting RNAs (piRNAs), and long non-coding RNAs (lncRNAs), and a number of studies suggest that lncRNAs modulate the expression of protein-coding genes in a variety of tissues and organs by altering chromatin modification, transcription, mRNA decay, protein subcellular localization, and other key processes. Although much work still remains to identify the roles of lncRNAs in reproduction-related systems, they are likely to exert widespread effects during these processes. In this review, we highlight our emerging understanding of how lncRNAs regulate gene expression, and we discuss the physiological role of this new class of molecular regulators in neurobiology, cardiology, endocrinology, metabolism, muscle biology, and female reproductive disorders.
Over the past several decades, the indispensable roles of RNA in numerous biological processes have been demonstrated, but it has only been recently that a small number of functional RNAs have attracted significant attention. A particularly striking finding is that the genomes of complex organisms contain large amounts of non-protein-coding sequences that scale consistently with developmental complexity (1). These previously so-called “transcription noise” sequences are now known to have specific roles in multiple biological processes and are referred to as noncoding RNAs (ncRNAs). The amount of ncRNAs in eukaryotes is quite vast and exceeds that of protein-coding genes (2). Among different types of ncRNAs, classes of transcripts longer than 200 nucleotides that lack any distinct open reading frames are defined as long non-coding RNAs (lncRNAs).
The identification of ncRNAs has increased dramatically in recent years, and it is now known that one class of ncRNAs termed microRNAs plays diverse roles in both physiological (3, 4) and pathological processes (5, 6), and this is likely to also be the case for lncRNAs. The principles behind their mechanisms of action are presented here using a selection of lncRNAs whose functions have been evaluated via the most robust methods available. In this review, we will summarize our current understanding of lncRNA functions in molecular biology and specifically in female reproductive systems.
Although only a small number of lncRNAs have been well characterized to date, they have been shown to play vital roles in molecular, cellular, physiological, and pathological functions by fine-tuning the expression of protein-coding genes (7, 8). It is clear that lncRNAs usually function through interactions with DNA, other RNAs, and proteins, either through direct base pairing with complementary sequences or through secondary structure generated by RNA folding. Here, we will classify these regulatory non-coding transcripts into those that regulate local gene expression in cis (Figure 1) versus those that function far from the transcription site and perform their regulatory roles in trans (Figure 2).
Cis-regulation of local gene expression by lncRNAs. A. lncRNAs may influence chromatin architecture and/or transcription by interacting with transcription factors or chromatin-modulating proteins. B. Enhancer RNAs (eRNA) expanded the regulatory capacity of cis-regulatory lncRNAs by coordinating transcriptional activation or stabilizing enhancer-promoter looping. Pol II, RNA polymerase II; TF transcription factor.
Trans-regulation of unlinked gene expression by lncRNAs. A. lncRNAs may regulate distant gene expression by recruiting chromatin-modifying proteins, transcription factors and also by influencing nuclear structure to regulate DNA transcription, RNA processing. B. Trans-acting lncRNAs could interact with proteins and/or other RNA molecules. Pol II, RNA polymerase II; TF transcription factor.
Numerous lncRNAs have been shown to play a cis-regulatory role in the expression of nearby genes. A portion of these lncRNAs can influence chromatin architecture and/or transcription by interacting with transcription factors or chromatin-modulating proteins and thus promoting their recruitment to nearby gene loci where they control the transcriptional activity of these genes. For instance, the lncRNA ANRIL binds to SUZ12, one of the histone methyltransferase polycomb repression complex 2 (PRC2) subunits, and depletion of ANRIL disrupts SUZ12 occupancy on the p15INK4B locus and thus increases its gene expression (9). Similar to that, the lncRNRA linc-HOXA1 in mouse embryonic stem cells has been shown to repress Hoxa1 gene expression, which is located 50 kb away from linc-HOXA1, by recruiting PURB as a transcriptional regulator (10). In another study, the lncRNA HoxBlinc was reported to promote hematopoietic development by up-regulating Hoxb gene expression. HoxBlinc acts as a regulator of chromatin loop structures by guiding Set1/MLL1 to the Hoxb gene locus to control lineage-specific transcription (11).
In addition to the mechanism described above, the discovery of enhancer RNAs (eRNAs), a class of lncRNAs synthesized at enhancers, has greatly expanded our understanding of the regulatory capacity of cis-regulatory lncRNAs (12). There is direct evidence that eRNAs play functional roles in nearby gene expression (13, 14), and the regulatory action of eRNAs requires both the sequences that mediate transcription factor binding and the specific sequences that encode the eRNA transcript, although the critical determining factors for this specificity have not been identified (13). In human breast cancer cells, induced eRNAs play important roles in the induction of target coding genes by increasing the strength of specific enhancer looping that is initiated by ERα binding (14). These enhancer-like functions show a more complex role for cis-regulatory lncRNAs in transcription factor binding and subsequent gene transcription and provide a novel field for the study of lncRNAs in gene regulation.
LncRNAs were initially reported to be cis-regulatory factors that interact with neighboring genes, but now extensive numbers of both cis- and trans-acting lncRNAs have been identified. HOTAIR was the first lncRNA shown to operate in trans on chromosomes other than its original site of transcription. Chromatin immunoprecipitation followed by hybridization to tiling microarrays interrogating all human promoters (ChIP-chip) showed that HOTAIR recruits the PRC2 complex to specific target genes genome-wide, leading to H3K27 trimethylation and epigenetic silencing of metastasis suppressor genes (15). Thus lncRNAs such as HOTAIR regulate chromatin states and gene expression in regions far away from their transcription sites.
In addition to recruiting chromatin-modifying proteins, trans-regulating lncRNAs also influence nuclear structure and organization or interact with proteins and/or other RNA molecules. Some lncRNAs influence nuclear architecture in order to regulate DNA transcription, RNA processing, and other steps during the process of gene expression. MALAT1 is a well-known nuclear speckles-retained lncRNA that is recruited to nuclear speckles by directly interacting with multiple splicing-associated proteins (16). In addition, MALAT1 acts as a scaffold that helps to position nuclear speckles at active gene loci (17). Also, trans-acting lncRNAs can function by interacting with proteins and other RNAs. Traditionally, proteins have been thought to be the major scaffolds in biological processes, but recent evidence suggests that lncRNAs can play a similar role. For example, the lncRNA NORAD functions as a molecular decoy for the RNA-binding proteins PUMILIO1 and PUMILIO2 and accelerates mRNA decay and translational inhibition of these mRNA targets (18). Also, the p53-responsive lncRNA GUARDIN has been identified as an RNA scaffold between BRCA1 and BARD1 and thus plays an important role in maintaining genomic stability in cells exposed to genotoxic stress as well as under steady-state conditions (19). Taken together, these observations show that lncRNAs are capable of functioning through multiple mechanisms at different points in the process of gene expression.
Recently, lncRNAs have been implicated in numerous cellular processes ranging from embryonic stem cell pluripotency (20, 21) to immune responses (22, 23), cell cycle regulation (24), cell proliferation, and cell death. The roles of lncRNAs in the regulation of the endocrine system, reproductive system, and other systems have also been explored, among which the best characterized are the nervous and cardiovascular systems.
Neural development is a highly stereotyped process that requires precise spatiotemporal regulation of cell proliferation and differentiation, and lncRNAs appear to function as a regulatory mechanism to fine-tune neuronal development and function. LncRNAs are abundantly expressed in the central nervous system (25), and several lncRNAs have been implicated in neuronal development and in the differentiation of neurons (26). For instance, the lncRNA RMST is regulated by the transcription factor REST, which is known as a master negative regulator that controls neurogenesis. And RMST then drives the recruitment of the neural transcription factor SOX2 to key neurogenesis-promoting genes. Loss of RMST blocks exit from the embryonic stem cell state and the initiation of neural differentiation (27). Another recent study has reinforced the important role that lncRNAs play in the brain by means of a collection of 18 lncRNA knockout mouse lines, three of which exhibit peri- or post-natal lethality and two of which show distinct developmental defects (28).
Epigenetic regulation is crucial during cardiovascular development, and lncRNAs play important roles in this process. Transcriptome analyses in the AC16 immortalized adult ventricular cardiomyocyte cell line identified a wide array of transcripts and found that lncRNAs are involved in cardiac development, physiology, and pathology (29). Fendrr, for instance, is essential for heart development in mice and likely exerts its functions by binding to PRC2 chromatin-remodeling complexes, and mouse embryos lacking Fendrr display upregulation of several transcription factors – such as Foxf1 and Pitx2 – that control lateral plate and cardiac mesoderm differentiation (30). Another example of a lncRNA required for cardiomyocyte lineage commitment is Braveheart (Bvht), which is necessary for the activation of a core network of cardiovascular genes and functions upstream of mesoderm posterior 1 (MesP1) and is thus indispensable for the development of cardiovascular progenitor cells. This function is carried out through a similar mechanism as Fendrr described above (31). Other lncRNAs have also been identified specifically in the heart and shown to be directly related to cardiac biology. For example, the lncRNA Mhrt is likely to protect the heart from pathological hypertrophy by antagonizing the activity of Brg1, a chromatin-remodeling enzyme that promotes aberrant gene expression and cardiac myopathy in response to stress (32).
The identification of a growing set of lncRNAs in endocrine organs – including the pancreas, pituitary, hypothalamus, thyroid, and parathyroid glands – has provided new insights into the properties of lncRNAs. For example, gonadotropin-releasing hormone (GnRH) is secreted by neurons of the hypothalamus, which is crucial in the regulation of normal reproductive development and function. Dysregulation of the GnRH gene Gnrh1 has been implicated in the incorrect timing of puberty, reproductive deficiencies, and infertility. A mouse Gnrh1 enhancer-derived noncoding RNA, GnRH-E1 RNA, is considered to be an inducer of Gnrh1 transcription in GnRH neuronal cell lines, thus GnRH-E1 RNA is suggested to participate in the development and maturation of GnRH neurons (33). Pancreatic islets serve a critical role in metabolic homeostasis through insulin secretion (34), and a comprehensive strand-specific transcriptome map of the human pancreatic islets and β cells identified more than 1,100 intergenic and antisense islet-cell lncRNA genes. These islet lncRNAs are dynamically regulated and have been shown to be an integral part of the β cell differentiation and maturation process. Moreover, 42% of mouse orthologous transcripts are found in islet cells and are regulated in a similar manner as their human counterparts (35). Profiling analyses have shown that lncRNAs are widely expressed in the endocrine-related system; however, there is little knowledge on the function of lncRNAs in endocrine-related organs, which needs to be further explored.
Adipose tissue plays multiple roles in energy storage and expenditure, endocrine signaling, and immune-metabolic crosstalk. There are two principal types of adipose tissues in mammals, namely brown adipose tissue (BAT), which is specialized in producing heat and consuming energy as a defense against cold and obesity (36), and white adipose tissue, which is in charge of storing chemical energy in the form of triglycerides. To a certain extent, lncRNAs are involved in adipocyte differentiation, development, and function. Whole-transcriptome RNA-Seq identified 175 lncRNAs that are specifically regulated during adipogenesis in mice, most of which are required for adipocyte differentiation (37). More recently, RNA-Seq analysis of transcriptomes in different adipose tissues identified 127 BAT-restricted lncRNAs. One of them, lnc-BATE1, is implicated in the establishment and maintenance of BAT and has been shown to play a role in thermogenesis. Further experiments showed that knockdown of lnc-BATE1 impaired the differentiation of brown adipocytes, as revealed by decreased expression of brown fat markers and mitochondrial markers (38).
LncRNAs are also involved in muscle differentiation. The lncRNA H19 is abundantly expressed in fetal tissues and in adult muscles. H19 can produce microRNAs miR-675-3p and miR-675-5p, which are able to down-regulate anti-differentiation transcription factor SMAD to promote differentiation and regeneration of skeletal muscle. H19-deficient mice display abnormal skeletal muscle regeneration, which can be rescued by ectopic expression of miR-675-3p and miR-675-5p (39). In addition, H19 also functions as a molecular sponge to modulate the microRNA let-7 family and thus to control muscle differentiation (40).
Although the functions of lncRNAs in normal biological processes and development such as neurobiology, endocrinology, cardiology, and muscle biology as described above have been well-characterized, the challenge remains to determine the roles of lncRNAs in the reproductive system (41). From a regulatory perspective, lncRNAs are very likely to be involved in the development of the female reproductive system and in regulating fertility (42).
Folliculogenesis is a complex process that is regulated by a broad molecular network, and the mammalian ovarian follicle consists of germ cells, somatic cells (both cumulus and granulosa cells), and follicular fluid (43). The identification of lncRNAs in the follicular microenvironment and their expression in the different compartments of the ovary would improve our understanding of oocyte growth and maturation. Well-characterized lncRNA H19 and Xist transcripts are expressed in bovine transzonal projections, which control the connection between the oocyte and the surrounding somatic cells in follicles, and it is through these projections that the somatic compartment of the follicle continues to nurture the oocyte after it becomes transcriptionally quiescent and through which they act during folliculogenesis. H19 is a maternally imprinted gene, whereas Xist is known to regulate transcriptional inactivation of the X-chromosomes in females (44). Another study showed that expression of the lncRNA AK124742 was up-regulated in cumulus cells from mature oocytes that later developed into high-quality embryos, whereas its expression was much lower in those from oocytes that resulted in poor-quality embryos. Therefore, AK124742 is believed to be significantly correlated with oocyte maturation, as well as with fertilization, embryo quality, and clinical pregnancy outcomes (45). Similarly, a microarray analysis identified a total of 20,563 lncRNAs expressed in human cumulus cells. Among them, 124 lncRNAs were consistently upregulated in high-quality cumulus cells, while 509 lncRNAs were downregulated. Those lncRNAs expressed in cumulus cells might be involved in oocyte development and early embryogenesis by regulating neighboring protein-coding genes (46). Another group updated the existing data on differentially expressed genes between granulosa cells and cumulus cells by identifying a number of lncRNA transcripts in parts of the genome other than coding regions (47).
Anti-Müllerian hormone (AMH) is an important hormone regulating folliculogenesis in the ovary, and the Amhr2 protein is a pivotal molecule in mediating AMH signaling (48). In addition, lncRNA-Amhr2 has been shown to activate the Amhr2 gene in mouse ovarian granulosa cells. Global transcriptome sequencing between compact cumulus cells from germinal vesicle cumulus oocyte complexes and expanded cumulus cells from metaphase II cumulus oocyte complexes showed differential expression of numerous lncRNA molecules. Interestingly, 12 of these differentially expressed lncRNAs are encoded within introns of genes, including ADAMTS9, AQP2, AQP5, CACNA1C, CHRM3, DPP4, FABP6, GPC5, HAS2, HSD11B1, ITGA6 and WNT5A, which are known to be involved in proliferation, steroidogenesis, and apoptosis in granulosa cells (49). These observations emphasize the importance of lncRNAs in oocytes and peripheral somatic cells. The identification of lncRNAs in cumulus cells and granulosa cells provides new clues as to how lncRNAs participate in the differentiation of follicular somatic cells. However, further studies are still required to characterize their biological functions during folliculogenesis.
Mammalian ovulation triggers the reorganization of follicular somatic cells through hypertrophy and hyperplasia, leading to the ovulation of a mature ovum and subsequent corpus luteum formation and progesterone secretion. However, little is known about the effects of lncRNAs on ovulation. A high-throughput RNA-seq assay was performed to identify differentially expressed lncRNAs between the ovaries of multiparous and uniparous goats. Combined with cis role analysis, 24 lncRNAs were predicted to overlap with cis-regulatory elements involved in the four ovulation-related pathways of progesterone-mediated oocyte maturation, steroid biosynthesis, oocyte meiosis, and GnRH signaling. This expanded our understanding of lncRNA biology and provided clues for how lncRNA mediates the regulation of goat ovulation and lambing (50). In another study, it was shown that Neat1 knockout mice fail to become pregnant due to corpus luteum dysfunction and low progesterone levels. Neat1 is a well-characterized lncRNA that exclusively localizes to the subnuclear domain paraspeckles and induces the relocation of Sfpq, one of the core paraspeckle proteins, from gene promoter regions to the paraspeckles (51, 52). Also, Neat1 sequesters Sfpq in paraspeckles in luteal cells, and together these observations suggest that Neat1 is essential for the formation of the corpus luteum and for the subsequent establishment of pregnancy. However, the precise molecular mechanism through which Neat1 functions remains to be determined (53).
Folliculogenesis is a primary determinant of female fertility, and dysregulation of different steps in the process can lead to multiple female reproductive disorders such as infertility, miscarriage, and adverse pregnancy outcomes. LncRNAs have recently been explored as epigenetic regulators in the female reproductive system and have shed new light on the etiology of numerous reproductive disorders. Here we mainly focus on the current evidence for the involvement of lncRNAs in polycystic ovary syndrome (PCOS) and premature ovarian insufficiency (POI).
The roles that miRNAs in follicular somatic cells and the follicular fluid play in oocyte maturation and the etiology of PCOS have been investigated (54, 55), and thus it is likely that the lncRNA expression profile might also reveal a role for these molecules in the pathological events in granulosa cells and cumulus cells that are associated with PCOS.
Several lncRNAs that are expressed in granulosa and cumulus cells have been specifically associated with PCOS, including PWRN2 and HCG26 (56, 57). A microarray analysis was performed on cumulus cells isolated from five patients with PCOS and five healthy women, and 627 differentially expressed lncRNAs were identified, of which five were confirmed to be consistent with the microarray data by qRT-PCR. Notably, approximately 20 lncRNAs were co-expressed with a type 2 diabetes mellitus candidate gene, neuropeptide Y1 receptor (NPY1R), and the co-expressed lncRNAs might play a vital role in the etiology of PCOS. Furthermore, the up-regulated lncRNA PWRN2 and its co-expressed gene ATP6V1G3 are likely to cause oocyte dysplasia in PCOS by down-regulating the pH level of the follicular microenvironment (56). Another independent microarray analysis was performed to compare lncRNA expression in granulosa cells from seven women with PCOS and seven matched healthy women, and this showed that a lncRNA named HCG26 is up-regulated in PCOS patients. Functional analysis showed that knockdown of HCG26 inhibits cell proliferation while promoting aromatase gene expression and estradiol production. These findings indicate that HCG26 is involved in granulosa cell proliferation and steroidogenesis and thus contributes to the pathogenesis of PCOS (57).
In addition to lncRNAs in granulosa cells, significantly increased expression of the lncRNAs SRA and CTBP1-AS in peripheral blood leukocytes of patients with PCOS have also been observed. Also, serum levels of lncRNA GAS5 has been shown to be decreased in PCOS patients with insulin resistance (58-60). Taken together, these studies provide compelling evidence that aberrantly expressed lncRNAs in granulosa cells or peripheral blood might underlie the pathophysiology of human PCOS, and they expand the database of diagnostic biomarkers and therapeutic targets for PCOS.
The role of lncRNAs has also been studied in rodent models of PCOS. For instance, deep sequencing in the ovaries of a letrozole-induced PCOS rat model revealed the lncRNA expression profile, and a joint pathway analysis was constructed of the vital lncRNA-miRNA-mRNA networks in order to determine the competitive endogenous RNA mechanism involved in the PCOS model (61). In addition to rat models, results in mice have indicated that abnormally elevated SRA expression promotes cell proliferation, inhibits cell apoptosis, induces the secretion of estradiol and progesterone in granulosa cells, and might be a risk factor for developing PCOS (62).
The causative genes of POI have been extensively studied in coding regions (63), and recently ncRNAs have begun to be explored as epigenetic regulators in the ovaries. Similar to PCOS, a fair number of studies have addressed the association of miRNAs with POI (64), but our understanding of lncRNAs in the pathophysiology of POI is still quite limited.
One of the few studies that have been performed showed expression of lncRNAs FMR4 and FMR6, which are transcribed from the fragile X-associated POI-correlated FMR1 gene locus, in the granulosa cells of FMR1 premutation carriers and control groups. There was a significant nonlinear association between the number of CGG repeats and the levels of FMR6. In addition, a significant negative correlation was observed between the number of oocytes retrieved and the expression level of FMR6 in granulosa cells (65).
Another recent study showed that overexpression of the lncRNA-Meg3-p53-p66Shc pathway in mice might be the main mechanism for cyclophosphamide-induced ovarian injury and POI and that knockdown of lncRNA-Meg3 can effectively reduce the activation effect of cyclophosphamide on follicles (66). Despite these studies, our current understanding of the association between lncRNAs and POI is still at an early stage, and the contribution of this class of regulatory factors in the etiology of human POI has yet to be determined.
Evidence that lncRNAs are involved in female reproductive disorders and for their potential roles in folliculogenesis has been provided and further verified (Table 1). However, only a small number of identified lncRNAs have been thoroughly studied mechanistically, and the publications available to date have been biased toward lncRNAs with observable functions in the female reproductive system. In addition, studying the functions of lncRNAs is difficult because of their relatively low expression levels and unusual evolutionary properties (67). For example, lncRNA promoters show greater sequence conservation than the background DNA, and they are almost as conserved as protein-coding gene promoters (68). While it appears that many lncRNAs detected in one species may not be transcribed in another species. Along with the temporally and spatially restricted expression patterns of lncRNAs (69), likely explains why our understanding of lncRNA function is where the microRNA field was a decade ago.
In the post-genomic era, the door has been opened for us to see the greater potential of the dark matter of the genome using both new technologies and updated databases. LncRNAs will advance science and medicine as biomarkers, therapeutic targets, and diagnostic indexes in human disorders. Thus, continued studies are warranted to elucidate their physical properties, their molecular mechanisms of action, and their biological roles in both physiological ovarian function and in the pathogenesis of female reproductive disorders.
This study was supported by grants from the National Key Research & Developmental Program of China (2017YFC1001100), National Natural Science Foundation of China (81522018, 81471509, 81571406, and 81771541).