Conservation and function of long noncoding RNAs

How about some cool science as we head toward the weekend?

Let’s talk about long noncoding RNAs (lncRNA) – they are (somewhat arbitrarily) defined as stretches of DNA that are at least 200 base pairs in length that are transcribed into mRNA but have little potential to code for proteins. Determining the function (if one exists) of a particular lncRNA can often be difficult.  In part, this may be due to the fact that lncRNA evolve much more quickly than protein-coding genes do and therefore exhibit a much smaller degree of sequence conservation, which can make identifying orthologs in other related organisms more difficult.  Nevertheless, if a particular lncRNA is functionally important, we would usually expect to see copies of it in related organisms, so finding these homologs can be an important indicator of function.

A new paper in Genes and Development by Quinn et al. is a useful demonstration of this.  The authors find evidence of 47 homologs of roX, an lncRNA involved in X chromosome dosage compensation,  across 35 fruit fly species.  The researchers identity roX homologs based on a combination of short regions of sequence conservation (“microhomology”), RNA secondary structure and synteny (i.e., similarity in location along a chromosome)  Here is the abstract (I believe the paper itself is open access):

Many long noncoding RNAs (lncRNAs) can regulate chromatin states, but the evolutionary origin and dynamics driving lncRNA–genome interactions are unclear. We adapted an integrative strategy that identifies lncRNA orthologs in different species despite limited sequence similarity, which is applicable to mammalian and insect lncRNAs. Analysis of the roX lncRNAs, which are essential for dosage compensation of the single X chromosome in Drosophila males, revealed 47 new roX orthologs in diverse Drosophilid species across ∼40 million years of evolution. Genetic rescue by roX orthologs and engineered synthetic lncRNAs showed that altering the number of focal, repetitive RNA structures determines roX ortholog function. Genomic occupancy maps of roX RNAs in four species revealed conserved targeting of X chromosome neighborhoods but rapid turnover of individual binding sites. Many new roX-binding sites evolved from DNA encoding a pre-existing RNA splicing signal, effectively linking dosage compensation to transcribed genes. Thus, dynamic change in lncRNAs and their genomic targets underlies conserved and essential lncRNA–genome interactions.

I think it’s a neat demonstration of both the challenges and utility of using evolutionary conservation to help inform inferences regarding the functionality of non-coding genes.  As we continue to get a better grasp of which non-coding sequences are important and which ones are less so, I expect to see many more studies like this.

Carl Zimmer also has an excellent write up of the paper here, which was the original inspiration for this post.  Enjoy!

Edited to correct the number of species in which roX was detected.

78 thoughts on “Conservation and function of long noncoding RNAs

  1. Before taking my leave… I am becoming more and more impressed with the history of the imperial old guard of geneticists such as Mark Ptashne.

    It would appear all of Molecular Biology is but a footnote to the efforts of the CSH Phage Group!

    Check out these two links:

    “Microbial Epigenetics Workshop”?! … who would of thought!? Hmmm – just the same, it would appear that the up and coming youngsters should be heeding more the classical papers authored by their predecessors, if Mark Ptashne is to be given his due.

  2. Allan Miller:
    Of course, the silencing of transposons is a slightly different issue. The main problem there is that they damage genes, so there is stronger selection – it’s mediated by that effect, rather than the incremental effect on gene size, which is pretty tiny per generation.

    One interesting wrinkle when it comes to epigenetic silencing of TEs is that there is often a cost involved with silencing when the TE has inserted itself near a gene. See Hollister and Gaut (2009) for a neat example:


    Transposable elements (TEs) are ubiquitous genomic parasites. The deleterious consequences of the presence and activity of TEs have fueled debate about the evolutionary forces countering their expansion. Purifying selection is thought to purge TE insertions from the genome, and TE sequences are targeted by hosts for epigenetic silencing. However, the interplay between epigenetic and evolutionary forces countering TE expansion remains unexplored. Here we analyze genomic, epigenetic, and population genetic data from Arabidopsis thaliana to yield three observations. First, gene expression is negatively correlated with the density of methylated TEs. Second, the signature of purifying selection is detectable for methylated TEs near genes but not for unmethylated TEs or for TEs far from genes. Third, TE insertions are distributed by age and methylation status, such that older, methylated TEs are farther from genes. Based on these observations, we present a model in which host silencing of TEs near genes has deleterious effects on neighboring gene expression, resulting in the preferential loss of methylated TEs from gene-rich chromosomal regions. This mechanism implies an evolutionary tradeoff in which the benefit of TE silencing imposes a fitness cost via deleterious effects on the expression of nearby genes.

  3. Tom,

    Thanks for the comment on methylation marks. I’m not a biologist, but I am writing a review article for engineers interested in cross disciplinary topics (It’s going to be a 4 rate review article since I’m no specialist, but for the sort of venue I’m writing it’s hopefully educational), and I’m trying to assert an analogy between methylation marks and histone modifications as memory, in fact I call epigenetic marks a form of RAM.

    Ptashne doesn’t like calling epigenetics a form of memory, but it looks to me like memory. Would you feel uncomfortable with the analogy of epigenetic marks and computer Random Access Memory.

    For starters:

    I’ll try not to derail Dave Carlson’s OP and I’ll try to limit my discussion of this, but the question about “Epigenetics is like Random Access Memory” is vital to a paper I’m writing. If the discussion gets long, I’ll move it elsewhere.

  4. TomMueller,

    Otherwise how does one explain constancy of karyotypes across primate lineages unless invoking positive selection?

    Why not negative (purifying) selection?

    Into the mix we’d have to add those species (eg muntjac) that seem to undergo more rapid rearrangements than our relatives. There may be a genomic distinction that causes fewer rearrangements in us, or selection against them which does not operate in muntjac, but it could simply be stochastic as well.

    Interesting picture of the interphase chromosome BTW.

  5. stcordova:
    I may have to publish in Vixra, but I’m writing it partly for my own education.

    Does that education include any geology or genetics or human population growth that you’re patently ignoring on your other “Black Swan” thread?

  6. stcordova,

    Would you feel uncomfortable with the analogy of epigenetic marks and computer Random Access Memory.

    You didn’t ask me, but I would. I do have a foot in both camps, the informatic and the molecular biological, and I say ‘nah’. Same with ‘code’ 😉 Analogies are like bananas with an umbrella stuck in them, except when they are more like invisible battleships …

    They are OK for illustration, but people can readily understand the actual thing. Analogies have a tendency to lead people astray, by over-egging the similarities and forgetting there must be differences too.

  7. Something I learned from one of the NIH researchers today on micro RNAs that I think has relevance to the lncRNAs. There is a lot of redundcany with the micro RNAs and their gene regulation is mostly tuning, and it’s not unusual for 9 micro RNAs to be involved in the regulation of one protein. You knock out one, there isn’t much noticeable phenotypic effect. You knock out all 9, it’s lethal!

    One paper was suggested as a resource:

    DICER was mentioned as well in connection with microRNAs, and I found this one on my own:

    Though miRNAs are totally at the opposite pole (so to speak) in length with lncRNAs. These RNA discoveries are happening at an explosive pace.

    It seems some miRNAs proceed form lncRNAs so there is relevance there.

    But there is this discovery reported this year:

    Extensive microRNA-mediated crosstalk between lncRNAs and mRNAs in mouse embryonic stem cells

    Recently, a handful of intergenic long noncoding RNAs (lncRNAs) have been shown to compete with mRNAs for binding to miRNAs and to contribute to development and disease. Beyond these reports, little is yet known of the extent and functional consequences of miRNA-mediated regulation of mRNA levels by lncRNAs. To gain further insights into lncRNA-mRNA miRNA-mediated crosstalk, we re-analyzed transcriptome-wide changes induced by the targeted knockdown of over 100 lncRNA transcripts in mouse embryonic stem cells (mESCs). We predicted that on average almost one fifth of the transcript level changes induced by lncRNAs are dependent on miRNAs that are highly abundant in mESCs. We validated these findings experimentally by temporally profiling transcriptome-wide changes in gene expression following the loss of miRNA biogenesis in mESCs. Following the depletion of miRNAs, we found that over 50% of lncRNAs and their miRNA-dependent mRNA targets were up-regulated coordinately, consistent with their interaction being miRNA-mediated. These lncRNAs are preferentially located in the cytoplasm and the response elements for miRNAs they share with their targets have been preserved in mammals by purifying selection. Lastly miRNA-dependent mRNA targets of each lncRNA tended to share common biological functions. Post-transcriptional miRNA-mediated crosstalk between lncRNAs and mRNA, in mESCs, is thus surprisingly prevalent, conserved in mammals and likely to contribute to critical developmental processes.

    Received July 22, 2014.
    Accepted March 17, 2015.

    Published by Cold Spring Harbor Laboratory Press

    Btw, Cold Spring Harbor — they harbor some top talent! I’ve always been impressed by the research that comes out of that place.

  8. Alright some more lncRNA stuff fresh from my 1st two days of class in the NIH. 🙂

    The term epigenetic has been evolving. It usually means methylation marks and histone modificaiton, but some will also include non-coding RNAs!

    Something to be aware of is that some cells can duplicate without DNA! For example beginning with this paper in 1966:

    It became apparent fertilized eggs could develop to the blastula stage (a few thousand cells) with no DNA in the nucleus, just left over mRNAs.

    So, at least for somatic cell lines, there is likely some transgenerational RNAs following mitotic divisions. It is thus arguable an “epigenetic” mark.

    I would not be surprised if the non-coding RNAs are performing some sort of monitoring and surveillance between mitotic divisions. There is short-term intergenerational information being stored there and we don’t realize it!!

    The lncRNA is involved with the PRC2 polycomb complex which modifies histones and histones signal chromatin accessibility and chromatin accessibility affect gene expression. Wew!

    These are the more technical details:

    Noncoding RNA and Polycomb recruitment

    A plethora of noncoding (nc) RNAs has been revealed through the application of high-throughput analysis of the transcriptome, and this has led to an intensive search for possible biological functions attributable to these transcripts. A major category of functional ncRNAs that has emerged is for those that are implicated in coordinate gene silencing, either in cis or in trans. The archetype for this class is the well-studied long ncRNA Xist which functions in cis to bring about transcriptional silencing of an entire X chromosome in female mammals. An important step in X chromosome inactivation is the recruitment of the Polycomb repressive complex PRC2 that mediates histone H3 lysine 27 methylation, a hallmark of the inactive X chromosome, and recent studies have suggested that this occurs as a consequence of PRC2 interacting directly with Xist RNA. Accordingly, other ncRNAs have been linked to PRC2 targeting either in cis or in trans, and here also the mechanism has been proposed to involve direct interaction between PRC2 proteins and the different ncRNAs. In this review, I discuss the evidence for and against this hypothesis, in the process highlighting alternative models and discussing experiments that, in the future, will help to resolve existing discrepancies.

    Genome-scale analysis of PRC2 targeting by lncRNAs

    The evidence implicating lncRNAs in targeting PRC2 to defined loci led to the idea that lncRNAs could play a wider role, directing PRC2 complexes to target loci genome-wide. PRC2 target loci have been mapped in several cell types using ChIP-seq methods. Peaks of occupancy number ∼1–2 × 103 and occur in relatively broad domains that correspond closely with CpG islands of the associated genes (Ku et al. 2008). A recent study identified a novel type of promoter-associated short RNAs transcribed from CpG island promoters (Kanhere et al. 2010). A large proportion of the short RNAs were predicted to form stem–loop structures resembling those seen with the A-repeat region in Xist RNA. Consistent with such a link, EMSA assays indicated binding of the short RNAs to PRC2, and RIP assays demonstrated that short promoter RNAs co-IP with PRC2 subunits. The EMSA assays indicated an interaction between A-repeat/short promoter RNAs and the SUZ12 PRC2 subunit but not the EZH2 subunit, as was reported in studies on Xist and HOTAIR ncRNAs (Kaneko et al. 2010; Tsai et al. 2010). Further studies are required to define the domain in Suz12 responsible for ncRNA interaction and to resolve the apparent discrepancy regarding which PRC2 subunit interacts with the ncRNAs (see further discussion below).

  9. stcordova,
    Uhmmm… I think you are misreading Ptashne.

    Epigenetics implies by definition “memory” of sorts. Ptashne merely remarks that nucleosome modification be it DNA methylation or Histone acetylation plays a role in down regulating gene expression but plays no role in propagating “memory”. Memory propagation is done by transcription factors. I advise you to read the last few paragraphs of his letter.


  10. Allan Miller: Analogies are like bananas with an umbrella stuck in them, except when they are more like invisible battleships …

    Your analogy about bananas was delicious! … metaphorically speaking! 😉

  11. Allan Miller,

    Hi again Allan… btw I quoted you on the AP Biology teachers’ forum to which you do have access.

    Regarding chromosome architecture and its importance… I think it important! If so – it stands to reason that whatever contributes to chromosome architecture would be important even if redundant and even if commonalities across lineages represents convergent evolution. i.e. the TEs being tamed and subsequently then co-opted need not be identical from lineage to lineage.

    but we rehash…

  12. Memory propagation is done by transcription factors.

    I think the RNAs have something to say. 🙂 If by memory propagation we also mean memory propagated during mitotic division, I think the RNAs play a role.
    Chromatin replication and epigenome maintenance

    Constance Alabert1 & Anja Groth1

    Stability and function of eukaryotic genomes are closely linked to chromatin structure and organization. During cell division the entire genome must be accurately replicated and the chromatin landscape reproduced on new DNA. Chromatin and nuclear structure influence where and when DNA replication initiates, whereas the replication process itself disrupts chromatin and challenges established patterns of genome regulation. Specialized replication-coupled mechanisms assemble new DNA into chromatin, but epigenome maintenance is a continuous process taking place throughout the cell cycle. If DNA synthesis is perturbed, cells can suffer loss of both genome and epigenome integrity with severe consequences for the organism.

    A sneak behind he paywall, notice ncRNAs mentioned.

    Maintenance of epigenetic states

    In a simplified view, restoration of epigenetic states after replication involves three processes. As we have discussed above, the first two involve: transmission of chromatin marks to the two new DNA strands, relying on the symmetrical nature of mCpGs and random segregation of parental (H3–H4)2 tetramers; and maturation processes linked to replication, such as histone deacetylation, DNA methylation and H3K9me1. In addition, chromatin states are shaped by a broad range of general chromatin maintenance mechanisms, including transcription-based processes, non-coding RNAs (ncRNAs), spreading of histone PTMs and crosstalk between marks. The prevailing view has been that rapid restoration of chromatin states after replication would be necessary for transmission of epigenetic information to daughter cells. However, emerging evidence shows that establishment of some methylation marks on new histones is a slow process that continues in daughter cells129, 130, 137, 138. Here, we highlight the cell cycle dynamics of chromatin restoration, and readers are referred to reviews on epigenetic memory for in-depth discussion of general maintenance mechanisms1, 97, 124, 127.

  13. Help me out here Allan… Not just protein, but RNA can also play the role of “transcription factor”? True or False

  14. “transcription factor”?

    I was going by the wiki definition. Am I mistaken to use it? Is there another convention?

    I’m not a biologist.

    Transcription factor

    From Wikipedia, the free encyclopedia

    In molecular biology and genetics, a transcription factor (sometimes called a sequence-specific DNA-binding factor) is a protein that binds to specific DNA sequences, thereby controlling the rate of transcription of genetic information from DNA to messenger RNA.

  15. TomMueller,

    Sure, chromosome architecture is important, but I suspect it is quite plastic nonetheless. You see a lot of to-ing and fro-ing between acrocentric and metacentric chromosomes in mammals depending on the polarity of female meiosis, and without some level of plasticity, we’d be struggling to account for karyotype differences between species more generally.

    And you could not have selection for linkage disequilibrium if loci could not shift, potentially influencing regions of chromatin control.

  16. TomMueller,

    Help me out here Allan… Not just protein, but RNA can also play the role of “transcription factor”? True or False

    I’m no expert, but I’d say ‘true’. But the convention does appear to be to say they are proteins, and nucleic acid regulators are therefore something else! As an RNA-World-er, I’d say nucleic acids must have taken the role initially.

  17. PIWI seems to regulate lncRNAs. If lncRNAs are regulated, it really suggests to me they have function!

    Transposable element dynamics and PIWI regulation impacts lncRNA and gene expression diversity in Drosophila ovarian cell cultures.
    Sytnikova YA1, Rahman R1, Chirn GW1, Clark JP1, Lau NC2.
    Author information

    Piwi proteins and Piwi-interacting RNAs (piRNAs) repress transposable elements (TEs) from mobilizing in gonadal cells. To determine the spectrum of piRNA-regulated targets that may extend beyond TEs, we conducted a genome-wide survey for transcripts associated with PIWI and for transcripts affected by PIWI knockdown in Drosophila ovarian somatic sheet (OSS) cells, a follicle cell line expressing the Piwi pathway. Despite the immense sequence diversity among OSS cell piRNAs, our analysis indicates that TE transcripts are the major transcripts associated with and directly regulated by PIWI. However, several coding genes were indirectly regulated by PIWI via an adjacent de novo TE insertion that generated a nascent TE transcript. Interestingly, we noticed that PIWI-regulated genes in OSS cells greatly differed from genes affected in a related follicle cell culture, ovarian somatic cells (OSCs). Therefore, we characterized the distinct genomic TE insertions across four OSS and OSC lines and discovered dynamic TE landscapes in gonadal cultures that were defined by a subset of active TEs. Particular de novo TEs appeared to stimulate the expression of novel candidate long noncoding RNAs (lncRNAs) in a cell lineage-specific manner, and some of these TE-associated lncRNAs were associated with PIWI and overlapped PIWI-regulated genes. Our analyses of OSCs and OSS cells demonstrate that despite having a Piwi pathway to suppress endogenous mobile elements, gonadal cell TE landscapes can still dramatically change and create transcriptome diversity.

    © 2014 Sytnikova et al.; Published by Cold Spring Harbor Laboratory Press.

    It’s those guys at Cold Spring Harbor again. Man they’re good.

  18. Here’s a big daddy repetitive lncRNA that’s megabases long and repeats a 359 bp long theme. They found a function for this behemoth.

    Repetitive centromeric satellite RNA is essential for kinetochore formation and cell division

    Chromosome segregation requires centromeres on every sister chromatid to correctly form and attach the microtubule spindle during cell division. Even though centromeres are essential for genome stability, the underlying centromeric DNA is highly variable in sequence and evolves quickly. Epigenetic mechanisms are therefore thought to regulate centromeres. Here, we show that the 359-bp repeat satellite III (SAT III), which spans megabases on the X chromosome of Drosophila melanogaster, produces a long noncoding RNA that localizes to centromeric regions of all major chromosomes. Depletion of SAT III RNA causes mitotic defects, not only of the sex chromosome but also in trans of all autosomes. We furthermore find that SAT III RNA binds to the kinetochore component CENP-C, and is required for correct localization of the centromere-defining proteins CENP-A and CENP-C, as well as outer kinetochore proteins. In conclusion, our data reveal that SAT III RNA is an integral part of centromere identity, adding RNA to the complex epigenetic mark at centromeres in flies.

  19. The following is my instructors’ suggested supplemental reading for lncRNAs this week. My class is mostly 75% focused on miRNAs, but since they have relation to lncRNAs, the lncRNAs were covered this week in class.

    lncRNAs are kind of a funny classification. The 200 bp limit is a bit of an accident as the company who found them arbitrarily assigned the number. There are lncRNAs that may be shorter. Also, lncRNAs are sort of a miscellaneous category right now, so they aren’t well defined.

    I was surprised that introns, pseudogenes, alternative spliced exons can generate lncRNAs — so the classification is a bit of a hodge podge right now.

    Anyway here is the suggested reading list:

    WEEK 2. Long Non-Coding RNAs: Biology and Functions
    1. Quinn JJ, Chang HY. (2015). “Unique features of long non-coding RNA biogenesis and function.” Nat Rev Genet. 17(1):47-62.

    2. Raveh E, et al. (2015). “The H19 Long non-coding RNA in cancer initiation, progression and metastasis – a proposed unifying theory.” Mol Cancer. 14(1): 184

    3. Hezroni H, et al. (2015). “Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species.” Cell Rep. 11(7): 1110-22.

    4. Hung CL, et al. (2014). “A long noncoding RNA connects c-Myc to tumor metabolism.” Proc Natl Acad Sci U S A. 111(52): 18697-702.

    5. Liu YR, et al. (2015). “Long noncoding RNAs in hepatocellular carcinoma: Novel insights into their mechanism.” World J Hepatol. 7(28): 2781-91.

    6. Sigdel KR, et al. (2015). “The Emerging Functions of Long Noncoding RNA in Immune Cells: Autoimmune Diseases.” J Immunol Res. 2015: 848790.

  20. stcordova,

    Thanks for the references, Sal. I have a rather overwhelming number of other things to do at the moment, but I will try to check these out when I have some more time.

  21. I have a rather overwhelming number of other things to do at the moment

    Hey, I totally understand now that my semester is finally in swing.

    Good luck with your stuff too!

  22. This is one of the frontier papers of lncRNAs even though it was written in 2012. It’s behind a paywall, but I’ll give you the abstract:

    Epigenetic Regulation by Long Noncoding RNAs
    Jeannie T. Lee

    Recent studies show that transcription of the mammalian genome is not only pervasive but also enormously complex. It is estimated that an average of 10 transcription units, the vast majority of which make long noncoding RNAs (lncRNAs), may overlap each traditional coding gene. These lncRNAs include not only antisense, intronic, and intergenic transcripts but also pseudogenes and retrotransposons. Do they universally have function, or are they merely transcriptional by-products of conventional coding genes? A glimpse into the molecular biology of multiple emerging lncRNA systems reveals the “Wild West” landscape of their functions and mechanisms and the key problems to solve in the years ahead toward understanding these intriguing macromolecules.

    It was a very enlightening read. It echoes the sympathies of the NIH researchers.

  23. This is a free article. You can get the PDF from this link:

    Cell cycle regulation by long non-coding RNAs


    The mammalian cell cycle is precisely controlled by cyclin-dependent kinases (CDKs) and related pathways such as the RB and p53 pathways. Recent research on long non-coding RNAs (lncRNAs) indicates that many lncRNAs are involved in the regulation of critical cell cycle regulators such as the cyclins, CDKs, CDK inhibitors, pRB, and p53. These lncRNAs act as epigenetic regulators, transcription factor regulators, post-transcription regulators, and protein scaffolds. These cell cycle-regulated lncRNAs mainly control cellular levels of cell cycle regulators via various mechanisms, and may provide diversity and reliability to the general cell cycle. Interestingly, several lncRNAs are induced by DNA damage and participate in cell cycle arrest or induction of apoptosis as DNA damage responses. Therefore, deregulations of these cell cycle regulatory lncRNAs may be involved in tumorigenesis, and they are novel candidate molecular targets for cancer therapy and diagnosis.


    The STAT3-Binding Long Noncoding RNA lnc-DC Controls Human Dendritic Cell Differentiation


    Long noncoding RNAs (lncRNAs) play important roles in diverse biological processes; however, few have been identified that regulate immune cell differentiation and function. Here, we identified lnc-DC, which was exclusively expressed in human conventional dendritic cells (DCs). Knockdown of lnc-DC impaired DC differentiation from human monocytes in vitro and from mouse bone marrow cells in vivo and reduced capacity of DCs to stimulate T cell activation. lnc-DC mediated these effects by activating the transcription factor STAT3 (signal transducer and activator of transcription 3). lnc-DC bound directly to STAT3 in the cytoplasm, which promoted STAT3 phosphorylation on tyrosine-705 by preventing STAT3 binding to and dephosphorylation by SHP1. Our work identifies a lncRNA that regulates DC differentiation and also broadens the known mechanisms of lncRNA action.

Leave a Reply