Some evidence ALUs and SINES aren’t junk and garbologists are wrong

Larry Moran, Dan Graur and other garbologists (promoters of the junkDNA perspective), have argued SINES and ALU elements are non-functional junk. That claim may have been a quasi-defensible position a decade ago, but real science marches forward. Dan Graur can only whine and complain about the hundreds of millions of dollars spent at the NIH and elsewhere that now strengthens his unwitting claim in 2013, “If ENCODE is right, Evolution is wrong.”

Larry said in Junk in Your Genome: SINES

In humans, the largest family of SINEs is called Alu elements after the fact that the sequence is cleaved by the restriction endonuclease Alu. These SINEs are also derived from 7SL RNA but the rearrangement is different from that in mouse. (They have a common ancestor.) There are about one million Alu elements in the human genome.

SINEs make up about 13% of the human genome. The largest proportion, by far, is Alu elements but there are small numbers of SINEs derived from other cellular RNAs such as the U RNAs required for splicing and snoRNAs (Garcia-Perez et al. 2007).

SINEs are parasites (selfish DNA). They are not essential for human survival and reproduction, especially the huge majority of SINEs that are defective. Thus, at least 13% of the human genome is clearly junk. The total amount of junk DNA contributed by all transposable elements is 44% of the genome (Kidwell 2005).

Thursday, February 07, 2008

Where to begin? First off, Larry’s claim was made over 8 years ago. Larry ran the risk of becoming the butt of jokes since a scientific discovery here and there could overturn his precarious claims.

Below is a video associated with a 2015 paper from National Academy of Sciences, a mere 7 years after Larry’s claim was made. One important aspect of SINES are the CTFC binding site motifs often found in SINEs. The motifs can’t be randomly located, otherwise they would not properly create functional chromatin extrusion loops. Further, these CTFC binding site motifs must be coordinated to “point” in the right direction many base pairs away in order for these extrusion loops to form. See this amazing video of extrusion loops and CTFC binding sites (which are often found in SINES):

Waves of Retrotransposon Expansion Remodel Genome Organization and CTCF Binding in Multiple Mammalian Lineages

CTCF-binding locations represent regulatory sequences that are highly constrained over the course of evolution [sic]. To gain insight into how these DNA elements are conserved and spread through the genome, we defined the full spectrum of CTCF-binding sites, including a 33/34-mer motif, and identified over five thousand highly conserved, robust, and tissue-independent CTCF-binding locations by comparing ChIP-seq data from six mammals. Our data indicate that activation of retroelements has produced species-specific expansions of CTCF binding in rodents, dogs, and opossum, which often functionally serve as chromatin and transcriptional insulators.

We therefore searched for an alternative mechanism for the de novo creation in a common mammalian ancestor of the thousands of CTCF-binding events now found throughout mammals. Despite the generally high conservation of CTCF motif-word usage, we noted that specific sets of motif-words were overrepresented in rodents (mouse and rat), dog, and opossum (Figure 4A). We found that the vast majority of these overrepresented motif-words are embedded within SINE transposons (Figures 4B and ​andS4S4).

The following 2015 paper lists many roles of ALU elements, about 7 years after Larry’s claims about ALUs were vomited onto the internet. Will he feel as confident now about his claims or will retractions be forthcoming?

The role of Alu elements in the cis-regulation of RNA processing

Alu elements are an important engine for functional diversity within the primate transcriptome. As building blocks of extra genetic material, retroelements are used to invent new ways to vary mRNA. The almost 300 nt long Alu element is an ideal player for several reasons: (1) Alus are frequently inserted into non-coding regions of pre-mRNAs, (2) when transcribed, they easily form stable secondary structures that seed a number of different RNA processing events, and (3) small changes to their sequence make them targets for a number of RNA-binding proteins that regulate gene expression. Depending on its location and specific sequence, the Alu element can induce different RNA processing events (Fig. 5). If two inverted Alus reside on each side of an exon, they can form a double-stranded RNA structure that may induce back-splicing and the formation of circular RNA. An intronic Alu with a mutated or edited sequence can induce alternative splicing or Alu exonization. Inverted repeat elements can also contribute to transcript variations in a more fine-tuned manner by inducing A-to-I editing within coding sequence. Also, Alus in introns and 3′UTRs can provide both miRNAs and their target sequences. In this review, we have only highlighted a few examples of how Alu elements may contribute to transcriptome variation in primates. These effects certainly combine with the better explored genomic variations that Alus create. Future studies will most likely reveal additional mechanisms on how these elements modulate our genetics.

Here is an example from that paper:
alu example figure 5

Possible Alu-induced RNA processing events. a Inverted Alus on each side of an exon that form a dsRNA structure may induce exonic RNA circularization. b An intronic Alu with a mutated or edited sequence can induce alternative splicing and/or Alu exonization. c Inverted Alu elements forming a dsRNA structure frequently induce A-to-I editing at nearby sites. d Within introns Alus can contribute to maturation of miRNAs. eAlu elements in 3′UTRs may act as miRNA targets

154 thoughts on “Some evidence ALUs and SINES aren’t junk and garbologists are wrong

  1. ” Larry ran the risk of becoming the butt of jokes since a scientific discovery here and there could overturn his precarious claims.”

    Thank God! Irony isn’t dead after all.

  2. Sal,
    The first paper you cite is very interesting. The second appears to be a somewhat speculative review.
    Let’s discuss the first paper, then:

    In what way do you think that Schmidt et al. 2012 refutes Larry Moran’s claims?
    Please be very precise in your use of language.

  3. At least 50% of the human genome consists of transposon-related and virus-related sequences. The vast majority is defective transposon and viral sequences including bits and pieces of once-active units.

    There are several well-documented examples of sequences like these that have secondarily acquired a new function. We don’t know how many examples will be found in the future but even the most optimistic estimate is fewer than a few thousand. That’s a minuscule fraction of the entire genome.

    Opponents of junk DNA go out of their way to avoid this fact. What they do is point out a few examples of former transposon sequences that have acquired a function and pretend that ALL 50% of the genome now has a function.

    So, I ask Sal the obvious question. How much of the genome is devoted to the functions you are describing and how important is this in the junk DNA debate?

  4. Larry Moran: How much of the genome is devoted to the functions you are describing and how important is this in the junk DNA debate?

    Sal, do you suppose it will turn out to be as important to the junk DNA debate as IDEA clubs have been to the ID movement?

    (Larry, you may not know that Sal reported here that Somebody Important told him that there is probably no other creationist who has the grasp of the ENCODE results that he does. Beyond that, he has made A’s in a couple of grad courses at the NIH. No matter how limited, his education is brand new, and yours is old. And he is of course Very Very Smart to begin with. You may once have been smart, but now you’re just an encrustation of the Darwinian Establishment.)

  5. Tom English: Somebody Important told him that there is probably no other creationist who has the grasp of the ENCODE results that he does.

    I can believe that. Nobody grasps things the way Sal does,

  6. Salvador,

    If Larry Moran has ever vomited onto the Internet, then you have never done more than to shit on it. You might want to be a bit more careful about calibration.

  7. There is no more difficulty presented by examples of ‘functional’ ALU and SINE fragments than by the observation that some mutations are beneficial. Some /= all, nor anything like.

    The Have Cake/Eat Cake Dichotomy. Mutations are all harmful; transposons, through mutagenic effect, are rilly useful.

  8. So for what I can tell, this is the same crap Sal has been promoting about pseudogenes: find some secondary function for some minor portion of them and claim victory.

    Ironically, it seems to me like these things actually count as evidence of evolution of new functionality, something creos claim is impossible.

  9. Sal,

    I dont know how you can use this paper to support your claims. The same data that show CTCF sites was used to show that there have been TE expansions hundreds of millions of years ago. But you think life is only thousands of years old

    I guess you didnt want to throw out the baby with the bathwater.

  10. Creationists are constantly isolating “game changer” publications, while, as Larry has pointed out on his blog, those of us who work to assimilate research in our fields pull out stacks of 50 papers on particular topics.

    It is hugely ironic that, with all the “strengths and weaknesses” science-ed bills introduced in state legislatures around the U.S. each year, creationists rarely, if ever, discuss research running contrary to the theses they promote. I genuinely cannot recall a case in which a creationist said, “If I’m right about this, then W, X, Y, and Z must be wrong. So I need to figure out how so many scientists went so wrong regarding W, X, Y, and Z.”

  11. I’m a bit interested in whether Sal has published a book that contains illustrations by other people, without notification or permission. That would be a neat trick if you can get away with it.

  12. The cutest part of the post is when Sal adds “[sic]” after “the course of evolution”, just so nobody thinks he agrees with any evolutionists.

    Sal, why are you unwilling to discuss baraminology and “kinds?” I’ve offered you two posts in the last few weeks, and nothing from you. Would you join in if I tried a third? I’m thinking about one.

  13. Larry Moran:

    So, I ask Sal the obvious question. How much of the genome is devoted to the functions you are describing and how important is this in the junk DNA debate?

    Oh probably a small percent, JoeCoder says about 1%.

    But I was pointing out you might have been premature in your assertion that all SINES are parasitic. SINES are maybe a bit of a miscellaneous category, analogously like lncRNAs are a bit of a miscellaneous category. Now that we’re elucidating SINES, maybe it’s not so useful to aggregate all these types of SINES as one class since “miscellaneous” really isn’t one class.

    For lack of a better term, I’ll call SINES with CTFC binding sites as CTFC-SINES. The orientation and positioning to effect chromatin extrusion doesn’t look random or parasitic to me.

    Regarding SINES that are ALUs which are involved in creating circular RNAs:

    http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3543092/

    Circular RNAs are abundant, conserved, and associated with ALU repeats

    In summary, this genome-wide analysis based on RNase R resistance establishes circular RNA species as a common and abundant form of noncoding RNA. Using a conservative estimate, ecircRNAs appear to originate from >14% of transcribed genes in human fibroblasts and, in many cases, are more highly expressed than the associated, canonical mRNA. Circularization of transcribed exons is conserved between human and mice, is correlated with the presence of inverted repeats within long flanking introns, and may result from exon-skipping via a lariat structure or direct circularization driven by intronic pairing. In aggregate, these findings suggest that ecircRNAs are not the product of mere accidents of splicing but instead may regulate gene expression by affecting translation, RNAi, or through sequestration of RNA binding proteins.

    That doesn’t look like the ALUs have an incidental role in protein expression does it?

    And just this March, 2016:

    http://www.nature.com/articles/ncomms11215

    Circular RNA profiling reveals an abundant circHIPK3 that regulates cell growth by sponging multiple miRNAs

    Circular RNAs (circRNAs) represent a class of widespread and diverse endogenous RNAs that may regulate gene expression in eukaryotes. However, the regulation and function of human circRNAs remain largely unknown. Here we generate ribosomal-depleted RNA sequencing data from six normal tissues and seven cancers, and detect at least 27,000 circRNA candidates. Many of these circRNAs are differently expressed between the normal and cancerous tissues. We further characterize one abundant circRNA derived from Exon2 of the HIPK3 gene, termed circHIPK3. The silencing of circHIPK3 but not HIPK3 mRNA significantly inhibits human cell growth. Via a luciferase screening assay, circHIPK3 is observed to sponge to 9 miRNAs with 18 potential binding sites. Specifically, we show that circHIPK3 directly binds to miR-124 and inhibits miR-124 activity. Our results provide evidence that circular RNA produced from precursor mRNA may have a regulatory role in human cells.

    So, maybe some of this is a little early to declare what is and is not functional, but why the rush to judgement? If ALUs result in circular RNAs and circular RNAs are part of complex regulatory networks, that would be pretty cool.

    So what if these sorts of discoveries start to erode the mutational load problems of Ohno, evolutionary biologists can always come up with some other theory that won’t really work but at least get them by until the next discovery overturns their ideas.

    FWIW, I think there is a lot of junk in the genome because of the mutational load problem, but we don’t know what parts. But unlike you, I think the junk present now was once functional.

    But I do think it is a rush to judgement to call SINES and ALUs parasites, and if the papers I cite are vindicated, to continue to call them parasites is probably inappropriate.

    Thanks for your comments.

    PS
    some additional papers

    http://www.nature.com/ng/journal/v43/n11/full/ng.917.html

    Transposon-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals

    A fundamental challenge in biology is explaining the origin of novel phenotypic characters such as new cell types1, 2, 3, 4; the molecular mechanisms that give rise to novelties are unclear5, 6, 7. We explored the gene regulatory landscape of mammalian endometrial cells using comparative RNA-Seq and found that 1,532 genes were recruited into endometrial expression in placental mammals, indicating that the evolution of pregnancy was associated with a large-scale rewiring of the gene regulatory network. About 13% of recruited genes are within 200 kb of a Eutherian-specific transposable element (MER20). These transposons have the epigenetic signatures of enhancers, insulators and repressors, directly bind transcription factors essential for pregnancy and coordinately regulate gene expression in response to progesterone and cAMP. We conclude that the transposable element, MER20, contributed to the origin of a novel gene regulatory network dedicated to pregnancy in placental mammals, particularly by recruiting the cAMP signaling pathway into endometrial stromal cells.

    and

    http://www.nature.com/ng/journal/v42/n7/abs/ng.600.html

    Transposable elements have rewired the core regulatory network of human embryonic stem cells

    Detection of new genomic control elements is critical in understanding transcriptional regulatory networks in their entirety. We studied the genome-wide binding locations of three key regulatory proteins (POU5F1, also known as OCT4; NANOG; and CTCF) in human and mouse embryonic stem cells. In contrast to CTCF, we found that the binding profiles of OCT4 and NANOG are markedly different, with only ~5% of the regions being homologously occupied. We show that transposable elements contributed up to 25% of the bound sites in humans and mice and have wired new genes into the core regulatory network of embryonic stem cells. These data indicate that species-specific transposable elements have substantially altered the transcriptional circuitry of pluripotent stem cells.

  14. Sal says,

    But unlike you, I think the junk present now was once functional.

    Are you kidding me? What ever gave you the impression that I believed something else?

    More than 50% of the human genome is derived from transposons and viruses that used to be functional. The vast majority of these sequences consist of bits and pieces of the original transposons and viruses and/or nearly intact sequences with mutations that render them non-functional. They look like junk. They are junk … with a few exceptions.

    I believe, just like you, that present junk came from functional sequences.

  15. Larry Moran: I believe, just like you, that present junk came from functional sequences.

    We may have a disagreement over the meaning of “functional”. After all, working LINEs, SINEs and endogenous viruses are not conserved by selection, which if I recall was one of your definitions of “functional”. The question to me is whether they’re functional with respect to the entity that owns the replication machinery (i.e. the human whose genome they occupy), which they clearly are not. Of course, populations of LINEs and SINEs in the genome do undergo selection of a sort, since broken ones don’t create new insertions, but that’s something different. I’ll grant you pseudogenes, though.

  16. I’ll grant you pseudogenes, though

    Note the diagram of the microRNA regulatory network and the PTEN bubble that represents the PTEN gene/protein/expression. Not shown in the diagram is the PTEN pseudogene PTENpg1 that acts as an miRNA sponge. Pseudogenes can participate in gene regulation.

    http://www.nature.com/nsmb/journal/v20/n4/full/nsmb.2516.html

    A pseudogene long-noncoding-RNA network regulates PTEN transcription and translation in human cells

    PTEN is a tumor-suppressor gene that has been shown to be under the regulatory control of a PTEN pseudogene expressed noncoding RNA, PTENpg1. Here, we characterize a previously unidentified PTENpg1-encoded antisense RNA (asRNA), which regulates PTEN transcription and PTEN mRNA stability. We find two PTENpg1 asRNA isoforms, α and β. The α isoform functions in trans, localizes to the PTEN promoter and epigenetically modulates PTEN transcription by the recruitment of DNA methyltransferase 3a and Enhancer of Zeste. In contrast, the β isoform interacts with PTENpg1 through an RNA-RNA pairing interaction, which affects PTEN protein output through changes of PTENpg1 stability and microRNA sponge activity. Disruption of this asRNA-regulated network induces cell-cycle arrest and sensitizes cells to doxorubicin, which suggests a biological function for the respective PTENpg1 expressed asRNAs.

    Who knows, this could be more prevalent than assumed. We’re only beginning to catalogue miRNA regulatory networks and their interaction with pseudogene transcripts.

  17. BTW, Sal, if you think there’s a lot of junk in the genome, then ENCODE was wrong. Because a 20% of junk can’t possibly be reconciled with the mutational load argument. Don’t you think?

  18. stcordova: Note the diagram of the microRNA regulatory network and the PTEN bubble blah, blah, blah…

    Once more, the fallacy that if one pseudogene has a function, they all probably do. And right after having that pointed out to you regarding retroelements.

  19. The video says there are about 10,000 ‘extrusion domains’. Let’s say all of them are associated with stalling at a CTCF binding sequence contained within a SINE. Does that make any significant inroad into explaining the bulk of these elements? These elements that, however one wants to characterise them, make copies of themselves in the millions?

    Consider this: you have a sequence that is capable of transposition. Is it not likely that this will lead to ‘overrepresentation’ of everything in it, once you have made a few million copies?

  20. stcordova,

    FWIW, I think there is a lot of junk in the genome because of the mutational load problem

    So you think that, in the 6000 year history of the planet, an accumulation of slightly deleterious mutations has led to extensive pseudogenisation? How does that even work, population-genetically? Not to mention the fact that most of the genome isn’t pseudogene.

  21. Sal… let’s say for argument sake that we could reduce the size of the human genome by 90% – something many geneticists think we could do (including popular bloggers who are all more qualified than you such as Moran, Myers, Graur, Gregory, Matzke, etc.)

    If we could reduce the size of the human genome by 90%, then would we really need all those ALUs to help with genome folding? Probably not! Most animals don’t have ALUs (a class of SINE particular to primates) and through trial and error nature has found other ways to order and fold their genomes. Also there are animals with trimmed down genomes containing almost no junk DNA (such as the Fugu which have roughly the same number of genes as humans) – in these animals nature has also found a way to order and fold their genomes.

    So if the only function ALUs have is to help us cope with an enormous genome which is bloated thanks to all the repetitive elements that have duplicated themselves over millions of years such as ALUs. Then can you really argue that they are functional?

    You are effectively arguing that something is functional because its stated purpose is to help us cope with having so many copies of that thing.

  22. Dazz:

    You garbologist!

    A true garbologist insists it is junk without even looking and considering possibilities. I listed examples where garbology theory was proven wrong:

    SINES with CTFC binding
    ALUs that generate circRNAs
    pseudo genes that serve as miRNA sponges

    I said I think there is lots of junk, but I don’t know how much and I don’t go around insisting entire classes of DNA are automatically junk because of evolutionary theory.

    Humans appeared to have been much stronger and smarter in ages past. We’ve lost capability based on examination of human remains in the past so I think we can no longer use some of what we once had.

    AceofSpades:

    including popular bloggers who are all more qualified than you such as Moran, Myers, Graur, Gregory, Matzke, etc.)

    You list is nothing comapred to the 442 researchers who were part of ENCODE and Roadmap across 32 institutions conducting 588 million dollars on 40 classes of experiments such as those depicted below which. Some of the experiments were like 5C which related to experiments that that elucidated the chromatin extrusion loops mentioned in the OP.

    Collectively this research consortium dwarfs whatever PZ Myers says, and PZ has published next to nothing in the last 15 years except his Happy Atheist flop of a book. You want to keep citing him, go ahead. As far as Matzke, he couldn’t even bring himself to answer a simple math question that a six year old could answer regarding if 500 fair coins 100% heads on a table was the product of chance or not. So two names on your list aren’t exactly something to brag about.

    You also keep looking at the repetition of ALUs as evidence of junk. VLSI circuits of computers are repetitive arrays, but that doesn’t imply they are junk! It is the dynamic changes in VLSI states that make them part of a complex system. In analogous manner ALUs and the RNAs they transcribe to are dynamic in form due to methylations of the DNA (an epigenetic change) and Adenosine to Inosine changes (post transcriptional change).

    Here is a paper on ALUs and embryonic stem cells:

    http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0011173

    Adenosine to Inosine (A-to-I) RNA editing is a site-specific modification of RNA transcripts, catalyzed by members of the ADAR (Adenosine Deaminase Acting on RNA) protein family. RNA editing occurs in human RNA in thousands of different sites. Some of the sites are located in protein-coding regions but the majority is found in non-coding regions, such as 3′UTRs, 5′UTRs and introns – mainly in Alu elements. While editing is found in all tissues, the highest levels of editing are found in the brain. It was shown that editing levels within protein-coding regions are increased during embryogenesis and after birth and that RNA editing is crucial for organism viability as well as for normal development. In this study we characterized the A-to-I RNA editing phenomenon during neuronal and spontaneous differentiation of human embryonic stem cells (hESCs). We identified high editing levels of Alu repetitive elements in hESCs and demonstrated a global decrease in editing levels of non-coding Alu sites when hESCs are differentiating, particularly into the neural lineage. Using RNA interference, we showed that the elevated editing levels of Alu elements in undifferentiated hESCs are highly dependent on ADAR1. DNA microarray analysis showed that ADAR1 knockdown has a global effect on gene expression in hESCs and leads to a significant increase in RNA expression levels of genes involved in differentiation and development processes, including neurogenesis. Taken together, we speculate that A-to-I editing of Alu sequences plays a role in the regulation of hESC early differentiation decisions.

    And something to note:

    http://embor.embopress.org/content/6/9/831

    Adenosine‐to‐inosine (A‐to‐I) RNA editing increases the complexity of the human transcriptome and is essential for maintenance of normal life in mammals. Most A‐to‐I substitutions occur within repetitive elements in the genome, mainly in Alu repeats. The phenomenon of A‐to‐I editing is far less abundant in mice, rats, chickens and flies than in humans, which correlates with the relative under‐representation of Alu repeats in these non‐primate genomes. Here, we review the recent results of bioinformatic and laboratory approaches that have estimated the extent of the editing phenomenon.

    ….
    The human editosome

    The ADAR family acts on double‐stranded RNA (dsRNA), which is formed by the coupling of complementary regions within a single transcript. Adar1‐knockout mice show an embryonic lethal phenotype (Wang et al, 2000; Hartner et al, 2004), whereas Adar2‐knockout mice suffer from convulsions and die prematurely (Higuchi et al, 2000). A third enzyme, ADAR3, contains both single‐ and double‐stranded RNA‐binding domains, although its deaminating activity has not yet been proven (Chen et al, 2000).

    The nucleoside inosine (I) is interpreted as guanosine (G) by the translation and splicing machinery, therefore there are several biological consequences of A‐to‐I substitutions: a change of codon, which leads to amino‐acid substitution (as exemplified by the glutamate receptor GluR‐B; reviewed in Seeburg et al, 1998); the insertion or elimination of a splice site (Rueter et al, 1999); or, potentially, the disruption of a stop codon. In addition to these recoding events, A‐to‐I editing might have other regulatory effects. For example, interferon‐induced ADAR1 is active on viral RNA (Patterson & Samuel, 1995). Although it has been proposed that viruses benefit from A‐to‐I substitutions (Polson et al, 1996), massive hyper‐editing of viral transcripts could be a cellular protection mechanism (Scadden & Smith, 1997). Recent works have also emphasized the multitude of A‐to‐I editing events in repetitive elements in the genome, the meaning of which has not yet been elucidated.

    and more recently from PNAS:

    http://www.ncbi.nlm.nih.gov/pubmed/20566853

    Human and chimpanzee genomes are almost identical, yet humans express higher brain capabilities. Deciphering the basis for this superiority is a long sought-after challenge. Adenosine-to-inosine (A-to-I) RNA editing is a widespread modification of the transcriptome. The editing level in humans is significantly higher compared with nonprimates, due to exceptional editing within the primate-specific Alu sequences, but the global editing level of nonhuman primates has not been studied so far. Here we report the sequencing of transcribed Alu sequences in humans, chimpanzees, and rhesus monkeys. We found that, on average, the editing level in the transcripts analyzed is higher in human brain compared with nonhuman primates, even where the genomic Alu structure is unmodified. Correlated editing is observed for pairs and triplets of specific adenosines along the Alu sequences. Moreover, new editable species-specific Alu insertions, subsequent to the human-chimpanzee split, are significantly enriched in genes related to neuronal functions and neurological diseases. The enhanced editing level in the human brain and the association with neuronal functions both hint at the possible contribution of A-to-I editing to the development of higher brain function. We show here that combinatorial editing is the most significant contributor to the transcriptome repertoire and suggest that Alu editing adapted by natural selection may therefore serve as an alternate information mechanism based on the binary A/I code.

  23. John Harshman: Once more, the fallacy that if one pseudogene has a function, they all probably do. And right after having that pointed out to you regarding retroelements.

    What’s more, he even says in this very thread that he acknowledges there is lots of junk.

    His entire issue is with a strawman he invented in his head. He’s arguing against a position nobody holds, that absolutely ALL retroelements and various forms of ncDNA is claimed to be junk.

    His opening post begins:

    Larry Moran, Dan Graur and other garbologists (promoters of the junkDNA perspective), have argued SINES and ALU elements are non-functional junk.

    By not actually explaining that these people openly state that NOT all of it is junk, he can then go on to “debunk” this constructed strawman-view and pretend evolutionary biologists and biochemists are somehow at odds with the latest science.

    And all of this has been pointed out to him before. What a despicable person he is.

  24. From the PNAS paper above:

    These findings bring to mind information storage models. As the number of potential editing sites in each Alu-containing transcript is high, usually several dozens, the potential for combinatorial encrypted information is enormous. Binary use of A or I in millions of sites in the neural cell transcriptome can be considered equivalent to the 0’s and 1’s used for information storage and processing by computers. It is tempting to speculate that the more abundant RNA editing found in the human brain may contribute to the more advanced human capabilities such as memory, learning, and cognition. This suggestion is consistent with the hypothesis that the advantage of complex organisms lies in the development of a digital programming system based on noncoding RNA signaling (46, 56). The combinatorial posttranscriptional RNA editing of noncoding sequences may therefore contribute to higher brain functions and may play a role in the evolution of human specialization.

    “Binary use of A or I in millions of sites in the neural cell transcriptome can be considered equivalent to the 0’s and 1’s used for information storage and processing by COMPUTERS.”

    BWAHAHA!

  25. stcordova: A true garbologist insists it is junk without even looking and considering possibilities.

    Then a true garbologist does not exist. And never did.

    For your argument to work, there has to be lots and lots of true garbologists, yet you have still not found even a single one.

  26. stcordova,

    You read scientific papers with the same mindset people read the Bible: looking for apparent connections to some predefined idea and finding them like a psycho finds weird connections everywhere he looks

  27. Doesn’t the mutational load have some math to back up the 90% junk figure in humans? For all I know it not only that there must be “lots” of junk, we can approximate how much of the geno is junk based on the mutational load argument.

    Sal
    I said I think there is lots of junk, but I don’t know how much

  28. dazz:
    Doesn’t the mutational load have some math to back up the 90% junk figure in humans? For all I know it not only that there must be “lots” of junk, we can approximate how much of the geno is junk based on the mutational load argument.

    Larry M has posted detailed estimates of the percentage of functional DNA of each type, along with justifications for each estimate.

    All Sal has to do is take each of these numbers and agree or disagree.

    Have I missed something, or has Sal completely passed up this opportunity?

    ETA: Here’s a link. I don’t think this is the very latest, but it should be close.

    http://sandwalk.blogspot.com/2008/02/theme-genomes-junk-dna.html

  29. From the link:

    Transposable Elements: (44% junk)

    DNA transposons:
    active (functional): <0.1%
    defective (nonfunctional): 3%
    retrotransposons:
    active (functional): <0.1%
    defective transposons
    (full-length, nonfunctional): 8%
    L1 LINES (fragments, nonfunctional): 16%
    other LINES: 4%
    SINES (small pseudogene fragments): 13%
    co-opted transposons/fragments: <0.1% a
    aCo-opted transposons and transposon fragments are those that have secondarily acquired a new function.
    Viruses (9% junk)

    DNA viruses
    active (functional): <0.1%
    defective DNA viruses: ~1%
    RNA viruses
    active (functional): <0.1%
    defective (nonfunctional): 8%
    co-opted RNA viruses: <0.1% b
    bCo-opted RNA viruses are defective integrated virus genomes that have secondarily acquired a new function.
    Pseudogenes (1.2% junk)
    (from protein-encoding genes): 1.2% junk
    co-opted pseudogenes: <0.1% c
    cCo-opted pseudogenes are formerly defective pseudogenes those that have secondarily acquired a new function.
    Ribosomal RNA genes:
    essential 0.22%
    junk 0.19%

    Other RNA encoding genes
    tRNA genes: <0.1% (essential)
    known small RNA genes: <0.1% (essential)
    putative regulatory RNAs: ~2% (essential) Protein-encoding genes: (9.6% junk)
    transcribed region:
    essential 1.8%
    intron junk (not included above) 9.6% d
    dIntrons sequences account for about 30% of the genome. Most of these sequences qualify as junk but they are littered with defective transposable elements that are already included in the calculation of junk DNA.
    Regulatory sequences:
    essential 0.6%

    Origins of DNA replication
    <0.1% (essential) Scaffold attachment regions (SARS)
    <0.1% (essential) Highly Repetitive DNA (1% junk)
    α-satellite DNA (centromeres)
    essential 2.0%
    non-essential 1.0%%
    telomeres
    essential (less than 1000 kb, insignificant)

    Intergenic DNA (not included above)
    conserved 2% (essential)
    non-conserved 26.3% (unknown but probably junk)

    Total Essential/Functional (so far) = 8.7%
    Total Junk (so far) = 65%
    Unknown (probably mostly junk) = 26.3%

  30. Larry is clearly no “garbologist” according to Sal’s definition of the term.
    I think a retraction and an apology is in order here

  31. stcordova,

    A true garbologist insists it is junk without even looking and considering possibilities.

    There is, therefore, no true garbologist.

    I don’t go around insisting entire classes of DNA are automatically junk because of evolutionary theory.

    Nor does anyone else.

    [eta – posted before I saw other replies saying the same thing. Whack that straw Sal! Whack it good!]

  32. stcordova,

    You list is nothing compared to the 442 researchers who were part of ENCODE and Roadmap across 32 institutions […]

    The hell you say. And all of them agree with Ewan Birney on interpretation, and disagree with Graur? You’ve checked?

  33. stcordova: “Oh probably a small percent, JoeCoder says about 1%.”

    I consider Sal a good friend and we had been discussing these ALU’s, but he must’ve misunderstood something I said because I don’t think these 5000 ALU’s constitute 1% of the genome. Likely far less.

    Wikipedia says “it is estimated that about 10.7% of the human genome consists of Alu sequences.” The source above says they found five thousand CTFC binding sites. 5000 out of 1 million ALU’s is 0.5%. 0.5% times 10.7% is only 0.000535% of the whole genome. And even that amount is assuming every nucleotide of those 5000 ALU’s is functional.

    Although I’m just an enthusiast and I don’t have much formal training in biology, so I welcome anyone who can refine or correct these numbers.

  34. Wikipedia says “it is estimated that about 10.7% of the human genome consists of Alu sequences.”

    Thanks for weighing in Joe. The PNAS paper however on the relationship of Primate ALUs in the development and operation of the brain could make the functionality of all ALUs possible.

    Given that BitCoin has a redundancy factor of 440, and that the brain also has substantial redundancies, from the perspective of redundant RAM, the ALUs could arguably be all functional, at least in terms of reserve memory capacity.

    As a caution however, we really don’t completely understand the chemical basis for how the brain remembers and learns. We know the epigenetic memory (in terms of histone modifications) may play a role in human memory, and now the ncRNAs as storage devices may play a role (the EDITOSOME), but there is the yet undiscovered territory of the glycome.

    I like that, yet another “-ome” the RNA editosome. 🙂

  35. BTW, Sal, if you think there’s a lot of junk in the genome, then ENCODE was wrong. Because a 20% of junk can’t possibly be reconciled with the mutational load argument. Don’t you think?

    Even without ENCODE, and using Larry’s figures for the fraction of the genome he thinks is functional, we may be past the limit of mutational load that can be carried in the human genome. That limit, according to Muller is around 1 bad mutation per individual per generation.

    Suppose 10% of the human genome is functional, that’s about 330,000,000 base pairs that are under constraint. That’s plenty of opportunity for more than 1 defect to emerge per individual per generation. Even synonymous changes can create functional compromise, btw.

    I’ll tell you the main reason I think a lot of the genome has to have function. 330,000,000 bases equal 660,000,000 bits of information which is about 82.5 megabytes of information. That doesn’t seem like quite enough information to provide manufacturing instructions to make something as complex as the human brain or human immune system or the eye or the hearing system, etc.

    That intuition is based on how much software we need to make systems that seem inferior to the human brain.

    One solution is that a lot of information is outside the genome, like in the glycome and epiproteome and who knows where else.

  36. stcordova: Even without ENCODE, and using Larry’s figures for the fraction of the genome he thinks is functional, we may be past the limit of mutational load that can be carried in the human genome. That limit, according to Muller is around 1 bad mutation per individual per generation.

    So why not look up Larry’s posts on that very topic and tell us where his numbers are wrong.

    Or engage some of the people here who are actual experts on population genetics. It’s not like the topic hasn’t been studied.

  37. stcordova,

    I think I remember Larry’s answer to this, and IIRC you know regulatory genes are a lot more lenient to mutation. with a 2% of the geno coding for proteins, we only get 2 mutations there and that’s well within the capabilities of purifying selection to avoid mutational meltdown (I seem to remember something about being haploid makes things twice better)

  38. stcordova: I’ll tell you the main reason I think a lot of the genome has to have function. 330,000,000 bases equal 660,000,000 bits of information which is about 82.5 megabytes of information. That doesn’t seem like quite enough information to provide manufacturing instructions to make something as complex as the human brain or human immune system or the eye or the hearing system, etc.

    But why assume that everything has to be specified in the DNA?

    As I see it, much of the information needed for the brain or the eye comes from reality, rather than from the DNA. Adaptive feedback, during development, adjusts the growing structure so that it works well. I don’t think it could work any other way.

  39. stcordova: That intuition is based on how much software we need to make systems that seem inferior to the human brain.

    So you can’t see how a brain / human being could be designed with such few genes… so much for the design inference

  40. Petrushka,

    You’re no longer on my ignore list. Congratulations. 🙂

    Larry’s estimate:

    Essential/Functional (so far) = 8.7%

    http://sandwalk.blogspot.ca/2011/05/whats-in-your-genome.html

    If ENCODE 80% figure is right, then about 45-82 mutations per individual per generation. See page page 13 where I quote Graur:

    http://www.creationevolutionuniversity.org/public_blogs/skepticalzone/paper31.pdf

    So Larry’s estimate is about 10% the size of the ENCODE estimate, so a back-of-the-envelope scaling brings the amount of mutations under his estimate to 4.5 to 8.2 mutations per generation per individual. Now one might say, synonymous mutations or tolerable point mutations don’t matter. But we know now synonymous mutations do matter because of regulatory targeting and binding site specificity for molecular machines like the Vitamin D receptor, micro RNAs, and who knows what else! But one doesn’t want to be just on the edge of the Muller limit. To be within the Muller limit we’d have to be invoking Ohno’s numbers or worse, like say 2%-5% functionality.

    I’m just saying it’s premature to ignore any class of DNA as non-functional. If we had done that we would have been impeded in seeing the role of ALU repetitive elements in brain development and operation. The insistence on junkDNA, the garbological viewpoint, is a science stopper.

    I think there is some junk in our DNA, but I don’t know how much, and I certainly won’t classify repetitive elements a priori as junk, especially now in light of A-to-I editing and the existence of the epitranscriptome.

    The evidence of some junk is testified by the abundance of birth defects and compromised functions in the human population (like allergies and myopia, etc.).

    As I said, at an intuitive level, the 82.5 megabytes doesn’t seem to provide enough developmental instruction to create the human nervous system or immune system, maybe not even lesser systems in the human body.

  41. stcordova: The evidence of some junk is testified by the abundance of birth defects and compromised functions in the human population (like allergies and myopia, etc.).

    You lost me there

Leave a Reply