Science Friday: A Molecular Biology & Evolution Sampler

My favorite subject-specific journal is Molecular Biology and Evolution (MBE).  This journal publishes on topics primarily related to molecular evolution and evolutionary genomics, which are among my favorite subjects in biology. I’m happy to report that the latest issue of MBE is out today, and there are lots of great articles that I think will be of interest to folks here, many of which are open-access.

I sadly don’t have time to write up any of these articles, but I thought it might be useful to “sample” a few in case any any of you would like to read and discuss them.  Here are a handful that seem particularly interesting:

Population Structure Shapes Copy Number Variation in Malaria Parasites (open-access)

Abstract:

If copy number variants (CNVs) are predominantly deleterious, we would expect them to be more efficiently purged from populations with a large effective population size (Ne) than from populations with a small Ne. Malaria parasites (Plasmodium falciparum) provide an excellent organism to examine this prediction, because this protozoan shows a broad spectrum of population structures within a single species, with large, stable, outbred populations in Africa, small unstable inbred populations in South America and with intermediate population characteristics in South East Asia. We characterized 122 single-clone parasites, without prior laboratory culture, from malaria-infected patients in seven countries in Africa, South East Asia and South America using a high-density single-nucleotide polymorphism/CNV microarray. We scored 134 high-confidence CNVs across the parasite exome, including 33 deletions and 102 amplifications, which ranged in size from <500 bp to 59 kb, as well as 10,107 flanking, biallelic single-nucleotide polymorphisms. Overall, CNVs were rare, small, and skewed toward low frequency variants, consistent with the deleterious model. Relative to African and South East Asian populations, CNVs were significantly more common in South America, showed significantly less skew in allele frequencies, and were significantly larger. On this background of low frequency CNV, we also identified several high-frequency CNVs under putative positive selection using an FST outlier analysis. These included known adaptive CNVs containing rh2b and pfmdr1, and several other CNVs (e.g., DNA helicase and three conserved proteins) that require further investigation. Our data are consistent with a significant impact of genetic structure on CNV burden in an important human pathogen.

No Accumulation of Transposable Elements in Asexual Arthropods (open-access)

Abstract:

Transposable elements (TEs) and other repetitive DNA can accumulate in the absence of recombination, a process contributing to the degeneration of Y-chromosomes and other nonrecombining genome portions. A similar accumulation of repetitive DNA is expected for asexually reproducing species, given their entire genome is effectively nonrecombining. We tested this expectation by comparing the whole-genome TE loads of five asexual arthropod lineages and their sexual relatives, including asexual and sexual lineages of crustaceans (Daphnia water fleas), insects (Leptopilina wasps), and mites (Oribatida). Surprisingly, there was no evidence for increased TE load in genomes of asexual as compared to sexual lineages, neither for all classes of repetitive elements combined nor for specific TE families. Our study therefore suggests that nonrecombining genomes do not accumulate TEs like nonrecombining genomic regions of sexual lineages. Even if a slight but undetected increase of TEs were caused by asexual reproduction, it appears to be negligible compared to variance between species caused by processes unrelated to reproductive mode. It remains to be determined if molecular mechanisms underlying genome regulation in asexuals hamper TE activity. Alternatively, the differences in TE dynamics between nonrecombining genomes in asexual lineages versus nonrecombining genome portions in sexual species might stem from selection for benign TEs in asexual lineages because of the lack of genetic conflict between TEs and their hosts and/or because asexual lineages may only arise from sexual ancestors with particularly low TE loads.

 

Evolution of Prdm Genes in Animals: Insights from Comparative Genomics (open-access)

Abstract:

Prdm genes encode transcription factors with a subtype of SET domain known as the PRDF1-RIZ (PR) homology domain and a variable number of zinc finger motifs. These genes are involved in a wide variety of functions during animal development. As most Prdm genes have been studied in vertebrates, especially in mice, little is known about the evolution of this gene family. We searched for Prdm genes in the fully sequenced genomes of 93 different species representative of all the main metazoan lineages. A total of 976 Prdm genes were identified in these species. The number of Prdm genes per species ranges from 2 to 19. To better understand how the Prdm gene family has evolved in metazoans, we performed phylogenetic analyses using this large set of identified Prdm genes. These analyses allowed us to define 14 different subfamilies of Prdm genes and to establish, through ancestral state reconstruction, that 11 of them are ancestral to bilaterian animals. Three additional subfamilies were acquired during early vertebrate evolution (Prdm5, Prdm11, and Prdm17). Several gene duplication and gene loss events were identified and mapped onto the metazoan phylogenetic tree. By studying a large number of nonmetazoan genomes, we confirmed that Prdm genes likely constitute a metazoan-specific gene family. Our data also suggest that Prdm genes originated before the diversification of animals through the association of a single ancestral SET domain encoding gene with one or several zinc finger encoding genes.

This next one is on a topic that comes up here from time to time, and I think it will be of interest to many of you.  Sadly, it’s paywalled, but if you don’t have access through a university library, feel free to send me a pm.

Are Human Translated Pseudogenes Functional?

Abstract:

By definition, pseudogenes are relics of former genes that no longer possess biological functions. Operationally, they are identified based on disruptions of open reading frames (ORFs) or presumed losses of promoter activities. Intriguingly, a recent human proteomic study reported peptides encoded by 107 pseudogenes. These peptides may play currently unrecognized physiological roles. Alternatively, they may have resulted from accidental translations of pseudogene transcripts and possess no function. Comparing between human and macaque orthologs, we show that the nonsynonymous to synonymous substitution rate ratio (ω) is significantly smaller for translated pseudogenes than other pseudogenes. In particular, five of 34 translated pseudogenes amenable to evolutionary analysis have ω values significantly lower than 1, indicative of the action of purifying selection. This and other findings demonstrate that some but not all translated pseudogenes have selected functions at the protein level. Hence, neither ORF disruption nor presence of protein product disproves or proves gene functionality at the protein level.

There are a lot of other interesting papers in this issue, but for the sake of brevity, I’ll stop here.  Happy reading!

205 thoughts on “Science Friday: A Molecular Biology & Evolution Sampler

  1. LINE-1 retrotransposons occupy about 17% of the human genome. Widely regarded as junk.

    It turns out they transpose in somatic cells and are apparently essential for enabling neurons to differentiate. It is pretty important neurons differentiate so that we can think and remember and store information in an orderly way. We can remembers smells, tastes, sounds, sights, ideas because neurons are differentiated. The L1 “junk” only recently is now realized to have function. I had the junk DNA debate 10 years ago, I’d not be in as good a position as today:

    The finding that L1 retrotransposons are active in somatic cells and have structural and functional consequences in neuronal genomes challenges the dogma that neurons are genetically stable entities (Muotri et al., 2005; Coufal et al., 2009; Baillie et al., 2011; Evrony et al., 2012; Perrat et al., 2013). The consequences of L1 activity in neurons resulting in neuronal mosaicism are evident throughout embryo development and in the adult brain (Thomas et al., 2012). This novel mechanism may contribute to genomic neuronal diversity across neurons in the same individual: the “one human, multiple genomes” phenomenon that has been demonstrated in a number of different cell types (Lupski, 2013).

    Current tally of function:

    coding region : 2%
    introns (including Alus of 10%) : 23-25% possibly
    LINE-1 (L1) retrotransposons, other LINEs: 17% possibly

    44% possibly functional.

  2. Sal, to colewd:

    ENCODE is considered medical science mainstream. Then NIH has thrown 288 million dollars at the ENCODE project and another 188 million at it’s sister project Roadmap. That’s almost half a billion dollars wagered on the hypothesis the genome has lots of function.

    You may feel a tad a lone in among the TSZ regulars, but you are not alone as far as the medical researchers who study non-coding DNA.

    No, Sal, the NIH is not wagering that the genome has lots of function, and in particular, it isn’t wagering that the percentage of junk DNA is small. It is wagering that the ENCODE project is worthwhile.

    You keep pretending that the NIH is betting against junk DNA. Here, for example:

    I don’t worry about Larry Moran and Dan Graur. Francis Collins is putting his money where his mouth isn’t, on exploring JunkDNA. To the tune of 600 million dollars:

    ENCODE : about 200 million
    Roadmap: about 200 million
    E4: about 200 million

    ” If ENCODE is right, then Evolution is wrong.” Dan Graur.

    There is a 600 million dollar bet by the NIH that ENCODE is right.

    False. The NIH is betting that the ENCODE project is worthwhile, not that the inflated functionality figures are correct.

  3. stcordova:
    LINE-1 retrotransposons occupy about 17% of the human genome.Widely regarded as junk.

    It turns out they transpose in somatic cells and are apparently essential for enabling neurons to differentiate.It is pretty important neurons differentiate so that we can think and remember and store information in an orderly way.We can remembers smells, tastes, sounds, sights, ideas because neurons are differentiated.The L1 “junk” only recently is now realized to have function.I had the junk DNA debate 10 years ago, I’d not be in as good a position as today:

    Current tally of function:

    coding region :2%
    introns (including Alus of 10%) : 23-25% possibly
    LINE-1 (L1)retrotransposons, other LINEs: 17%possibly

    44% possibly functional.

    You’re just grasping at straws there. By pretending that because some miniscule percentage of retrotransposons have been found to have secondarily acquired functions, therefore it is reasonable to postulate the rest of them are “possibly” functional is the ‘possible therefore probable’- fallacy.

  4. stcordova: Introns are about 25% of the genome. They are considered junk. If introns are functional that figure of 90% goes down to 73% – 25% introns + 2% coding is 27% functional, therefore 73% at most is junk.

    Here is a paper on introns and their function.
    http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3325483/

    So you have found a paper that suggest a function for some introns.

    Run the numbers Sal. Don’t just throw out these random references and then say “maybe”.
    Having identified a functional intron does not all introns functional make.
    Having identified an interaction, or some sort of activity in a genomic region, does not mean that region has important organismal function of any sort. If all it took to identify a functional region was some sort of activity, then by definition the whole genome would be functional merely because it is copied at cell division.

    How many functional introns were identified and what proportion of ALL introns do they make up?

    And then remember, if it really is junk and it’s not conserved, then it is mutating over generations. Sooner or later, something will mutate such that it happens to be transcsribed or possibly translated, or become similar to a regulatory region of some sort (and then regulatory proteins and RNA’s can recognize it and will bind to it) and by chance it might secondarily acquire a function. A calculation of probability will show that this is rare, but with a big genome it’s bound to happen. In other words, we expect some of the junk-regions to be active and we expect some of the active regions to be functional (and therefore, of course, not junk).

    But that means we can’t just make these simplistic inferences that “some introns functiona = all introns functional”. To make a case for it not being junk you have to actually determine the relative proportions.

  5. Besides, Sal, every single one of your vague references to the possible therfore probable fallacy regarding the possible function of some junk regions would fail the Onion test.

  6. Rumraket,

    some miniscule percentage of retrotransposons have been found

    Indeed we’ve only scratched the surface, and we might not have found it if we just sat back and accepted a faith-based approach that there is nothing but junk to find as you and others here have insisted.

    We don’t know, but you still insist we know and that it’s junk. The truth is you BELIEVE and put your faith and trust in an idea that doesn’t have direct empirical evidence, and you still stand by it despite the fact the medical community has found some hints that your faith-based idea is at risk of being falsified.

    The NIH is betting that the ENCODE project is worthwhile,

    OK, so that means we pursued ENCODE because we don’t really understand the genome as well as Flint claimed.

  7. stcordova:OK, so that means we pursued ENCODE because we don’t really understand the genome as well as Flint claimed.

    Aa usual, this is not honest. I didn’t claim the genome was fully understood, and this is clear from the comments of those who know far better than I. I claimed that we know it well enough to understand that you are misrepresenting. I think probably your error here lies in making your religious motivations so obvious.

    On the one side, we have those whose understanding of the genome is sufficient to say that nearly all of it is junk — even if some of the “junk” turns out to be marginally functional or can possibly through mutation acquire a function later. On the other side, we have people like you, who take it as religious doctrine that there is no or minimal junk, and do everything you can to cherry-pick, cast doubt, misrepresent, misdirect, and otherwise undermine a religiously uncongenial set of observations.

    I don’t know why you can’t see that this effort is counterproductive. If you tried to make a case that the 90% junk estimates might be high, and maybe 80% could be closer, this would better disguise your religious convictions. Otherwise, NON-religious readers regard you as being right sometimes the same way a stopped clock is right sometimes – by accident and despite rather than because of your convictions.

  8. stcordova

    OK, so that means we pursued ENCODE because we don’t really understand the genome as well as Flint claimed.

    LOL! Sal keeps desperately trying to cram his YEC God into those ever shrinking gaps.

  9. stcordova: Indeed we’ve only scratched the surface, and we might not have found it if we just sat back and accepted a faith-based approach that there is nothing but junk to find as you and others here have insisted.

    But nobody has said trying to elucidate exactly what portions of the genome are functional and what that function is, isn’t worth doing.

    And I have to say I think your insinuation that the scientists who say it’s mostly junk, is basing it on faith, is downright stupid because you surely now MUST know that it isn’t true.
    There are actual arguments for why most of the genome is junk. They are arguments about proportions, based on multiple independent lines of investigation. For fucks sake Sal you know this, you can’t not know it at this point.

    And historically they actually started out believing it was all functional due to arguments from selection.

    And nobody has said it wasn’t worth doing the ENCODE project. We want to know the transcription levels of the genome, because these can help identify which regions are functional and which probably aren’t. And this in turn can help identify regions more worth doing studies that determine what their function are. All of this is important work, nobody who thinks there’s mostly junk insists it isn’t worth it to find out what parts in particular are and aren’t functional. You are imagining a scenario that simply doesn’t take place and assigning thoughts and motivations to scientists they don’t actually have.

    Why do you spew manifest falsehoods you KNOW are false?

  10. Rumraket:No, that isn’t the truth and you KNOW it isn’t the truth. You know it is a falsehood and you are now knowingly and deliberately telling something you know is untrue.

    I have come to disagree with this. I’m convinced that Sal “knows” everything through the Will To Believe, and that either the evidence supports him or the evidence is wrong, but evidence is really irrelevant. And I’m convinced that Sal simply cannot imagine any other way of “knowing” anything. If your faith has doctrines about DNA different from his, he uses the traditional religious way of resolving such disputes — he SAYS your faith is false. That makes it false.

  11. It’s not so much a lie as a violation of Lizzie’s prime directive. thou shalt not question the sincerity of another poster.

    As far as I can tell, everything else is secondary.

    One of the reasons science transcends religion and politics is that motive simply doesn’t matter. You get results or you don’t

  12. petrushka:
    It’s not so much a lie as a violation of Lizzie’s prime directive. thou shalt not question the sincerity of another poster.

    As far as I can tell, everything else is secondary.

    One of the reasons science transcends religion and politics is that motive simply doesn’t matter. You get results or you don’t

    Gauger and Axe get plenty of results, all of them bogus. Most of us would suspect this is because they structure their experiments inappropriately. We might even speculate as to what motivates them to do this.

    Similarly, Sal cites plenty of results. Some of us seem to sense a distinctive pattern to his citations, and note that this pattern, however artificial or forced, always lines up with his religious delusions. This MAY not be a coincidence.

  13. Flint: Aa usual, this is not honest. I didn’t claim the genome was fully understood,

    Accusing others of dishonesty is against the rules.

    Yes, technically, you did not claim the genome was fully understood.

    Flint: FIRST, nearly all of the genome is understood.

    Nearly all the genome is understood, or not. Extraordinary claims require extraordinary evidence.

    ETA: Near as I can tell Salvador didn’t claim that you said the genome was fully understood.

  14. Rumraket: That makes you a lying sack of shit. I emplore you to stop being a lying sack of shit.

    This is against the rules, but apparently you have a special dispensation.

  15. Mung: Accusing others of dishonesty is against the rules.

    Yes, technically, you did not claim the genome was fully understood.

    Nearly all the genome is understood, or not. Extraordinary claims require extraordinary evidence.

    ETA: Near as I can tell Salvador didn’t claim that you said the genome was fully understood.

    Sal is determined to misrepresent what I said. My understanding is that the genome is understood WELL ENOUGH to establish that Sal’s a priori convictions are WAY WAY WAY off base. My understanding is based on those posters, all of whom are far more informed than I am, who mostly agree with one another and continue to point out Sal’s error to him. Which he simply ignores.

    But I will give Sal the benefit of the doubt here. For him, the evidence simply does not matter. If his faith requires falsehood, he is genuinely sincere in excreting that falsehood. Over and over. This is one of those theological paradoxes I admit I don’t understand. Apparently he can post falsehoods repeatedly, KNOWING they are false, but do so honestly. It’s a mystery.

  16. Mung: This is against the rules, but apparently you have a special dispensation.

    I suggest you ask for a refund on the price of entry then, sir.

  17. Mung: Nearly all the genome is understood, or not. Extraordinary claims require extraordinary evidence.

    If you want the evidence, open any contemporary book on molecular biology and biochemistry. There’s fifty years worth of accumulated evidence and understanding of genomes in there, combine it with genetic load, interspecies genome size variations, the human genome project and, ironically, the results of the ENCODE experiment (the actual results, the transcription assays, not the press releases) and that’s what tells us the genome is mostly junk.

  18. Flint: But I will give Sal the benefit of the doubt here. For him, the evidence simply does not matter. If his faith requires falsehood, he is genuinely sincere in excreting that falsehood. Over and over. This is one of those theological paradoxes I admit I don’t understand. Apparently he can post falsehoods repeatedly, KNOWING they are false, but do so honestly. It’s a mystery.

    “By definition, no apparent, perceived, or claimed evidence in any field, including history and chronology, can be valid if it contradicts the scriptural record” – Answers In Genesis

    “I think Martin Luther correctly distinguished between what he called the magisterial and ministerial uses of reason. The magisterial use of reason occurs when reason stands over and above the gospel like a magistrate and judges it on the basis of argument and evidence. The ministerial use of reason occurs when reason submits to and serves the gospel…. Should a conflict arise between the witness of the Holy Spirit to the fundamental truth of the Christian faith and beliefs based on argument and evidence, then it is the former which must take precedence over the latter.
    …We’ve already said that it’s the Holy Spirit who gives us the ultimate assurance of Christianity’s truth. Therefore, the only role left for argument and evidence to play is a subsidiary role.
    – William Lane Craig

    “Second, if the magisterial role of reason were valid, then a person who had been given poor arguments for Christianity would have a just excuse before God for not believing in him. Suppose someone had been told to believe in God because of an invalid argument. Could he stand before God on the judgement day and say, “God, those Christians only gave me a lousy argument for believing in you. That’s why I didn’t believe”? Of course not! The bible says all men are without excuse. Even those who are given no good reason to believe and many persuasive reasons to disbelieve have no excuse, because the ultimate reason they do not believe is that they have deliberately rejected God’s Holy Spirit.”William Lane Craig, Reasonable Faith page 37

    I rest my case, Christianity makes people clinically insane.

  19. Allan Miller,

    Nearly 99% of the human genome does not encode proteins, and while there recently has been extensive biochemical annotation of the remaining noncoding fraction, it remains unclear whether or not the bulk of these DNA sequences have important functional roles. By comparing the genome sequences of different species we identify genomic regions that have evolved unexpectedly slowly, a signature of natural selection upon functional sequence. Using a high resolution evolutionary approach to find sequence showing evolutionary signatures of functionality we estimate that a total of 8.2% (7.1–9.2%) of the human genome is presently functional, more than three times as much than is functional and shared between human and mouse. This implies that there is an abundance of sequences with short lived lineage-specific functionality. As expected, most of the sequence involved in this functional “turnover” is noncoding, while protein coding sequence is stably preserved over longer evolutionary timescales. More generally, we find that the rate of functional turnover varies significantly across categories of functional noncoding elements. Our results provide a pan-mammalian and whole genome perspective on how rapidly different classes of sequence have gained and lost functionality down the human lineage.

    Allan, thanks for the paper. It was very interesting and helpful. The methodology they used to reach their conclusion was mutational turnover as compared to species that are part of the tree of life. They conclude that a large amount of sequential turnover in non coding DNA between animals means lack of function. Is this how you interpret their methods? How does this differ from Encode? I would hypothesize that the very large difference in guesstimates would be caused by different assumptions.

  20. And I have to say I think your insinuation that the scientists who say it’s mostly junk, is basing it on faith, is downright stupid because you surely now MUST know that it isn’t true.

    So what experimental technique do you cite that establishes the genome is 90% junk?

    Knock out doesn’t count since we can knock out a spare tire from a car, and it can still run, but it doesn’t mean the spare tire has no function.

    So if they don’t have experiments, it’s just a faith-based assumption, not scientific experimental basis.

    On the other hand if experimentalists put forward evidence of function, I accept that as scientific evidence.

    So what experiments exist that establish non-function of 90% of the genome. But even with a knockout experiment (which I said has problems for establishing non-function), do you think we can knockout 90-98% of the human genome and not expect some serious health issues if not lethality?

  21. colewd: How does this differ from Encode? I would hypothesize that the very large difference in guesstimates would be caused by different assumptions.

    It’s based on very different types of biochemical assays. The ENCODE project did assays that essentially measure the amount of places in the genome where a transcription factor will bind and initiate transcription, and then measure the level of that trancription (how many copies are produced in the cells).

    They then, because basically the entire genome is at some point transcribed, use this fact to conclude that therefore all of that transcribed DNA must be functional.

    Problem is, large parts of those transcripts are expressed at levels below a single copy pr. cell.

    From first principles of DNA binding proteins, this result is predicted even for deliberately constructed, entirely random and nonfunctional DNA.

    In fact there have been studies done that mirrors the results of the ENCODE projects detected transcription maps(how many places are transcribed and how many transcripts are produced when they do) on synthetic and random DNA.

    This is pretty good evidence that ENCODE are making unjustified extrapolations from their results in their press releases and paper abstracts.

  22. stcordova: do you think we can knockout 90-98% of the human genome and not expect some serious health issues if not lethality?

    You have to do actualy research to to determine which portions are functional. There’s a difference between knowing the relative proportions and knowing exactly which parts are the functional ones.

    Obviously you can’t just knock shit out randomly at a 90% level. You could do knock out experiments, sure, but you’d first have to do a lot of comparative sequence analysis (inter and intra-species) and even augment that with transcription level studies ala ENCODE. These two in combination would allow you to make rough estimates about which portions are most probably junk.
    For example, of some region shows no sequence conservation (varies a lot between species, possibly even between individuals) and was also transcribed at a level expected from suprious binding, that would represent a good area to try a knock-out experiment.

    Then you could proceed to do knock-outs of such regions one by one and see what happens. Probably in things like plants, flies and mice.

    Regardless, I don’t think knock-out experiments are required to show that most of the genome is junk. Comparative genetics and transcription level assays are actually more than enough to show this. Combine it with genetic load and inter-species genome size variations and it’s done deal. Most of the genome is junk. Now we just have to figure out which particular parts with a high degree of accuracy.

  23. Seriously, this is evidence from direct experimentation that the ENCODE project’s uber-hyped claims are baseless:

    Finding function in the genome with a null hypothesis

    To test DNA for function, we used a new technique to measure whether a piece of DNA can regulate a downstream gene (a barcoded DsRed reporter gene). One way to define functional DNA in the context of this experiment is ‘any piece of DNA that reproducibly regulates the reporter gene.’

    We tested about 2,000 native sequences from the genome (more about that in my next post), and, as a negative control, we also tested random DNAs, DNAs created by scrambling the sequences of genomic DNA.

    It turns out that most of the 1,300 random DNA sequences cause reproducible regulatory effects on the reporter gene.

  24. stcordova: So what experimental technique do you cite that establishes the genome is 90% junk?

    It isn’t just creationists who think that experiments, in labs, with white coats on and test tubes full of bubbling, colored liquids in hand, are the only valid way to gather data. But it’s true that a lot of creationists seem to think this. It’s wrong. Observational data is just as good as experimental data.

    And in this case we have the c-value paradox and its buddies the onion test and the fugu test, as well as sequence conservation or lack thereof, all of which tell us that there’s lots and lots of junk in the human genome, in the neighborhood of 90 percent.

  25. I, for one, think a spare tire is non-functional. It exists for potential function. Would it be odd if “spare genes,” genes for potential function, were found in the genome?

  26. Mung:
    I, for one, think a spare tire is non-functional. It exists for potential function. Would it be odd if “spare genes,” genes for potential function, were found in the genome?

    Depends on what you mean by “potential function”. A gene doesn’t have to do its thing all the time in order to evolve and be maintained by selection, just often enough that, on the average, organisms that have it have more reproductive success than those that don’t. A spare tire is functional because cars that have one do better (well, they don’t reproduce, but they have more success at being working cars) than those that don’t. But if you’re talking about a gene that has never done anything before, but just might do something in the future if it picks up a mutation, or the environment changes in a novel way, or it just starts being active for no particular reason, that gene is certainly non-functional and can neither evolve nor be maintained by selection.

  27. And in this case we have the c-value paradox and its buddies the onion test and the fugu test, as well as sequence conservation or lack thereof, all of which tell us that there’s lots and lots of junk in the human genome, in the neighborhood of 90 percent.

    ACGT sequences aren’t the only significant use of DNA, DNA serves as a scaffold for epigenetic machinery such as histones which are Random Access Memory units. The histones could easily be recruited for different use in each organism, hence the amount of DNA required to function as a histone scaffold would vary between organisms.

    I might post on the hypothesis of DNA as a histone scaffold sometime since it is reasonably seen that Eukaryotes with high powered brains may utilize the DNA-Histone complexes as a means of being able to learn and remember.

    We also don’t know to what extent, partly because the chemistry is difficult, if the RNAs floating around in the nuclear complex are epitranscriptomically modified. That is the 3rd of the 3 NIH initiatives after ENCODE and Roadmap, namely E4 (enabling exploration of the eukaryotic epitranscriptome). I do not know if that has been officially launched but the budget was comparable to ENCODE, in the 100-200 million range.

    Even assuming common ancestry is true, it does not immediately dictate level of function or lack thereof. No one should rightly claim we KNOW, we guess, we have faith, but without experimental evidence, it’s a statement of faith. That’s ok, but let’s not pretend the claim is actually experimentally demonstrated, and if fact maybe in principle it can’t be!

    However, the claim of “90% no function” can be falsified at least in principle, but it cannot be directly confirmed, hence the claim borders on not being science at all.

  28. stcordova:

    It’s hard to take seriously anyone who can say

    DNA serves as a scaffold for epigenetic machinery such as histones which are Random Access Memory units.

    or

    Even assuming common ancestry is true…

    In what way are histones RAM? And how can you discount all the evidence supporting common ancestry?

    Once again, naturally, you ignore almost everything I said and post as if I didn’t say it; for example, you reiterate the silly assumption that experimental science is the only valid science.

  29. Btw,

    Here is an illustration of how function was established for a section of DNA, specifically the section of DNA corresponding to a homolog of the Gcn5 gene in yeast.

    It was initially called P55, and it’s biochemical function (at least one of them) was elucidated with great difficulty. This is for a coding section, how much more difficult is it to elucidate function for non-coding regions such as an intron:

    http://www.pnas.org/content/92/14/6364.full.pdf

    The intron functional elucidation will be difficult because RNAs are very fragile.

    We don’t have the power right now to elucidate what goes on in the cellular RNA soups in the nuclear complex without destroying a lot of the RNAs or at least how they really work in an undisturbed cell.

    Hopefully we can improve our techniques, but let’s stop being so presumptuous and assert we KNOW. We don’t know. We’re only beginning to know.

  30. Mung:
    Onions don’t tell us squat about the human genome.

    Not directly, no. But they do tell us quite a bit about how genomes vary and constrain the plausible hypotheses on why. Fugu (which you forgot to mention) are more directly relevant.

  31. stcordova: Here is an illustration of how function was established for a section of DNA, specifically the section of DNA corresponding to a homolog of the Gcn5 gene in yeast.

    No, that’s a description of how the existence of a particular peptide was discovered. If you discover the sequence of that peptide you can link it to a bit of DNA sequence. But you don’t need the peptide to know that the DNA sequence has a function; the fact that it’s conserved over evolutionary time would tell you that long before you had detected the translated protein or had any notion of just what that protein did. Once again you seem to think that laboratory experiments are the only valid form of science.

  32. John Harshman: But they do tell us quite a bit about how genomes vary and constrain the plausible hypotheses on why. Fugu (which you forgot to mention) are more directly relevant.

    I’m a man. Ruled by my belly. Onion good. Fugu??? Is sold by McDonalds or Dominos as topping?

  33. It’s hard to take seriously anyone who can say

    DNA serves as a scaffold for epigenetic machinery such as histones which are Random Access Memory units.

    🙂

    Actually your comment gives me less incentive to take you seriously. 🙂

    Histones can be reversibly written and read (gee, RAM can be reversibly written and read):

    http://www.nature.com/nrd/journal/v13/n9/fig_tab/nrd4360_F1.html

    and

    http://www.sciencedirect.com/science/article/pii/S0955067405000190

    Methylation of histones: playing memory with DNA

    Nucleosomal histones can be methylated in vivo at multiple residues and defined methylation patterns are related to distinct functional readouts of chromosomal DNA. Histone methylation has emerged as an important post-translational modification involved in transcriptional regulation and genome integrity. Recent progress in determining the cis and trans determinants of this process revealed multiple roles for histone methylation in epigenetic memory of active and silent states. The analysis of imprinted, X-linked and heterochromatic sequences disclosed mechanistic similarities for heritable transcriptional repression, pointing to a common mode of action. Moreover, the view of histone methylation as a stable modification has recently been challenged by studies revealing a number of pathways that are capable of removing histone methylation. Thus, in addition to having great in vivo complexity, this modification appears more dynamic then was previously thought.

    See that bolded portion, memory baby, memory!

    And what is RAM without programming!

    http://www.nature.com/nsmb/journal/v20/n3/full/nsmb.2489.html

    Epigenetic programming and reprogramming during development

    and here is a good one (histones a part of epigenetic memory by the way):

    http://dev.biologists.org/content/139/21/3891.long

    Epigenetic memory and parliamentary privilege combine to evoke discussions on inheritance

    Understanding the basis of epigenetic memory is a fast-moving challenge in modern biology. At a recent Company of Biologists Workshop held at Steyning’s historic Wiston House, thirty researchers led by John Gurdon interrogated three central questions: how are cell type-specific programs generated, what mechanisms duplicate this programmatic information as cells divide, and how does epigenetics contribute to trans-generational inheritance? We report some of the emerging themes arising from this debate.

    From a paper in 2014 regarding histone in brain memory alone (note, they use the word “memory”):

    Another twist in the histone memory code

    For instance, histones may be post-transcriptionally modified (e.g., phosphorylation, methylation, acetylation, sumoylation and ubiquitination7). The combinations of these histone “marks” (sometimes referred to as the “histone code”) determine which transcription factors can bind to specific DNA regions to regulate the expression of particular genes. Chromatin architecture may also be modulated by the incorporation of histone variants8. The core histones come in 4 flavors (H2A, H2B, H3 and H4). Histone variants are non-allelic forms of the conventional core histones which are expressed at comparatively lower levels. All conventional histones, except H4, have histone variants. One of the most highly conserved histone variant is H2A.Z (which may replace H2A).

    Although great strides have been made in understanding the transcriptional processes mediating memory formation, the present paper by Zovkic and colleagues9 represents the first to examine a potential role for the incorporation of histone variants in memory formation.
    ….

    Memory formation and storage involves an orchestrated response of finely-tuned processes across several brain regions. By showing a unique, brain region-dependent and time-limited role for histone variants in the process, the current study adds yet another layer of complexity (a new twist in the histone code) to this process. We stand in awe of this complexity and are reminded of that sentiment forwarded by a pioneering memory researcher. To paraphrase Karl S Lashely11, we sometimes feel, in reviewing the evidence, that the necessary conclusion is that learning (and retention) is just not possible. It is difficult to conceive of a set of mechanisms that can be so finely tuned to allow for it. Nevertheless, in spite of such evidence against it, learning sometimes does occur. Although the findings of Zovkic and colleagues are novel and important, they also serve to emphasize how little we understand about this fundamental process.
    http://www.nature.com/cr/journal/v25/n2/full/cr2014134a.html

  34. stcordova,
    I’m appalled by your reasoning. Because RAM and histones share a characteristic, histones are RAM? Frogs and limes are both green, therefore frogs are limes? And then you pull abstracts in which the words “histone” and “memory” both appear as if that makes your point. You aren’t conducting a discussion or even an argument; you’re testifyin’.

  35. John Harshman:

    And then you pull abstracts in which the words “histone” and “memory” both appear as if that makes your point.

    How about you pull abstract that say “histones aren’t memory.”

    You can pull abstracts that might say DNA is junk, but you can’t pull that stunt with epigenetic memory ( of which histones are a part) because now the cat is out of the bag, and we’ve established some of the function of histones — they function as cellular memory (at least).

    As far as DNA acting as a scaffold for the histone memory units, how else are the individual histones located except by DNA?

    The papers make my point because that what the function is. Your complaint is about a silly as saying, “we shouldn’t use the idea of code because that’s an engineering rather than an evolutionary construct.”

    They use the word “memory” because that is the most appropriate term. Trying to explain the histone memory structure in terms of common descent and fitness is not the most descriptive way to do it.

    Engineering and computer metaphors like “memory” and “function” and “programming” and “reprogramming” are the conceptually appropriate words. Too bad for anti-design advocates if engineering metaphors suggest designed constructs, but that’s the most appropriate language to further scientific understanding.

  36. It might be helpful to point out that stretches of DNA are wound around the histones to form beads we call nucleosomes. Here is a visual depiction:

  37. Now the “brain” of the histones is stuck in their tails. 🙂

    You can see the little memory positions on the individual amino acid residues of the histone tails:

  38. sal, what part of the genome is used to store these supposed memories? In order to say it’s evidence against junk it would have to be in regions that are thought to be junk. Such as the vast swathes of retrotransposons. Further you would have to have some idea about what percentage of the genome is being used as this kind of storage mechanism.

    Besides, it doesn’t explain why some species of Onion would need five times more “memory” than others. Nor why some trees need 100 times more than humans or other species of trees.

    You’re just pulling obscure abstracts about genomic interactions and then making baseless inferences such as the idea that “this implies it isn’t junk”. No, it doesn’t. There is nothing there about the relative proportions of genomes or which parts in particular are used as this kind of “memory”.

  39. sal, what part of the genome is used to store these supposed memories?

    Look at the diagram I just provided, every 147 DNA base pairs wraps around the 8 histones forming an octamer, that implies most of the genome is a scaffold for this vast array of histones say on the order of:

    3 giga bases /147 = 20 million histone ocatmer memory units in the human genome

    4 histones have around 50 possible states combined and that we have 8 histones per nucleosome that translates to a back-of-the-envelope calculation of 50×2 = 100 bits, thus:

    100 bits x 20 million = gigabits histone memory capacity per cell

    Considering the human brain has not quite 100 billion neurons and 100 trillion connections — 1000 connections per neuron, there have to be developmental mechanism that keep track of location etc.

    Since histones are part of the cellular differentiation mechanisms, the 2 gigabits of memory the histones provide per cell are just about the order of magnitude that one would expect to help the developmental mechanisms manage development of such a complex system as the brain and its 100 trillion connections.

    Besides, it doesn’t explain why some species of Onion would need five times more “memory” than others. Nor why some trees need 100 times more than humans or other species of trees.

    You assume since you don’t understand it, it must be useless. I’d say that attitude is clearly of no use to science, and could actually be a hindrance to science.

    What was the most recent common ancestor of a human and an onion? Why would anyone expect we would deploy histones and DNA the same way onions do?

    Now how does this relate to repetitive elements or transposable elements?

    The histones and cytosine methylation on the DNA affect chromatin accessibity (the above pictures is the unwound chromatin), and chromatin accessibility affects gene expression. The histones occupying the repetitive DNA as well as the cytosine methylation marks then are positioned to regulate the coding regions.

    That’s why if the non-coding regions are knocked out, the histones that occupy them are knocked out, and the histones in those repetitive regions help regulate the coding regions — that’s why the repetitive regions are likely not junk. This makes mechanical sense if you are willing to open your eyes. It’s not just voodoo of how junk DNA is important. If one sees the junk DNA as a scaffold for the regulatory histones, then that “extra” DNA totally makes sense.

    So how do evolutionists fix this embarrassment of their “junk narrative” now that even junk areas host histones that control coding regions– bwahaha, the epicycle of exaptation.

    Now a little background in 2010 these authors lay out the exaptation evolutionary hypothesis and trash it only to turn around in 2011 and promote the very theory they trashed in 2010. But first the 2010 paper:

    http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2836006/
    Epigenetic histone modifications of human transposable elements: genome defense versus exaptation.

    These notions form the basis of two competing hypotheses pertaining to the role of epigenetic modifications of TEs in eukaryotic genomes: the genome defense hypothesis and the exaptation hypothesis. To date, all available evidence points to the genome defense hypothesis as the best explanation for the biological role of TE epigenetic modifications.

    Now in the face of studying the role of histones at those TE sites, they see the light, have to backtrack, but don’t even use the word “exaptation” even though it is clear the TEs now serve function because of the presence of histones at the TE sites:

    http://www.ncbi.nlm.nih.gov/pubmed/21215797
    Epigenetic regulation of transposable element derived human gene promoters.

    It was previously thought that epigenetic histone modifications of mammalian transposable elements (TEs) serve primarily to defend the genome against deleterious effects associated with their activity. However, we recently showed that, genome-wide, human TEs can also be epigenetically modified in a manner consistent with their ability to regulate host genes. Here, we explore the ability of TE sequences to epigenetically regulate individual human genes by focusing on the histone modifications of promoter sequences derived from TEs. We found 1520 human genes that initiate transcription from within TE-derived promoter sequences. We evaluated the distributions of eight histone modifications across these TE-promoters, within and between the GM12878 and K562 cell lines, and related their modification status with the cell-type specific expression patterns of the genes that they regulate. TE-derived promoters are significantly enriched for active histone modifications, and depleted for repressive modifications, relative to the genomic background. Active histone modifications of TE-promoters peak at transcription start sites and are positively correlated with increasing expression within cell lines

  40. stcordova,

    Indeed we’ve only scratched the surface, and we might not have found it if we just sat back and accepted a faith-based approach that there is nothing but junk to find as you and others here have insisted.

    This is wrong – annoyingly so. You are misrepresenting the argument. No-one is insisting that there is no point in looking at the whole genome because only 10% of appears to be functional. It is, nonetheless, a fact that only 10% of the genome appears to be functional. We should not simply ignore apparent facts as if they are not there. Anyone wants to make a name for ’emselves by finding more function, they can go right ahead. No-one is stopping them. Conversely, research effort needs to be targeted to the places it is most likely to find fruit.

    And I ask again – why does it matter? Do you think there is some health problem being caused by this viewpoint, as if junk is some equivalent of an anti-vax position? What problem would it cause you if it was a scientific fact that only 10% of the genome is functional? Your arguments against this position are silly – find function for 1 LINE and 17% of the genome becomes functional. 1 intron, there’s another 25%! You must see how weak this is.

    One thing is certain: if function is found, it will not be Creationists that find it.

  41. stcordova: Look at the diagram I just provided, every 147 DNA base pairs wraps around the 8 histones forming an octamer, that implies most of the genome is a scaffold for this vast array of histones say on the order of:

    ffs Sal, the entire genome wraps around histones regardless of genome size. Your statement there would be true even if the ~90% junk was all removed. You’d still have the entirety of the remaining DNA wound around histones when it is packed in chromatin.

    It is simply how the genome is packed to make it take up less space. You’re now insinuating that merely because the cell packs it’s entire genome up by wrapping it around histones, that implies it’s all functional. You might as well then have said that because it’s all replicated during cell division that means it’s functional.

  42. stcordova: You assume since you don’t understand it, it must be useless. I’d say that attitude is clearly of no use to science, and could actually be a hindrance to science.

    No, I say it’s useless because a combination of 1. knowing the actual sequence and being able to see that it is mostly dead transposable elements, 2. genetic load and 2. extreme interspecies genome size variations even for closely related species (such as Onions). At no fucking point do I or anyone else ever make the inference that “we don’t know – therfore junk”.

    Can you pleas just fucking stop trying to saddle us with a line of reasoning we don’t make and we manifestly reject and protest every time you bring it up? Stop LYING.

    I actually agree with you, if anybody really DID think like that it WOULD be an attitude of no use and a hinderance to science. Problem is, nobody thinks like that. It is a strawman you are creating and you are obviously (because you’ve been told otherwise many times now) doing it to score some sort of rethorical point even though you know it is untrue.

  43. That methylation/demethylation affects activity is hardly big news. If one wishes to call that ‘memory’, one is at liberty to do so. If one thinks that reminiscent of engineering … fine. It’s a bit like sticking a blob of blu-tac on a clock face. Makes it ‘remember’ the time, when the hands get stuck, and forgetting it when they move again. Gah, analogies!

  44. ffs Sal, the entire genome wraps around histones regardless of genome size. Your statement there would be true even if the ~90% junk was all removed

    No you don’t understand. Fewer histones means fewer histones available for regulation.

    The histones have cell-type, tissue specific, developmental stage histone states which means different regulatory profiles per cell.

    That’s why one can use the same DNA (almost the same DNA) in each cell, but generate 213 different cell types, and likely 100 trillion slightly differentiated cells, the histones aid in the management of different expression profiles, same with the cytosine methylations on the DNA.

    The cytosine methylations themselves provide computational memory, look at this diagram to see the memory changes during the development of oocyte to sperm to post implantation embryo. It looks like bits flipping on an off quite frequently just like RAM. Note that many of the cytosine methylation epigenetic marks are in the non coding regions (like SINEs and LINEs):

    http://www.nature.com/nature/journal/v511/n7511/fig_tab/nature13544_F2.html

Leave a Reply