Genetic load and junk.

Mung, to petrushka, elsewhere:

Everyone does not understand “genetic load” and those that do claim to understand are probably wrong. Why don’t you start an OP on genetic load and the genetic load argument? That would be interesting. Betting you won’t.

This is such an OP. I believe the genetic load argument*** was initially proposed by Susumu Ohno in 1972, whose paper also introduced the then-scare-quoted term “junk”. It’s brief, accessible, and worth a read for anyone who wishes to offer an opinion/understand (not necessarily in that order).

The short version: sequence-related function must be subject to deleterious mutations. Long genomes (such as those of most eukaryotes) contain too many bases for the entire genome to be considered functional in that way, given known mutation rates. The bulk of such genomes must either have functions that are not related to sequence, or no function at all.

Interestingly, the paper is hosted on the site of an anti-junk-er, Andras Pellionisz, a self-promoting double-PhD’d … er … maverick. Also of interest is that, contrary to some ID narratives, the idea was initially resisted by ‘Darwinists’, if that term is understood not as people who simply accept evolution, but as people who place most emphasis on Natural Selection. Perfectionism is not the sole preserve of Creationists.

More recent work has characterised the nonfunctional fraction, and this lends considerable empirical support to Ohno’s contentions.

[eta: link to comment]
***[eta: in relation to genome size, not the first time anyone, ever, discussed genetic load!]

209 thoughts on “Genetic load and junk.

  1. The point about “Darwinists” initially resisting the concept is important and not many ID’ers (or even evolutionists) know about it.

    The general view was that natural selection would get rid of junk, because carrying around a lot of junk carries with it a cost in terms of the energy required to replicate it (and it takes more time to replicate a larger genome). This idea isn’t wrong, it’s definitely true for prokaryotes, it’s just that for eukaryotes there are other mechanisms at work that have a stronger influence on genome-size than selection for energy-cost and speed at replication.

  2. Incidentally, you won’t go 5 minutes on this before someone makes the mistake of equating junk with noncoding DNA. Let’s nip that one in the bud: one can have both nonfunctional coding DNA, and functional noncoding DNA.

  3. Allan Miller:
    Alan Fox,
    If you like …

    Quoting Ohno:

    Even if an allowance is made for the existence in multiplicates of certain genes, it is still concluded that, at the most, only 6% of our DNA base sequences is utilized as genes.

    I see more recent estimates put the percentage of the human genome under selective pressure at 12 – 15%.

  4. The onus, so to speak, is on the no-junkers to demonstrate that a significant percentage of sequence independent DNA is functional. More than a couple of percent.

    What, for example, are the odds that a junk sequence mutates into a useful sequence, as compared to a duplicate gene acquiring a new function?

  5. I’m going to sit back now, and wait for mung to correct everyone’s understanding of the genetic load argument.

    He could start by dissecting Larry Moran’s six part discussion and telling us what Larry got wrong.

  6. A few miscellaneous comments:

    1. JBS Haldane (in 1939) and HJ Muller (in 1950) introduced the mutational load and showed that it would be equal to 2\mu, where mu was the mutation rate. If we have fitnesses that multiply over loci, which is a good first approximation, with L loci the overall mean fitness is (1-2\mu)^L or approximately e^{-2L\mu}.

    2. As amounts of DNA and mutation rates became clearer in the 1950s, it became apparent to population geneticists that there was too much DNA. This was known as the c-value paradox. Ohno, in 1972, combined this with evidence from genome size changes to argue that much of the genome was “junk DNA”.

    3. What about genetic drift? Motoo Kimura, Takeo Maruyama, and James Crow showed in 1963 in this paper that it would increase the mutational load if populations were small enough.

    4. Sites at which all variation is neutral have no mutational load.

  7. Alan Fox: Quoting Ohno:

    I see more recent estimates put the percentage of the human genome under selective pressure at 12 – 15%.

    Have you just implicitly fallen into the non-coding = junk fallacy? “Within genes” and “under selective pressure” are not the same thing at all.

  8. John Harshman: Have you just implicitly fallen into the non-coding = junk fallacy? “Within genes” and “under selective pressure” are not the same thing at all.

    I might have inadvertently. I was presuming any genetic sequence under selective pressure must have some benefit to the organism/niche. I’m aware that there are sequences that code for RNA that itself is central to cell function; ribosomal RNA, for instance. Are you saying “gene” must be reserved for sequences that code for proteins?

  9. I think its possible in principle that most of the DNA could have function, despite the genetic load argument, if most of the genome coded for long non-coding RNAs that, once transcribed, would act as a scaffold to recruit factors that would then epigenetically modify the DNA to shut it down. But this stretches the definition of ‘function’ In many cases it would amount to saying that the only function of a stretch of DNA was to produce an RNA that would shut down that stretch of DNA because it had no other function.
    AFAIK, there isnt any good evidence for extensive coverage of the genome by RNAs of this sort but there are a few example here and there.

  10. Just a terminological issue: “genetic load” is a general term for reductions in fitness associated with various evolutionary forces and genetic phenomena. More specific kinds of genetic load have names like segregational load, substitutional load, mutational load, etc.

    In this case we are talking about mutational load.

  11. Alan Fox: I might have inadvertently. I was presuming any genetic sequence under selective pressure must have some benefit to the organism/niche. I’m aware that there are sequences that code for RNA that itself is central to cell function; ribosomal RNA, for instance. Are you saying “gene” must be reserved for sequences that code for proteins?

    No. “Gene” refers to sequences that code for proteins or functional RNAs. But there are many functional sequences that don’t code for either, for example transcription factor binding sites.

  12. John Harshman: “Gene” refers to sequences that code for proteins or functional RNAs.

    OK.

    But there are many functional sequences that don’t code for either, for example transcription factor binding sites.

    Does that cover the difference between 6% and 12-15%? If transcription regulation is visible to selection, is it unreasonable to consider such sites as part of the genome?

    ETA I should have said “an essential part of the genome”? I’m thinking knockout mouse.

  13. Alan Fox: OK.
    Does that cover the difference between 6% and 12-15%?If transcription regulation is visible to selection, is it unreasonable to consider such sites as part of the genome?

    You use terms confusingly. “Genome” doesn’t mean “the genes”. It refers to all the DNA. Most of your genome is not part of a gene. Of the part that isn’t part of a gene, most is junk. I’m not sure where either estimate came from or what accounts for the difference. I merely point out that they are estimates of different things and were not intended to be the same.

  14. Joe Felsenstein,

    More specific kinds of genetic load have names like segregational load, […]

    I doubt anyone cares, but I’m a bit lukewarm on segregational load, and recombinational. It seems wrong to me to deprecate processes that supposedly break up adaptive gene combinations, when those very processes play a large part in bringing them into being. It’s as if one starts from a perfect, maximally fit diploid genome (arising as if by magic) which is then broken up by that darned meiosis. I’m lukewarm on the whole ‘costs of sex’ thing to be honest, but that’s maybe for another thread.

  15. petrushka: I’m going to sit back now, and wait for mung to correct everyone’s understanding of the genetic load argument.

    Why don’t you just thank Allan for bailing you out and count your blessings?

    This is the “genetic load” argument you think everyone knows?

  16. I see no deep contradictions being posted here. Not everyone is an expert, but everyone seems to agree that most eukaryotic genomes have a lot of DNA that seems not to be conserved. Plus, there seem to be plausible historical scenarios that explain why much of the genome as currently unused.

  17. Mung:

    This is the “genetic load” argument you think everyone knows?

    It’ the one I know. And I’ve written papers on related kinds of genetic loads, and textbook sections on mutational load.

    Got a correction to suggest?

  18. Mung bolded the word everyone.

    I suppose he wishes to call attention to the fact that not everyone in the world knows biology.

    But i would say most of the posters here have visited Sandwalk and have seen Larry rant about junk DNA. So most people here have a ready source for the genetic load argument.

    That would be the one to poke holes in if someone thinks the majority of DNA is not junk. At least in plants and animals.

  19. John Harshman: You use terms confusingly. “Genome” doesn’t mean “the genes”. It refers to all the DNA.

    Sure. I’m aware of that.

    Most of your genome is not part of a gene. Of the part that isn’t part of a gene, most is junk. I’m not sure where either estimate came from or what accounts for the difference. I merely point out that they are estimates of different things and were not intended to be the same.

    What I was wondering is, given a percentage, apparently around 10% or so* is certainly essential genetic information and thus under selective pressure, deleterious mutations being lost and beneficial mutations being retained, and at least half is definitely not essential and therefore not under selective pressure (what used to be called selfish DNA), is the jury still out on the possibility of some functional attribute for DNA that does not yet fall clearly into either category? REW mentions that some DNA might have function that does not depend on specific nucleotide sequences but nonetheless is important perhaps in some gross structural way.

    *in humans

  20. Alan Fox:is the jury still out on the possibility of some functional attribute for DNA that does not yet fall clearly into either category?

    I don’t think the jury is very far out. If there were much such “bulk DNA”, the lengths, though not the sequences, of that DNA would be conserved. Outside of a possible minimum length for an intron to contain splice signals, I don’t see much evidence of such a thing.

  21. Since my screen name is featured in the OP, am I allowed to ask when the no-junk people are expected to arrive with their counter-arguments?

    Or at least the correct interpretation of genetic load.

  22. colewd:
    Allan Miller,

    Do you have an opinion on how much of the genome is dedicated to embryo development?

    Embryos are an evilutionist myth. What good is half a baby with all it’s superirreduciblecomplexialidocious systems?

  23. petrushka: Mung bolded the word everyone.

    I suppose he wishes to call attention to the fact that not everyone in the world knows biology.

    Sure. That explains everything. Did you claim everyone knows biology? I can’t imagine how I might disagree with that claim. But then, I also don’t believe that cows exist inside computers.

  24. You know, Allan, I think you’re generally an upright guy, though I can’t say I’ve ever seen you admit you’ve ever been wrong about anything. But this isn’t the thread for that. 😉

    But like keiths often does, you decide to not provide any link to the original post you quote. Is it really that hard to provide a link to quoted text that originated on this site?

    It might be nice to know the original context. Or perhaps that has nothing to do with your OP. After all, everyone knows about the genetic load argument.

    Whatever.

    Haldane introduced the idea of the cost of selection. [Maybe it was someone before Haldane, whatever.] The concept of genetic load has been confused with the concept of cost of selection.

    We’ll probably never know what petrushka meant now, but if people who think they understand “genetic load” are conflating genetic load and cost of selection, then I at least have better grounds for my statement than petrushka has for his.

    Of course, I could be wrong about whether anyone has ever confused genetic load and cost of selection. But the OP totally misses out on that.

  25. Joe Felsenstein: 4. Sites at which all variation is neutral have no mutational load.

    Are you saying that “genetic load” and “mutational load” are the same?

  26. petrushka: He could start by dissecting Larry Moran’s six part discussion and telling us what Larry got wrong.

    Let’s pretend like you created an OP [you didn’t] in which you provided a link to Larry’s “six part discussion” [you didn’t].

  27. Joe Felsenstein: Just a terminological issue: “genetic load” is a general term for reductions in fitness associated with various evolutionary forces and genetic phenomena. More specific kinds of genetic load have names like segregational load, substitutional load, mutational load, etc.

    In this case we are talking about mutational load.

    Thanks Joe! But then, everyone already knew that. Sez petrushka.

  28. Let’s stipulate that not everyone is an expert and not everyone is qualified to teach population genetics at the graduate level.

    How is Larry Moran wrong about his estimate of junk DNA.

  29. Here’s the original statement by petrushka, which elicited my response quoted in the OP:

    petrushka: You go first. Everyone knows the genetic load argument, so assume it has been made and start there.

    So petrushka actually asserted that everyone knows the genetic load argument.

    It’s not that I wish to call attention to the fact that not everyone in the world knows biology. That’s a claim petrushka never made and therefore a claim I never disputed. But if that’s all you have …

  30. petrushka: How is Larry Moran wrong about his estimate of junk DNA.

    Would it kill you to provide a link, or a link to where you provided a link?

  31. link 1:

    Georgi goes on to explain four main arguments for junk DNA: genetic load, the C-value Paradox, transposons (selfish DNA), and modern evolutionary theory.

    Not much help.

  32. link 2:

    Intelligent Design Creationists have difficulty understanding the arguments for junk DNA and the evidence that supports those arguments. We try to explain the genetic load argument but it doesn’t seem to penetrate.

    But everyone understands the genetic load argument.

  33. petrushka: He could start by dissecting Larry Moran’s six part discussion and telling us what Larry got wrong.

    Please provide a link to the first post in Larry Moran’s six part discussion of genetic load.

  34. I will grant that you can read much faster than i can. It might take me a hour or two to fully digest a couple dozen sandwalk post along with all the didcussion.

  35. You think there is a six part discussion on Sandwalk of genetic load. Maybe you’re just wrong. Isn’t that a possibility?

  36. I’m using a tablet and can’t do good searches. The subject of the six parter is junk dna.

  37. colewd,

    Do you have an opinion on how much of the genome is dedicated to embryo development?

    Human, I presume you mean? I’d be inclined to say ‘not much’. Most of the genome is involved in cellular process, rather than establishing a multicellular soma out of such cells.

  38. Mung,

    […] can’t say I’ve ever seen you admit you’ve ever been wrong about anything.

    I could say the same for you! For me: doesn’t happen very often, but it certainly does happen.

    But like keiths often does, you decide to not provide any link to the original post you quote. Is it really that hard to provide a link to quoted text that originated on this site?

    Not hard at all. It was an omission, rather than a decision per se. I saw a reference to genetic load and junk, so I stuck an OP up about it. Other aspects of genetic load were not the subject of the OP, which was triggered by that specific connection.

    And, btw, genetic load is the cost of selection – the population cost of carrying suboptimal variants.

  39. Allan Miller:

    colewd,

    Do you have an opinion on how much of the genome is dedicated to embryo development?

    Human, I presume you mean? I’d be inclined to say ‘not much’. Most of the genome is involved in cellular process, rather than establishing a multicellular soma out of such cells.

    I disagree. Most of the genome is not “involved” in any processes, other than sitting there and being replicated. It is junk DNA.

  40. The mutational load argument (called here by the broader term genetic load) is really very simple:

    1. The rate of point mutations in humans is about 100 mutations per generation in the (diploid) genome. Here is a discussion of this by Larry Moran. For our purposes it does not matter whether he is off by a factor of two. Or maybe even more.

    2. Humans have about 3 billion bases per haploid genome, or 6 billion in a diploid genome.

    3. The main theoretical result on mutational load is that the total load, that is, the reduction in fitness, is twice the mutation rate. Basically that is because any deleterious mutation is sooner or later eliminated from the population. So in a stretch of DNA that has 10,000 bases, we expect 10,000 \times 100 / 6,000,000,000 mutations per generation. If all those sites are not junk, those mutations would be deleterious, and so the fitness of an average individual would be reduced by 0.00016 by mutants in that stretch of DNA,

    4. The default way that fitnesses combine between parts of the genome that are involved in unrelated functions is that they multiply. This is because your probability of surviving two independent challenges is the product of the probabilities of each.

    5. We can figure out the product of 6,000,000,000 / 10,000 such chunks of genome, each bringing a factor of 0.99984. That’s pretty close to e^{-100}. Which is an awfully small number. I don’t like your chances of surviving long enough to grow up and comment here.

    6. The way out of this disastrous prediction is to assume either that the mutation rate is really much lower (very much lower), or that mutations are in a conspiracy to have their fitnesses combine in some very strongly interacting way so that the result is that their combined fitnesses are nowhere near the product of their individual fitnesses, or …

    7. That in much of the genome (very much of it) mutations are not deleterious. Because junk.

Leave a Reply