Science Friday: A Molecular Biology & Evolution Sampler

My favorite subject-specific journal is Molecular Biology and Evolution (MBE).  This journal publishes on topics primarily related to molecular evolution and evolutionary genomics, which are among my favorite subjects in biology. I’m happy to report that the latest issue of MBE is out today, and there are lots of great articles that I think will be of interest to folks here, many of which are open-access.

I sadly don’t have time to write up any of these articles, but I thought it might be useful to “sample” a few in case any any of you would like to read and discuss them.  Here are a handful that seem particularly interesting:

Population Structure Shapes Copy Number Variation in Malaria Parasites (open-access)

Abstract:

If copy number variants (CNVs) are predominantly deleterious, we would expect them to be more efficiently purged from populations with a large effective population size (Ne) than from populations with a small Ne. Malaria parasites (Plasmodium falciparum) provide an excellent organism to examine this prediction, because this protozoan shows a broad spectrum of population structures within a single species, with large, stable, outbred populations in Africa, small unstable inbred populations in South America and with intermediate population characteristics in South East Asia. We characterized 122 single-clone parasites, without prior laboratory culture, from malaria-infected patients in seven countries in Africa, South East Asia and South America using a high-density single-nucleotide polymorphism/CNV microarray. We scored 134 high-confidence CNVs across the parasite exome, including 33 deletions and 102 amplifications, which ranged in size from <500 bp to 59 kb, as well as 10,107 flanking, biallelic single-nucleotide polymorphisms. Overall, CNVs were rare, small, and skewed toward low frequency variants, consistent with the deleterious model. Relative to African and South East Asian populations, CNVs were significantly more common in South America, showed significantly less skew in allele frequencies, and were significantly larger. On this background of low frequency CNV, we also identified several high-frequency CNVs under putative positive selection using an FST outlier analysis. These included known adaptive CNVs containing rh2b and pfmdr1, and several other CNVs (e.g., DNA helicase and three conserved proteins) that require further investigation. Our data are consistent with a significant impact of genetic structure on CNV burden in an important human pathogen.

No Accumulation of Transposable Elements in Asexual Arthropods (open-access)

Abstract:

Transposable elements (TEs) and other repetitive DNA can accumulate in the absence of recombination, a process contributing to the degeneration of Y-chromosomes and other nonrecombining genome portions. A similar accumulation of repetitive DNA is expected for asexually reproducing species, given their entire genome is effectively nonrecombining. We tested this expectation by comparing the whole-genome TE loads of five asexual arthropod lineages and their sexual relatives, including asexual and sexual lineages of crustaceans (Daphnia water fleas), insects (Leptopilina wasps), and mites (Oribatida). Surprisingly, there was no evidence for increased TE load in genomes of asexual as compared to sexual lineages, neither for all classes of repetitive elements combined nor for specific TE families. Our study therefore suggests that nonrecombining genomes do not accumulate TEs like nonrecombining genomic regions of sexual lineages. Even if a slight but undetected increase of TEs were caused by asexual reproduction, it appears to be negligible compared to variance between species caused by processes unrelated to reproductive mode. It remains to be determined if molecular mechanisms underlying genome regulation in asexuals hamper TE activity. Alternatively, the differences in TE dynamics between nonrecombining genomes in asexual lineages versus nonrecombining genome portions in sexual species might stem from selection for benign TEs in asexual lineages because of the lack of genetic conflict between TEs and their hosts and/or because asexual lineages may only arise from sexual ancestors with particularly low TE loads.

 

Evolution of Prdm Genes in Animals: Insights from Comparative Genomics (open-access)

Abstract:

Prdm genes encode transcription factors with a subtype of SET domain known as the PRDF1-RIZ (PR) homology domain and a variable number of zinc finger motifs. These genes are involved in a wide variety of functions during animal development. As most Prdm genes have been studied in vertebrates, especially in mice, little is known about the evolution of this gene family. We searched for Prdm genes in the fully sequenced genomes of 93 different species representative of all the main metazoan lineages. A total of 976 Prdm genes were identified in these species. The number of Prdm genes per species ranges from 2 to 19. To better understand how the Prdm gene family has evolved in metazoans, we performed phylogenetic analyses using this large set of identified Prdm genes. These analyses allowed us to define 14 different subfamilies of Prdm genes and to establish, through ancestral state reconstruction, that 11 of them are ancestral to bilaterian animals. Three additional subfamilies were acquired during early vertebrate evolution (Prdm5, Prdm11, and Prdm17). Several gene duplication and gene loss events were identified and mapped onto the metazoan phylogenetic tree. By studying a large number of nonmetazoan genomes, we confirmed that Prdm genes likely constitute a metazoan-specific gene family. Our data also suggest that Prdm genes originated before the diversification of animals through the association of a single ancestral SET domain encoding gene with one or several zinc finger encoding genes.

This next one is on a topic that comes up here from time to time, and I think it will be of interest to many of you.  Sadly, it’s paywalled, but if you don’t have access through a university library, feel free to send me a pm.

Are Human Translated Pseudogenes Functional?

Abstract:

By definition, pseudogenes are relics of former genes that no longer possess biological functions. Operationally, they are identified based on disruptions of open reading frames (ORFs) or presumed losses of promoter activities. Intriguingly, a recent human proteomic study reported peptides encoded by 107 pseudogenes. These peptides may play currently unrecognized physiological roles. Alternatively, they may have resulted from accidental translations of pseudogene transcripts and possess no function. Comparing between human and macaque orthologs, we show that the nonsynonymous to synonymous substitution rate ratio (ω) is significantly smaller for translated pseudogenes than other pseudogenes. In particular, five of 34 translated pseudogenes amenable to evolutionary analysis have ω values significantly lower than 1, indicative of the action of purifying selection. This and other findings demonstrate that some but not all translated pseudogenes have selected functions at the protein level. Hence, neither ORF disruption nor presence of protein product disproves or proves gene functionality at the protein level.

There are a lot of other interesting papers in this issue, but for the sake of brevity, I’ll stop here.  Happy reading!

205 thoughts on “Science Friday: A Molecular Biology & Evolution Sampler

  1. colewd: Thanks for the information. I really don’t understand this debate. Why call part of the genome that we don’t understand junk? How much of the function of the genome do we really understand? 1%?

    Because we DO understand most of it and understand that it IS junk. We probably understand over 90% of the genome.

    You have been misinformed if you think junk-DNA is based on an argument from ignorance.

  2. colewd:
    I really don’t understand this debate.Why call part of the genome that we don’t understand junk?How much of the function of the genome do we really understand? 1%?

    I think you have put your finger on the problem here. FIRST, nearly all of the genome is understood. I don’t know how you got the impression that it was 99% terra incognita, but it’s not.

    SECOND, you seem to be under the impression that we have labeled as “junk” anything not understood. This is exactly backwards. We have labeled as junk, those often very large segments of the genome which have been amply demonstrated to be junk. Certainly part of what is not fully understood MIGHT be junk and it MIGHT be functional or even essential. But nobody here is arguing from ignorance (except Sal, of course).

  3. Flint,

    Sal is honestly admitting what he does not know. Are you doing the same? How much do you think we understand of the function of the human genome? 10% 50% 90%? Allan Miller, what is your opinion?

  4. I don’t see anything honest being done. I see a fraction of one percent being bandied about as it were a significant percentage.

    If it’s not conserved, sequence doesn’t matter, regardless of imagined function.

  5. colewd: Sal is honestly admitting what he does not know. Are you doing the same? How much do you think we understand of the function of the human genome? 10% 50% 90%? Allan Miller, what is your opinion?

    Mr. colewd, you need to understand that we have the sequences of all that genome. It’s mostly mutated retrotransposons and thousands of dead duplications, pseudogenes, retroviruses and so on. We have the sequence, we know what virus genes look like, what transposons look like, what regulatory regions look like etc. etc. By simply comparing them to the functional regions we can see they are dead, long deactivated, mutated and no longer functional genes of many sorts. Again, transposons, reverse transcriptase genes of viral origin, dead protein coding genes that are missing key exons, dead mutated regulatory regions (because they have sequences that are similar to functional regulatory regions, but have deactivating mutations in them), have premature stop codons and so on and so forth.

    The case for JunkDNA isn’t based on an argument from ignorance, it is actually very strong and supported by multiple independent lines of investigation. Genetic load, c-value paradox, lack of sequence-conservation, and knowing the actual sequences and comparing them to known functional regions. It’s known to be mostly junk.

  6. Australians are worried that a single pregnant corn snake has escaped into the wild. I’m sure Sal could allay their fears by citing genetic degradation. It’s always worked in the past.

  7. colewd,

    How much do you think we understand of the function of the human genome? 10% 50% 90%? Allan Miller, what is your opinion?

    I think there are good grounds to hedge at round about 10% of sequence being functional. This paper illustrates the kind of approach, by looking at constraint and conservation.

    Of the rest, we have 50% comprised of the same few repeats – dead transposon and virus. Although transposons can ocasionally land in genes and become, effectively, mutations, there are no grounds to suppose that the explanation of this 50% resides anywhere other than the consequences of transposition itself. Every time a copy-paste transposon jumps, it expands the genome a fraction. The main opposing force to this is the mutagenic effect, rather than the additional cost of extra bases, which is trivial from the pov of selection.

    25% of the sequence is intron (untranslated gaps in coding sequence). There are some grounds for supposing that the presence of introns, in some genes, permits a functional action such as alternative splicing. But this explains only the presence of a break, not the length of that break, and is not particularly common. It appears unlikely that we really ‘need’ 25% of the genome to support splice variants in 2% of it, even if every gene had a splice variant, which would be very unlikely.

    So there is 50-75% of the genome right there whose ‘function’ is fairly well established – it has not got one. Unless, of course, someone makes a breakthrough. We can never be categorical. But Ohno’s argument needs to be borne in mind, for any proposal of function.

  8. Allan Miller,

    Of course it does.

    Reference please

    What’s yours, meanwhile? No, let me guess – Design. Where, when, why, how? A smidgeon of that detail you repetitiously and pugnaciously demand would be ever so welcome. In your own time.

    You are obviously confused. Yours is the position that claims to have a step-by-step process for producing transposons and all biologic features. ID does not make such a claim. ID has a step-by-step process for determining design exists or not. And all other questions about it come after that.

  9. Frankie: Yours is the position that claims to have a step-by-step process for producing transposons and all biologic features.

    Do you have a citation for that?

  10. Frankie: ID has a step-by-step process for determining design exists or not.

    You were unable to demonstrate such a process for ATP synthase however. You just said it was designed.

  11. This is why OMagain is best ignored-

    Darwin (1859) and all evolutionists since have posited a step-by-step process. That is what natural selection and drift are, ie step-by-step processes.

    And yes the step-by-step design determining process has been used for ATP synthase. First there are any explanations for it via stochastic processes- it seems no one even knows where to start. And it has a specification. So there you have it- the first steps eliminated necessity and chance and the last step was the specification.

    Again this has all been told to OM many, many times and yet here we are again.

  12. colewd:
    Flint,

    Sal is honestly admitting what he does not know.Are you doing the same? How much do you think we understand of the function of the human genome? 10% 50% 90%?Allan Miller, what is your opinion?

    Sigh. Sal is not “honestly admitting what he does not know”, he is dishonestly claiming that what is actually known, is somehow not known. I am not a biologist, only a spectator, but even I can spot the misdirection and BS of the creationist.

  13. Allan Miller,

    Thanks for the thoughtful answer. My skepticism here is based on where we are in understanding splicing codes, as you mentioned, and gene regulation. Where are all these codes? I am also struggling with the idea that introns are not functional. From my studies, the genome especially as it relates to cell division is a very tightly controlled process with lots of redundancy. Timing is critical. A possible experiment would be to shorten several introns that are long and see if there is is a change in function. Do you think comparing a sequence of a known function to an unknown function is real evidence of lack of function? How would we know?

  14. Frankie,

    You are obviously confused. Yours is the position that claims to have a step-by-step process for producing transposons and all biologic features.

    No it doesn’t. No-one claims to know every step of every evolutionary sequence in detail. Nor is it esssential for steps to be as small as you think Darwin thought they were to be evolutionary in that sense. His own contemporaries and supporters thought he was being a bit restrictive.

    ID does not make such a claim. ID has a step-by-step process for determining design exists or not.

    Use your process on the transposon then. Does design ‘exist’? I’ll make it easy – just the Alu sequence.

  15. Allan Miller:
    Frankie,

    No it doesn’t. No-one claims to know every step of every evolutionary sequence in detail. Nor is it esssential for steps to be as small as you think Darwin thought they were to be evolutionary in that sense. His own contemporaries and supporters thought he was being a bit restrictive.

    Use your process on the transposon then. Does design ‘exist’? I’ll make it easy – just the Alu sequence.

    LoL! Alan never read Darwin nor anyone who came after him- what part of “slight successive modifications” don’t you understand, Allan? And we know no one has that detail, that is the point. You don’t have any way to test the claims of your position.

    And yes the design inference extends to transposons. See Spetner 1997

  16. colewd,

    My skepticism here is based on where we are in understanding splicing codes, as you mentioned, and gene regulation. Where are all these codes? I am also struggling with the idea that introns are not functional.

    The regulatory ‘code’ is located within the promoter/repressor region of a gene. It aids and abets binding of transcription factors. I must give a general answer to such a general question; I’m not being cagey.

    As to splicing, I think you may have missed the point on intron length. Even a functional intron would provide a little hidey-hole for parasitic sequence to insert itself, harmlessly provided the excised ends of the intronic sequence are not interrupted. The presence of an intron and its length are not necessarily correlated with the same functional role.

    You could also ask yourself whether organisms that produce no alternative transcripts still have introns. I actually don’t know – I propose it as a literature research project for you. I’m betting that they do.

    From my studies, the genome especially as it relates to cell division is a very tightly controlled process with lots of redundancy. Timing is critical. A possible experiment would be to shorten several introns that are long and see if there is is a change in function.

    The timing of division is related to the time it takes to replicate all bases, but if this is as tightly constrained as you seem to suggest, neither transposons nor intron position/length would be able to vary significantly between or within species. c- value clearly does vary enormously, so I don’t think fiddling with a few bases here or there is going to make much difference.

    Do you think comparing a sequence of a known function to an unknown function is real evidence of lack of function?

    Not sure what you mean. If you are referring to the linked paper, that’s not what they did. They were using evolutionary rate as a proxy for functional constraint. If a sequence varies in line with the neutral expectation, it is reasonably inferred to be neutral, and hence not under purifying selection.

    If you mean the transposons, there are several million of them, all with the same basic sequence, but with clear disablement of their transpositional ability. They might have some other function – all the millions of them – but they are indisputably derived from transposons, and I see no need for a further explanation for their presence, given what transposition actually is.

  17. Frankie,

    LoL! Alan never read Darwin nor anyone who came after him- what part of “slight successive modifications” don’t you understand, Allan?

    LoL! Don’t you understand that no-one is under any obligation to take Darwin as Gospel? As I said, and will repeat for the hard of hearing, his contemporaries and supporters (eg Huxley) felt he was wrong on this point. And so it has proven, particularly at the genomic level.

    And yes the design inference extends to transposons. See Spetner 1997

    So some time in 1997 Spetner said something about transposons? Brill. That settles it, then. Transposons Are Designed! Spetner sez.

  18. OMagain: Do you have a citation for that?

    “But, as by this theory innumerable transitional forms must have existed…”

    “In looking for the gradations by which an organ in any species has been perfected, we ought to look exclusively to its lineal ancestors; but this is scarcely ever possible, and we are forced in each case to look to species of the same group, that is to the collateral descendants from the same original parent-form, in order to see what gradations are possible, and for the chance of some gradations having been transmitted from the earlier stages of descent, in an unaltered or little altered condition.”

    “Although we must be extremely cautious in concluding that any organ could not possibly have been produced by successive transitional gradations, yet, undoubtedly, grave cases of difficulty occur, some of which will be discussed in my future work.”

    etc. etc.

  19. Spetner: “I find it hard to believe that a process as precise and as well controlled as the transposition of genetic elements happens only by chance.”.

    That pretty much settles it. Spetner finds it hard, it’s hard.

  20. Mung: “But, as by this theory innumerable transitional forms must have existed…”

    Big FAIL by Mung. FrankenJoe said science claims to have a step-by-step process for producing transposons. Your 150+ years old Darwin quote about transitional fossil forms has nothing to do with FrankenJoe’s idiocy.

    It is pretty funny when you Fundy Creationists bend over backwards to prop up each other’s lies no matter how egregious and stupid they are.

  21. dazz,

    Here are 4 to start.
    J Steroid Biochem Mol Biol. 2010 Jul; 121(1-2): 130–135. Published online 2010 Mar 7. doi: 10.1016/j.jsbmb.2010.02.036

    Published in Advance August 24, 2010, doi: 10.1101/gr.107920.110 Genome Res. 2010. 20: 1352-1360

    Cancers (Basel). 2013 Dec; 5(4): 1242–1260. PMCID: PMC3875938 Published online 2013 Oct 21. doi: 10.3390/cancers5041242

    Organogenesis. 2015;11(3):95-104. doi: 10.1080/15476278.2015.1086052.

  22. Allan Miller,

    As to splicing, I think you may have missed the point on intron length. Even a functional intron would provide a little hidey-hole for parasitic sequence to insert itself, harmlessly provided the excised ends of the intronic sequence are not interrupted. The presence of an intron and its length are not necessarily correlated with the same functional role.

    Do you have evidence here? I think intron length could effect transcription timing. Will search for positive situation.

  23. Allan Miller,

    I have not read this yet but it appears relevant.

    Dev Cell. 2008 Mar; 14(3): 324–330.
    doi: 10.1016/j.devcel.2008.02.002
    PMCID: PMC2825037
    NIHMSID: NIHMS176861
    Intron Delays and Transcriptional Timing during Development

  24. Flint,

    Sigh. Sal is not “honestly admitting what he does not know”, he is dishonestly claiming that what is actually known, is somehow not known. I am not a biologist, only a spectator, but even I can spot the misdirection and BS of the creationist.

    I am not seeing it and may be my issue. If you can cite some specific examples that would help. If he is saying that something is not known and you can cite several sources of experimental evidence showing he is either ignorant or dishonest I will get it. If he simply has a different opinion then he is totally in bounds. He believes in YEC and I don’t agree with him but I have not seen his arguments as ignorant or dishonest.

  25. colewd,

    Do you have evidence here?

    For what – the non-effect of c value on replication? Well, supernumary chromosomes, whole genome duplications, intra-species and inter-species variation seem to indicate that c value is not especially tightly constrained, by anything.

    I think intron length could effect transcription timing.

    Well yes, it’s bound to!

  26. colewd,

    Intron Delays and Transcriptional Timing during Development

    There is something of a distinction to be drawn between something having an effect on function and that effect being the reason it is there.

  27. Allan Miller,

    For what – the non-effect of c value on replication? Well, supernumary chromosomes, whole genome duplications, intra-species and inter-species variation seem to indicate that c value is not especially tightly constrained, by anything.

    Maybe right, but I remain skeptical until alternative splicing and gene expression timing is better understood. There is clearly variation and supernumerary chromosomes is a good example but “not especially tightly constrained” seems like a premature conclusion until more is understood.

  28. Allan Miller,

    There is something of a distinction to be drawn between something having an effect on function and that effect being the reason it is there.

    I will give you a do over on this comment. 🙂

  29. colewd,

    Maybe right, but I remain skeptical until alternative splicing and gene expression timing is better understood. There is clearly variation and supernumerary chromosomes is a good example but “not especially tightly constrained” seems like a premature conclusion until more is understood.

    Sounds like nothing will persuade you, then. c-value varies enormously even within a genus (eg onions). How much more would need to be understood about genomes before my assertion (based on genomic evidence) was acceptable to you? If you think I am wrong, and c value is tightly constrained, then you implicitly think two species of onion cannot evolve by simple genomic amendments from a common ancestor. Which you are entitled to think, but what could possibly persuade you otherwise?

    Even 2 people vary by substantially more than the length of 1 intron. And, they vary in their introns. More, indeed, than in their SNPs, contrary to the naive ‘Gospel of Darwin’ idea coming from the local Creationist/ID fraternity that only micro-stepwise change is properly within the purview of evolutionary theory.

  30. colewdHere are 4 to start.
    J Steroid Biochem Mol Biol. 2010 Jul; 121(1-2): 130–135. Published online 2010 Mar 7. doi: 10.1016/j.jsbmb.2010.02.036

    Published in Advance August 24, 2010, doi: 10.1101/gr.107920.110 Genome Res. 2010. 20: 1352-1360

    Cancers (Basel). 2013 Dec; 5(4): 1242–1260. PMCID: PMC3875938 Published online 2013 Oct 21. doi: 10.3390/cancers5041242

    Organogenesis. 2015;11(3):95-104. doi: 10.1080/15476278.2015.1086052.

    I don’t see anyone there that’s authored two or more of those papers. Am I missing something obvious?

  31. dazz,

    The subject matter is cell cycle proteins vitamin d b canteen among other proteins that can effect cell cycle control. Among approximately 50 papers I have in file on this subject there are common authors. The thesis is the effect that blood levels of vitamin d have on cell biochemistry, cell adhesion, stem cell production and the cell cycle. BTW make sure you get out in the sun:-) as it is the most efficient source of vitamin d for humans.

  32. colewd:
    dazz,

    The subject matter is cell cycle proteins vitamin d b canteen among other proteins that can effect cell cycle control.Among approximately 50 papers I have in file on this subject there are common authors.The thesis is the effect that blood levels of vitamin d have on cellbiochemistry, cell adhesion, stem cell production and the cell cycle.BTW make sure you get out in the sun:-) as it is the most efficient source of vitamin d for humans.

    OK, so you haven’t authored any of those papers, I thought that’s what you meant. Thanks for clarifying

  33. Allan Miller,

    Thanks for the paper in SV indels. Need to think about this some before any comment. If this is creating more genome variation then SNP’s how is it that human genomes only vary by +-.5%?

  34. So let’s look at the 90% junk claim.

    Introns are about 25% of the genome. They are considered junk. If introns are functional that figure of 90% goes down to 73% – 25% introns + 2% coding is 27% functional, therefore 73% at most is junk.

    Here is a paper on introns and their function.
    http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3325483/

    That paper makes me think introns actually have function. Why would you fault anyone, after reading that paper, if they go away believing introns have function?

    You say, “they’re extrapolating”. I could say the same of people who extrapolate and insist from their limited data set that 90% of the genome is junk.

    Because we DO understand most of it and understand that it IS junk. We probably understand over 90% of the genome.

    You have been misinformed if you think junk-DNA is based on an argument from ignorance.

    Am I mistaken to think you’d be really disappointed if medical researchers find more and more and more function in the genome?

    Will it bother you if 27% of the genome is functional? How about 50%, 60%, etc.

  35. colewd,

    If this is creating more genome variation then SNP’s how is it that human genomes only vary by +-.5%?

    +/- 0.5% on a 3,200 Mb genome is +/- 16 million bases per haploid genome …

  36. colewd:
    Allan Miller,

    Thanks for the paper in SV indels.Need to think about this some before any comment.If this is creating more genome variation then SNP’s how is it that human genomes only vary by +-.5%?

    If you look at a single human genome and compare it to the reference genome, they will differ by at roughly 0.5% of bases on average (I’m not sure if this applies to African genomes as they tend to be more diverse). This is not a particularly meaningful estimate if you want to look at the population or species level diversity, however, since it greatly underestimates the amount variation there is floating around. In any case, 0.5% difference in a 3.2 billion base pair genome is still 16,000,000 nucleotide positions. It is easy to see why copy number variants (CNVs) can contribute more to total variation than SNPs since the latter only involve a single base while CNVs typically involve many bases.

  37. FIRST, nearly all of the genome is understood.

    Define why you mean by understood. We most certainly don’t know what a lot of those untranslated RNA transcripts do exactly, but here is a suspicion floating around those who are studying the question:

    http://www.nature.com/nrg/journal/v16/n2/full/nrg3863.html

    -mediated epigenetic regulation of gene expression

    Diverse classes of RNA, ranging from small to long non-coding RNAs, have emerged as key regulators of gene expression, genome stability and defence against foreign genetic elements. Small RNAs modify chromatin structure and silence transcription by guiding Argonaute-containing complexes to complementary nascent RNA scaffolds and then mediating the recruitment of histone and DNA methyltransferases. In addition, recent advances suggest that chromatin-associated long non-coding RNA scaffolds also recruit chromatin-modifying complexes independently of small RNAs. These co-transcriptional silencing mechanisms form powerful RNA surveillance systems that detect and silence inappropriate transcription events, and provide a memory of these events via self-reinforcing epigenetic loops.

    Do we understand all the details of the epigenome? No. Much less do we understand how the RNA transcripts affect the epigenome. So we know very little.

  38. stcordova,

    That paper makes me think introns actually have function. Why would you fault anyone, after reading that paper, if they go away believing introns have function?

    I have already, twice in this thread, and previously as well, discussed the distinction between the presence of an intron being functional and its entire length being necessary for that function. Even if every single introninc gap is functional, that does not mean an extra 25% of the genome is functional.

    You say, “they’re extrapolating”. I could say the same of people who extrapolate and insist from their limited data set that 90% of the genome is junk.

    That is not from extrapolation. It was initially a reasonable surmise of Ohno’s based upon genetic load arguments (an argument you are delighted to pursue when it suits you). Since then, the genome has been characterised. Half of it is dead transposon. That’s not extrapolating from one transposon. That’s counting dead transposons.

  39. stcordova,

    Will it bother you if 27% of the genome is functional? How about 50%, 60%, etc.

    How about you? What if only 10% really is functional? Why does it matter? Would you be disappointed to find we aren’t in danger of genetic meltdown?

  40. well, discussed the distinction between the presence of an intron being functional and its entire length being necessary for that function

    Sounds like the idea that a small strand in a rope can be dispensed with therefore not all the strands that make the rope serve a function — they are un-needed extras.

  41. stcordova: Sounds like the idea that a small strand in a rope can be dispensed with therefore not all the strands that make the rope serve a function — they are un-needed extras.

    Let’s say you have an intron 1,000 nucleotides long whose presence is important for facilitating the formation of multiple functional isoforms of a particular protein. Then an Alu insertion in the middle of that intron extends its length by 350 nucleotides. Are these additional 350 nucleotides functional?

  42. Dave wrote:

    Let’s say you have an intron 1,000 nucleotides long whose presence is important for facilitating the formation of multiple functional isoforms of a particular protein. Then an Alu insertion in the middle of that intron extends its length by 350 nucleotides. Are these additional 350 nucleotides functional?

    The intron paper suggests intron length has functional significance. Here is the importance of length:

    Sequence-dependent functions are mediated by sequence elements within the intron; length-dependent functions are mediated by the length of the intron, regardless of its nucleotide content; position-dependent functions are mediated by the position of the intron with respect to the exons; and splicing-dependent functions are mediated by the mere fact that splicing had occurred during the maturation of the mRNA.

    measured an elongation rate of RNAP II along this gene of about 1.1 kb min−1, suggesting that it is the introns transcription time alone that underlies this delay.

    It is a known theoretical result that negative feedback loops with a time delay may end up in oscillatory behavior. This was demonstrated in an artificial setup by engineering gene networks with time delays, and obtaining expression pulses whose cycle depended on the intron length (Swinburne et al., 2008). But it was also shown in physiological transcripts. The gene Hes7 is cyclically expressed in the presomitic mesoderm and regulates the somite segmentation. It had been recently shown that introns within the mouse Hes7 cause a 19-min delay in transcription, and that without this delay (i.e., if the introns are removed) the oscillations disappear and Hes7 is expressed steadily, leading to severe segmentation defects (Takashima et al., 2011). As expected from a length-dependent intronic function, the total length of all introns in Hes7 was found to be highly conserved across the eukaryotic domain (Seoighe and Korir, 2011). Large-scale analysis of additional 1875 genes identified at least 10 more genes whose total intron length is conserved much more than expected, suggesting a similar role in time delays (Seoighe and Korir, 2011). Interestingly, many of these genes are related to developmental processes, in which negative feedback time delay loops are expected to play an important role (Swinburne and Silver, 2008).

    Alu’s are typically around 300 bp, not 350, and that is important.

    Nucleosome beads require a length of approximately 147 base pairs—

    147 x 2 = 294 ~= 300

    there has to be some padding to achieve that number, plus the nucleosome histones provide important function in regulating chromatin accessibility which regulates gene transcription.

    And finally:

    http://www.hindawi.com/journals/scientifica/2012/545328/

    Alu Mobile Elements: From Junk DNA to Genomic Gems

    Alus, the short interspersed repeated sequences (SINEs), are retrotransposons that litter the human genomes and have long been considered junk DNA. However, recent findings that these mobile elements are transcribed, both as distinct RNA polymerase III transcripts and as a part of RNA polymerase II transcripts, suggest biological functions and refute the notion that Alus are biologically unimportant. Indeed, Alu RNAs have been shown to control mRNA processing at several levels, to have complex regulatory functions such as transcriptional repression and modulating alternative splicing and to cause a host of human genetic diseases. Alu RNAs embedded in Pol II transcripts can promote evolution and proteome diversity, which further indicates that these mobile retroelements are in fact genomic gems rather than genomic junks.

    The length of the intron has some fine tuning, thus each nucleotide is under some functional constraint in that sense.

  43. stcordova,

    That’s all great – and thanks for the correction on the length of Alu elements – but how about answering the actual question: If the 1000 bp intron tolerates the 300 bp insertion (i.e., the insertion does not result in negative fitness effects), is the extra 300 bp now functional? Given that some Alus are still actively transposing, this is probably an event that occurs with some regularity, so it’s not just a thought experiment per se.

  44. If the 1000 bp intron tolerates the 300 bp insertion (i.e., the insertion does not result in negative fitness effects), is the extra 300 bp now functional? Given that some Alus are still actively transposing, this is probably an event that occurs with some regularity, so it’s not just a thought experiment per se.

    I don’t know, but we do know many of the new ones in somatic cells are damaging. It would appear new Alu’s that appear in the present day are possibly not good.

    If one assumes evolution, then the inference that the Alu’s are junk seems to follow quite well from that assumption. If however we find widespread function in existing Alu’s and then find damage from newly inserted Alu’s this would be a paradox, but I still think existence of such paradoxes shouldn’t be immediate grounds for rejecting function of Alu’s any more than we should reject ENCODE’s claims because it poses paradoxes for traditional evolutionary theories.

  45. stcordova
    If one assumes evolution, then the inference that the Alu’s are junk seems to follow quite well from that assumption.

    Science doesn’t “assume” evolution Sal. That evolution occurs and has occurred over the last 3.5+ billion years is a verified scientific fact. Deal with reality for a change.

  46. colewd,

    ENCODE is considered medical science mainstream. Then NIH has thrown 288 million dollars at the ENCODE project and another 188 million at it’s sister project Roadmap. That’s almost half a billion dollars wagered on the hypothesis the genome has lots of function.

    You may feel a tad a lone in among the TSZ regulars, but you are not alone as far as the medical researchers who study non-coding DNA.

Leave a Reply