{"id":29109,"date":"2015-11-28T11:04:13","date_gmt":"2015-11-28T11:04:13","guid":{"rendered":"http:\/\/theskepticalzone.com\/wp\/?p=29109"},"modified":"2018-03-02T04:23:19","modified_gmt":"2018-03-02T04:23:19","slug":"how-to-calculate-amino-acid-sequence-space","status":"publish","type":"post","link":"http:\/\/theskepticalzone.com\/wp\/how-to-calculate-amino-acid-sequence-space\/","title":{"rendered":"How to calculate amino acid sequence space"},"content":{"rendered":"<p>I see long-time commenter at Uncommon Descent, Mung, in a thread entitled\u00a0<em><a href=\"http:\/\/www.uncommondescent.com\/evolution\/backwards-eye-wiring-lee-spetner-comments\/#comment-590164\">Backwards eye wiring? Lee Spetner comments<\/a>,<\/em> asks:<\/p>\n<blockquote><p>How do you calculate the size of amino acid sequence space?<\/p><\/blockquote>\n<p>As this seems somewhat off-topic there, I thought I&#8217;d attempt to answer Mung&#8217;s question. I&#8217;ll try and be brief.<!--more--> The two most fascinating biochemicals are nucleic acids (RNA and DNA) and proteins. Proteins seem ubiquitous in cellular systems; they function as catalysts (enzymes), structural elements (keratin, collagen), signal molecules (hormones, pheromones), binding agents (antibodies). Proteins are linear sequences of amino acids joined by a condensation (called so because a molecule of water is lost) reaction forming a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Condensation_reaction\">peptide bond<\/a>. There are <a href=\"https:\/\/en.wikipedia.org\/wiki\/Proteinogenic_amino_acid\">twenty-one amino-acids<\/a> found in eukaryotes and twenty of them are directly represented in the genetic code. The special case is <a href=\"https:\/\/en.wikipedia.org\/wiki\/Selenocysteine\">selenocysteine<\/a> which is coded indirectly and I&#8217;ll leave that out of the calculation for the sake of simplicity.<\/p>\n<p>So what number of different amino acid sequences could theoretically exist, given twenty possibilities for each aa in the polymer. I guess we shouldn&#8217;t count twenty monomers. For dimers, there are 400 possibilities. For trimers, we have have 8,000 and so on. The general formula for the number of theoretically possible different protein sequences of length <img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/theskepticalzone.com\/wp\/wp-content\/ql-cache\/quicklatex.com-a63eb5ff0272d3119fa684be6e7acce8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\" style=\"vertical-align: 0px;\"\/> is <img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/theskepticalzone.com\/wp\/wp-content\/ql-cache\/quicklatex.com-ad6a57fe52ec7353a8765c95e7b19694_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#50;&#48;&#94;&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"26\" style=\"vertical-align: 0px;\"\/>. So the answer for all possible sequences is the sum of this calculation from <img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/theskepticalzone.com\/wp\/wp-content\/ql-cache\/quicklatex.com-de1ed12bab38e0cb542535c81b9395e1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#110;&#61;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"42\" style=\"vertical-align: 0px;\"\/> to, well, what? There are some very large proteins; <a href=\"https:\/\/en.wikipedia.org\/wiki\/Titin\">titin<\/a> being the largest known at around 30,000 aa&#8217;s. So I guess we should sum at least to that number.<\/p>\n<p>This is a very big number indeed! I leave it as an exercise for the reader to try representing the number that results when taking the upper limit of <img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/theskepticalzone.com\/wp\/wp-content\/ql-cache\/quicklatex.com-a63eb5ff0272d3119fa684be6e7acce8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\" style=\"vertical-align: 0px;\"\/> as 30,000. \ud83d\ude42<\/p>\n<p>Now I&#8217;ve answered Mung&#8217;s question, would he like to enlarge on what it signifies?<\/p>\n<p>ETA categories and remove tautology<\/p>\n<p>ETA 2 correction <img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/theskepticalzone.com\/wp\/wp-content\/ql-cache\/quicklatex.com-ad6a57fe52ec7353a8765c95e7b19694_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#50;&#48;&#94;&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"26\" style=\"vertical-align: 0px;\"\/> not <img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/theskepticalzone.com\/wp\/wp-content\/ql-cache\/quicklatex.com-41c1283c29e88022cdea90ccb961854e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#110;&#94;&#123;&#50;&#48;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"25\" style=\"vertical-align: 0px;\"\/> (hat tip Joe Felsenstein)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I see long-time commenter at Uncommon Descent, Mung, in a thread entitled\u00a0Backwards eye wiring? Lee Spetner comments, asks: How do you calculate the size of amino acid sequence space? As this seems somewhat off-topic there, I thought I&#8217;d attempt to &hellip; <a href=\"http:\/\/theskepticalzone.com\/wp\/how-to-calculate-amino-acid-sequence-space\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":12,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14,48,13],"tags":[],"class_list":["post-29109","post","type-post","status-publish","format-standard","hentry","category-information-theory","category-mathematics","category-probability-and-statistics"],"_links":{"self":[{"href":"http:\/\/theskepticalzone.com\/wp\/wp-json\/wp\/v2\/posts\/29109","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/theskepticalzone.com\/wp\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/theskepticalzone.com\/wp\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/theskepticalzone.com\/wp\/wp-json\/wp\/v2\/users\/12"}],"replies":[{"embeddable":true,"href":"http:\/\/theskepticalzone.com\/wp\/wp-json\/wp\/v2\/comments?post=29109"}],"version-history":[{"count":0,"href":"http:\/\/theskepticalzone.com\/wp\/wp-json\/wp\/v2\/posts\/29109\/revisions"}],"wp:attachment":[{"href":"http:\/\/theskepticalzone.com\/wp\/wp-json\/wp\/v2\/media?parent=29109"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/theskepticalzone.com\/wp\/wp-json\/wp\/v2\/categories?post=29109"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/theskepticalzone.com\/wp\/wp-json\/wp\/v2\/tags?post=29109"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}