Mung has drawn our attention to a post by Kirk Durston at ENV. This is my initial reaction to his method to establish the likelihood of generating a protein with AA permease (amino acid membrane transport) capability.
Durston: “Hazen’s equation has two unknowns for protein families: I(Ex) and M(Ex). However, I have published a method to solve for a minimum value of I(Ex) using actual data from the Protein Family database (Pfam),
Translation: I have published a method to solve for a minimum value of I(Ex) among proteins that presently exist.
I downloaded 16,267 sequences from Pfam for the AA permease protein family. After stripping out the duplicates, 11,056 unique sequences for AA Permease remained.
Translation: I took some proteins that actually exist. I implicitly assume that they are a representative, unbiased sample of all the AA permeases that could exist.
the results showed that a minimum of 466 [think he means 433 – that’s the number he plugs in later anyway] bits of functional information are required to code for AA permease.
Translation: the results show that the smallest number of bits in this minuscule and biased sample of the entire space is 433.
Using Hazen’s equation to solve for M(Ex), we find that M(Ex)/N is less than 10^-140 where N = 20^433.
Translation: starting from my extremely tiny sample of protein space, multiplying up any distortions (eg those due to common origin or evolution) and ignoring redundancy, modularity, exaptation, site-specific variations in constraint and the possibility of anything more economically specified than an existing protein, the chance of hitting a 433-bit AA permease by a mechanism not actually known in biology is – ta-dah! – 1 in 10^140.