Given the importance of information theory to some intelligent design arguments I thought it might be nice to have a toolkit of some basic functions related to the sorts of calculations associated with information theory, regardless of which side of the debate one is on.
What would those functions consist of?
Over in the In Slight Defense of Granville Sewell thread there’s a discussion going on between myself and keiths over the entropy of a pair of dice. A tempest in a teapot I’m sure, with any disagreement probably revolving around his use of the Boltzmann equation and my use of the Shannon measure.
Perhaps we can have some fun and learn something as well. Here’s a programming challenge:
Write the code necessary to calculate the entropy of a pair of standard six-sided dice.
What functions should we write to do this? Please explain your code. Extra credit for writing tests to validate the code. Let me know if you have any questions.
Feel free to offer your comments on probability, statistics, and information theory. No rule against making this thread informative as well as boring.
The discussion begins around here.
According to Mung, there is no such thing as a random throw of dice so therefore the entropy is whatever his god wants it to be – it depends on what way his god adjusts it.
So therefore it is impossible to calculate such a value as Mung’s god can change it on a whim.
To appeal to an omnipotent deity is to allow that at any moment the regularities of nature may be ruptured, that Miracles may happen and that the entropy of a pair of dice can be changed without regard to it’s previous value.
Normally in such a challenge, I’d post my code first and ask others to critique it and/or provide their own to compare with.
Will you be writing code Mung? Or just asking others to?
Mike Elzinga has a great entropy concept test:
It would be interesting to see the various participants in the other thread provide their answers. I volunteer to hold them all for simultaneous release.
Patrick,
I responded on the other thread.
For some definitions of “response”. 😉
Care to take a stab at answering those questions from your “subjective knowledge” perspective?
So the first method I decided I needed was one to sum the values in a collection. Turns out that a sum method already exists in Python and there will be one in the next version of Ruby, but there isn’t one in the current version of Ruby.
A sum() method is available in Rails, one could use just the needed rails gem, but that’s a bit of overkill for this project. The method itself is simple enough to write.
But why a sum method? Well, say, I want to calculate probabilities from frequencies or make sure all my probabilities add up to 1.
For the case of a pair of dice I need a way to count the “ways”. For each number I can count the ways to to roll that number and then sum of all of them.
1 way to roll a two
2 ways to roll a three
3 ways to roll a 4
4 ways to roll a 5
5 ways to roll a six
6 ways to roll a seven
5 ways to roll an eight
4 ways to roll a nine
3 ways to roll a ten
2 ways to roll an eleven
1 way to roll a twelve
—–
36 ‘ways’
I’ll post my code and tests in another post.
Well, given that the entropy can be different depending on the observer, I have no fear of being wrong. Do you have the “correct” answers?
Of course you would, but you’re a far better coder than I am.
Patrick,
I responded in the other thread.
require ‘minitest/autorun’
def sum(array)
array.inject(0, :+)
end
class SumTest < MiniTest::Test
def test_sum_with_empty_array
assert_equal 0, sum([]), "summed elements of an empty array should be zero"
end
def test_sum_of_a_pair_of_dice
assert_equal 36, sum([1,2,3,4,5,6,5,4,3,2,1])
end
def test_sum_of_probabilities
assert_equal 1.01, sum([0.03,0.06,0.08,0.11,0.14,0.17,0.14,0.11,0.08,0.06,0.03])
end
def test_sum_of_frequencies_as_probabilities
s = 36.0
assert_equal 1.0000000000000002, sum([1/s,2/s,3/s,4/s,5/s,6/s,5/s,4/s,3/s,2/s,1/s])
end
end
Test results:
D:\projects\information_theory>ruby test_sum.rb
Run options: –seed 37497
# Running:
….
Finished in 0.002068s, 1934.2360 runs/s, 1934.2360 assertions/s.
4 runs, 4 assertions, 0 failures, 0 errors, 0 skips
D:\projects\information_theory>
Of primary interest for our Dice Entropy is the sum of the number of ways the sum of two dice can be obtained:
1 way to throw a 2
2 ways to throw a 3
3 ways to throw a 4
4 ways to throw a 5
5 ways to throw a 6
6 ways to throw a 7
5 ways to throw an 8
4 ways to throw a 9
3 ways to throw a 10
2 ways to throw an 11
1 way to throw a 12
—-
36 ‘ways’
And that is because from the frequency and the sum we can calculate the probability, by dividing the frequency by the sum:
D:\projects\information_theory>ruby -e ‘puts 4/36.0’
0.1111111111111111
And given that, we can create a probability distribution.
And from the probability distribution we can calculate the entropy.
Or so says the source I am ‘cribbing’ from. 🙂
require ‘minitest/autorun’
def shannon_entropy(probability_distribution)
probability_distribution.inject(0) {|sum, px| sum + px * Math.log2(1.0/px)}
end
class EntropyTest < MiniTest::Test
def test_entropy_from_uniform_distribution_4
distribution = Array.new(4) {1.0 / 4}
assert_equal 2, shannon_entropy(distribution)
end
def test_entropy_from_uniform_distribution_8
pr = 1.0 / 8
assert_equal 3, shannon_entropy([pr,pr,pr,pr,pr,pr,pr,pr])
end
def test_entropy_from_uniform_distribution_16
distribution = Array.new(16) {1.0 / 16}
assert_equal 4, shannon_entropy(distribution)
end
end
Test results:
D:\projects\information_theory>ruby test_entropy.rb
Run options: –seed 18023
# Running:
…
Finished in 0.000000s, Inf runs/s, Inf assertions/s.
3 runs, 3 assertions, 0 failures, 0 errors, 0 skips
D:\projects\information_theory>
Mung, why not code your probabilities as fractions, 1/36, 2/36 etc?
Hi. Good to see you. I think I am doing that. To know the fraction I have to know the sum. I’ve chosen to calculate the sum rather than hard code it.
My next post will show the results using fractions, and the following post will show the test results. Let me know if you have any questions.
require ‘minitest/autorun’
def shannon_entropy(probability_distribution)
probability_distribution.inject(0) {|sum, px| sum + px * Math.log2(1.0/px)}
end
class EntropyTest < MiniTest::Test
def test_entropy_from_non_uniform_distribution_dice_pair
sum = 36.0
distribution = [1,2,3,4,5,6,5,4,3,2,1].map {|ways| ways/sum}
assert_equal 3.27, shannon_entropy(distribution)
end
end
Test results:
D:\projects\information_theory>ruby dice_entropy.rb
Run options: –seed 49598
# Running:
F
Finished in 0.000000s, Inf runs/s, Inf assertions/s.
1) Failure:
EntropyTest#test_entropy_from_uniform_distribution_8 [dice_entropy.rb:13]:
Expected: 3.27
Actual: 3.2744019192887706
1 runs, 1 assertions, 1 failures, 0 errors, 0 skips
D:\projects\information_theory>
So there you have it. I’ve calculated the Shannon entropy for a pair of dice. Not only that, but I have written and provided the functions to do so (in Ruby). The code is here for everyone to see.
The value is close enough to the value of 3.27 which I provided in the In Slight Defense of Granville Sewell thread.
Mung,
3.27 bits/symbol is the answer to the wrong problem, as I explained here.
Are you saying it fails to calculate the Shannon entropy? Or are you saying that the problem you presented cannot be solved in terms of Shannon entropy?
Is there any chance you’ll present your own coding solutions to how to calculate Shannon entropy?
Mung,
See this.
Code, keiths.
I’ve given mine.
Why would I waste time coding it, Mung? It’s the concepts that matter, and you still haven’t grasped them.
What concept does my code fail to grasp?
I explained your error already, Mung.
We’re all waiting for your code.
Sorry for the delay in replying. Life is still too interesting by half.
Your reply didn’t actually answer Mike Elzinga’s challenge from the perspective of your view on entropy.
While I still don’t have the time to properly engage in the discussion, you are mistaken about entropy being observer dependent. Whether or not one knows the specific microstate, entropy is related to the number of microstates consistent with the macrostate defined by variables such as P, V, and T. It is related to the amount of energy in the system that is no longer available to do work. That doesn’t change based on the observer.
If I have the time I’ll dig out my old thermo and stat mech textbooks. Don’t expect that to happen in the next few weeks. Consider this, as before, a pure argument from authority.
I have a healthy respect for authority.
Patrick,
I responded in the other thread.
Came across this today:
Thermodynamic Probability W and Entropy