German math problem 
[Nov. 20th, 201011:39 pm]
Greg

[  Tags    math  ] 
[  Current Mood 
  geeky  ] 
As I mentioned during the Germany trip recap, visiting Allianz Arena inspired a math problem. Here it is!
Allianz Arena is home to two German soccer teams  Bayern München (hereafter "B") and TSV 1860 München (hereafter "T"). Let's assume there are 2n weeks in the season, and one game per week, and half a team's games are at home and half on the road. Presumably B and T would have to work out who gets to play at home which week, but let's say they didn't and just randomly picked which weeks they play where. What is the expected value of the number of "conflict weeks"  i.e. both B and T are scheduled to play at home?
My first step was to try this out with n=1. Then you get the following possibilities (here I just show just the weeks the team plays at home):
Week 1  TB  T  B  
Week 2   B  T  TB 
# conflicts  1  0  0  1 

So the expected value of the number of conflicts is .5 for n=1.
Now, I took a guess at what the expected value was in general. As a rough guess, each team has to choose n weeks, so you might expected about half of them to overlap. But it seemed to me that once you do have a conflict week, the chances of another one would go down (since you've "used up" one game from both teams), so I thought it might be a bit less than n/2.
So  onwards to discovering the answer! It seemed fairly easy to write this out as a recurrence relation with the following parameters:
c(w,b,t) = expected number of conflict weeks if there are w weeks in the season, b home games for B left, t home games for T left. Base cases: c(w,b,0) = c(w,0,t) = 0 [no games left for one team means no conflicts] c(w,w,t) = t and c(w,b,w) = b [if one team has to play all their games in the remaining weeks, however many the other team has left is the number of conflicts]
And we write the induction step by looking at the first week: c(w,b,t) for 0 < b,t < w is:<td></td>
(wb)/w * (wt)/w * c(w1,b,t)  [chance that b and t don't play a game this week times what happens for the rest of the season] 
+ b/w * (wt)/w * c(w1,b1,t)  [chance b plays a game this week and t doesn't] 
+ (wb)/w * t/w * c(w1,b,t1)  [chance t plays a game this week and b doesn't] 
+ b/w * t/w * (1 + c(w1,b1,t1))  [chance both play a game this week, so the expected value goes up by 1] 
This looked correct, but was a little discouraging because it seemed very not obvious what an explicit formula would be. But it was easy enough to code a quick Python script to get a hint to what an answer would be.
The results were highly suggestive: c(2,1,1) = .5 (this is the case we worked out by hand above), c(4,2,2) = 1, c(6,3,3) = 1.5, c(8,4,4) = 2, c(10,5,5) = 2.5, etc. So it sure looked like c(2n,n,n)=n/2.
But I really wanted to prove it, but dealing with this kinda ugly recurrence relation didn't seem like the way to go. Also, this seemed like a combinatorial problem and it seemed like there should be a nice combinatorial way to express c(2n,n,n).
After some thought I came up with a good way of thinking about it. Let's say without loss of generality that B's home games are the first n of the season. Then you're just choosing T's home games and seeing how many overlap with the first n. This leads fairly naturally to the explicit formula: c(2n,n,n) = (sum_{i=0}^n [(n choose i) * (n choose (ni)) * i])/(2n choose n) Here i represents the number of conflicts, and the chance of getting that many conflicts is the same as choosing i games in the first n weeks of the season (which will be conflicts), and the remaining ni in the second n weeks of the season. In total, you're choosing n weeks out of 2n, which is where we get the denominator from.
This didn't seem like a huge improvement at first, but we can actually simplify this a lot. First of all, (n choose i) = (n choose (ni)), so we get c(2n,n,n) = (sum_{i=0}^n [(n choose i)^2 * i])/(2n choose n) Now, notice that (sum_{i=0}^n [(n choose i)^2]) = (2n choose n)). The right side is the number of ways of counting choosing n objects out of 2n choices, and the left side is the same  first you choose which i are in the first half, and then which (ni) are in the second half (since (n choose i)=(n choose (ni)). So this will help out in a minute. Now let's consider two cases: Case 1: n is odd, i.e. n=2k+1 for some integer k. Let's just look at the numerator of c(2n,n,n) and split it up in half to get sum_{i=0}^k [(n choose i)^2 * i] + sum_{j=k+1}^n [(n choose j)^2 * j] Here we can pair up i's and j's that sum to n  since 0+n=n and k+(k+1)=2k+1=n, this covers all of them. Since i+j=n, then (n choose i)=(n choose j), so we'll end up with sum_{i=0}^k [(n choose i)^2 * i + (n choose i)^2 * (ni)], or sum_{i=0}^k [(n choose i)^2 * n] Now we can unsplit this back into the i's and j's, giving n/2 to each side to get sum_{i=0}^k [(n choose i)^2 * (n/2)] + sum_{j=k+1}^n [(n choose j)^2 * (n/2)] which simplifies down to sum_{i=0}^n [(n choose i)^2 * (n/2)] (n/2) * sum_{i=0}^n [(n choose i)^2] Since we showed above that this sum is equal to (2n choose n), this means that c(2n,n,n) = ((n/2) * (2n choose n))/(2n choose n) = n/2, as desired! Case 2: n is even, i.e. n=2k for some integer k. This is the same as the previous case, except we have the additional value i=k which doesn't pair up with anything. But, the value that we're summing for i=k is ((n choose k)^2 * k) and k does equal n/2, so it reduces the same way!
In short, I love combinatorics! 

