Definitions
Common abbreviations:
- ExpGF - Expected Goals For (usually for a single game)
- ExpGFx - Team x's Expected Goals For (usually for a single game)
- GA - Goals Against
- GAx - Team x's Goals Against
- GF - Goals For
- GFx - Team x's Goals For
- GP - Games Played
- GPx - Team x's Games Played
- PP - Percentage of total points, multiply by 3 to obtain the predicted points. This is never used for actual standings, only forecasted, or adjusted.
- PTSx - The number of points earned by team x (3 * wins + 1 * draws)
Common functions:
- (x)^(y) - x raised to the power of y.
- SQRT(x) - The square root of x. Equivalent to (x)^(0.5).
- x..y - Evaluate the function for all integers from x to y inclusive. This is typically used when all the results will be summed, or multiplied. We use this notation because the summation character is hard to use online without having lots of pictures.
- (x)! - The factorial of x. Note that (0)! = 1.
- e^(x) = The mathematical constant e (~2.71828) raised to the power of x.
- PPMF(x, ExpGF) - The Poisson probability mass function. This calculates the chance that x goals will be scored given an ExpGF. This function is equivalent to: ((GF)^(x))*(e^(-(GF)))/(x)!
- PCDF(x, ExpGF) - The Poisson cumulative density function. This calculates the chance that up to and including x goals will be scored given an ExpGF. This function is equivalent to the sum of PPMF((0..x), ExpGF).
Basic Model
Model as of 7/1/11
Assumption: In modern leagues, it is extremely unlikely for a team to score as many as 10 goals. We can always increase this value if needed, but chances are good that will never happen.
- Calculate the expected goals for each team using the following formula: ExpGFx = (GFx + GAy)/(GPx + GPy)
- Calculate the sum of (PPMF(i,ExpGFx)*PPMF(i,ExpGFy)) where i=0..10, this is the chance of a tie.
- Calculate the sum of (PCDF(i, ExpGFy)*(1-PCDF(i, ExpGFx))) where i=0..9, this is the chance that team x will win. Do this for both teams to get each's probability.
Assumption: In modern leagues, it is extremely unlikely for a team to score as many as 10 goals. We can always increase this value if needed, but chances are good that will never happen.
Home Field Advantage Model
Model as of 7/1/11
- Calculate the expected goals for each team using step 1 of the Basic Model
- Calculate the adjustment factor, using the following formula: Adj = SQRT(LeagueGF/LeagueGA)
- For the home team, adjust the ExpGF by using the following formula: AdjExpGFx = Adj * ExpGFx
- For the away team, adjust the ExpGF by using the following formula: AdjExpGFy = ExpGFy / Adj
- Use the adjusted ExpGFx and ExpGFy to calculate the probabilities using steps 2-3 of the Basic Model
US Open Cup/Knockout Tournament Basic Model
Model as of 7/7/11
1. Calculate the Basic model using the GF, GA, and GP from the regular season.
2. Take the two teams' results from Step 1 of the model, and divide by three, to get the expected goals for AET.
3. Calculate step 3 of the Basic Model for each team using the new values from the previous step, and multiply it by the original result for step 2 of the Basic Model. Add the results to the win probabilities for each team calculated in the original Basic Model step 3.
4. Calculate step 2 of the Basic Model using the modified numbers from step 2 of the Knockout Basic Model, and multiply it by the original result for step 2 of the Basic Model.
5. Divide the result from Step 4 by two, and add that value to each of the teams' win probabilities.
1. Calculate the Basic model using the GF, GA, and GP from the regular season.
2. Take the two teams' results from Step 1 of the model, and divide by three, to get the expected goals for AET.
3. Calculate step 3 of the Basic Model for each team using the new values from the previous step, and multiply it by the original result for step 2 of the Basic Model. Add the results to the win probabilities for each team calculated in the original Basic Model step 3.
4. Calculate step 2 of the Basic Model using the modified numbers from step 2 of the Knockout Basic Model, and multiply it by the original result for step 2 of the Basic Model.
5. Divide the result from Step 4 by two, and add that value to each of the teams' win probabilities.
Sporting KC/Richmond Kickers Hack Formula
1. Calculate the GF, GA, and GP for the MLS teams in Round 3 of the USOC. Use these numbers for Sporting KC in the Knockout Basic Model
2. Calculate the GF, GA, and GP for the non-MLS teams in Round 3 of the USOC. Use these numbers for Richmond in the Knockout Basic Model
3. Calculate the results based on the Knockout Basic Model.
2. Calculate the GF, GA, and GP for the non-MLS teams in Round 3 of the USOC. Use these numbers for Richmond in the Knockout Basic Model
3. Calculate the results based on the Knockout Basic Model.
EAP
Model as of 7/22/11
1. Calculate the expected goals for: ExpGFx = GFx/GPx
2. Calculate the expected goals against: ExpGFy = GAx/GPx
3. Use those two values for steps 2 and 3 of the Basic Model.
1. Calculate the expected goals for: ExpGFx = GFx/GPx
2. Calculate the expected goals against: ExpGFy = GAx/GPx
3. Use those two values for steps 2 and 3 of the Basic Model.
Pythagorean
Model as of 7/22/11
PP = GF^2 / (GF^2 + GA^2)
alternatively (if GF != 0):
PP = 1 / (1 + (GA/GF)^2).
PP = GF^2 / (GF^2 + GA^2)
alternatively (if GF != 0):
PP = 1 / (1 + (GA/GF)^2).
Pythagenpat
Model as of 7/22/11
k = league-wide constant. It is calculated empirically by adjustment to reduce the mean squared residual between PP and ActPPG/3
exp = ((GFx + GAx) / GPx) ^ k
PP = 1 / (1 + (1/GF:GA)^exp)
In baseball, the constant k is set to 0.29
k = league-wide constant. It is calculated empirically by adjustment to reduce the mean squared residual between PP and ActPPG/3
exp = ((GFx + GAx) / GPx) ^ k
PP = 1 / (1 + (1/GF:GA)^exp)
In baseball, the constant k is set to 0.29
Basic Model Bounding
Model as of 7/22/11
nx = Games yet to be played for team x
EAPx = The EAP model calculated for team x
SIGMAx = SQRT(nx*EAPx*(3-EAPx))
The basic model predicted points for the season for team x has a 95% chance to fall within +/- 2*SIGMAx
nx = Games yet to be played for team x
EAPx = The EAP model calculated for team x
SIGMAx = SQRT(nx*EAPx*(3-EAPx))
The basic model predicted points for the season for team x has a 95% chance to fall within +/- 2*SIGMAx
RPI
Model as of 7/22/11
PTSxy = The number of points team x has earned against team y this season
GPxy = The number of games team x has played against team y this season
OPPGxy = (PTSy - PTSyx)/(GPy - GPyx)
AOPPGx = SUM(GPxy*OPPGxy)/GPx
OOPPGxy = AOPPGy
AOOPPGx = SUM(GPxy*OOPPGxy)/GPx
RPIx = ((PTSx/GPx) + 2*AOPPGx + AOOPPGx) / 4
In all of the sums, assume it is the sum of the calculation for all possible opposing teams y.
e.g.
Teams a, b, c, and d play in a league
The SUM(GPxy*OPPGxy) term of AOPPGa becomes SUM(GPab*OPPGab, GPac*OPPGac, GPad*OPPGad)
PTSxy = The number of points team x has earned against team y this season
GPxy = The number of games team x has played against team y this season
OPPGxy = (PTSy - PTSyx)/(GPy - GPyx)
AOPPGx = SUM(GPxy*OPPGxy)/GPx
OOPPGxy = AOPPGy
AOOPPGx = SUM(GPxy*OOPPGxy)/GPx
RPIx = ((PTSx/GPx) + 2*AOPPGx + AOOPPGx) / 4
In all of the sums, assume it is the sum of the calculation for all possible opposing teams y.
e.g.
Teams a, b, c, and d play in a league
The SUM(GPxy*OPPGxy) term of AOPPGa becomes SUM(GPab*OPPGab, GPac*OPPGac, GPad*OPPGad)