From britdisc-owner@csv.warwick.ac.uk Mon Nov 3 11:35:13 1997
Received: (from daemon@localhost)
by pansy.csv.warwick.ac.uk (8.8.7/8.8.7) id LAA19490
for britdisc-outgoing; Mon, 3 Nov 1997 11:18:49 GMT
Received: from wol.ra.phy.cam.ac.uk (eor.ra.phy.cam.ac.uk [131.111.48.66])
by pansy.csv.warwick.ac.uk (8.8.7/8.8.7) with SMTP id LAA19474
for <britdisc@csv.warwick.ac.uk>; Mon, 3 Nov 1997 11:18:47 GMT
Received: by wol.ra.phy.cam.ac.uk (UK-Smail 3.1.25.1/15)
id <m0xSKWs-0003J4C@wol.ra.phy.cam.ac.uk>; Mon, 3 Nov 97 11:18 GMT
Message-Id: <m0xSKWs-0003J4C@wol.ra.phy.cam.ac.uk>
Date: Mon, 3 Nov 97 11:18 GMT
From: mackay@mrao.cam.ac.uk (David J.C. MacKay)
To: britdisc@csv.warwick.ac.uk, frisbee@mrao.cam.ac.uk
Subject: SE regs 97
Sender: owner-britdisc@warwick.ac.uk
Precedence: bulk
------------------------------------------------------------------------
Southeast Regional Outdoor Ultimate Championships
Cambridge, November 1-2 1997
------------------------------------------------------------------------
Results:
Team infera rating
1 UTI 11.2 +/- 0.2
2 Red Shift 10.9 +/- 0.2
3 First Touch 10.8 +/- 0.2
4 Slowhawks 10.2 +/- 0.2
5 Doh'hawks 10.1 +/- 0.2
6 Strange Blue 2 9.7 +/- 0.2
7 Skunks 1 9.5 +/- 0.2
8 Strange Blue 1 9.2 +/- 0.2
9 Skunks 2 8.3 +/- 0.2
Spirit of the game: Doh'hawks
------------------------------------------------------------------------
The start of the 97 regionals featured thick fog - fog thicker than
the winds of the 96 regionals were windy. But all the teams managed
occasionally to find the disc, each other, and the endzones during
their initial games. The sun burst through after an hour or two, and
the weather from then on was pleasant and gentle.
The Spirit in the tournament was really good, and I think everyone had
a good time. Skunks and Mohawks led the Halloween partying at Darwin
College, and UTI and First Touch were able to return to London for
their Halloween parties without adverse effects on their games.
More than half of the teams had a substantial number of beginners, and
I think that they really enjoyed learning from playing with the
top teams of the region.
Many thanks to everyone who came. Thankyou for playing your games on time,
and for being clean and tidy guests. I think the Cambridge Rugby Club will
be willing to have us all again. Maybe a Summer tournament next year?
-------------------------------------------------------------------------
There now follow:
(A) Lost property.
(B) Discussion of the 'infera' software used to process the scores
and spit out the above ranking and ratings.
(C) Scores of all games.
(D) Photos from the tournament.
-------------------------------------------------------------------------
(A) =====================================================================
Lost property: found --
one key on chain with hospice medallion
one black hat, one pair black gloves
three shirts
one pair black tracksuit trousers
(B) =====================================================================
Discussion of Infera's ranks
==========================================================================
0) Background: Infera is a program which infers the most probable
ranking, by ability, of a set of teams based on the scores of any
games they have played. It is applicable to any tournament format.
Teams may have played different numbers of games, the games may have been
of different durations, and teams need not have been arranged
in equal strength pools.
A description of the program can be found on the web here.
http://wol.ra.phy.cam.ac.uk/ultimate/infera/
The basic idea is that the score of any one game provides information
about the relative rank of the two teams. -- A game does not provide
concrete information though; if two teams are close in ability, the
game might go either way. So a close win for team A over team B does
not show for certain that A is better than B; it's just more
probable. The longer the game, the more information it gives about
the teams' relative abilities. Infera uses probability theory to
figure out the most probable ranking.
Infera was first used for real at the 1997 Southeast Regionals in
Cambridge. The tournament format was in fact a round robin, so it is
possible to compare infera's ranking with the rankings given by
more traditional methods which can be applied to round robins.
-----------------------------------------------------------------------
Results:
infera score games won goal difference
1 UTI UTI 11.2 +/- 0.2 8 67
2 Red Shift RS 10.9 +/- 0.2 7 44
3 First Touch FT 10.8 +/- 0.2 6 48
4 Slowhawks M1 10.2 +/- 0.2 5 8
5 Doh'hawks M2 10.1 +/- 0.2 4 1
6 Strange Blue 2 SB2 9.7 +/- 0.2 2 -21
7 Skunks 1 SK1 9.5 +/- 0.2 3 -34
8 Strange Blue 1 SB1 9.2 +/- 0.2 1 -43
9 Skunks 2 SK2 8.3 +/- 0.2 0 -70
------------------------------------------------------------------------
1) Comparison with goal difference
After a round robin, one possible way to rank teams is by goal
difference. In this tournie, it turned out that infera's rankings
were almost the same as the rankings you would get from goal
difference, except Red Shift and First Touch (who have goal
differences 44 and 48) came out switched. Maybe the reason that Red
Shift had a slightly poorer goal difference than FT is that UTI
really pulled out the stops in their last game, against RS. And this
final game was longer in duration by 33% than all other games in the
tournament, so this game has a slightly disproportionate effect on
the ranking by goal difference.
2) Comparison with traditional (win/lose) rankings.
Another traditional (and rather crude) performance measure for a round robin
is number of games won. In this tournament, it gives a clean ranking
of the teams, but a _different_ one from infera's. Skunks1 come ahead
of SB2 by `games won', because the SK1/SB2 result was SK1 6 SB2 5.
But Infera gave SB2 a score of 9.7 +/- 0.2, and SK1 a score of 9.5
+/- 0.2.
So, why did infera put SB2 slightly ahead of SK1? The SB2/SK1 result
was a close result, obviously. (only a draw could have been closer,
and the hooter happened to go during an odd point.) So to rank SB2
relative to SK1, Infera takes into account not only the SK1/SB2
result, but also the results against other teams, and the ranks of
those other teams.
So let's look at the other scores of SK1 and SB2 ...
SK1 3 UTI 13 SB2 2 UTI 13
SK1 2 RS 13 SB2 1 RS 13
SK1 5 FT 13 SB2 8 FT 10 <<<<<<<<<<<<<<<<
SK1 6 M1 11 SB2 6 M1 13
SK1 4 M2 13 SB2 4 M2 7 <<<<<<<<<
SK1 6 SB1 4 SB2 9 SB1 6
SK1 9 SK2 3 SB2 13 SK2 1 <
Clearly, SB2 did much better against FT and against M2.
It's because of these strong results that SB was ranked a tiny bit higher.
Which is the fairer ranking? Infera reports what it reckons is the
_most_probable_ ranking, and it takes into account more than just the
simple win/lose outcome. Its estimate is that it is more probable,
given all the results, that SB2 was stronger than SK1, and that the
SK1/SB2 game happened to go the other way, rather than the alternative
hypothesis, that SK1 is better than SB2, but SB2 managed by fluke to
get a much better result against FT. What do you think?
==========================================================================
3) Counterfactuals concerning the final:
Just as the close outcome of the single SB2/SK1 game was overruled by
evidence from other games, the ranking by infera of the number 1 and 2 teams
isn't simply determined by the result of the "final" game that they
play against each other.
So people might be interested to know:
What would have happened if the score in the final had been closer? I
have plugged in a few alternative scores (the true score was UTI 15,
RS 3) to see how big a win the finalists needed to guarantee the
number 1 ranking. (RS went into the final with a slightly higher
ranking than UTI, on the basis of the previous 35 games.)
If the score had been UTI 15, RS 14 then the ranks would have come out:
1 RS 11.08
2 UTI 11.04
3 FT 10.82
4 M1 10.19
5 M2 10.14
6 SB2 9.68
7 SK1 9.52
8 SB1 9.18
9 SK2 8.35
So winning by one point would not have been enough for UTI to be
ranked number 1 (though it must be emphasised that the difference
between 11.08 and 11.04 is utterly tiny - and a sensible idea would be
to have the option of calling the overall outcome a tie when the
differences are so small relative to the remaining uncertainty).
The critical score in this case is 15-13. If UTI won by more than two
points, then they got the number 1 ranking from infera.
We could ask, why this difference? Why did Red Shift come into the
final with a head start in the rankings? There are two simple
explanations: [i] UTI accidentally turned up late for their game with
Mohawks 1, and generously conceded five points, making the final score
13-7 instead of 13-2. [ii] UTI played a friendly game with Skunks 2
(with Nick Haslam switching sides) which ended with a score of 8-2.
If we 'correct' these two exceptional events, by entering the Mohawks
game as a 13-2 result, and, say, omitting the UTI/Skunks2 game from
the data, we find that it is now _UTI_ who enter the finals with the
highest rank, and Red Shift would have had to beat them better than
15-13 for infera to be persuaded that Red Shift were the number 1
team.
In conclusion,
(1) I think infera worked just fine and gave rankings that made
complete sense. I'll put in data from other tournaments if people
send it to me in the right format.
(2) When close hypothetical scores (eg 15-13 or 13-15) are put in for
a game between two teams (eg the final), the outcomes of other games
could sometimes overrule the outcome of that game. To ensure that
these effects are not spurious, I would recommend that when infera is
used, teams should not have penalties put onto their score for
turning up late to games, or totally failing to show up; this mucking
with the scores is the sort of thing which might on rare occasions
cause infera to be confused. It should be easy to find other ways to
penalise late teams, if necessary! Incidentally, I didn't include
any penalties in the tournament rules, and almost all the games at
the 97 Cambridge tournament ran on time.
(3) It might be good to declare two teams to have equal rank, if their
infera ratings are closer than, say 0.1.
(4) If you have any comments on infera, I may well have responded to
them already on the web pages I mentioned above. There is a huge
number of ways you can use infera; for example, if you want to tell
it only the win/lose/draw outcomes, instead of the actual scores, you
may. My chief reason for recommending it is that allows you to choose
arbitrary tournament formats no matter how many teams turn up and
what games are played, and still easily get rankings out at the end
of the day. But you could use it, for example, to rank teams in a
long term league of several tournaments (even if some teams have
attended different numbers of tournaments). Alternatively you could
use infera to determine the number of tour points allocated to teams
for their performance a tournament. Infera gives each team a rating
such that teams judged to have been very close end up with similar
ratings, and teams that are far better than the others get
proportionally bigger ratings. Instead of using some arbitrary fixed
numbers like 1st=200, 2nd=120, 3rd=80, the infera ratings would
return numbers that reflect how close the number 2 team came to the
number 1 team, etc.
(C) ============== Scores ============================
Tournament schedule can be seen here:
http://wol.ra.phy.cam.ac.uk/ultimate/schedule/9.html
Here are the scores:
# saturday nov 1 97 #
RS 7 M1 3
RS 13 SB1 1
RS 8 FT 6
RS 9 M2 4
FT 13 SB1 3
SB1 11 SK2 4
UTI 13 SB1 3
SK2 2 M2 10
SB2 13 SK2 1
FT 13 SK2 0
UTI 13 M1 7
# note: M1 were given 5 points by UTI as an apology for being late
M1 13 SB2 6
SB2 8 FT 10
SK1 6 SB2 5
SK1 6 M1 11
UTI 13 SK1 3
M2 13 SK1 4
UTI 13 M2 5
# sunday
UTI 9 FT 5
RS 13 SB2 1
M2 8 SB1 0
M2 7 SB2 4
RS 13 SK2 3
M1 10 SB1 5
SK1 6 SB1 4
SK2 1 M1 9
RS 13 SK1 2
UTI 8 SK2 2
FT 13 M2 2
UTI 13 SB2 2
FT 13 M1 3
SK1 9 SK2 3
M1 8 M2 5
SB1 6 SB2 9
FT 13 SK1 5
# Final ( to 15 points )
UTI 15 RS 3
(D) =====================================================================
Photos
=========================================================================
A few photos of fog, discs, teams and human pyramids are on the web here:
http://wol.ra.phy.cam.ac.uk/ultimate/pics/seregs97/
==========================================================================
David J.C. MacKay email: mackay@mrao.cam.ac.uk
www: http://wol.ra.phy.cam.ac.uk/mackay/
Cavendish Laboratory, tel: (01223) 339852 fax: 354599 home: 276411
Madingley Road, international code: +44 1223
Cambridge CB3 0HE. U.K. room: 982 Rutherford Building