<class 'pandas.core.frame.DataFrame'>
RangeIndex: 74 entries, 0 to 73
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 competition_id 74 non-null int64
1 season_id 74 non-null int64
2 country_name 74 non-null object
3 competition_name 74 non-null object
4 competition_gender 74 non-null object
5 competition_youth 74 non-null bool
6 competition_international 74 non-null bool
7 season_name 74 non-null object
8 match_updated 74 non-null object
9 match_updated_360 56 non-null object
10 match_available_360 10 non-null object
11 match_available 74 non-null object
dtypes: bool(2), int64(2), object(8)
memory usage: 6.1+ KB
Elo rating system, Statsbomb open-data minimal exploratory analysis.
ELO rating system for 1.Bundesliga saison 2015/2016
Statsbomb open-data minimal exploratory analysis
ELO rating system (with arbitrary K and s factors)
Application the ELO rating system to 1. Bundesliga 2015/2016 season matches
Finding optimal factors s and K based on the season’s data
Display the final ranking table with optimal K and s factors
Finding the most surprising win in the season
1. Statsbomb open-data minimal exploratory analysis
| competition_id | season_id | country_name | competition_name | competition_gender | competition_youth | competition_international | season_name | match_updated | match_updated_360 | match_available_360 | match_available | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 9 | 27 | Germany | 1. Bundesliga | male | False | False | 2015/2016 | 2024-05-19T11:11:14.192381 | None | None | 2024-05-19T11:11:14.192381 |
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 306 entries, 0 to 305
Data columns (total 22 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 match_id 306 non-null int64
1 match_date 306 non-null object
2 kick_off 306 non-null object
3 competition 306 non-null object
4 season 306 non-null object
5 home_team 306 non-null object
6 away_team 306 non-null object
7 home_score 306 non-null int64
8 away_score 306 non-null int64
9 match_status 306 non-null object
10 match_status_360 306 non-null object
11 last_updated 306 non-null object
12 last_updated_360 0 non-null object
13 match_week 306 non-null int64
14 competition_stage 306 non-null object
15 stadium 306 non-null object
16 referee 306 non-null object
17 home_managers 306 non-null object
18 away_managers 306 non-null object
19 data_version 306 non-null object
20 shot_fidelity_version 306 non-null object
21 xy_fidelity_version 306 non-null object
dtypes: int64(4), object(18)
memory usage: 52.7+ KB
| match_id | match_date | kick_off | competition | season | home_team | away_team | home_score | away_score | match_status | ... | last_updated_360 | match_week | competition_stage | stadium | referee | home_managers | away_managers | data_version | shot_fidelity_version | xy_fidelity_version | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 3890561 | 2016-05-14 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Hoffenheim | Schalke 04 | 1 | 4 | available | ... | None | 34 | Regular Season | PreZero Arena | Felix Brych | Julian Nagelsmann | André Breitenreiter | 1.1.0 | 2 | 2 |
| 1 | 3890505 | 2016-04-02 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Bayern Munich | Eintracht Frankfurt | 1 | 0 | available | ... | None | 28 | Regular Season | Allianz Arena | Florian Meyer | Josep Guardiola i Sala | Niko KovaÄŤ | 1.1.0 | 2 | 2 |
| 2 | 3890511 | 2016-04-08 | 20:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Hertha Berlin | Hannover 96 | 2 | 2 | available | ... | None | 29 | Regular Season | Olympiastadion Berlin | Benjamin Brand | Pál Dárdai | Daniel Stendel | 1.1.0 | 2 | 2 |
| 3 | 3890515 | 2016-04-09 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Hamburger SV | Darmstadt 98 | 1 | 2 | available | ... | None | 29 | Regular Season | Volksparkstadion | Peter Sippel | Bruno Labbadia | Dirk Schuster | 1.1.0 | 2 | 2 |
| 4 | 3890411 | 2015-12-20 | 16:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Hertha Berlin | FSV Mainz 05 | 2 | 0 | available | ... | None | 17 | Regular Season | Olympiastadion Berlin | Peter Sippel | Pál Dárdai | Martin Schmidt | 1.1.0 | 2 | 2 |
5 rows Ă— 22 columns
| match_id | match_date | kick_off | competition | season | home_team | away_team | home_score | away_score | match_status | ... | last_updated_360 | match_week | competition_stage | stadium | referee | home_managers | away_managers | data_version | shot_fidelity_version | xy_fidelity_version | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 305 | 3890259 | 2015-08-14 | 20:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Bayern Munich | Hamburger SV | 5 | 0 | available | ... | None | 1 | Regular Season | Allianz Arena | Bastian Dankert | Josep Guardiola i Sala | Bruno Labbadia | 1.1.0 | 2 | 2 |
| 299 | 3890265 | 2015-08-15 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Augsburg | Hertha Berlin | 0 | 1 | available | ... | None | 1 | Regular Season | WWK Arena | Tobias Welz | Markus Weinzierl | Pál Dárdai | 1.1.0 | 2 | 2 |
| 304 | 3890260 | 2015-08-15 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Bayer Leverkusen | Hoffenheim | 2 | 1 | available | ... | None | 1 | Regular Season | BayArena | Robert Hartmann | Roger Schmidt | Markus Gisdol | 1.1.0 | 2 | 2 |
| 300 | 3890264 | 2015-08-15 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Werder Bremen | Schalke 04 | 0 | 3 | available | ... | None | 1 | Regular Season | Wohninvest Weserstadion | Daniel Siebert | Viktor Skripnik | André Breitenreiter | 1.1.0 | 2 | 2 |
| 302 | 3890262 | 2015-08-15 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Darmstadt 98 | Hannover 96 | 2 | 2 | available | ... | None | 1 | Regular Season | Merck-Stadion am Böllenfalltor | Felix Brych | Dirk Schuster | Michael Frontzeck | 1.1.0 | 2 | 2 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 18 | 3890560 | 2016-05-14 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Darmstadt 98 | Borussia Mönchengladbach | 0 | 2 | available | ... | None | 34 | Regular Season | Merck-Stadion am Böllenfalltor | Peter Sippel | Dirk Schuster | André Schubert | 1.1.0 | 2 | 2 |
| 17 | 3890562 | 2016-05-14 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Borussia Dortmund | FC Köln | 2 | 2 | available | ... | None | 34 | Regular Season | Signal-Iduna-Park | Michael Weiner | Thomas Tuchel | Peter Stöger | 1.1.0 | 2 | 2 |
| 16 | 3890563 | 2016-05-14 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Bayer Leverkusen | Ingolstadt | 3 | 2 | available | ... | None | 34 | Regular Season | BayArena | Guido Winkmann | Roger Schmidt | Ralph HasenhĂĽttl | 1.1.0 | 2 | 2 |
| 15 | 3890564 | 2016-05-14 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Augsburg | Hamburger SV | 1 | 3 | available | ... | None | 34 | Regular Season | WWK Arena | Florian Meyer | Markus Weinzierl | Bruno Labbadia | 1.1.0 | 2 | 2 |
| 0 | 3890561 | 2016-05-14 | 15:30:00.000 | Germany - 1. Bundesliga | 2015/2016 | Hoffenheim | Schalke 04 | 1 | 4 | available | ... | None | 34 | Regular Season | PreZero Arena | Felix Brych | Julian Nagelsmann | André Breitenreiter | 1.1.0 | 2 | 2 |
306 rows Ă— 22 columns
2.ELO rating system (with arbitrary K and s factors)
https://en.wikipedia.org/wiki/Elo_rating_system
$ R_{} = R_{} + K ( - ) $
//also possible to incorporate the Goal difference factor like \(G = \frac{GD}{1+GD}\)
\(\text{Actual score}\) for the team is: * 1 - for a win * 0.5 - for a draw * 0 - for a loss
\(\text{Expected score}\) for the team: $ E = $
3. Application the ELO rating system to 1. Bundesliga 2015/2016 season matches
Team Rating
0 Borussia Mönchengladbach 119.256246
1 Werder Bremen 117.325608
2 Bayern Munich 115.286875
3 Schalke 04 110.249651
4 Eintracht Frankfurt 105.792199
5 Borussia Dortmund 105.689099
6 FC Köln 105.357665
7 Bayer Leverkusen 104.791526
8 Hannover 96 100.557829
9 Wolfsburg 98.128545
10 Darmstadt 98 97.147549
11 Hamburger SV 96.802878
12 Augsburg 94.286453
13 Hoffenheim 89.494287
14 FSV Mainz 05 88.182061
15 Hertha Berlin 87.955994
16 Ingolstadt 86.235838
17 VfB Stuttgart 77.459697
4. Finding optimal factors s and K based on the season’s data
K=25, s=400, MSE= 0.17810227843637477
K=25, s=450, MSE= 0.17836864179747566
K=30, s=400, MSE= 0.17795245927761352
K=30, s=450, MSE= 0.17801034693620812
K=35, s=400, MSE= 0.1781143332074865
K=35, s=450, MSE= 0.177965323655386
K=40, s=400, MSE= 0.17849574518103187
K=40, s=450, MSE= 0.17814722318708967
K=45, s=400, MSE= 0.17903423014209696
K=45, s=450, MSE= 0.17849574518103187
best K: 30, best s: 400, best MSE: 0.17795245927761352
5. Display the final ranking table with optimal K and s factors
Team Rating
0 Bayern Munich 399.843898
1 Borussia Dortmund 311.559629
2 Bayer Leverkusen 206.012058
3 Borussia Mönchengladbach 161.946876
4 Schalke 04 130.313364
5 FSV Mainz 05 109.815993
6 FC Köln 90.355869
7 Hertha Berlin 83.103770
8 Augsburg 63.408002
9 Wolfsburg 62.476312
10 Werder Bremen 55.539498
11 Hoffenheim 54.275737
12 Hamburger SV 47.719699
13 Darmstadt 98 42.814199
14 Eintracht Frankfurt 37.806151
15 Ingolstadt 35.617444
16 VfB Stuttgart -27.234742
17 Hannover 96 -65.373758
6. Finding the most surprising win in the season
Initial idea - the win of low-rated team with the gratest difference in ratings between teams in match.
match_id 3890553
match_date 2016-05-07
kick_off 15:30:00.000
competition Germany - 1. Bundesliga
season 2015/2016
home_team Eintracht Frankfurt
away_team Borussia Dortmund
home_score 1
away_score 0
match_status available
match_status_360 unscheduled
last_updated 2023-07-19T12:28:27.869912
last_updated_360 None
match_week 33
competition_stage Regular Season
stadium Deutsche Bank Park
referee Daniel Siebert
home_managers Niko KovaÄŤ
away_managers Thomas Tuchel
data_version 1.1.0
shot_fidelity_version 2
xy_fidelity_version 2
Name: 25, dtype: object
Largest rating gain
26.126711087581256