[ Login ]

__repr__

4 month(s) ago uolter   comments 




Waiting for FIFA World cup 2014

5 month(s) ago uolter   comments 


Disclaimer

I'm not a fan of the game of football.

Data & Facts

Best teams in world cups (Since now)

This is the classification ranking.

team first second third fourth tot
0 Brasile 5 2 2 1 10
1 Italia 4 2 1 1 8
2 Germania 3 4 4 1 12
3 Argentina 2 2 0 0 4
4 Uruguai 2 0 0 3 5

5 rows × 6 columns

best team

World cup history

#YearCountryfirstpsecodpthirdpfourthp
01930UruguaiUruguaiArgentinaUSAJugoslavia
11934ItaliaItaliaCecoslovacchiaGermaniaAustria
21938FranciaItaliaUngheriaBrasileSvezia
31950BrasileUruguaiBrasileSveziaSpagna
41954SvizzeraGermaniaUngheriaAustriaUruguai
51958SveziaBrasileSveziaFranciaGermania
61962CileBrasileCecoslovacchiaCileJugoslavia
71966InghilterraInghilterraGermaniaPortogalloURSS
81970MessicoBrasileItaliaGermaniaUruguai
91974GermaniaGermaniaOlandaPoloniaBrasile
101978ArgentinaArgentinaOlandaBrasileItalia
111982SpagnaItaliaGermaniaPoloniaFrancia
121986MessicoArgentinaGermaniaFranciaBelgio
131990ItaliaGermaniaArgentinaItaliaInghilterra
141994USABrasileItaliaSveziaBulgaria
151998FranciaFranciaBrasileCroaziaOlanda
162002Corea del Sud\n\nGiapponeBrasileGermaniaTurchiaCorea del Sud
172006GermaniaItaliaFranciaGermaniaPortogallo
182010SudafricaSpagnaOlandaGermaniaUruguai

  • the first world cup was held in 1930
  • the last was in 2010

Teams in 2014 world cup and their position in the medals ranking

#teamfirstsecondthirdfourthtot
15Germania344112
5Brasile522110
22Italia42118
14Francia11215
1Argentina22004
25Olanda03014
20Inghilterra10012
28Spagna10012
26Portogallo00112
9Corea del Sud00011
7Cile00101
12Croazia00101
3Belgio00011
27Russia00011
11Costa Rica00000

All matches so far ....

#dateteam_ateam_bresult
013/07/30FranciaMessico4-1
113/07/30USABelgio3-0
214/07/30YugoslaviaBrasile2-1
314/07/30RomaniaPerù3-1
415/07/30ArgentinaFrancia1-0

...here the full list.

Direct matches graph

network

Network diameter:3

Betweeness Centrality

  • Brasile 0.123351914014
  • Germania 0.102325396665
  • Italia 0.083260275077

Players in 2014 tournament.

#PosGiocatoreetaPresGolNazionale
0PManuel Neuer28450Germania
1PRoman Weidenfeller3310Germania
2DPhilipp Lahm301055Germania
3DPer Mertesacker29964Germania
4DMarcell Jansen28453Germania

here the full list.

Older players

  • Noel Valladares 37 (Honduras)
  • Giorgos Karagounis 37 (Grecia)
  • Didier Drogba 36 (Costa d'Avorio)
  • Daniel Van Buyten 36 (Belgio)
  • Gianluigi Buffon 36 (Italia)

Younger players

  • Frank Bagnack 18 (Camerun)
  • Luke Shaw 18(Inghilterra)
  • Adnan Januzaj 19 (Belgio)
  • José Giménez 19 (Uruguai)
  • Cristian Ramírez 19(Ecuador)

Best Scorer

etaPresGolNazionale
Miroslav Klose3513168Germania
Didier Drogba369963Costa d’Avorio
Samuel Eto'o3311555Camerun
Cristiano Ronaldo2911049Portogallo
Lukas Podolski2811246Germania

DataSet

The dataset is coming from the Italian wikipedia page dedicated to the FIFA world cup, and pages linked to it. Scraping the paged I collected a a few csv files:

Some code

Code and datasets is freely available on github with some more results and visualizations. Fell free to change it.



What data is saying

6 month(s) ago uolter   comments 


On Sunday 27 of April for the first time in the history two Popes - John XXIII and John Paul II - have been declared Saints by the current Pope Francis.

I’ve tried to see this event from a point of view of the data coming from Twitter analysing all posts with the hashtag #2popesaints.

I started to collect data via the stream api from 8.30 am to 4.30 pm CET, and basically I got the text message the creation time the device used to tweet and the geographic coordnates.

textcreated_atgeosource
count448544485472644854
firstNaN2014-04-27 06:30:05NanNaN
freq359161510920
lastNaN2014-04-27 14:30:0NanNan
topRT @catolicos_es: ¡San Juan XXIII ...2014-04-27 10:00:06{ "type" : "Point", "coordinates" : [ 6.864163.Twitter for iPhone
unique1910619056642210

(summary table)

Frequency, peaks and crash

graph messages per mins

The graph above represents the number of tweets per minutes. They reached the highest level around 10.30 am when during the ceremony the two Pope have been declared Saints. A few minutes after the peak the data stream reader creased and I missed some data. Probably, my infrastructure was not strong enough to collect such an amount of data in few seconds.

At 4.30 pm data was flowing with an average speed of 93.23 post per minute while just after the crash the speed was 111.93 post per minute. Though we can assume the highest around 10.30 am.

iPhone wins.

Definitely, the iPhone was the device widely used to send messages, android came second then any web browser, the Blackberry and Windows phone which is not in a very good shape in terms of market penetration.

Twitter for iPhone10920
Twitter for Android10403
web8855
Twitter for iPad3423
TweetDeck2029
Twitter for BlackBerry®1861
Twitter for Android1397
Twitter for Android Tablets918
Mobile Web585
HootSuite549
Twitter for Windows Phone534
TweetCaster for Android332
Twitter for BlackBerry332
Facebook258
Twitter for Mac220

Where faith lies

I collected 44854 messages, but only a small subset of 726 of them were provided of geographic coordinates. Even though putting them on a map does not surprise to notice they are mostly coming from Catholic Countries: mainly America and Europe.

openstreetmap


Language detected

Even the language detection from the messages is not really surprising:

language histogram

Most of them are in English - the hashtag was in English too - then Spanish, Italian and French.

One technical issue here: It’s very important to clean up all the data before starting working on it. Especially in the language detection is very important to remove all the not-chars symbols, all the urls, hashtags and twitter account (starting with @). When I started with dirty data Swedish came fourth before Italian. That was obviously wrong.

Tools

This is the full set of tools I used for this analysis:

Code

Code is freely available on github in a rough format. Feel free to change it.



SQL Joins

7 month(s) ago uolter   comments 


.... and it's great when you want to learn the different kind of joins:

sql joins



 
Back to top