Introduction
Soccer is a well-liked sport with loads of statistics to investigate. On this venture, we explored the performances of two of the best soccer gamers of all time, Lionel Messi and Cristiano Ronaldo. We analyzed their objectives and assists knowledge from the seasons 2002/2003 to 2022/2023. Our goal was to seek out any attention-grabbing patterns or tendencies of their performances through the years.
Exploratory Knowledge Evaluation (EDA)
We began by loading the information right into a Pandas DataFrame and performing some exploratory knowledge evaluation. We noticed that Messi had the next imply and median for objectives, assists, and video games performed than Ronaldo. This advised that Messi was usually extra productive than Ronaldo. We additionally noticed that each gamers carried out exceptionally effectively within the 2018–19 season. Messi had the best variety of objectives and assists, whereas Ronaldo had the best variety of video games performed.
I visualized the information by plotting Messi’s and Ronaldo’s objectives towards video games. I noticed a optimistic correlation between the 2 variables, which was anticipated. Messi had the next slope, which indicated that he was extra more likely to rating a purpose in a recreation than Ronaldo. We additionally noticed that the variety of objectives and video games performed by each gamers elevated through the years, which advised that they have been turning into extra productive.
Characteristic Engineering
I created a brand new characteristic known as “Purpose Conversion Fee,” which was the ratio of objectives scored to video games performed. This characteristic was necessary as a result of it gave us a greater understanding of how environment friendly each gamers have been at changing their alternatives into objectives. We noticed that Messi had the next conversion price than Ronaldo, which advised that he was a extra medical finisher.
Speculation Testing
I carried out a speculation check to find out if there was a major distinction within the imply variety of objectives scored by Messi and Ronaldo. I used a two-sample t-test assuming equal variances and a significance degree of 0.05. Our check resulted in a t-statistic: -0.22474028517867933
p-value: 0.8234565161330686, I had dropped rows with lacking values first earlier than the speculation.
Modeling
I created a linear regression mannequin to foretell the variety of objectives scored by Ronaldo and Messi based mostly on the variety of video games performed and assists. The mannequin predicted 18 objectives for Ronaldo, and for Messi, the mannequin predicted 16 objectives.
Visualization
I plotted Messi’s and Ronaldo’s objectives and assists towards time. We noticed that each gamers’ performances fluctuated through the years however remained constantly excessive. I additionally noticed that Messi had the next variety of objectives and assists than Ronaldo.
I plotted the correlation matrix for the dataset, which confirmed that the variety of objectives scored was strongly positively correlated with the variety of video games performed and assists. We additionally noticed a optimistic correlation between assists and video games performed.
Conclusion
In conclusion, we explored the performances of Lionel Messi and Cristiano Ronaldo through the years. I noticed that Messi was usually extra productive than Ronaldo and had the next goal-conversion price. We additionally noticed that each gamers grew to become extra productive through the years. We created a mannequin to foretell Ronaldo’s objectives based mostly on the variety of video games performed and plotted the correlation matrix for the dataset. General, our evaluation offered insights into the efficiency of two of the best soccer gamers of all time.
Right here is the hyperlink to the code: https://github.com/HenryMorganDibie/Messi-VS-Ronaldo-Stats