top of page

Portfolio

Rush's Logo.jpg
Rush's Music

I assisted the owner of Rush's Music with updating some of their systems to make processing rentals much faster and easier. I consolidated most of the data that was taken over the summer into some Excel files to speed up the process of intake. This helped reduce overtime hours and expedite rental procedures. 

Github_logo.png
La Liga Simulated Prediction
For one of my class projects, I was assigned the task of searching for some information that I would be interested in and creating useful visualization with the information. I decided to try and predict the winners of the La Liga season 2023-2024. I scraped and cleaned data off of the internet, used some small amounts of machine learning, and created visuals that would be beneficial in understanding what the data meant and predict the league winners in a table. 
Web Scraping Method
The code starts by scraping La Liga football match data and shooting statistics from FBref.com for multiple seasons, focusing on years 2023 to 2020.
    
Next it uses BeautifulSoup to parse HTML content, extracting team-specific URLs from the league standings page for detailed data gathering.
    
Then the script iterates over each season, collecting and merging two types of data for each team: match fixtures and shooting statistics.
    
 Then match data and shooting stats are merged on the 'Date' column, filtered for La Liga matches, and enriched with 'Season' and 'Team' information.
    
Afterwards all collected data is combined into a single Data Frame, processed with lowercase column names, and saved as a CSV file, "matches_laliga.csv," for further analysis.
fbref_scrape_years(2024-2020).jpg
Data Utilization
goals_vs_points.png
The graph illustrates a positive correlation between the number of goals scored (Goals For) and the total points accumulated in the season (Total Points) for La Liga teams, with a red regression line indicating that teams scoring more goals tend to earn more points.

Data points for individual teams show variances from the trend, where teams above the line are outperforming expectations based on their goal tally, and those below are underperforming, as highlighted by the shaded confidence interval around the regression line.
The confusion matrix visualizes the performance of a classification model on La Liga match outcomes, with the diagonal cells (111 for 'Lose', 10 for 'Draw', 107 for 'Win') indicating correct predictions and off-diagonal cells showing misclassifications.

The matrix highlights that the model is better at predicting wins than losses or draws, as seen by the higher correct prediction counts for wins and the significant number of misclassified losses and draws.
Confusion_Matrix.png
This table was produced by aggregating and analyzing La Liga football match data, where a predictive model simulated outcomes for each team across the season, factoring in historical performance, current form, and various statistical inputs.

The table reflects a summation of each team's performance, with wins (W), draws (D), and losses (L) accumulated over 38 matches, revealing disparities that suggest further refinement in the predictive model to align the simulated results with realistic match counts.
bottom of page