Portfolio Page – Coffee Quality Modelling

Project Overview

A self-directed project using open-source coffee quality data to apply core machine learning techniques. The analysis focused on relationships between sensory measures, using geospatial analysis to explore regional trends, regression to predict one score from others, clustering to identify sensory profiles, and time series analysis to examine seasonal patterns and production stability.

Tools Used


  • Excel – Data Preparation | Visualisation | Analysis
  • KeyNote – Presentation
  • Python (Jupyter | Anaconda) – Scripting Environment
  • pandas | numpy | os – Data Manipulation
  • matplotlib | seaborn | pylab – Plotting | Visualisation
  • scikit-learn | statsmodels – Machine Learning | Statistical Modelling
  • folium – Geospatial Visualisation
  • Tableau – Dashboard Design

Skills Demonstrated


  • Script Writing
  • Exploratory Data Analysis | Data Wrangling | Aggregation | Subsetting
  • Linear Regression | Clustering (K-Means) | Model Evaluation
  • Time Series Analysis | Stationarity Testing | Lag Analysis
  • Geospatial Mapping
  • Visualisation | Dashboard Design

Data Sourced


This analysis uses a modified version of data originally sourced from the Coffee Quality Institute, made available on Kaggle.

Coffee Quality Dataset – Bean origin, variety, altitude, processing method, physical attributes, flavour metrics, and total quality score, with geospatial coordinates for most entries.

The Dataset was accessed on 02 November 2024.

Key Insights


Insight.

Visual — Description

Abc.

Visual — Description

Abc.


Insight.

Visual — Description

Abc.

Visual — Description

Abc.


Insight.

Visual — Description

Abc.

Visual — Description

Abc.

Visual — Description

Abc.

Key Takeaways

Recommendations


  • Genre Strategy
    Abc.
  • Regional Focus
    Abc.
  • Competitive Positioning
    Abc.
  • Digital Strategy
    Abc.

Links & Deliverables


Tableau Dashboard *(link to be added)*