Project · MSc Coursework

Computational data analysis

Applied statistical modeling and unsupervised learning on real datasets, across two projects covering regression with regularization and Gaussian Mixture Models.

RegressionRegularization GMMStatisticsPython
Problem

Explore how different statistical models behave when applied to noisy, real-world data: specifically how regularization affects regression, and how mixture models handle physiological signal data.

Approach

Project 1, Regression: Built linear regression models with L1 (Lasso) and L2 (Ridge) regularization and tuned hyperparameters via grid search. Investigated how regularization strength affects coefficient sparsity, prediction error, and model interpretability.

Project 2, Unsupervised learning: Applied Gaussian Mixture Models to physiological signal data (heart rate, blood pressure, skin conductance) to explore whether clusters correspond to distinct emotional states. Fit models with varying numbers of components and evaluated using BIC and cluster coherence.

Validation

Evaluated model behavior through error metrics, parameter sensitivity analysis, and qualitative inspection of learned clusters. Checked for overfitting by comparing training and held-out performance, and examined whether identified clusters had interpretable structure.

Outcome

Practical experience translating statistical theory into working models and interpreting results with appropriate caution, including recognizing when apparent patterns in unsupervised outputs are not actually supported by the data.