Ha Vu

Data Enthusiast

img1

About Me

Numbers have always fascinated me. During high school, solving advanced algebra and calculus problems was the activity I anticipated the most when classes started. As I went to college, I realized that numbers can be so much more than problems on paper and blackboard. Numbers can tell interesting stories about monthly cosmetic sales, or weekly students' performance in an online coding course, or weekly message open rate of a user on an online dating app. As I dive deeper, I could also see correlation and causation relationships between different metrics, analyze A/B experiment results, and with the right approach, I could even optimize segmentation for a business' promotional campaigns. These are all the things I've learned through my academic and professional experience as an aspiring data scientist. And as a data enthusiast, I hope to continue having the chance to apply data analytics, statistics, and machine learning to solving real-world business problems.

I'm an international student from Ho Chi Minh City, Vietnam. In my free time, you can always find me sitting in coffee shops and reading manga or browsing for dinner recipes.


Projects

For this project, I and my teammate used and machine learning models to classify tweets from politicians in Northern Europe into 4 political spectrums: left, right, center, and independent. Our team used the CatBoost classifier to classify the tweets. We also used the pretrained vectorizer Word2Vec to transform the tweets into vectors.

I participated in a team of 4 and constructed an end-to-end data intensive application that tracked the inventory of bikes and available docks at a given station with the NYC Citibike Bike Sharing system using PySpark and SQL on Databricks platform.

For this competition, I predicted the sub-seasonal temperatures in different regions in the U.S. for November and December 2022 using machine learning models and visualization, achieving a root-mean-squared error of 0.835 after model ensembling.

Project 4
Project 5
Project 6

Experience

Graduate Cohort Study Group Leader

University of Rochester - The Learning Center   -   Rochester, NY

August 2023 – Present

Data Analyst

STEAM for Vietnam   -   Remote

June 2021 – Present

Data Science Intern

Tinder   -   Rochester, NY

June 2023 – August 2023

Math Learning Assistant

Agnes Scott College Resource Center for Math and Science   -   Decatur, GA

August 2020 – May 2022

Digital Data Intern

L’Oréal Vietnam   -   Ho Chi Minh City, Vietnam

March 2021 – May 2021

Marketing Intern

Pressure System Builders Vietnam (PSBV)   -   Ho Chi Minh City, Vietnam

June 2020 – July 2020

Career Peer

Agnes Scott Collge Office of Internship and Career Development   -   Decatur, GA

April 2020 – May 2021

SUMMIT Peer Advisor

Agnes Scott College Office of Academic Advising and Accessible Education   -   Decatur, GA

March 2020 – May 2021

Student Program Assistant

Agnes Scott College Center for Diversity and Inclusion   -   Decatur, GA

September 2018 – May 2020


Mathematical Projects

For this project, I created a mathematical model to control the feral cat population in a city. We used the differential equations to measure the impact of the feral cat population on the city when left uncontrolled and suggest a solution to control the population.

For this project, I simulated the spread of fake news on social media. using the SIR differential equation model to measure the trend of the fake news on the social media explore ways to deter the spread of fake news.

This report is to summarize my team's analysis on the paper called Dynamic Modeling of Exercise Effects on Plasma Glucose and Insulin Levels by Anirban Roy and Robert S. Parker, which discusses the effects of exercises on plasma insulin and glucose level on diabetic people.


Connect