Impact of Zone and Construction Permits on Real Estate Prices in Washington, D.C.
This is a group project completed for the course MGT 6203 (Data Analytics in Business) as part of my graduate studies at Georgia Tech. The objective of this project was to evaluate how neighborhood investment and development, as measured by construction and building permits, impact residential sale prices within Washington, D.C. over a five year period from 2018 – 2022. Data from publicly accessible sets released by the Washington, D.C. government were utilized to scrutinize patterns at the Census Block Group level.
Holt-Winters Forecasting with CVG Temperature Data
The goal of this exercise is to use Holt-Winters forecasting, also known as triple exponential smoothing, to predict future monthly average temperatures in the Cincinnati area. The data was taken from the National Weather Service, and contains the monthly temperature averages in Fahrenheit from January 2000 through December 2022.
I used the great e-textbook Forecasting: Principles and Practice by Hyndman and Athanasopoulos to inform my knowledge and methodology.
Titanic Classification with Support Vector Machines
This is the (classic) Titanic classification problem. The training data set contains information about each passenger, and whether they survived the sinking of the Titanic or not. Let’s see how accurate of a model I can make, using Support Vector Machines classification models with K-fold Cross Validation, to predict if a passenger would survive or not.
The training data is provided by Kaggle.com, from their Titanic – Machine Learning from Disaster competition. There was testing data provided, but due to the nature of the competition, it does not provide the survival column. Because of this, the testing data will be created from a split in the training data.
Import Container Tracker
This is a script I designed to make it much easier to track shipping containers in bulk. It takes an input .txt file of desired container numbers and drives a Chrome window to input the numbers into SeaRates.com and return the last status in a .csv file.
If the site is slow or the container number does not yield any results, the script will wait 20 seconds before yielding “No useful information can be output” and moving on. In these instances, I have to manually enter the container number into a separate site (typically that of the actual freight forwarder).
King County Housing Data Visualization
Interactive visualization in Tableau depicting trends between house features, price, crime, and local health inspection scores using publicly available government data. This was originally create for my Data Visualization class at the University of Cincinnati.
Credit Card Evaluator
This is a Pandas utility to calculate total cashback based on my PNC credit card spending. I constructed this to evaluate performance between several credit cards that I was considering. The script works by the user inputting desired cashback percentages per category to test, assigns the percentage and calculates cashback for each row in the sample .csv data, then generates the sum.
For my own personal use, I tested the PNC Cash Rewards card, Chase Freedom Unlimited, and Citi Custom Cash card. Based on my existing spending data, the Chase card yielded the most cashback. I still use the Chase card today!