◆ Greatly improved a model used for predicting clean times of homes for scheduling and cost estimation. Took the model all the way from conception to production, while iterating versions. This involved working closely with engineering teams to scope out how model would interact with scheduler and getting model in a production ready state. The result was a greatly decreased error rate with an ROI of $3.4m. This was extended to build a model that would help onboarding specialists have data to better negotiate contractor rates.
◆ Led feature engineering side of data lake project to provide central source of features among both the data science team and rest of the company (mainly analysts). Main goal was to reduce duplicate efforts among the team and increase knowledge sharing.
◆ As part of project, wrote tickets and prioritized all work. Mentored two interns. Used python/dask to create features and write them to S3 buckets. Worked with engineering using Athena + AWS Glue to get features from S3 to be queried through Redshift Spectrum. Created over 30 features in 2 months. This was a continuous work in progress as well and gave experience with geographical data.
◆ Worked on first pass recommendation system for website, mobile apps, and marketing. This involved an API and taking on a lot of software engineering tasks for deployment. The recommendation system suggested markets using s2spheres in order to generalize and better map to other location data at Johnson & Johnson.
◆ Continued work on the owner churn model and exploring different methods. Part of this also involved identifying gaps in owner data and engagement as well as some possible ways to to help increase and capture both. Along with that, also identified other data that could be acquired to further help with accuracy of the model.
글자수 1,837자1,842Byte