Hamel Husain

Staff Machine Learning Engineer @ GitHub

Beaverton, OR


Office Hours

How does this work?

I enjoy building tools and platforms to make data scientists more successful. I have authored and am a core maintainer of many popular open source data science tools / infrastructure. I also have extensive experience as a machine learning engineer across a wide variety of industries. Professional Page: https://hamel.dev/

Ask me about
Weights & Biases
Machine Learning
Developer Tools
Data Science and Machine Learning Platforms
Data science
Work experience

Sep 2020 - Present


Creating Amazing Tools For Data Scientists

Work with Jeremy Howard on various open-source projects. My focus is on infrastructure, developer tools, DevOps, and software engineering. Some products I've been involved with so far: - nbdev: A notebook-enabled software development environment - github.com/fastai/nbdev - fastcore: Extensions to the python programming language - github.com/fastai/fastcore - ghapi: A python client for the GitHub API w/ support for GitHub Actions - github.com/fastai/ghapi - fastpages: A blogging platform with support for Jupyter notebooks - github.com/fastai/fastpages I've also created and help maintain many of the CI, integration tests, and build tools for the fastai ecosystem. Some examples include Conda build tooling https://github.com/fastai/fastconda, Docker image maintenance https://github.com/fastai/docker-containers, and many others.


2017 - Present


Staff Machine Learning Engineer

- Creating systems and conducting research that enable representation learning of code, issues, repos, and users. - Led launch of CodeSearchNet (www.github.com/github/CodeSearchNet) - Building a new class of Machine Learning Ops products: https://www.mlops-github.com Open sourced several examples of how to build data products and use deep learning with public GitHub data. These examples can be viewed at: http://hamel.dev, and include - Summarizing GitHub Issues using sequence to sequence models - Natural Language Semantic Code Search - How to create machine learning enabled GitHub Apps to automate the developer workflow - How to achieve Machine Learning Ops with GitHub Actions and Kubeflow


2016 - 2017


Senior Data Scientist - Machine Learning

- Led the creation of a company-wide machine learning infrastructure roadmap. - Optimized marketing spend of $500M annually using machine learning. - Built customer lifetime value models that predict spending patterns for every user at an individual level. This is used across the entire company in the following areas: marketing, customer service, risk and finance. - Created image classification models to recognize objects in listing photos and for image re-ordering. - Built end-to-end pipelines for data products using Airflow. - Created reusable frameworks and infrastructure that enable other data scientists to iterate faster. - Used a large suite of tools for modeling: Spark, Python, Keras / TensorFlow, DataRobot, H2o, Vowpal Wabbit. http://nerds.airbnb.com/data/


2015 - 2016


Senior Data Scientist

Data scientist and product manager for a machine-learning software company. - Deployed a wide variety of modern algorithms such as xgboost, random forest, gradient boosted trees, support vector machines, elastic net etc. on operational systems. - Regularly utilized python, spark and H2O for data wrangling, model building and visualizations. - Built data science workflows in order to ensure that data was aggregated, collected and cleaned in ways that maximized accuracy and transparency. - Contributed significantly to the product design and user experience for the data ingest and python API components of the product. - Benchmarked and studied competing and complimentary machine learning software for accuracy, speed and usability. (H20, Graphlab/Turi, Amazon, Azure, etc.) - Contribute to product strategy and roadmap. - Presently an advisor on product strategy and data science.

2011 - 2015


VP, Applied Analytics

Led data science efforts on consulting engagements for a wide variety of industries. Some concrete outcomes: - Founding member of the advanced analytics consulting practice, grew a team of 5 to 50 machine learning and analytics practitioners. - Built data-driven tools that enabled a large restaurant chain reduce food-waste by 15% through demand forecasting. - Created models and interactive data visualizations for a large casino to increase the detection of fraud by 50%. - Performed market basket analysis and built recommendation systems that increased conversion rate by 5% for a large clothing retailer. - Launched professional development curriculum for junior data scientists that built skills for data wrangling, predictive modeling and visualizations using Python and the command line.


2004 - 2008


Consultant - Management Consulting

Specialized in CRM data analysis for fortune 500 telecommunications companies. Lead teams of 3-5 people on multiple engagements to implement data analytics solutions for clients in the telecom industry. Saved clients over $85M dollars in the areas of supply chain, call center optimization and churn reduction. Some examples include: - Provided early detection of product defects and network failures through use of call center analytics, saving over $10M in contact center overhead. - Built predictive models to provide an early warning system for customer churn, allowing client to intervene and reduce churn rate by 7%. - Developed scorecards, metrics, and dashboards that enabled clients to increase customer satisfaction while simultaneously reducing costs by changing employee and partner incentive structures.


May 2003 - Apr 2004

Washington Mutual

Credit Risk Analyst

Developed and implemented models to evaluate credit risk on loan portfolios.


Georgia Institute of Technology

Master of Science (M.S.), Computer Science, Machine Learning

University of Michigan

Doctor of Law (J.D.), Cum Laude

Southern Methodist University

B.S., Mathematics and Industrial Engineering

Talk to Hamel

@ Copyright 2020 OfficeHours Technologies Co.