Open Source
My open-source work has been focused on developer tools and infrastructure. I’ve contributed to projects such as fastai, Metaflow, Kubeflow, Jupyter, and Great Expectations, as well as many others. I list some of these below:
Axolotl
I am a core contributor to Axolotl, a library for efficient fine-tuning of large language models. I also wrote an in-depth debugging guide for Axolotl.
fastai
I maintain and contribute to a variety of fastai projects. Below are the projects I’ve been very involved in:
Project | Description | Role | Other References |
---|---|---|---|
fastpages | An easy to use blogging platform for Jupyter Notebooks. | Creator | Blog, Talk |
nbdev | Write, test, document, and distribute software packages and technical articles all in one place, your notebook. | Core Contributor | Blog, Talk |
fastcore | A Python language extension for exploratory and literate programming. | Core Contributor | Blog |
ghapi | A Python client for the GitHub API | Core Contributor | Blog |
Metaflow
I created notebook cards: A tool that allows you to use notebooks to generate reports, visualizations and diagnostics in Metaflow production workflows. Blog
Kubeflow
I’ve worked on several projects related to Kubeflow, mainly around examples and documentation:
Project | Description | Role | Other References |
---|---|---|---|
GitHub Issue Summarization | An end-to-end example of using Kubeflow to summarize GitHub Issues. Became one of the most popular tutorials of Kubeflow. | Author | Interview with Jeremy Lewi |
kubeflow/codei-intelligence | Various tutorials and applied examples of Kubeflow. | Core Contributor | Talk |
The Kubeflow Blog | I used fastpages to create the official Kubeflow blog. | Core Contributor | Site |
Jupyter
I created the Repo2Docker GitHub Action, which allows you to trigger repo2docker to build a Jupyter enabled Docker images from your GitHub repository. This Action allows you to pre-cache images for your own BinderHub cluster or for mybinder.org.
This project was accepted into the official JupyterHub GitHub org.
Great Expectations
I developed the Great Expectations GitHub Action that allows you to use Great Expectations in CI/CD Workflows. Blog.
Other
I worked as a staff machine learning engineer at GitHub from 2017 - 2022. I led or created the following open source projects that explored the intersection of machine learning, data and the developer workflow:
Project | Description | Role | Other References |
---|---|---|---|
Code Search Net | Datasets, tools, and benchmarks for representation learning of code. This was a big part of the inspiration for GitHub’s eventual work on CoPilot. | Lead | Blog, Paper |
Machine Learning Ops | A collection of resources on how to facilitate Machine Learning Ops with GitHub. This project explored integrations with a wide variety of data science tools with GitHub Actions. | Creator | Blog |
Issue Label Bot | A GitHub App powered by machine learning that auto-labels issues. | Creator | Blog, Talk |
Covid19-dashboard |
A demonstration of how to use GitHub Actions, Jupyter Notebooks and fastpages to create interactive dashboards that update daily. |
Creator | News Article |