Open Source

A collection of open source projects I’ve been involved with.

My open-source work has been focused on developer tools and infrastructure. I’ve contributed to projects such as fastai, Metaflow, Kubeflow, Jupyter, and Great Expectations, as well as many others. I list some of these below:


Axolotl

I am a core contributor to Axolotl, a library for efficient fine-tuning of large language models. I also wrote an in-depth debugging guide for Axolotl.

fastai

I maintain and contribute to a variety of fastai projects. Below are the projects I’ve been very involved in:

Project Description Role Other References
fastpages GitHub Repo stars An easy to use blogging platform for Jupyter Notebooks. Creator Blog, Talk
nbdev GitHub Repo stars Write, test, document, and distribute software packages and technical articles all in one place, your notebook. Core Contributor Blog, Talk
fastcore GitHub Repo stars A Python language extension for exploratory and literate programming. Core Contributor Blog
ghapi GitHub Repo stars A Python client for the GitHub API Core Contributor Blog
No matching items

Metaflow

I created notebook cards: A tool that allows you to use notebooks to generate reports, visualizations and diagnostics in Metaflow production workflows. Blog

Kubeflow

I’ve worked on several projects related to Kubeflow, mainly around examples and documentation:

Project Description Role Other References
GitHub Issue Summarization An end-to-end example of using Kubeflow to summarize GitHub Issues. Became one of the most popular tutorials of Kubeflow. Author Interview with Jeremy Lewi
kubeflow/codei-intelligence Various tutorials and applied examples of Kubeflow. Core Contributor Talk
The Kubeflow Blog I used fastpages to create the official Kubeflow blog. Core Contributor Site
No matching items

Jupyter

I created the Repo2Docker GitHub Action, which allows you to trigger repo2docker to build a Jupyter enabled Docker images from your GitHub repository. This Action allows you to pre-cache images for your own BinderHub cluster or for mybinder.org.

This project was accepted into the official JupyterHub GitHub org.

Great Expectations

I developed the Great Expectations GitHub Action that allows you to use Great Expectations in CI/CD Workflows. Blog.

Other

I worked as a staff machine learning engineer at GitHub from 2017 - 2022. I led or created the following open source projects that explored the intersection of machine learning, data and the developer workflow:

Project Description Role Other References
Code Search Net GitHub Repo stars Datasets, tools, and benchmarks for representation learning of code. This was a big part of the inspiration for GitHub’s eventual work on CoPilot. Lead Blog, Paper
Machine Learning Ops A collection of resources on how to facilitate Machine Learning Ops with GitHub. This project explored integrations with a wide variety of data science tools with GitHub Actions. Creator Blog
Issue Label Bot A GitHub App powered by machine learning that auto-labels issues. Creator Blog, Talk
Covid19-dashboard GitHub Repo stars A demonstration of how to use GitHub Actions, Jupyter Notebooks and fastpages to create interactive dashboards that update daily.
Creator News Article
No matching items