Get in touch: email@example.com
I founded MonkeySee to make it easier to test and automate webapps. We're using the latest language modeling and computer vision to understand HTML and act on it.
Software Engineer ↣ Vice President, Machine Learning
I was at Globality for just over five years. I built up most of their initial ML platform (years 1-3) and then focused on growing their team into the organization it is today (years 3-5). Our focus was building the world's most extensive network of professional service suppliers using web crawling, information extraction, and NLP.
Investor & Advisor
I make small seed-sized investments in promising deep tech startups. I also advise companies on engineering architecture and AI strategy.
Engineer in Residence (Intern)
At Founders Circle Capital, I wrote an ML prioritization pipeline for their prospects so they could focus on the highest likelihood leads.
I studied at Stanford University. My core research was on predictive time series data (for wearable sensors) and multimodal models for language modeling and vision understanding.
gpt-json provides schema definition and validation for the GPT line of models, for use in data pipelines and structured reasoning.
vectordb-orm is a small ORM wrapper on top of vector databases. It allows for easier model definition as Python objects and abstracts the backend details. Currently supporting Milvus and Pinecone.
Dagorama is a simple daemon library for Python. It's a hybrid between Celery and Dask, where tasks can be chained together but run on different machines in parallel. Work in Progress.
Groove is a MITM proxy specifically optimized for web crawling and unit test construction. The core logic is written in Go and provides an API client in Python. It allows customization of cache handling, request recording, 3rd party routing, and TLS fingerprinting.
Headfull Chrome is a docker image for easier web crawling that integrates font packages, display virtualization, and remote control. It makes it easier to develop web automation that mirrors how you use a browser. For more background, see the post