Jason Carlson

Data / ML Engineer

Machine learning and data engineer with experience building large-scale production systems handling petabytes of data and tens of millions in business impact.Book a 30 minute intro call: Calendly Link

What I help with

  • Data pipeline architecture and debugging

  • Machine learning model prototyping and deployment

  • Scaling Spark / Ray / distributed data workflows

  • ML experimentation infrastructure

  • Performance optimization for large data systems

Who I help

  • Early-stage startups

  • Small businesses with messy data

  • Teams that need a first ML/data system

  • Founders who need part-time technical help

Selected Experience

Large-scale search ML pipeline (CloseMatch)
Built a petabyte-scale Apache Spark ML pipeline from scratch to reduce undifferentiated search results. System became the foundation for an entire engineering team’s workflow. Annual profit ~$15M / year
Stack: AWS, EMR, Spark, Random Forest, BERT
Identity Graph System
Built a petabyte-scale graph-processing pipeline for identity resolution. Reduced refresh latency by 50% and cold-start processing time by 80%. Saved ~1.5M / year, faster iteration from cold starts too.
Stack: Google Cloud, Dataproc, Spark
Advertiser Risk ML Model
Built and deployed a supervised ML system that reduced advertiser campaign suspensions, increasing annual profit from $15M → $43M.
Stack: Random Forest, AWS Lambda, Glue, S3
Distributed ML Experimentation Platform
Built a Ray-based experimentation platform on Kubernetes for large-scale reinforcement learning research and model evaluation.
Stack: Ray, EKS

Engagements

  • Intro calls

  • Architecture reviews

  • Project-based consulting

  • Fractional technical advising


Contact

Best way to reach me for general inquiries: [email protected]To schedule a free intro call: Calendly LinkTo schedule a paid consulting call ($300 / hour): Calendly Link