Caleb N. Ellington

Ph.D. Student @ SAILING Lab in CMU-Pitt Computational Biology, advised by Eric P. Xing
Previously: Genesis Therapeutics / UW IPD / Indeed / Klavins Lab / Amazon


GHC 7405

Gates Hillman Complex, 4902 Forbes Ave

Pittsburgh, PA 15213

I work on AI/ML for biology to enable automated biomarker discovery, drug repurposing, and therapeutic target identification for heterogeneous and rare diseases. My research interests span foundation models, probabilistic graphical models, biological modeling, and multi-task learning.

I co-created, an sklearn-style toolbox for personalized modeling with heterogeneous observational data. Contextualized Machine Learning is a novel machine learning paradigm that estimates models, functions, and distributions at sample-specific resolution, allowing per-patient or per-cell analysis of biomedical data and enabling truly personalized modeling for healthcare, precision diagnostics, and drug repurposing.

I also enjoy rock climbing, running, backpacking, guitar, learning languages, and wild speculation about Spotify’s recommender system. I’ve lived in Austin TX, Seattle WA, and Pittsburgh PA and I call them all home.


Dec 4, 2023 Contextualized Networks Reveal Heterogeneous Transcriptomic Regulation in Tumors at Sample-Specific Resolution is online now on bioRxiv. We introduce a new machine learning paradigm for sample-specific inference of probabilistic graphical models, and show that sample-specific models of transcriptional regulation are highly accurate, interpretable, and prognostic across 25 tumor types and nearly 8000 patints. I’ll be sharing significant updates to the method and results at NeurIPS 2023 GenBio Workshop.
Oct 17, 2023 The Contextualized Machine Learning White Paper is online at arXiv here. We explore applications, algorithms, and extensions for contextualized models: models that understand heterogeneity in real data, adapt to new environments, and are explainable by design. This work outlines the vision for the Contextualized ML project, which I co-created with Ben Lengerich and has transformed into a fast-growing reasearch community.
Oct 11, 2023 Contextualized Policy Recovery: Modeling and Interpreting Medical Decisions with Adaptive Imitation Learning is online on arXiv here. This work is co-first with my extremely talented MS student Jannik Deuschel.
Apr 26, 2023 Contextualized ML is one year old today. Over the past year we’ve gained 50 GitHub stars and over 6000+ downloads.
Nov 16, 2022 Check out my talk from CSHL Biological Data Science this past week: Contextualized Graphical Models Reveal Sample-Specific Transcriptional Networks for 7000 Tumors

selected publications

  1. arXiv
    Contextualized Machine Learning
    Lengerich, Benjamin, Ellington, Caleb N.,  Rubbi, Andrea and 2 more authors
  2. bioRxiv
    Contextualized Networks Reveal Heterogeneous Transcriptomic Regulation in Tumors at Sample-Specific Resolution
    Ellington, Caleb N., Lengerich, Benjamin J.,  Watkins, Thomas BK and 4 more authors