Research

LLM Research

I am currently working on LLM pretraining and finetuning on the MosaicML team at Databricks.

As a Research Scientist at MosaicML, I was part of the team that pretrained and finetuned the open-source large language models MPT-7B and MPT-30B.

I recently had the privilege of working with Aditi Jha on fine-tuning LLMs: “LIMIT: Less Is More for Instruction Tuning Across Evaluation Paradigms” (NeurIPS 2023 Workshop). In this paper we explored how many instruction tuning samples are necessary to achieve good performance on both traditional NLP benchmarks as well as with model-based evaluation paradigms.

Back when the MosaicML NLP team consisted of only 9 researchers, we did some work on optimizing BERT pretraining. Here is our detailed blog post and report: “MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining” (NeurIPS 2023). We used a lot of the insights from this work to build MPT-7B and MPT-30B.

As a ML Research Intern at MosaicML, I worked on cyclic learning rate schedules for estimating training efficiency. Our work is summarized in this blogpost “Efficiently Estimating Pareto Frontiers with Cyclic Learning Rate Schedules” and this workshop paper Fast Benchmarking of Accuracy vs. Training Time with Cyclic Learning Rates.

This talk by Jonathan Frankle gives an overview of some of MosaicML’s early research.

Brain Machine Interfaces and Biological Learning Rules

During my PhD I worked on biologically plausible learning in recurrent neural networks (RNNs), reinforcement learning (RL), and motor control with James M. Murray.

How does neural activity change during motor learning, and what does it say about the underlying mechanisms? In our recent NeurIPS 2022 paper “Distinguishing Learning Rules with Brain Machine Interfaces”, we derive a metric to distinguish between learning rules by observing changes in neural activity during learning, given that the mapping from brain to behavior is known by the experimenter. Because brain-machine interface (BMI) experiments allow for perfect knowledge of this mapping, we focus on modeling a cursor-control BMI task using recurrent neural networks, showing that learning rules can be distinguished in simulated experiments using only observations that a neuroscience experimenter would plausibly have access to.

The Fly Brain

For a large part of my PhD, I worked on a project with Rudy Behnia, Larry Abbott and Jessica Kohn on the neural computation of motion in Drosophila eyes. Our paper “Flexible filtering by neural inputs supports motion computation across states and stimuli” was published in Current Biology. Here is a Current Biology “Dispatch” that summarizes this work: Motion vision: Pinning down motion computation in an ever-changing circuit

Our work is summarized in this research talk:

drawing

Some of my pre-PhD work in the Hillman Lab investigated patterns of neural activation and blood flow (i.e. neurovascular coupling) in the rodent cortex.

In a previous life, I wrote a review-style master’s thesis on superconducting qubits for quantum computing.


LLM Research

Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs

MPT-30B: Raising the bar for open-source foundation models

LIMIT: Less Is More for Instruction Tuning Across Evaluation Paradigms

Publications

  1. “LIMIT: Less Is More for Instruction Tuning Across Evaluation Paradigms” Aditi Jha, Sam Havens, Jeremey Dohmann, Alex Trott, Jacob Portes (NeurIPS 2023 Workshop) [preprint] [website] [code]

  2. “MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining” Jacob Portes, Alexander R Trott, Sam Havens, DANIEL KING, Abhinav Venigalla, Moin Nadeem, Nikhil Sardana, Daya Khudia, Jonathan Frankle (NeurIPS 2023)

  3. “Distinguishing Learning Rules with Brain Machine Interfaces” Jacob P. Portes, Christian Schmid, James M. Murray (NeurIPS 2022) [preprint] [code]

  4. “Fast Benchmarking of Accuracy vs. Training Time with Cyclic Learning Rates” Jacob Portes, Davis Blalock, Cory Stephenson, Jonathan Frankle (“Has it Trained Yet?” NeurIPS 2022 Workshop) [paper] [preprint] [blogpost]

  5. “Flexible Computation in Neural Circuits” Jacob P. Portes (PhD Thesis, 2022) [dissertation]

  6. “Flexible filtering by neural inputs supports motion computation across states and stimuli” Jessica R. Kohn*, Jacob P. Portes*, Matthias P. Christenson, LF Abbott, Rudy Behnia, (Current Biology, 2021) (*equal contribution) [article] [preprint] [code]

  7. “Resting-state hemodynamics are spatiotemporally coupled to synchronized and symmetric neural activity in excitatory neurons” Ying Ma, Mohammed A. Shaik, Mariel G. Kozberg, Sharon H. Kim, Jacob P. Portes, Dmitriy Timmerman, Elizabeth M.C. Hillman (PNAS, 2016) [article]

Posters

  1. “Distinguishing Learning Rules with Brain Machine Interfaces” NeurIPS (November 2022) Jacob P. Portes, Christian Schmid, James M. Murray

  2. “Flexible filtering by neural inputs supports motion computation across stimuli and states” CSHL Drosophila Neurobiology (October 2021) Jacob Portes*, Jessica Kohn*, Matthias Christenson, Larry Abbott, Rudy Behnia (*equal contribution)

  3. “Neural signatures of supervised learning vs. reinforcement learning in brain-machine interface tasks” Cosyne (February 2021) Jacob Portes, James Murray

  4. “An Anatomically and Functionally Constrained Model of Direction Selectivity in Drosophila” Cosyne (February 2020) Jacob Portes*, Jessica Kohn*, Larry Abbott, Rudy Behnia (*equal contribution)

  5. “The effect of locomotion-induced octopamine release on motion detection circuits in Drosophila” Society for Neuroscience (November 2017) Jessica Kohn, Jacob Portes, Rudy Behnia.

  6. “Historical analysis of the role of theory in the development of neuroscience” Society for Neuroscience Theme J - History Poster (November 2016) J. Portes

  7. “The effects of endothelial dysfunction on neuronal activity, hemodynamics and neurovascular coupling” Society for Neuroscience (November 2015) Mohammed A. Shaik, Jacob P. Portes, Sharon H. Kim, Elizabeth M. C. Hillman

  8. “A new nonlinear model of the fMRI BOLD response” Society for Neuroscience (November 2014) Jacob P. Portes, Cyrus B. Amoozegar, Brenda R. Chen, Mariel G. Kozberg, Mohammed A. Shaik, Elizabeth M.C. Hillman

  9. “A new nonlinear model of the fMRI BOLD response” General Electric (GE) Student Research Summit Talk and Poster (August 2014) Jacob P. Portes, Cyrus B. Amoozegar, Brenda R. Chen, Mariel G. Kozberg, Mohammed A. Shaik, Elizabeth M.C. Hillman

Master’s Thesis

  1. “Decoherence, Superconducting Qubits, and the Possibility of Scalable Quantum Computing” supervised by Allan Blaer in the Columbia Physics department and with the kind encouragement of Anargyros Papageorgiou in the CS department
    • Abstract Is it possible to implement a fully controllable, unambiguously quantum computer? While most in the field believe that the answer is in the affirmative, uncertainty and skepticism still exist among academics and industry professionals. In particular, decoherence is often spoken of as an insurmountable challenge. This thesis argues that there are no fundamental mathematical or physical properties that would preclude the possibility of implementing a fully controllable quantum computer using superconducting qubits. The proof is in key results from the past 30 years in math, physics and computer science; this thesis is a sketch of these results. It begins with the well known theoretical results that have motivated the field - namely quantum algorithmic speed up and efficient error correction - and continues with an overview of the well developed theory of decoherence, arguing that decoherence has been and can still be significantly reduced. These theoretical results are related to superconducting qubits throughout. The thesis concludes with a summary of recent experimental progress with superconducting qubit circuits.