Genomics analysis platform

Picture of Sophia Petrillo's frowning face from The Golden Girls.
Picture it! South San Francisco, 2017
Dropbox has drag-and-drop interaction. Suddenly, it makes sense for science too.

Calico generated more data than we could manage by spreadsheet and bespoke analysis. This platform became our first home for routine, repeatable genomics analysis.

Data in the following screenshots are simulated.


Calico's computing team was small when I joined. We had no standard process for handling routine genomics samples & returning usable data to scientists—our most common data modality at the time. This platform helped meet the needs of our biologists (get useful data fast), computational scientists (reduce manual analysis time), and genomics lab (routinize and track submissions).


The platform comprises several key parts:

  • Project sample & metadata management
  • Genomics submissions
  • Pipelines (basecalling, RNA-seq, and others)
  • Visual, easy-to-use differential expression analysis tools

My colleagues Kiran, Archa, and Ken helped build the underlying computing infrastructure and prototyped pipelines. Matt Sooknah built APIs, production analysis pipelines, and key infrastructure. I developed the whole client and user interface, sample & metadata management, and some minor features and APIs in Python.

Many of the user interface components and React infrastructure were extracted into a UI library or re-used in other apps I built.

Tens of thousands of samples and hundreds of analyses have been handeld in point-and-click fashion with this platform, which is an essential part of Calico's scientific activity.