Current Projects

RECENTLY GRADUATED!

The NIMBLE Environment for Statistical Computing

Many challenges in data science benefit from increasingly sophisticated statistical models. NIMBLE is becoming increasingly popular but is in need of open-source software leadership that drives its adaptation to parallel and distributed infrastructures as well as in-storage computing environments.

Start Date: Summer ’17

ON-GOING

Skyhook: Elastic Databases for the Cloud

The cloud business model requires flexible resource usage but traditional relational databases strongly couple data to physical resources making it difficult to add and remove database nodes. The Skyhook project extends PostgreSQL with a data/resource decoupling that allows dynamic expansion and shrinking of database clusters and enables the query optimizer to leverage this functionality.

Start Date: Late Fall’16

 

STARTING IN EARLY 2019

Tracery2 and Chancery

Tracery is a generative-text library and language implemented in Javascript. Its goal was to enable casual users (novice coders, but also those who do not ‘code’) to write simple JSON files that encodes grammar rules which produce complex recursively-expanded text. It was initially created as a class project at UCSC, then open-sourced. Tracery has been one of the biggest success stories in using open source software to support artists and poets. After the initial version was released in 2014, a British artist made a website, CheapBotsDoneQuick, to host bots written in the language. CheapBotsDoneQuick in turn created an artbot boom, with more than ten thousand bots currently hosted.

Black Swan: The Popper Reproducibility Platform

Synopsis: Reproducibility is the cornerstone of the scientific method. Yet, in computational and data science domains, a gap exists between current practices and the ideal of having every new scientific discovery be easily reproducible. Advances in computer science (CS) and software engineering slowly and painfully make their way into these domains, even in (paradoxically) CS research. Popper (http://falsifiable.us) is an experimentation protocol and CLI tool for implementing scientific exploration pipelines following a DevOps (https://en.wikipedia.org/wiki/Devops) approach. The goal of Popper is to bring the same methods and tools used for the agile delivery of software (DevOps) to scientists and industry researchers.