I’m a research scientist in the demand forecasting group at Amazon in New York City.
My research interests (past and present) include hierarchical Bayesian modeling, MCMC methods,
data and model visualization, text mining, and other topics related to applied statistics.
This is my personal site, which is a mix of statistics research, side
projects (mostly sports-related) and other stuff.
- R package summarytrees:
This is an R package (co-written with Howard Karloff) to compute and visualize
maximum entropy summary trees. If you have a large, node-weighted
tree (1000's of nodes, or more), our package aggregates the nodes in an
optimal way and interactively
visualizes the summary tree (typically 100s of nodes) using d3.js.
 Here is a link to the
 Here are two vignettes describing a data analysis using the package:
 Here are a few examples of the interactive visualization:
 Here is the original
paper describing the work (from EuroVis 2013).
- R package LDAvis:
The LDAvis package (co-written with Carson Sievert) provides an interactive
visualization of the topics learned from LDA. It is designed to
allow users to easily interpret topics, by showing, for each topic, a ranked
list of the most relevant
terms, where relevance is a combination of a term's frequency within and
exclusivity to a topic.