My research is primarily focused on algorithms for "big data" -- specifically, data that grows exponentially or exists in an extremely high-dimensional vector space. Obvious application domains include biology (particularly genomics and metagenomics) and astronomy. I have a particular interest in computational biology as an application domain.
My approach to algorithms for big data focuses on the "manifold hypothesis," which posits that many examples of real-world data, while embedded in very high-dimensional spaces, are nonetheless constrained to much lower-dimensional manifolds. By constraining search and other computations to these manifolds, we can achieve time complexity that grows not with the size of a dataset, but rather with geometric and topological properties of the data.
This general approach guides my research group, Algorithms for Big Data at URI.
I have also been involved in an ongoing collaboration relating to coral. This collaboration stems from an NSF Idea Labs project, and has led to a number of interesting bioinformatics publications. With collaborators at Tufts University, I lead the MEDFORD (Metadata Format for Open Reef Data) project, developing a simple, easy-to-use markup language for describing research data.
Please note that I do not currently have open funded (RA) positions available. When this changes, I will update this page. I am available to supervise computer science capstone projects (CSC 499) and graduate work with already agreed-upon funding.
Here is a list of recent or impactful papers, with links to either open-access or preprint versions: