Princeton University

School of Engineering & Applied Science

Efficient Redundancy Techniques to Reduce Delay in Cloud Systems

Gauri Joshi, MIT
E-Quad, B205
Thursday, February 18, 2016 - 4:30pm

Ensuring fast and seamless service to users is critical for cloud services. However, guaranteeing fast response is challenging due to random service delays that are common in today's data centers. In this talk I describe my work on utilizing redundancy to combat service variability in cloud computing and storage systems. For example, replicating a computing task at multiple servers and waiting for the earliest copy can reduce service delay. However, it can cost additional resources, and also delay other tasks waiting in queue to access the servers. I present a framework to analyze such queues with redundancy and answer fundamental design questions such as: 1) how many replicas to launch, 2) which servers to assign the replicas to, and 3) when to issue and cancel the replicas. Our analysis reveals surprising regimes where replication reduces both delay as well as cost. More broadly this work forges new connections between queuing and coding theory, uncovering many interesting future directions in cloud infrastructure, crowdsourcing and beyond.
Gauri Joshi is a Ph.D candidate in Electrical Engineering and Computer Science at MIT where she completed an S.M. in 2012. Her research interests include probabilistic modeling, coding theory and statistical inference. Before coming to MIT, she completed a B.Tech and M. Tech in Electrical Engineering from the Indian Institute of Technology (IIT) Bombay in 2010. She has held summer internships at Google, Bell Labs and Qualcomm. Gauri's awards and honors include the Best Thesis Prize in Computer science at MIT (2012), Institute Gold Medal of IIT Bombay (2010), Claude Shannon Research Assistantship (2015-16), and Schlumberger Faculty for the Future fellowship (2011-2015).