Codes for distributed storage systems: reliability, repair and privacy

Seminar
Thursday, March 25, 2010
11:30am
ENS 637

Abstract:

The proliferation of large-scale distributed storage systems has created new challenges in guaranteeing system reliability in the face of individual node failures. With increased deployment of low end component units to scale such systems, node failures are no longer rare events. When faced with such continual node failures, each time a node fails, the system needs to be repaired, with a new node replacing the failed one by downloading the needed information from the rest of the network. This helps restore the system to its targeted reliability level. The challenge is to do this efficiently under the network constraints of storage and bandwidth costs. Popular choices such as data duplication (repetition codes) and MDS (Reed-Solomon) codes turn out to be suboptimal. We will describe our recent work uncovering a fundamental tradeoff between the storage and repair bandwidth cost, and how network coding techniques can attain the entire spectrum of optimal tradeoff points. We will then motivate and address the problem of exact repair of the lost data. This turns out to be an open problem in general. As random network codes are inadequate for addressing this, we will describe new classes of deterministic exact repair codes that are simple to implement and are optimal for some important cases. We will finally address the problem of securing data privacy in the distributed storage network under repair dynamics, when a passive eavesdropper can hack a certain fraction of the network nodes, and can access the stored content and all the incoming messages into these hacked nodes. What is the distributed storage capacity under this model? We derive an upper bound, and show the optimality of a class of simple-to-implement exact repair codes for an important special case.

Biography:

Kannan Ramchandran (Ph.D.: Columbia University, 1993) is a Professor of Electrical Engineering and Computer Science at UC Berkeley, where he has been since 1999. He was on the faculty at UIUC from 1993 to 1999, and with AT&T Bell Labs from 1984 to 1990. Prof. Ramchandran is a Fellow of the IEEE. He has published extensively in his field, holds 10 patents, and has received several awards for his research including two Best Paper awards from the IEEE Signal Processing Society, an Okawa Foundation Prize and an Outstanding Teaching Award at Berkeley, and a Hank Magnuski Scholar award at Illinois. His current research interests include distributed signal processing and coding for networks, video communications and peer-to-peer content delivery, multi-user information theory, security, and multi-scale image processing and wavelets.

Speaker

Professor
UC Berkeley

Kannan Ramchandran (Ph.D.: Columbia University, 1993) is a Professor of Electrical Engineering and Computer Science at UC Berkeley, where he has been since 1999. He was on the faculty at UIUC from 1993 to 1999, and with AT&T Bell Labs from 1984 to 1990. Prof. Ramchandran is a Fellow of the IEEE. He has published extensively in his field, holds 10 patents, and has received several awards for his research including two Best Paper awards from the IEEE Signal Processing Society, an Okawa Foundation Prize and an Outstanding Teaching Award at Berkeley, and a Hank Magnuski Scholar award at Illinois. His current research interests include distributed signal processing and coding for networks, video communications and peer-to-peer content delivery, multi-user information theory, security, and multi-scale image processing and wavelets.