Viswanath, Milenkovic win IEEE best paper awards
CSL Research Professor Pramod Viswanath and CSL Associate Research Professor Olgica Milenkovic were recipients of the 2010 Information Theory Society Student Paper Award at the IEEE International Symposium on Information Theory, the largest symposium in the field. Out of an estimated 500 papers submitted and eligible for the award, only five were selected, on the basis of both the technical content of the paper and the quality of its presentation.
“It is certainly an honor that both Illinois and the Coordinated Science Lab were represented among two of the five winning papers,” Milenkovic said.
Viswanath co-authored “Universal Hypothesis Testing in the Learning-Limited Regime” with Benjamin Kelly (lead author), a Cornell graduate student; Thitidej Tularak, a Cornell alumnus; and Aaron Wagner, a Cornell assistant professor and former CSL postdoctoral student.
The group’s research developed an algorithm that would enable a computer to make a good choice when faced with a classification problem. For example, when an e-mail arrives, the computer could be asked to decide whether to label the message as belonging to a specific category (work, home, etc). The algorithm could also be used to scan e-mails and automatically label them based on the content.
“We didn’t want the algorithm to look for a pattern or keywords to each category and learn it (this is the state of the art) because a simple learnable pattern may not exist,” Viswanath said. “Instead, we proposed an algorithm that can still do very good classification but never really learn the underlying pattern or set of keywords that characterizes each category.”
The research is presently more mathematical and less empirical, but the researchers hope to get their algorithm into practice soon.
“We hope to relax some of the assumptions and see whether we can build algorithms which work well on real world data. One way of measuring the performance will be to compare our algorithm with those currently in use,” Kelly said.
Milenkovic co-authored “On Reconstructing a Sequence from its Subsequence Compositions” with UC-San Diego Graduate Students Jayadev Acharya (lead author), Hirakendu Das and Shenjun Pan; and UC-San Diego Professor and former CSL Visiting Professor Alon Orlitsky.
This research group attempted to quantify the performance limits of protein sequence reconstruction based on mass-charge ratios of subsequences of proteins. Milenkovic explained that when identifying proteins, traditionally biologists would use tandem mass spectrometers, devices consisting of two mass spectrometers in series connected by a chamber, which break long peptide chains and into shorter pieces.
“The task at hand is to identify the protein based on the masses of the pieces,” Milenkovic said. “A large number of copies of the same protein is ‘cut’, allowing the fragments to overlap, which is of crucial importance for identifying the protein as a whole.”
The group developed a new mathematical framework for studying when proteins can be uniquely identified based solely on the protein’s subsequence masses. The approach used by the group is based on representing the information about protein subsequence masses/composition in the form of polynomials. In this case, the problem becomes a special instance of the well known turnpike problem. The turnpike problem deals with reconstructing the exact locations of proteins using only the intra-distances between them.
“We found that if you have only the masses (composition) of the pieces, you can still determine what the protein is, provided that the length of the sequence has a special form,” Milenkovic said. “It is very interesting to point out that unique reconstruction is dependent on the length of the piece. In our paper we describe how we found numbers that tell which lengths are uniquely reconstructable based on the masses of the subsequences.”