Oh’s crowdsourcing work improves efficiency of new market

2/10/2014 Katie Carr, CSL

One might think that only someone at NASA would be working to classify and identify hundreds of thousands of galaxies and, in the process, help scientists understand how these galaxies evolve. But now, with the help of a revolutionary process called crowdsourcing,

Written by Katie Carr, CSL

One might think that only someone at NASA would be working to classify and identify hundreds of thousands of galaxies and, in the process, help scientists understand how these galaxies evolve. But now, with the help of a revolutionary process called crowdsourcing, which Merriam-Webster defines as “the practice of obtaining needed services, ideas or content by soliciting contributions from a large group of people,” people across the globe can be involved in work like this that is outside their normal realm. Crowdsourcing companies, such as Galaxy Zoo, Foldit and Amazon Mechanical Turk, are making use of technology by breaking large-scale problems down into small tasks that can be electronically distributed to on-demand human contributors.

New CSL professor Sewoong Oh is working in this area to design reliable and cost-efficient crowdsourcing systems using tools from applied probability, graph theory and related mathematical tools.

CSL professor Sewoong Oh
CSL professor Sewoong Oh
CSL professor Sewoong Oh
Oh graduated from Stanford University with a Ph.D. in Electrical Engineering in 2011 and worked as a postdoctoral researcher at MIT before becoming an assistant professor of industrial and enterprise systems engineering in 2012. He joined CSL in August 2013 and is continuing his work on crowdsourcing, as well as ranking and recommendation systems.

Crowdsourcing is a way to break large tasks down into small tasks that can be distributed to a crowd of people to do jobs such as image labeling, character recognition, translation, transcription or other tasks computers aren’t great at doing. While this is an efficient way to solve large-scale problems, there are difficulties that come along with trusting large amounts of people.

“People are noisy because they are humans,” Oh said. “They might try to scam you or are just lazy. We need algorithms to detect who is lazy or spamming to be able to use crowdsourcing more efficiently. People are also participating in this because it’s fun or to be part of something bigger and to help people. We’re also trying to figure out what motivates people, whether it’s monetary or being part of a grand challenge.”

According to Oh, crowdsourcing in the real world is a fast growing industry, with a lot of room for improving how information is aggregated, jobs are allocated and incentives are determined.

“There are many aspects to crowdsourcing, such as designing better incentives or better algorithms, that leave a lot of room to improve how it’s done in practice,” Oh said. “The application area is in its infancy and new problems are emerging with new intellectual challenges for researchers.”

In addition to his crowdsourcing work, Oh’s research at CSL also relates to ranking and recommendation systems. He’s most concerned with the data that’s being generated from people’s every day activities and extracting meaningful information from that to improve their lives, such as search engines ranking search results based on what information they believe they know about a certain person.

“If you Google something, they will provide you with a list and try to rank it in a way that’s sorted best to suit your goal,” Oh said.

By using simple comparison questions between pairs of items, Oh is attempting to make this ranking system more accurate to reflect the global appeals of people. He is working to incorporate that into recommendation systems that will ask you not to rate, but to rank.

Based on this idea, he’s looking into creating new recommendation systems for companies such as Netflix. When Netflix recommends movies or shows to their users, they have users rate shows with star ratings depending on how much they liked the show. Oh is looking for more efficient and accurate ways to recommend products and services to users.

“When using people’s input in this way, you want to ask people the simplest questions possible,” Oh said. “What people typically do is rate it using stars, but that’s not really consistent and the scale varies. My three stars means something different from your three stars.”

Using pair-wise comparisons, such as asking multiple questions that compare two items, the results are much more reliable and less likely to change over time. Oh proposes to combine the knowledge he obtains from ranking and choice models, which he believes will lead to more efficient and reliable recommendations.

“When we interact, we think we can ask people complex questions and expect them to be very rational and very smart in giving you complex answers and then learn a lot from those answers,” Oh said. “But in truth, people are not good at assessing or judging themselves and other things. We believe we can learn more about people reliably by actually having them do simple things and asking them simple questions, but asking them many questions.”

Beyond rating retail shopping experiences and movie reviews, Oh believes this research can be applied to many areas, such as finding out where people are willing to take the train and get off the train when a city is looking into building new train stations, while adjusting their traffic systems or getting opinions on new policies that may be implemented.

“If you just ask people those questions, you don’t get the information quickly,” Oh said. “People don’t always know why they like or dislike things. They may know if they like it or not, but not why or how much. I’m working on solving those problems.”


Share this story

This story was published February 10, 2014.