Researchers work to make real-time analytics scalable and affordable
Analyzing data in real time is often a struggle for researchers. To address those challenges, a team at the Advanced Digital Science Center, a University of Illinois research center in Singapore, has built a platform that provides an easy solution for analyzing text, video, audio and other types of data quickly and accurately.
“We hope that our analytics can help the decision makers to make decisions as quickly as possible when some special event happens,” said Zhenjie Zhang, an ADSC senior research scientist. “For example, when analyzing video, we want to detect events or faces as quickly as possible. Or, we may hope to monitor social media networks and identify the new trending topics. For sensor networks, such as the power grid, we want to find problems and notify engineers immediately, so they can stop the problem before it gets worse.”
The system utilizes simple programming, so that programmers in other fields can easily integrate their codes into the program.
“There are many fancy computer vision algorithms out there,” Zhang said. “The problem is that the algorithms are too complex and cannot be run in real time. They can only analyze about one to two frames per second, but we can do real-time tracking.”
Another unique aspect of ADSC’s solution is their use of elasticity, which enables automatic scaling as the volume of data increases or decreases throughout the day. Zhang, along with ADSC Research Scientist Richard Ma and Illinois Computer Science Professor Emerita Marianne Winslett, has adopted an elastic cloud-based scheme, where a user pays for service only when they need it.
For example, the researchers ran a demonstration in which they took real-time transportation data produced by Shanghai’s metro, bus, and taxi systems and were able to predict what traffic congestion would be like in one hour. The capability could help commuters know how long their trips might take. The cost of using the tool would also adjust as the volume of data increases, such as during rush hour, when more machines are used.
“This elasticity is the most challenging aspect of the research because when a very fast data stream comes to your system, you need to move the results to a new machine and you want to try to make this migration as quick as possible so it doesn’t affect the current computations,” Zhang said. “We’ve developed our solution, but are always working to improve it.”
The team, composed of a variety of experts in data management, machine learning, and data mining, has almost finished their system platform prototype and is working to add new features and additional applications on top of the platform. They are also working closely with the open source community to add functionalities and improve shortcomings of current platforms.
“As we’ve been developing our platform, we’ve talked with many computer vision researchers and there’s a huge gap between the research and practice,” Zhang said. “They usually focus on the accuracy of the model, but don’t care if they can deliver the prediction on time. We can help supplement the research to help them bridge that gap.”