Newsroom: Highlights

Bookmark and Share Email Print

08.22.18 - Focus on Services for Data Scientists

Prf Chen Li's SURF-IoT presentation,
"Data Analytics as a Service for Data Scientists"

 

SURF-IoT mentor and UCI Prof. Chen Li was the featured speaker at the SURF-IoT Summer Seminar Series, Wednesday, Aug. 22, at CALIT2.

His presentation, “Data Analytics as a Service for Data Scientists” described the challenges often faced by data scientists and domain experts when dealing with large amounts of data, especially due to the scale and limited IT knowledge and infrastructure maintenance skills.

Li’s discussed several software solutions being developed to support data analytics as a service to users. Solutions include Apache AsterixDB as an open source parallel database, Cloudberry as a middleware system to support data visualization, and Texera as a system to enable browser-based text analytics using declarative workflows. These solutions are integrated to support data ingestion, storage, indexing, querying, visualization, and analytics. Li shared his experiences in using these solutions to support management of large-scale social media data (eg, billions of tweets in terabytes) as a service to researchers of various disciplines such as social science and public health professionals from several schools and universities.

Li demonstrated how an individual would use Texera, a current SURF-IoT faculty-mentored research project designed to allow users with little IT experience to easily analyze social media.

He was motivated to develop the system because many text analysts spend a significant amount of effort on low-level computation such as keyword search, regular expressions, dictionary-based matching, and natural language processing. They also face long running times and lack of debugging tools along with a need to re-run the program after making minor changes, at the same time, cloud-based services and technologies have emerged and advanced significantly in the past decade, he said.

Li explained the goals of the Texera project are to:
    •    Provide text analytics as cloud services so users do not need to download software and do periodic updates and patches. Plus sharing becomes much easier
    •    Provide a browser-based GUI for developers to form a workflow plan declaratively without writing code
    •    Allow non-IT people to do text analytics
    •    Increase productivity of text analytics

SURF-IoT Fellows attend a series of seminars during the summer program to get a deeper understanding of ongoing research projects and enhance their knowledge about telecommunications and information technology systems and applications. Students will present their research findings Aug. 30, at the SURF-IoT Symposium at CALIT2.

The program, co-sponsored by UCI’s Undergraduate Research Opportunities Program (UROP) and CALIT2, provides students with a unique experience. Each student has the guidance of a UCI faculty mentor, along with the opportunity to gain experience and advanced training in state-of-the-art facilities and techniques.

To learn more about SURF-IoT visit here.

-- Sharon Henry