A group of School of Engineering and Applied Science faculty and students are working behind the scenes to lead the way in information and data management on the internet.

Penn’s Database Group — which was created in the early 1980s by former computer and information science professor Peter Buneman and current CIS professors Susan Davidson and Val Tannen — aims to effectively tackle the seemingly simple yet challenging task of storing and manipulating vast quantities of data.

Despite its general lack of visibility around campus, the group has had far-reaching impact in a variety of fields ranging from medicine to internet and web technologies.

For Davidson, who is also chair of the CIS Department, the Database Group is what has largely helped Penn become one of the world leaders in database theory and its associated programming languages today.

In the 1990s, the group became famous not only for database theory but also for its work with bioinformatics — a branch of biology which deals with storage and analysis of biological data.

For example, “the Kleisi integration system developed by us allowed biologists and medical researchers to easily collaborate with each other over projects like the human genome, which required effective control over immense unstructured data,” Davidson said.

She added that this interdisciplinary collaboration between the group and faculty in genetics and biology ultimately led to the formation of the Penn Center for Bioinformatics.

The database research in the field of bioinformatics was of great value for later internet technologies.

“Many of the problems tacked there 10 years ago like storing vast sequences of DNA and chromosomal data are now relevant in the current internet scenario, where vast quantities of similar unstructured data are created dynamically on the fly — for example, like generating tweets and uploading photos on social networks,” Davidson said.

Over the years, the group’s work has evolved and diversified to encompass multiple disciplines, systems and applications. Today, it tackles problems with a multidisciplinary approach, according to assistant CIS professor Boon Thau Loo, who also heads the NetDB@Penn group. Loo cited a project he is working on related to networking routing technology — which aims to make the creation and upkeep of telecommunication networks and tools easier by involving database and network technology together — as an example of this collaboration.

Today, the group is funded by a variety of governmental entities and corporate organizations, including the Department of Energy, Defense Advanced Research Projects Agency, National Institutes of Health, National Science Foundation and GlaxoSmithKline.

It also boasts an impressive list of alumni, which includes professors and heads of computer science departments at universities along with researchers in leading technology companies like Google and IBM.

Fourth-year engineering doctoral student Xiaocheng Huang, who is at Penn on a two-year research tenure, said she has been impressed by the group’s work and has found herself at home in the Database Group given her similar research interests.

Davidson is optimistic about the group’s future, which she predicts will be based on cloud computing — having data stored in central databases away from users’ systems and big data platforms.

Loo agreed.

“The large explosion of data today via social networks and scientific disciplines like astronomy and medicine needs further research into effectively managing such dynamic and dispersed data,” he said, adding that Penn is just the place to make this happen.

Comments powered by Disqus

Please note All comments are eligible for publication in The Daily Pennsylvanian.