Privacy, data and universe-wide AI: Dr Soumen Chakrabarti on the state of internet

Dr Soumen Chakrabarti is one of India’s foremost AI researchers, working in the field of knowledge graphs and search. His work has also contributed greatly to development of the Open Web. Dr Chakrabarti addressed Flipkart engineers recently in a Blue Sky talk about his favorite area of research. We caught up with him to pick his brains about some of the most pressing issues surrounding the state of the information highway

Artificial Intelligence

When Soumen Chakrabarti started his journey as an academic at IIT Kharagpur in 1987, the internet was still in its infancy. Since then, it has a come a long way, revolutionizing the way we consume information and think about it. Once a barren terrain populated by sparse HTML pages, it has grown from a platform where only computer geeks interacted with each other to the largest network of people in the history of human civilization, with topics like privacy and artificial intelligence taking center stage. By 2025, it is predicted that more than 463 billion GB of new data will be created on the internet every day. Simply put, there is more information available to humankind than ever before in history.

At the core of internet user experience are the search engines. Search engines like Google are often the easiest way for people to discover the internet. One of the chief aspects of Dr Chakrabarti’s work is in the field of search queries. In fact, discoverability on search engines is one of the most important parameters for the survival of any website. More so for an online marketplace like Flipkart, a fact not lost on Flipsters, as evidenced by the massive turnout at his talk.

Artificial Intelligence

In fact, one of the papers that Dr Chakrabarti co-authored in 2002 is a highly cited work in the field. It won him the ICDE 2012 influential paper award. In 2014, he won the the Shanti Swarup Bhatnagar Prize for Science and Technology, widely regarded as the country’s most prestigious science award. He currently works as a faculty member in the department of Computer Science at IIT Bombay.

After his talk at Flipkart on June 12, Dr Chakrabarti spoke to Flipkart Stories about the evolution of the internet, artificial intelligence, the importance of indexing beyond the open web, its relevance to e-commerce, and many of the problems that have arisen as a result.

Edited excerpts:

Artificial Intelligence

On universe-wide AI and repeating evolution…

“I have this feeling that if the overall purpose of artificial intelligence is to populate fresh planets with intelligence, we should go about it a bit differently and first think of intelligent butterflies and bees and what happened out of the ecosystem of the planet from scratch. So let’s not teach AI how to answer queries. Instead, can we start with self sustaining artificial intelligence from the raw materials of a new planet? If we can repeat evolution on a rapid time scale, we can figure out a shortcut for that process so that intelligence happens faster on other planets. That’s my definition of universe-wide AI.”

On data-driven vs model-driven approaches…

“It’s important to be both data-driven and model-driven. Data keeps you on track to correct outcomes, but models give you insight. In this respect, deep learning is facing a lot of criticism of late. At all the academic conferences that I’m involved with, there’s a very steep premium on building models which you can explain. And deep models without any interpretation are kind of losing favor at present, no matter how they perform. In areas like medicine, like critical control systems, you don’t want opaque systems, no matter how good they are.”

On people’s expectations from web search in the present…

“People expect more. People are asking a web query about something, and they get an organic hit, and they have a certain level of expectation that is still lower than if they’re making a very pointed query about a product that they’d like to buy with attributes from both specifications and people’s experiences with the product.”

On where e-commerce is probably lagging behind…

“In open web queries, at least since 2008, if not earlier, knowledge graphs and catalogs like Wikipedia and Freebase have been collaboratively and publicly collected, cleaned, and tailored to search. A lot more people have been working on structuring the public web compared to structuring products, their attributes, their relationships to other products, to companies, to vendors, to sellers. That is at a more nascent stage.”

On privacy…

“There seems to be no personalization of privacy needs. There are a lot of people who don’t care about giving out their Aadhaar number to get rice. But at the same time there are a lot of people whose fingerprints are very, very important to them. There has to be some sort of a design continuum where you say: “My fingerprint is worth this much to me therefore I will sign up to a different modality of government benefits.” If the same one-size-fits-all law is going to be imposed on the population, there are going to be stresses in the system. But at the same time it’s important to educate people who are not very familiar with what can be done in their names.”

Also read:
With AI & ML, Flipkart is addressing the uniquely Indian problem of problem addresses


Enjoy shopping on Flipkart