
Data Science and AI for Good: A Fireside Chat with Columbia University Professors
Insights
- Data for good goes beyond responsible AI by ensuring that not only the methods but also the end goals of technology align with societal values.
- The evolution of Cricinfo illustrates how AI and innovations like retrieval-augmented generation (RAG) can simplify complex user experiences and drive broader adoption.
- Next-generation AI research demands an interdisciplinary approach, bringing together engineering, business, and social sciences to build trusted, agentic systems for real-world impact.
In this fireside chat, Ramachandran S, Lead for Engineering and Manufacturing at Infosys Knowledge Institute, speaks with Professor Garud Iyengar, Avanessians Director of the Columbia Data Science Institute, and Professor Vishal Misra, Vice Dean for Computing and AI at Columbia Engineering. Together, they explore how data science and AI are being applied for societal good, the role of responsible AI, the evolution of net neutrality, and the Infosys–Columbia partnership shaping the future of agentic and interdisciplinary AI research.
Explore more videos:
Ramachandran S:
Hello everyone. Welcome to this fireside chat with two global experts in data sciences. My name is Ramachandran. I'm the lead for Engineering and Manufacturing domains in the thought leadership team at Infosys, our Knowledge Institute. We are very privileged to have with us today Professor Garud Iyengar and Professor Vishal Misra from Columbia University.
Professor Garud is Avanessians Director at the Data Sciences Institute. He's also a professor of industrial engineering and operations research at Columbia Engineering. He co-leads the artificial intelligence initiatives at the university.
Professor Vishal is the RKS family professor of computer science. He is also the Vice Dean for Computing and Artificial Intelligence. As a graduate student, he co-founded cricinfo.com, a very popular online cricket platform. Welcome to this chat, professors. Thank you so much for taking the time in your busy schedule.
Ramachandran S:
So to begin with Professor Garud, can you talk to us about the Data Sciences Institute? How do you help students to realize value from data?
Garud Iyengar:
Avanessians Director, Columbia Data Science Institute
The Data Science Institute at Columbia was started as a university-level initiative to harness the power of data science to make positive changes in society. That has been our mission. We do this by supporting research in the foundations of data science, in the applications of data science to various domains, and also educating the next generation of data scientists.
We have a master's program, we have a PhD program, we run DSI Scholars, which is a research program targeted towards masters and undergraduates to get them involved early in data science. It's always been the mission of the Institute to make sure that the next generation of data scientists, AI scientists, get the best possible education.
Ramachandran S:
So, Professor Vishal, going back to your setup of Cricinfo, I was curious to know, has artificial intelligence been used in the platform? If so, how has it been used?
Vishal Misra:
Vice Dean, Computing and AI, Columbia Engineering
So yeah, AI is being used, but this is more recent. Cricinfo was founded in the 90s. And cricket, as you know, is a very stats-rich sport. So we had created an online free searchable database called Statsguru on Cricinfo. But because you can search for anything, everything was made available.
So if you go to the Statsguru page, it has like 15 drop downs, 20 check boxes, 18 different text fields. It's a very daunting, confusing interface. So as a result, except for the real cricket nerds, people didn’t really use Statsguru that much.
Now ESPN bought Cricinfo, so it's now called ESPNcricinfo. But I've always felt that that had to be fixed. That interface was not good.
I know the people who run Cricinfo, ESPNcricinfo quite well. The editor-in-chief, Sambit Bal, whenever he comes to New York, we meet. So he was in New York January 2020, right before the pandemic. We had gone out, and we were talking and I told him again, why don’t we do something about Statsguru? And he looked at me and said, why don't you do something about Statsguru?
It was kind of a joke, but not really. He really thought maybe I should be doing something. So that summer, GPT-3 was released for the first time. And I saw someone use GPT-3 to write a SQL query for their own database by giving natural language commands to GPT-3.
And I thought maybe I could use that to simplify the interface for Statsguru. So I did get early access to GPT-3. Soon I realized that what existed, I could not use that to build the interface because GPT-3, people remember, had only a 2048-token context limit. There's no way you could fit the complexities of a database like Statsguru in that context window.
So to solve that problem, I accidentally invented what is now known as RAG.
Using RAG, we were able to build this interface where now you can ask queries in natural language. You can say, “What is Kohli’s record against leg spinners in power plays?” Instead of going through that complicated interface, now you can ask questions in English.
This was put in production on Cricinfo in September of 2021, about 15 months before ChatGPT arrived. So I've been playing with these models for quite a while.
After we built this, I didn’t understand why it worked. So then I got into researching these models and building a mathematical model, trying to understand why these models work, how they can learn on demand.
So my journey in AI is actually related to Cricinfo. And Cricinfo is using AI in the form of Ask Cricinfo. There’s a new version coming soon. We’ll see. And because of that, my area of research was modeling and performance evaluation, networking. I wasn’t really into AI. But thanks to Cricinfo, very interestingly, I started this journey. And now I’m in the space for computing and AI.
Ramachandran S:
Very interesting. Thank you, professor. So, Professor Garud, you talk about data for good for the betterment of society. Can you talk more about that? How unique, how different is it from responsible AI beyond being just ethical or transparent and accountable?
Garud Iyengar:
So in order to understand that, I wanted to take you back to 2012. That was when the Data Science Institute started forming. And we were thinking about this mission for data for good right at that time, at the time when AI hadn’t taken off, machine learning was there.
In some sense, what we were trying to do by putting that message up is that whatever research that we do, whatever research we support as seed grants, whatever our faculty write, we wanted them to make a declaration that whatever data they have used is for a positive purpose.
Later on, many conferences in machine learning started to include this effect as well.
In terms of trying to convert that into an actionable statement, we’ve been working with our policy school. More recently, once AI came into the picture, trying to understand what it means to use data for societal good. Responsible AI definitely is a part of it.
But if I were to, I could responsibly do something bad, meaning the technology is faithful to what my goals are. With Data for Good, we are actually asking for something more. Not just verifying that the technology does what it is supposed to do, but ensuring the end goal that the technology is being used for keeps societal values in mind.
What Infosys has done in its responsible AI initiative—identifying all the measurable metrics that one could attach to systems to understand what is responsible, equitable, fair—that’s part of it.
And here, our efforts at the Data Science Institute were really to make people aware: it’s not that whatever AI can do, we must do. We should also think about what AI should do. Once you have immense amounts of data, you can extract things for positive purposes or for negative purposes. And we would like to support positive purposes.
Perfect — here’s the continuation of the verbatim transcript with speaker names in the same format:
Ramachandran S:
Very profound thoughts, professor, very deep. Thank you so much. Coming back to you, Professor Vishal, you have been a proponent of net neutrality. But do you think post-pandemic it has taken a step back? Has there been an impact on net neutrality because broadband has become so essential, not just in corporate life, but even for education and for the common man across all countries? What is your take on net neutrality as we see it today?
Vishal Misra:
So I think talking about India, net neutrality—or the aspect of net neutrality that I worked on, this was about 10 years ago. I came and testified in parliament. I also worked with the TRAI chief at the time, R.S. Sharma.
And the law that was passed on net neutrality, which banned differential pricing, actually has played a very big role in the way internet has evolved in India. That was right before Jio was about to enter the market.
And there are two kinds of regulations. There’s ex post and ex ante. Ex post is after you’ve seen some harm, then you bring in regulation to try and undo the harm or stop what’s done. Ex ante is where you anticipate something might happen and then design a regulation to prevent it from happening.
So when we worked on that regulation of banning differential pricing, the impact of it was that Jio could not make, let’s say, Jio Cinema or Jio TV or whatever Jio services they were going to offer—because they were vertically integrated, an ISP as well as content provider. They could not lower the price of their content. They had to offer the same price for every byte that flows on the network.
So as a result, what happened was they came into the market initially, if you remember those days, they could not make that part free. So they made everything free, even Netflix or Amazon Prime or whatever. All sorts of content providers got the benefit of lower pricing because of the way Jio approached the market after the differential pricing ruling.
As a result, the other ISPs in India had to lower their prices, had to match, and had to compete with better services. Now, if you look at internet service in India, worldwide it’s probably amongst the cheapest and the best. People don’t realize that, but that net neutrality ruling had a silent role to play in that.
So instead of thinking it has taken a backseat, silently it has really helped with work from home and providing really good internet services to Indian citizens.
Ramachandran S:
Is there any global context to it, professor? You spoke about the Indian context.
Vishal Misra:
India was amongst the leaders globally in enacting really strong differential pricing regulations. I think Canada followed suit. Places in Europe also had similar net neutrality rulings.
The USA has been different—they’ve had some rulings, then they stepped back, then they had another ruling. The problem in the US is not the same. The US has islands of monopolies. Although there are several ISPs in the US, in different regions you don’t have lots of choices. That has been a problem.
So even though net neutrality as defined here is not there, a different problem is playing out in the US. But several other countries have enacted net neutrality regulations, which were led by TRAI and R.S. Sharma. That was a fantastic piece of regulation they enacted.
Ramachandran S:
So, Professor Garud, one of the strategic ongoing partnerships between Infosys and Columbia University is to set up a cutting-edge artificial intelligence center for next-gen research. Can you talk more about it?
Garud Iyengar:
Absolutely. I’m really excited about this center. The visit to Infosys with regards to the center has really opened my eyes to what this company can do—and is doing.
We came in with one vision, but I am walking away thinking about many, many more things that we could be doing at that center.
The three main areas we are thinking about right now are:
- Agentic AI — understanding how an ecosystem of agents would evolve. Think about a world where Infosys puts out some agents, SAP puts out some agents, Adobe puts out some agents, and they all work together for your application. How would you provide security for it? How would you prevent adversarial attacks? What kind of back-end infrastructure is needed? What kind of controls, guardrails?
- Responsible AI — not just in the agentic or GenAI sense, but broader. Looking at all kinds of critical applications, understanding how to control them. LLMs are part of that topic as well.
- AI for Marketing Applications — envisioning a world where a CMO can have all the data, play with it, visualize it, run “what-if” scenarios without needing data scientists. Everything available to them directly. How would marketing operations change if it’s AI-first? Instead of just modifying existing operations, how do we redesign them from scratch?
The other feature I really like is that this relationship is between Infosys on one end and Columbia University on the other. It’s going to be housed in the School of Engineering, but we’re going to bring faculty from across the university.
Take agentic AI—you could technologically produce it. But then there’s the question of adoption, trust, making sure people don’t feel threatened by this technology. That’s not typically something engineers worry about. It’s something our colleagues in the business school, psychology department, and others will contribute to.
Similarly for marketing—the lead will come from the marketing department, but we’ll identify people who understand both technology and domain.
I’m very excited about what we could do. And as I said earlier, the visit here has opened my eyes to other things. Infosys has platforms we could be using directly for our students. Going back, I’ll rethink how this center is going to work.
Ramachandran S:
That sounds very exciting, professor. It shows how interdisciplinary AI is. Thank you so much for your time and for a very interesting conversation.
So thank you for watching. For more such video sessions, please visit infosys.com/IKI and the video section. Until next time, keep learning, keep sharing.