Past Experience
My training so far has been around imperfect data. My PhD started with trying to understand how pretraining helps model performance downstream, even if the pretraining is done on extremely noisy, sparse labels (arXiv). This was when Self-Supervised Learning methods started becoming mainstream and showed immense promise.
Towards the end of that project, I decided to transition to more theoretical topics. Hence, around this time, I decided to look into Fairness in Machine Learning. At that time and even now, Fair ML offers a plethora of open problems with significant implications for Machine Learning. Our previously published paper at ICML 2024 tackles exactly this: when is it true that imposing fairness on your model amounts to obtaining the most accurate classifier, especially when the data is imperfect. The tools used in our paper can be leveraged to work for many cost-sensitive applications and label bias models. Around this time, I started to get interested in Bayes optimality (the best model for a given data distribution).
In the real world, our data will never be free of noise and spurious correlations of the past. Hence, there needs to be some assurance and guarantee that the models we train on such noisy and biased data will get the job done and comply with safety and regulatory guidelines. Therefore, my research currently studies many such scenarios from the perspective of what’s theoretically possible (Bayes Optimality). I am also working on generalizing to other imperfect data settings, especially those requiring cost-sensitive learning, where your errors can have unequal weights. Fairness in ML is just one example of this, where the weights should differ between data groups to satisfy a given fairness constraint.
Building on our previous work, we recently also studied whether we can steer a given data distribution towards an ‘ideal’ distribution, where there are no trade-offs between fairness and accuracy. To that end, in our recent work (arXiv, blog), which has been accepted for presentation and publication at Neurips 2025, we were able to study such conditions assuming a parametric distribution over our data. However, we also noticed that the most general steering condition will result in Non-Convex optimization programs. Therefore, we propose following an Affirmative Action for the underprivileged group to make the optimization tractable. Since there is some evidence that the internal representation of large models behaves according to some Gaussian, as a proof of concept, we were able to leverage our results to steer LLMs away from toxic generation and reduce bias in Multi-class classification over embeddings extracted from LLMs. We also hope that some of these ideas can be used for principled training of Fair Generative models.
Current Projects
Currently, I am on track to submit my thesis early next year. My last few projects involve exploring Bayes Error Estimation from soft labels (extension of ideas from Ishida et al.) and exploring the role of randomization in optimal fair classification (extension of ideas from Agarwal et al.).