Publications
Conference Publications
Equitable Core Imputations for Max-Flow, MST and b-Matching Games
- With Rohith R. Gangam, Naveen Garg, and Vijay V. Vazirani
- AAMAS 2026: In Proceedings of the 25th International Conference on Autonomous Agents and Multiagent Systems
Abstract
We study fair allocation of profit (or cost) for three central problems from combinatorial optimization: Max-Flow, MST and b-matching. The essentially unequivocal choice of solution concept for this purpose would be the core, because of its highly desirable properties. However, recent work [Vaz24] observed that for the assignment game, an arbitrary core imputation makes no fairness guarantee at the level of individual agents. To rectify this deficiency, special core imputations, called equitable core imputations, were defined - there are two such imputations, leximin and leximax - and efficient algorithms were given for finding them. For all three games, we start by giving examples to show that an arbitrary core imputation can be excessively unfair to certain agents. This led us to seek equitable core imputations for our three games as well. However, the ubiquitous tractable vs intractable schism separates the assignment game from our three games, making our task different from that of [Vaz24]. As is usual in the presence of intractability, we resorted to defining the Owen set for each game and algorithmically relating it to the set of optimal dual solutions of the underlying combinatorial problem. We then give polynomial time algorithms for finding equitable imputations in the Owen set. The motivation for this work is two-fold: the emergence of automated decision-making, with a special emphasis on fairness, and the plethora of industrial applications of our three games.
On the Existence and Complexity of Core-Stable Data Exchanges
- Jiaxin Song, Pooja Kulkarni, Parnian Shahkar, Bhaskar Ray Chaudhury
- NeurIPS 2025: The Thirty-Ninth Annual Conference on Neural Information Processing Systems
Abstract
The rapid growth of data-driven technologies, coupled with the widespread distribution of high-quality data across various organizations, has made the development of efficient data exchange protocols increasingly important. However, agents must navigate the delicate balance between acquiring valuable data and bearing the costs of sharing their own. Ensuring stability in these exchanges is essential to prevent agents — or groups of agents — from departing and conducting local exchanges independently. To address this, we study a model where n agents participate in a data exchange. Each agent has an associated concave payoff function and convex cost function — a setting typical in domains such as PAC learning and random discovery models. The net utility of an agent is its payoff minus its cost. We study the classical notion of core-stability. An exchange is core-stable if no subset of agents has any incentive to deviate to a different exchange. This notion also guarantees individual rationality and Pareto optimality. Our key contributions are: 1- Existence and Computation. Modeling the data exchange as an n-person game, we prove the game is balanced, thereby guaranteeing the existence of core-stable exchanges. This approach also naturally gives us a pivoting algorithm via Scarf's theorem (Scarf, 1967). Further, we show that the algorithm works well in practice through our empirical results. 2- Complexity. We prove that computing a core-stable exchange is PPAD-hard, even when the potential blocking coalitions are restricted to a constant number of agents. To the best of our knowledge, this is the first PPAD-hardness result for problems seeking core-like guarantees in data economies. We further show that relaxing either the concavity of the payoff function or the convexity of the cost function can lead to settings where core-stable exchanges may not exist. Additionally, given an arbitrary instance, determining whether a core-stable data exchange exists is NP-hard. Together, these findings delineate the existential and computational boundaries of achieving core stability in data exchange economies.
The Complexity of Finding Local Optima in Contrastive Learning
- Jingming Yan, Yiyuan Luo, Vaggos Chatziafratis, Ioannis Panageas, Parnian Shahkar, Stelios Stavroulakis
- NeurIPS 2025: The Thirty-Ninth Annual Conference on Neural Information Processing Systems
Abstract
Contrastive learning is a powerful technique for discovering meaningful data representations by optimizing objectives based on contrastive information, often given as a set of weighted triplets indicating that an "anchor" is more similar to a "positive" example than to a "negative" example. The goal is to find representations (e.g., embeddings or a tree metric) where anchors are placed closer to positive than to negative examples. While finding global optima of contrastive objectives is NP-hard, the complexity of finding local optima -- representations that do not improve by local search algorithms such as gradient-based methods -- remains open. Our work settles the complexity of finding local optima in various contrastive learning problems by proving PLS-hardness in discrete settings (e.g., maximize satisfied triplets) and CLS-hardness in continuous settings (e.g., minimize Triplet Loss), where PLS (Polynomial Local Search) and CLS (Continuous Local Search) are well-studied complexity classes capturing local search dynamics in discrete and continuous optimization, respectively. Our results imply that no polynomial time algorithm (local search or otherwise) can find a local optimum for various contrastive learning problems, unless PLS is in P (or CLS is in P for continuous problems). Even in the unlikely scenario that PLS is in P (or CLS is in P), our reductions imply that there exist instances where local search algorithms need exponential time to reach a local optimum, even for d=1 (embeddings on a line).
Online Fair Division: Towards Ex-Post Constant MMS Guarantees
- With Pooja Kulkarni and Ruta Mehta
- EC 2025: the 26th ACM Conference on Economics and Computation
Abstract
We investigate the problem of fairly allocating m indivisible items among n sequentially arriving agents with additive valuations, under the sought-after fairness notion of maximin share (MMS). We first observe a strong impossibility: without appropriate knowledge about the valuation functions of the incoming agents, no online algorithm can ensure any non-trivial MMS approximation, even when there are only two agents. Motivated by this impossibility, we introduce OnlineKTypeFD (online k-type fair division), a model that balances theoretical tractability with real-world applicability. In this model, each arriving agent belongs to one of k types, with all agents of a given type sharing the same known valuation function. We do not constrain k to be a constant. Upon arrival, an agent reveals her type, receives an irrevocable allocation, and departs. We study the ex-post MMS guarantees of online algorithms under two arrival models: 1- Adversarial arrivals: In this model, an adversary determines the type of each arriving agent. We design a 1/k-MMS competitive algorithm and complement it with a lower bound, ruling out any Ω(1 / √k)-MMS-competitive algorithm, even for binary valuations. 2- Stochastic arrivals: In this model, the type of each arriving agent is independently drawn from an underlying, possibly unknown distribution. Unlike the adversarial setting where the dependence on k is unavoidable, we surprisingly show that in the stochastic setting, an asymptotic, arbitrarily close-to-0.5-MMS competitive guarantee is achievable under mild distributional assumptions. Our results extend naturally to a learning-augmented framework; when given access to predictions about valuation functions, we show that the competitive ratios of our algorithms degrade gracefully with multiplicative prediction errors.
Reciprocally Fair Federated Learning
- With Aniket Murhekar, Jiaxin Song, Bhaskar Ray Chaudhury, and Ruta Mehta
- ICML 2025: Forty-Second International Conference on Machine Learning
Abstract
Federated learning (FL) is a popular collaborative learning paradigm, where agents with individual datasets can jointly train a machine learning (ML) model. While higher data sharing improves model accuracy and leads to higher payoffs, it also raises costs associated with data acquisition or loss of privacy, causing agents to be strategic about their data contribution. This leads to undesirable behavior at a Nash equilibrium (NE), such as free-riding, resulting in sub-optimal fairness, data sharing, and welfare. To address this, we design M_Shap, a budget-balanced payment mechanism for FL that admits Nash equilibria under mild conditions and achieves reciprocal fairness, where each agent's payoff equals their contribution to the collaboration, as measured by the Shapley share. In addition to fairness, we show that the Nash equilibrium under M_Shap has desirable guarantees in terms of accuracy, welfare, and total data collected. We validate our theoretical results through experiments, demonstrating that M_Shap outperforms baselines in terms of fairness and efficiency.
Improved MMS Approximations for Few Agent Types
- With Jugal Garg
- IJCAI 2025: The 34th International Joint Conference on Artificial Intelligence
Abstract
We study fair division of indivisible goods under the maximin share (MMS) fairness criterion in settings where agents are grouped into a small number of types, with agents within each type having identical valuations. For the special case of a single type, an exact MMS allocation is always guaranteed to exist. However, for two or more distinct agent types, exact MMS allocations do not always exist, shifting the focus to establishing the existence of approximate-MMS allocations. A series of works over the last decade has resulted in the best-known approximation guarantee of 3/4+3/3836. In this paper, we improve the approximation guarantees for settings where agents are grouped into two or three types, a scenario that arises in many practical settings. Specifically, we present novel algorithms that guarantee a 4/5-MMS allocation for two agent types and a 16/21-MMS allocation for three agent types. Our approach leverages the MMS partition of the majority type and adapts it to provide improved fairness guarantees for all types.
Human-AI Collaboration with Misaligned Preferences
- With Jiaxin Song, Kate Donahue, and Bhaskar Ray Chaudhury
- EAMMO 2025: Proceedings of the 5th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization
Abstract
In many real-life settings, algorithms play the role of assistants, while humans ultimately make the final decision. Often, algorithms specifically act as curators, narrowing down a wide range of options into a smaller subset that the human picks between: consider content recommendation or chatbot responses to questions with multiple valid answers. Crucially, humans may not know their own preferences perfectly either, but instead may only have access to a noisy sampling over preferences. Algorithms can assist humans by curating a smaller subset of items, but must also face the challenge of misalignment: humans may have different preferences from each other (and from the algorithm), and the algorithm may not know the exact preferences of the human they are facing at any point in time. In this paper, we model and theoretically study such a setting. Specifically, we show instances where humans benefit by collaborating with a misaligned algorithm. Surprisingly, we show that humans gain more utility from a misaligned algorithm (which makes different mistakes) than from an aligned algorithm. Next, we build on this result by studying what properties of algorithms maximize human welfare, when the goals could be either utilitarian welfare or ensuring all humans benefit. We conclude by discussing implications for designers of algorithmic tools and policymakers.
Welfare Approximation in Additively Separable Hedonic Games
- With Martin Bullinger and Vaggos Chatziafratis
- AAMAS 2025: In Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems
Abstract
Partitioning a set of n items or agents while maximizing the value of the partition is a fundamental algorithmic task. We study this problem in the specific setting of maximizing social welfare in additively separable hedonic games. Unfortunately, this task faces strong computational boundaries: extending previous results, we show that approximating welfare by a factor of n^(1 - epsilon) is NP-hard, even for severely restricted weights. However, we can obtain a randomized O(log n)-approximation on instances for which the sum of input valuations is nonnegative. Finally, we study two stochastic models of aversion-to-enemies games, where the weights are derived from Erdős-Rényi or multipartite graphs. We obtain constant-factor and logarithmic-factor approximations with high probability.
Robust Popular Matchings
- With Martin Bullinger and Rohith Reddy Gangam
- AAMAS 2024: In Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems
Abstract
We study popularity for matchings under preferences. This solution concept captures matchings that do not lose against any other matching in a majority vote by the agents. A popular matching is said to be robust if it is popular among multiple instances. We present a polynomial-time algorithm for deciding whether there exists a robust popular matching if instances only differ with respect to the preferences of a single agent while obtaining NP-completeness if two instances differ only by a downward shift of one alternative by four agents. Moreover, we find a complexity dichotomy based on preference completeness for the case where instances differ by making some options unavailable.
Working Papers
How to Price Data: A Market Equilibrium Based Approach
- With Pooja Kulkarni and Ruta Mehta
Abstract
High-quality data is a key input to modern machine learning models, leading to the emergence of platforms that facilitate the buying and selling of data. A central challenge in these platforms is how the data is priced to balance the interests of both buyers and sellers. Traditional market equilibrium notions, where demand meets supply are commonly used to price goods but do not extend naturally to data due to its non-rivalrous nature, whereby multiple buyers can simultaneously benefit from the same dataset. We therefore introduce a new notion of equilibrium for data pricing based on Nash equilibrium and study it in settings where data may be complementary or substitutable, focusing on the canonical utility models for each, namely Leontief and linear, respectively. We show that equilibrium prices fail to exist for linear utilities even with homogeneous buyers and two sellers, while establishing strong existence, efficiency, and polynomial-time computation guarantees for Leontief utilities in general markets with n homogeneous buyers and m sellers. We further examine the role of platform mediation and price discrimination in enabling optimal equilibrium outcomes efficiently. On the technical front, we develop a novel proof technique based on systematically reducing the space of candidate equilibria through the graph-of-deviations, which may be of independent interest.
Online MMS with Worst-Case Guarantees: Randomization Beats the Adversary
- With Shuchi Chawla, Zhiyi Huang, Ruta Mehta, and Pooja Kulkarni.
Abstract
We study the problem of fairly allocating m indivisible goods among n agents who arrive online, under the notion of maximin share (MMS) fairness. Fair allocation with online agent arrivals is known to be extremely challenging: prior work shows that even with two agents, no deterministic online algorithm can simultaneously guarantee for both agents any non-trivial approximation to MMS, or to any standard fairness notion, without knowledge of future arrivals.
In this work, we show that randomization can circumvent these strong impossibility results. We first present a randomized allocation algorithm parameterized by c ∈ Ω(log n), that allocates goods independently in each round, and with probability at least 1 − (O(log c))3/c2 guarantees every agent an O(1/c)-approximation to MMS. We extend this guarantee all the way to submodular valuation functions, providing a strong counter to known hardness results.
Our second main result targets the stronger notion of constant-MMS guarantees. We propose an allocation scheme based on dependent rounding that achieves a 1/256-approximation to MMS with constant probability under binary additive valuations. The analysis proceeds by carefully interleaving our algorithm, round by round, with an auxiliary algorithm. This coupling enables us to apply concentration bounds for exchangeable random variables, which in turn allows us to control the allocation probabilities.
We complement these results with almost matching limitations: for any c ≥ 1, no online algorithm can achieve a 1/c-approximation to MMS with probability greater than 1 − 1/c2, even for the case of binary additive valuations.
Benefits to Agentic Heterogeneity in Compositional Decision Making (LLM Collaboration)
- With Jiaxin Song, Kate Donahue, and Bhaskar Ray Chaudhury.
Abstract
In many real-life settings, decisions are made not by a single agent acting in isolation, but by teams of imperfect agents. Specifically, teammates often build on each other’s work, reflecting a composition setting. For example, consider a setting where an algorithm presents a partial solution to a human, who makes the final decision herself, or an agentic team where the final output is a product of iterative improvements by multiple agents. In this case, we may wonder about the benefits of a diverse team - when agents are composed with each other, is it ever helpful to have agents with different strengths and weaknesses, rather than a homogeneous team of the strongest agent in isolation? In this work, we study a specific setting, where the task is to recover a single item (answer) from a discrete set, where items differ in quality. One agent (the curator) narrows down a set of items to a smaller set, while another agent (the decider) makes the final pick. In this setting, we show provable benefits to heterogeneity between the agents: that is, we identify settings where a heterogeneous team is stronger than a homogeneous one. We precisely characterize the type of heterogeneity, which relates to types of “misalignment” between each agent that leads them to make different kinds of mistakes. Finally, we conclude by exploring performance of LLMs on a benchmark dataset (MMLU) and demonstrating empirically that there exist scenarios where a mixed team can outperform a single agent in isolation, or homogeneous team. We conclude by discussing implications for the design of agentic teams, as well as desirable properties for design of algorithmic tools.
Ongoing Research
Equilibrium Pricing in API Markets with Covering Constraints
- With Kshipra Bhawalkar, Christopher Liaw, Pooja Kulkarni and Ruta Mehta
Modalities as Strategic Agents in Multimodal ML
- With Babak Shahbaba, Ioannis Panageas, Ziyi Liang, and Zahra Moslemi.
