Publications
Upcoming Work
Equilibrium Pricing for Data Markets
- With Pooja Kulkarni and Ruta Mehta
Abstract
Data is increasingly viewed as a strategic asset, yet standardized market mechanisms for its pricing and exchange remain underdeveloped. Unlike traditional goods, data is non-rivalrous and infinitely replicable, rendering classical notions of competitive equilibrium—where prices balance supply and demand—inapplicable. We propose a new model for competitive pricing in data markets, grounded in game-theoretic principles. We model the interaction between sellers and buyers as a Stackelberg-type game: sellers first set prices, and buyers respond by purchasing their most preferred bundles subject to budget constraints. Since data is non-rivalrous, buyers can always access their optimal bundles. Sellers, aiming to maximize revenue, engage in a strategic pricing game. We define equilibrium prices as a Nash equilibrium in this game, where no seller can unilaterally improve her revenue. For a natural class of buyer utility functions arising from vertically distributed data, we show the following: (1) There exists a rational Nash equilibrium that is computable in polynomial time. (2) Among all equilibria, the aforementioned equilibrium is simultaneously revenue-maximizing for sellers, welfare-maximizing for buyers, and fair to all sellers. Our results parallel classical insights: competitive pricing promotes both efficiency and fairness. Additionally, we show that the natural best-response dynamics converge to a (1 + ε)-approximate equilibrium. We empirically compare these equilibria and analyze the role of online-platform mediation in improving market outcomes. On the technical side, the set of equilibria is non-convex, so standard fixed-point techniques (e.g., Kakutani’s theorem) are inapplicable. Instead, we develop a novel “graph-of-deviations” framework and show acyclicity to establish existence and computation of rational equilibria.
Human-AI Collaboration with Misaligned Preferences
- With Jiaxin Song, Kate Donahue, and Bhaskar Ray Chaudhury
Abstract
In many real-life settings, algorithms play the role of assistants, while humans ultimately make the final decision. Often, algorithms specifically act as curators, narrowing down a wide range of options into a smaller subset that the human picks between: consider content recommendation or chatbot responses to questions with multiple valid answers. Crucially, humans may not know their own preferences perfectly either, but instead may only have access to a noisy sampling over preferences. Algorithms can assist humans by curating a smaller subset of items, but must also face the challenge of misalignment: humans may have different preferences from each other (and from the algorithm), and the algorithm may not know the exact preferences of the human they are facing at any point in time. In this paper, we model and theoretically study such a setting. Specifically, we show instances where humans benefit by collaborating with a misaligned algorithm. Surprisingly, we show that humans gain more utility from a misaligned algorithm (which makes different mistakes) than from an aligned algorithm. Next, we build on this result by studying what properties of algorithms maximize human welfare, when the goals could be either utilitarian welfare or ensuring all humans benefit. We conclude by discussing implications for designers of algorithmic tools and policymakers.
Equitable Core Imputations for Max-Flow, MST and b-Matching Games
- With Rohith R. Gangam, Naveen Garg, and Vijay V. Vazirani
Abstract
We study fair allocation of profit (or cost) for three central problems from combinatorial optimization: Max-Flow, MST and b-matching. The essentially unequivocal choice of solution concept for this purpose would be the core, because of its highly desirable properties. However, recent work [Vaz24] observed that for the assignment game, an arbitrary core imputation makes no fairness guarantee at the level of individual agents. To rectify this deficiency, special core imputations, called equitable core imputations, were defined - there are two such imputations, leximin and leximax - and efficient algorithms were given for finding them. For all three games, we start by giving examples to show that an arbitrary core imputation can be excessively unfair to certain agents. This led us to seek equitable core imputations for our three games as well. However, the ubiquitous tractable vs intractable schism separates the assignment game from our three games, making our task different from that of [Vaz24]. As is usual in the presence of intractability, we resorted to defining the Owen set for each game and algorithmically relating it to the set of optimal dual solutions of the underlying combinatorial problem. We then give polynomial time algorithms for finding equitable imputations in the Owen set. The motivation for this work is two-fold: the emergence of automated decision-making, with a special emphasis on fairness, and the plethora of industrial applications of our three games.
Publications
On the Existence and Complexity of Core-Stable Data Exchanges
- Jiaxin Song, Pooja Kulkarni, Parnian Shahkar, Bhaskar Ray Chaudhury
- NeurIPS 2025: The Thirty-Ninth Annual Conference on Neural Information Processing Systems
Abstract
The rapid growth of data-driven technologies, coupled with the widespread distribution of high-quality data across various organizations, has made the development of efficient data exchange protocols increasingly important. However, agents must navigate the delicate balance between acquiring valuable data and bearing the costs of sharing their own. Ensuring stability in these exchanges is essential to prevent agents — or groups of agents — from departing and conducting local exchanges independently. To address this, we study a model where n agents participate in a data exchange. Each agent has an associated concave payoff function and convex cost function — a setting typical in domains such as PAC learning and random discovery models. The net utility of an agent is its payoff minus its cost. We study the classical notion of core-stability. An exchange is core-stable if no subset of agents has any incentive to deviate to a different exchange. This notion also guarantees individual rationality and Pareto optimality. Our key contributions are: 1- Existence and Computation. Modeling the data exchange as an n-person game, we prove the game is balanced, thereby guaranteeing the existence of core-stable exchanges. This approach also naturally gives us a pivoting algorithm via Scarf's theorem (Scarf, 1967). Further, we show that the algorithm works well in practice through our empirical results. 2- Complexity. We prove that computing a core-stable exchange is PPAD-hard, even when the potential blocking coalitions are restricted to a constant number of agents. To the best of our knowledge, this is the first PPAD-hardness result for problems seeking core-like guarantees in data economies. We further show that relaxing either the concavity of the payoff function or the convexity of the cost function can lead to settings where core-stable exchanges may not exist. Additionally, given an arbitrary instance, determining whether a core-stable data exchange exists is NP-hard. Together, these findings delineate the existential and computational boundaries of achieving core stability in data exchange economies.
The Complexity of Finding Local Optima in Contrastive Learning
- Jingming Yan, Yiyuan Luo, Vaggos Chatziafratis, Ioannis Panageas, Parnian Shahkar, Stelios Stavroulakis
- NeurIPS 2025: The Thirty-Ninth Annual Conference on Neural Information Processing Systems
Abstract
Contrastive learning is a powerful technique for discovering meaningful data representations by optimizing objectives based on contrastive information, often given as a set of weighted triplets indicating that an "anchor" is more similar to a "positive" example than to a "negative" example. The goal is to find representations (e.g., embeddings or a tree metric) where anchors are placed closer to positive than to negative examples. While finding global optima of contrastive objectives is NP-hard, the complexity of finding local optima -- representations that do not improve by local search algorithms such as gradient-based methods -- remains open. Our work settles the complexity of finding local optima in various contrastive learning problems by proving PLS-hardness in discrete settings (e.g., maximize satisfied triplets) and CLS-hardness in continuous settings (e.g., minimize Triplet Loss), where PLS (Polynomial Local Search) and CLS (Continuous Local Search) are well-studied complexity classes capturing local search dynamics in discrete and continuous optimization, respectively. Our results imply that no polynomial time algorithm (local search or otherwise) can find a local optimum for various contrastive learning problems, unless PLS is in P (or CLS is in P for continuous problems). Even in the unlikely scenario that PLS is in P (or CLS is in P), our reductions imply that there exist instances where local search algorithms need exponential time to reach a local optimum, even for d=1 (embeddings on a line).
Online Fair Division: Towards Ex-Post Constant MMS Guarantees
- With Pooja Kulkarni and Ruta Mehta
- EC 2025: the 26th ACM Conference on Economics and Computation
Abstract
We investigate the problem of fairly allocating m indivisible items among n sequentially arriving agents with additive valuations, under the sought-after fairness notion of maximin share (MMS). We first observe a strong impossibility: without appropriate knowledge about the valuation functions of the incoming agents, no online algorithm can ensure any non-trivial MMS approximation, even when there are only two agents. Motivated by this impossibility, we introduce OnlineKTypeFD (online k-type fair division), a model that balances theoretical tractability with real-world applicability. In this model, each arriving agent belongs to one of k types, with all agents of a given type sharing the same known valuation function. We do not constrain k to be a constant. Upon arrival, an agent reveals her type, receives an irrevocable allocation, and departs. We study the ex-post MMS guarantees of online algorithms under two arrival models: 1- Adversarial arrivals: In this model, an adversary determines the type of each arriving agent. We design a 1/k-MMS competitive algorithm and complement it with a lower bound, ruling out any Ω(1 / √k)-MMS-competitive algorithm, even for binary valuations. 2- Stochastic arrivals: In this model, the type of each arriving agent is independently drawn from an underlying, possibly unknown distribution. Unlike the adversarial setting where the dependence on k is unavoidable, we surprisingly show that in the stochastic setting, an asymptotic, arbitrarily close-to-0.5-MMS competitive guarantee is achievable under mild distributional assumptions. Our results extend naturally to a learning-augmented framework; when given access to predictions about valuation functions, we show that the competitive ratios of our algorithms degrade gracefully with multiplicative prediction errors.
Reciprocally Fair Federated Learning
- With Aniket Murhekar, Jiaxin Song, Bhaskar Ray Chaudhury, and Ruta Mehta
- ICML 2025: Forty-Second International Conference on Machine Learning
Abstract
Federated learning (FL) is a popular collaborative learning paradigm, where agents with individual datasets can jointly train a machine learning (ML) model. While higher data sharing improves model accuracy and leads to higher payoffs, it also raises costs associated with data acquisition or loss of privacy, causing agents to be strategic about their data contribution. This leads to undesirable behavior at a Nash equilibrium (NE), such as free-riding, resulting in sub-optimal fairness, data sharing, and welfare. To address this, we design M_Shap, a budget-balanced payment mechanism for FL that admits Nash equilibria under mild conditions and achieves reciprocal fairness, where each agent's payoff equals their contribution to the collaboration, as measured by the Shapley share. In addition to fairness, we show that the Nash equilibrium under M_Shap has desirable guarantees in terms of accuracy, welfare, and total data collected. We validate our theoretical results through experiments, demonstrating that M_Shap outperforms baselines in terms of fairness and efficiency.
Improved MMS Approximations for Few Agent Types
- With Jugal Garg
- IJCAI 2025: The 34th International Joint Conference on Artificial Intelligence
Abstract
We study fair division of indivisible goods under the maximin share (MMS) fairness criterion in settings where agents are grouped into a small number of types, with agents within each type having identical valuations. For the special case of a single type, an exact MMS allocation is always guaranteed to exist. However, for two or more distinct agent types, exact MMS allocations do not always exist, shifting the focus to establishing the existence of approximate-MMS allocations. A series of works over the last decade has resulted in the best-known approximation guarantee of 3/4+3/3836. In this paper, we improve the approximation guarantees for settings where agents are grouped into two or three types, a scenario that arises in many practical settings. Specifically, we present novel algorithms that guarantee a 4/5-MMS allocation for two agent types and a 16/21-MMS allocation for three agent types. Our approach leverages the MMS partition of the majority type and adapts it to provide improved fairness guarantees for all types.
Welfare Approximation in Additively Separable Hedonic Games
- With Martin Bullinger and Vaggos Chatziafratis
- AAMAS 2025: In Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems
Abstract
Partitioning a set of n items or agents while maximizing the value of the partition is a fundamental algorithmic task. We study this problem in the specific setting of maximizing social welfare in additively separable hedonic games. Unfortunately, this task faces strong computational boundaries: extending previous results, we show that approximating welfare by a factor of n^(1 - epsilon) is NP-hard, even for severely restricted weights. However, we can obtain a randomized O(log n)-approximation on instances for which the sum of input valuations is nonnegative. Finally, we study two stochastic models of aversion-to-enemies games, where the weights are derived from Erdős-Rényi or multipartite graphs. We obtain constant-factor and logarithmic-factor approximations with high probability.
Robust Popular Matchings
- With Martin Bullinger and Rohith Reddy Gangam
- AAMAS 2024: In Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems
Abstract
We study popularity for matchings under preferences. This solution concept captures matchings that do not lose against any other matching in a majority vote by the agents. A popular matching is said to be robust if it is popular among multiple instances. We present a polynomial-time algorithm for deciding whether there exists a robust popular matching if instances only differ with respect to the preferences of a single agent while obtaining NP-completeness if two instances differ only by a downward shift of one alternative by four agents. Moreover, we find a complexity dichotomy based on preference completeness for the case where instances differ by making some options unavailable.
