Selling Joint Ads: A Regret Minimization Perspective

1. 研究的问题

这篇论文探讨的核心问题是设计一种收益最大化的激励相容机制,用于在线零售环境中的联合广告销售。具体来说,就是如何将一个广告位(例如)出售给两个不可排除的买家(例如,一个商家和一个品牌),这两个买家可能共同在拍卖中出价以推广一个产品,并且都从广告的展示中获益。该问题涉及到设计一个机制,该机制从两个买家那里收集出价,并决定是否分配广告位以及两个参与方应支付的金额。这个问题引出了复杂的激励相容性约束,例如,如何在两个参与方之间分配付款。

2. 研究的对象

研究对象是在线零售领域的广告销售机制,尤其是在涉及两个或多个买家共同对一个非排他性商品(如广告位)出价的场景。论文特别关注了那些在拍卖设计中产生的复杂问题,包括如何在不同的买家之间分割付款,以及如何确保拍卖机制既能最大化收益,又能满足激励相容性和个体理性的要求。

3. 使用的技术

论文采用了在线学习视角来解决寻找收益最大化激励相容机制的问题。作者提出了一种高效的学习算法,该算法基于自适应离散化机制空间的方法,因为任何非自适应离散化都无法实现次线性遗憾。在随机设置中(即代理的估值根据某些固定但未知的分布进行抽取),作者设计了一个有效的学习算法,实现了 (\tilde{O}(n^{3/4})) 的遗憾界限。而在对抗性设置中(即估值是预先任意生成的),作者利用问题的非Lipschitz性质来证明一个强有力的负面结果,即没有任何学习算法能够实现超过最佳固定机制收益的一半。

此外,论文还考虑了 (\pi)-平滑对手设置,即估值随机生成自平滑分布,但与随机情况不同,可以以非平稳方式进行。在这个设置中,作者构建了一个有效的学习算法,实现了 (\tilde{O}(n^{2/3})) 的遗憾界限,并基于对数数量级的专家进行简洁编码。最后,作者证明了在随机和平滑设置中,没有任何学习算法能够实现小于 (\Omega(\sqrt{n})) 的遗憾,从而缩小了这两种问题中最小最大遗憾率的范围。

1. Research Problem

The core issue explored in this paper is the design of a revenue-maximizing incentive-compatible mechanism for selling a non-excludable good, such as an advertisement slot, to two cooperative bidders (e.g., a merchant and a brand). This problem captures scenarios where two parties jointly bid in an auction and both benefit from the ad being displayed. The mechanism collects bids from both parties and decides on the allocation and payments. This gives rise to intricate incentive compatibility constraints, such as how to split payments between the two parties. The paper tackles the problem of finding a revenue-maximizing incentive-compatible mechanism from an online learning perspective, which poses significant technical challenges due to the large action space and the highly irregular function mapping mechanisms to revenue.

2. Research Object

The research object is the mechanism design for selling joint advertisements in the online retail sector, specifically focusing on scenarios where multiple bidders are involved in the bidding process for a single, non-excludable item. The paper is particularly concerned with the complexities that arise in auction design, including how to divide payments among different bidders and ensuring that the auction mechanism maximizes revenue while meeting the criteria for incentive compatibility and individual rationality.

3. Used Technology

The paper employs an online learning approach to address the challenge of identifying a revenue-maximizing incentive-compatible mechanism. The authors propose an efficient learning algorithm based on an adaptive discretization scheme of the mechanism space, as any non-adaptive discretization fails to achieve sublinear regret. In the stochastic setting, where agents’ valuations are drawn from a fixed but unknown distribution, the algorithm achieves a regret bound of (\tilde{O}(n^{3/4})). In the adversarial setting, where valuations are arbitrarily generated upfront, the authors exploit the non-Lipschitzness of the problem to prove a strong negative result, indicating that no learning algorithm can achieve more than half of the revenue of the best fixed mechanism in hindsight. They also consider the (\pi)-smooth adversary setting, where valuations are randomly generated from smooth distributions but can be non-stationary. In this setting, they construct an efficient learning algorithm that achieves a regret bound of (\tilde{O}(n^{2/3})) and is based on a succinct encoding of exponentially many experts. Finally, they prove that no learning algorithm can achieve less than (\Omega(\sqrt{n})) regret in both the stochastic and the smooth setting, thus narrowing the range where the minimax regret rates for these two problems lie.