Abstract
This paper considers the distributed bandit convex optimization problem with time-varying inequality constraints over a network of agents, where the goal is to minimize network regret and cumulative constraint violation. Existing distributed online algorithms solving this problem require that each agent broadcasts its decision to its neighbors at each iteration. However, communication resources are often limited. To better utilize communication resources, we propose a distributed event-triggered online primal-dual algorithm with two-point bandit feedback. Under several classes of appropriately chosen decreasing parameter sequences and non-increasing event-triggered threshold sequences, we establish dynamic network regret and network cumulative constraint violation bounds. These bounds are comparable to the results achieved by distributed event-triggered online algorithms with full-information feedback. Finally, a numerical example is provided to verify the theoretical results.
| Original language | English |
|---|---|
| Pages (from-to) | 2242-2253 |
| Number of pages | 12 |
| Journal | IEEE Transactions on Control of Network Systems |
| Volume | 12 |
| Issue number | 3 |
| Early online date | 8-Apr-2025 |
| DOIs | |
| Publication status | Published - Sept-2025 |
Keywords
- Bandit convex optimization
- cumulative constraint violation
- distributed optimization
- event-triggered algorithm
- time-varying constraints