

Abstract
Intelligent behavior in both biological and artificial systems is adaptive: it flexibly responds to changing external conditions in ways that are sensitive to the system’s internal states and goals. Except for single-cell organisms, adaptive systems typically consist of multiple interacting components. This raises a central question: how can components coordinate their actions for the benefit of the system as a whole, especially in dynamic environments where optimal strategies may shift over time?
In learning contexts, this challenge is known as the structural credit assignment problem: the difficulty of identifying how individual actions contribute to collective outcomes. In this talk, I present ongoing work using a multi-agent multi-armed bandit (MAMAB) task in a dynamic environment with collective rewards. The aim is to explore how effective coordination can emerge from simple cognitive and social mechanisms, using both agent-based simulations and human experiments.
Passcode: 882647