Digitale Bibliotheek
Sluiten Bladeren door artikelen uit een tijdschrift
 
<< vorige    volgende >>
     Tijdschrift beschrijving
       Alle jaargangen van het bijbehorende tijdschrift
         Alle afleveringen van het bijbehorende jaargang
           Alle artikelen van de bijbehorende aflevering
                                       Details van artikel 3 van 4 gevonden artikelen
 
 
  On-policy concurrent reinforcement learning
 
 
Titel: On-policy concurrent reinforcement learning
Auteur: Banerjee, Bikramjit
Sen, Sandip
Peng, Jing
Verschenen in: Journal of experimental & theoretical artificial intelligence
Paginering: Jaargang 16 (2004) nr. 4 pagina's 245-260
Jaar: 2004-10
Inhoud: When an agent learns in a multi-agent environment, the payoff it receives is dependent on the behaviour of the other agents. If the other agents are also learning, its reward distribution becomes non-stationary. This makes learning in multi-agent systems more difficult than single-agent learning. Prior attempts at value-function based learning in such domains have used off-policy Q-learning that do not scale well as the cornerstone, with restricted success. This paper studies on-policy modifications of such algorithms, with the promise of scalability and efficiency. In particular, it is proven that these hybrid techniques are guaranteed to converge to their desired fixed points under some restrictions. It is also shown, experimentally, that the new techniques can learn (from self-play) better policies than the previous algorithms (also in self-play) during some phases of the exploration.
Uitgever: Taylor & Francis
Bronbestand: Elektronische Wetenschappelijke Tijdschriften
 
 

                             Details van artikel 3 van 4 gevonden artikelen
 
<< vorige    volgende >>
 
 Koninklijke Bibliotheek - Nationale Bibliotheek van Nederland