Reinforcement Learning | Pollux - Fachinformationsdienst Politikwissenschaft

Filter

1328 Ergebnisse

Sortierung:

Aufsatz(elektronisch)#12021

Reinforcement Learning in Contests

Chaudhary, Vikas

Open Access

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

SSRN

Exportieren

Open Access#22013

Multi-objective Reinforcement Learning

In: http://hdl.handle.net/10630/5909

Ruiz-Montiel, Manuela

In this talk we present PQ-learning, a new Reinforcement Learning (RL) algorithm that determines the rational behaviours of an agent in multi-objective domains ; This work is partially funded by: grant TIN2009-14179 (Spanish Government, Plan Nacional de I+D+i) and Universidad de Málaga, Campus de Excelencia Internacional Andalucía Tech. Manuela Ruiz-Montiel is funded by the Spanish Ministry of Education through the National F.P.U. Program

Open Access

BASE

Exportieren

Open Access#32011

Reinforcement learning in repeated portfolio decisions

Diao, Linan; Rieskamp, Jörg

How do people make investment decisions when they receive outcome feedback? We examined how well the standard mean-variance model and two reinforcement models predict people's portfolio decisions. The basic reinforcement model predicts a learning process that relies solely on the portfolio's overall return, whereas the proposed extended reinforcement model also takes the risk and covariance of the investments into account. The experimental results illustrate that people reacted sensitively to different correlation structures of the investment alternatives, which was best predicted by the extended reinforcement model. The results illustrate that simple reinforcement learning is sufficient to detect correlation between investments.

Open Access

BASE

Exportieren

Aufsatz(elektronisch)#42022

Deep Reinforcement Learning: A Study of Reinforcement Learning with Neural Networks in Industrial Automation

Iroshan, Asiri

Open Access

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

SSRN

Exportieren

Aufsatz(elektronisch)#52022

Reinforcement Learning for CVA hedging

Noguer i Alonso, Miquel; Zhdankin, Ivan

Open Access

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

SSRN

Exportieren

Aufsatz(elektronisch)#62022

Factor investing with reinforcement learning

André, Eric; Coqueret, Guillaume

Open Access

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

SSRN

Working paper

Exportieren

Aufsatz(elektronisch)#72020

Deep Reinforcement Learning Patents: An Empirical Survey

In: Brian Haney, Deep Reinforcement Learning Patents: An Empirical Survey, UCLA J. L. & Tech __ (2021).

Haney, Brian

Open Access

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

SSRN

Working paper

Exportieren

Aufsatz(elektronisch)#82021

Recent Advances in Reinforcement Learning in Finance

Hambly, Ben M.; Xu, Renyuan; Yang, Huining

Open Access

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

SSRN

Exportieren

Aufsatz(elektronisch)#9

Factor Investing with Reinforcement Learning

In: JEDC-D-22-00205

Coqueret, Guillaume; André, Eric

Open Access

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

SSRN

Exportieren

Aufsatz(elektronisch)#10

Factor Investing with Reinforcement Learning

In: JEDC-D-22-00205

Coqueret, Guillaume; André, Eric

Open Access

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

SSRN

Exportieren

Buch(elektronisch)#112023

Reinforcement learning for maritime communications

In: Wireless networks

Xiao, Liang; Yang, Helin; Zhuang, Weihua; Min, Minghui

This book demonstrates that the reliable and secure communication performance of maritime communications can be significantly improved by using intelligent reflecting surface (IRS) aided communication, privacy-aware Internet of Things (IoT) communications, intelligent resource management and location privacy protection. In the IRS aided maritime communication system, the reflecting elements of IRS can be intelligently controlled to change the phase of signal, and finally enhance the received signal strength of maritime ships (or sensors) or jam maritime eavesdroppers illustrated in this book. The power and spectrum resource in maritime communications can be jointly optimized to guarantee the quality of service (i.e., security and reliability requirements), and reinforcement leaning is adopted to smartly choose the resource allocation strategy. Moreover, learning based privacy-aware offloading and location privacy protection are proposed to intelligently guarantee the privacy-preserving requirements of maritime ships or (sensors). Therefore, these communication schemes based on reinforcement learning algorithms can help maritime communication systems to improve the information security, especially in dynamic and complex maritime environments. This timely book also provides broad coverage of the maritime wireless communication issues, such as reliability, security, resource management, and privacy protection. Reinforcement learning based methods are applied to solve these issues. This book includes four rigorously refereed chapters from prominent international researchers working in this subject area. The material serves as a useful reference for researchers, graduate students. Practitioners seeking solutions to maritime wireless communication and security related issues will benefit from this book as well.

Zugriff(via Pollux)Zugriff(via Standort)

Verfügbarkeit an Ihrem Standort wird überprüft

Dieses Buch ist auch in Ihrer Bibliothek verfügbar:

Exportieren

Aufsatz(elektronisch)#1230. März 2001

Reinforcement Learning in Repeated Interaction Games

In: The B.E. Journal of Theoretical Economics, Band 1, Heft 1

Bendor, Jonathan; Mookherjee, Dilip; Ray, Debraj

ISSN: 1935-1704

Abstract
We study long run implications of reinforcement learning when two players repeatedly interact with one another over multiple rounds to play a finite action game. Within each round, the players play the game many successive times with a fixed set of aspirations used to evaluate payoff experiences as successes or failures. The probability weight on successful actions is increased, while failures result in players trying alternative actions in subsequent rounds. The learning rule is supplemented by small amounts of inertia and random perturbations to the states of players. Aspirations are adjusted across successive rounds on the basis of the discrepancy between the average payoff and aspirations in the most recently concluded round. We define and characterize pure steady states of this model, and establish convergence to these under appropriate conditions. Pure steady states are shown to be individually rational, and are either Pareto-efficient or a protected Nash equilibrium of the stage game. Conversely, any Pareto-efficient and strictly individually rational action pair, or any strict protected Nash equilibrium, constitutes a pure steady state, to which the process converges from non-negligible sets of initial aspirations. Applications to games of coordination, cooperation, oligopoly, and electoral competition are discussed.

Zugriff(via Standort)Subito

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

Exportieren

Aufsatz(elektronisch)#13

Reinforcement Learning in a Prisoner's Dilemma

In: YGAME-D-22-00575

Dolgopolov, Arthur

Open Access

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

SSRN

Exportieren

Open Access#142005

Aktionenlernen mit Selbstorganisierenden Karten und Reinforcement Learning

Flentge, Felix

Die vorliegende Arbeit beschäftigt sich mit der Entwicklung eines Funktionsapproximators und dessen Verwendung in Verfahren zum Lernen von diskreten und kontinuierlichen Aktionen: 1. Ein allgemeiner Funktionsapproximator – Locally Weighted Interpolating Growing Neural Gas (LWIGNG) – wird auf Basis eines Wachsenden Neuralen Gases (GNG) entwickelt. Die topologische Nachbarschaft in der Neuronenstruktur wird verwendet, um zwischen benachbarten Neuronen zu interpolieren und durch lokale Gewichtung die Approximation zu berechnen. Die Leistungsfähigkeit des Ansatzes, insbesondere in Hinsicht auf sich verändernde Zielfunktionen und sich verändernde Eingabeverteilungen, wird in verschiedenen Experimenten unter Beweis gestellt. 2. Zum Lernen diskreter Aktionen wird das LWIGNG-Verfahren mit Q-Learning zur Q-LWIGNG-Methode verbunden. Dafür muss der zugrunde liegende GNG-Algorithmus abgeändert werden, da die Eingabedaten beim Aktionenlernen eine bestimmte Reihenfolge haben. Q-LWIGNG erzielt sehr gute Ergebnisse beim Stabbalance- und beim Mountain-Car-Problem und gute Ergebnisse beim Acrobot-Problem. 3. Zum Lernen kontinuierlicher Aktionen wird ein REINFORCE-Algorithmus mit LWIGNG zur ReinforceGNG-Methode verbunden. Dabei wird eine Actor-Critic-Architektur eingesetzt, um aus zeitverzögerten Belohnungen zu lernen. LWIGNG approximiert sowohl die Zustands-Wertefunktion als auch die Politik, die in Form von situationsabhängigen Parametern einer Normalverteilung repräsentiert wird. ReinforceGNG wird erfolgreich zum Lernen von Bewegungen für einen simulierten 2-rädrigen Roboter eingesetzt, der einen rollenden Ball unter bestimmten Bedingungen abfangen soll. ; This doctoral thesis deals with the development of a function approximator and its application to methods for learning discrete and continuous actions: 1. A general function approximator – Locally Weighted Interpolating Growing Neural Gas (LWIGNG) – is developed from Growing Neural Gas (GNG). The topological neighbourhood structure is used for calculating interpolations between neighbouring neurons and for applying a local weighting scheme. The capabilities of this method are shown in several experiments, with special considerations given to changing target functions and changing input distributions. 2. To learn discrete actions LWIGNG is combined with Q-Learning forming the Q-LWIGNG method. The underlying GNG-algorithm has to be changed to take care of the special order of the input data in action learning. Q-LWIGNG achieves very good results in experiments with the pole balancing and the mountain car problems, and good results with the acrobot problem. 3. To learn continuous actions a REINFORCE algorithm is combined with LWIGNG forming the ReinforceGNG method. An actor-critic architecture is used for learning from delayed rewards. LWIGNG approximates both the state-value function and the policy. The policy is given by the situation dependent parameters of a normal distribution. ReinforceGNG is applied successfully to learn continuous actions of a simulated 2-wheeled robot which has to intercept a rolling ball under certain conditions.

Open Access

BASE

Exportieren

Aufsatz(elektronisch)#1528. Dezember 2010