nr |
titel |
auteur |
tijdschrift |
jaar |
jaarg. |
afl. |
pagina('s) |
type |
1 |
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes
|
Michael Kearns |
|
2002 |
49 |
2 |
p. 193-208 16 p. |
artikel |
2 |
Building a Basic Block Instruction Scheduler with Reinforcement Learning and Rollouts
|
Amy McGovern |
|
2002 |
49 |
2 |
p. 141-160 20 p. |
artikel |
3 |
Continuous-Action Q-Learning
|
José del R. Millán |
|
2002 |
49 |
2 |
p. 247-265 19 p. |
artikel |
4 |
Introduction
|
Satinder Singh |
|
2002 |
49 |
2 |
p. 107-109 3 p. |
artikel |
5 |
Kernel-Based Reinforcement Learning
|
Dirk Ormoneit |
|
2002 |
49 |
2 |
p. 161-178 18 p. |
artikel |
6 |
Near-Optimal Reinforcement Learning in Polynomial Time
|
Michael Kearns |
|
2002 |
49 |
2 |
p. 209-232 24 p. |
artikel |
7 |
On Average Versus Discounted Reward Temporal-Difference Learning
|
John N. Tsitsiklis |
|
2002 |
49 |
2 |
p. 179-191 13 p. |
artikel |
8 |
Reinforcement Learning for Call Admission Control and Routing under Quality of Service Constraints in Multimedia Networks
|
Hui Tong |
|
2002 |
49 |
2 |
p. 111-139 29 p. |
artikel |
9 |
Risk-Sensitive Reinforcement Learning
|
Oliver Mihatsch |
|
2002 |
49 |
2 |
p. 267-290 24 p. |
artikel |
10 |
Structure in the Space of Value Functions
|
David Foster |
|
2002 |
49 |
2 |
p. 325-346 22 p. |
artikel |
11 |
Technical Update Least-Squares Temporal Difference Learning
|
Justin A. Boyan |
|
2002 |
49 |
2 |
p. 233-246 14 p. |
artikel |
12 |
Variable Resolution Discretization in Optimal Control
|
Rémi Munos |
|
2002 |
49 |
2 |
p. 291-323 33 p. |
artikel |