nr |
titel |
auteur |
tijdschrift |
jaar |
jaarg. |
afl. |
pagina('s) |
type |
1 |
Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results
|
Mahadevan, Sridhar |
|
1996 |
22 |
1-3 |
p. 159-195 |
artikel |
2 |
Creating Advice-Taking Reinforcement Learners
|
Maclin, Richard |
|
1996 |
22 |
1-3 |
p. 251-281 |
artikel |
3 |
Editorial
|
Dietterich, Thomas G. |
|
1996 |
22 |
1-3 |
p. 5-6 |
artikel |
4 |
Efficient Reinforcement Learning through Symbiotic Evolution
|
Moriarty, David E. |
|
1996 |
22 |
1-3 |
p. 11-32 |
artikel |
5 |
Feature-Based Methods for Large Scale Dynamic Programming
|
Tsitsiklis, John N. |
|
1996 |
22 |
1-3 |
p. 59-94 |
artikel |
6 |
Incremental Multi-Step Q-Learning
|
Peng, Jing |
|
1996 |
22 |
1-3 |
p. 283-290 |
artikel |
7 |
Introduction
|
Kaelbling, Leslie Pack |
|
1996 |
22 |
1-3 |
p. 7-9 |
artikel |
8 |
Linear Least-Squares Algorithms for Temporal Difference Learning
|
Bradtke, Steven J. |
|
1996 |
22 |
1-3 |
p. 33-57 |
artikel |
9 |
On the Worst-Case Analysis of Temporal-Difference Learning Algorithms
|
Schapire, Robert E. |
|
1996 |
22 |
1-3 |
p. 95-121 |
artikel |
10 |
Reinforcement Learning with Replacing Eligibility Traces
|
Singh, Satinder P. |
|
1996 |
22 |
1-3 |
p. 123-158 |
artikel |
11 |
The Effect of Representation and Knowledge on Goal-Directed Exploration with Reinforcement-Learning Algorithms
|
Koenig, Sven |
|
1996 |
22 |
1-3 |
p. 227-250 |
artikel |
12 |
The Loss from Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks
|
Heger, Matthias |
|
1996 |
22 |
1-3 |
p. 197-225 |
artikel |