Document Summary

Report ID:05-06-09
Initial Submission Date:2005-06-24
Title:A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal?Difference Learning
Summary:The traditional Kalman filter can be viewed as a recursive stochastic algorithm that approximates an unknown function via a linear combination of prespecified basis functions given a sequence of noisy samples. In this paper, we generalize the algorithm to one that approximates the fixed point of an operator that is known to be a Euclidean norm contraction. Instead of noisy samples of the desired fixed point, the algorithm updates parameters based on noisy samples of functions generated by application of the operator, in the spirit of Robbins?Monro stochastic approximation. The algorithm is motivated by temporal?difference learning, and our developments lead to a possibly more efficient variant of temporal?difference learning. We establish convergence of the algorithm and explore efficiency gains through computational experiments involving optimal stopping and queueing problems.
Authors:Choi, David; Van Roy, Benjamin
Contact email:bvr@stanford.edu
 Number of views : 1203     Number of downloads : 627

Versions:

VersionDate Accessible?Download
12005-06-24ydownload

Submit a revision/Change accessibility
Back to Tech Reports