The page you are looking for has moved. You will be redirected to the new location in 5 seconds. Please update your links to use the new location at http://longhorizon.org/trey/pubs/b2hd-smith06frtdp.html

Trey Smith's Publications

Sorted by Date   Sorted by Publication Type   Sorted by Topic   

Focused Real-Time Dynamic Programming for MDPs: Squeezing More Out of a Heuristic.

Trey Smith and Reid G. Simmons. In Proc. Nat. Conf. on Artificial Intelligence (AAAI), 2006.

Download

[PDF]   

Abstract

Real-time dynamic programming (RTDP) is a heuristic search algorithm for solving MDPs. We present a modified algorithm called Focused RTDP with several improvements. While RTDP maintains only an upper bound on the long-term reward function, FRTDP maintains two-sided bounds and bases the output policy on the lower bound. FRTDP guides search with a new rule for outcome selection, focusing on parts of the search graph that contribute most to uncertainty about the values of good policies. FRTDP has modified trial termination criteria that should allow it to solve some problems (within $\epsilon$) that RTDP cannot. Experiments show that for all the problems we studied, FRTDP significantly outperforms RTDP and LRTDP, and converges with up to six times fewer backups than the state-of-the-art HDP algorithm.

BibTeX Entry

@InProceedings{smith06:frtdp,
  author = 	 {Trey Smith and Reid G. Simmons},
  title = 	 {Focused Real-Time Dynamic Programming for {MDPs}: Squeezing More Out of a Heuristic},
  booktitle =	 {Proc. Nat. Conf. on Artificial Intelligence (AAAI)},
  year =	 2006,
}

Generated by bib2html.pl (written by Patrick Riley ). About this theme. Last modified: Fri May 17, 2013 12:44:57