Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(753.37KB)
This is an Open Access publication published under CSC-OpenAccess Policy.
Publications from CSC-OpenAccess Library are being accessed from over 74 countries worldwide.
Planning in Markov Stochastic Task Domains
Yong Lin, Fillia Makedon
Pages - 54 - 64     |    Revised - 30-08-2010     |    Published - 30-10-2010
Volume - 1   Issue - 3    |    Publication Date - October  Table of Contents
MORE INFORMATION
KEYWORDS
Markov decision processes, POMDP, task planning, uncertainty, decision-making
ABSTRACT
In decision theoretic planning, a challenge for Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs) is, many problem domains contain big state spaces and complex tasks, which will result in poor solution performance. We develop a task analysis and modeling (TAM) approach, in which the (PO)MDP model is separated into a task view and an action view. In the task view, TAM models the problem domain using a task equivalence model, with task-dependent abstract states and observations. We provide a learning algorithm to obtain the parameter values of task equivalence models. We present three typical examples to explain the TAM approach. Experimental results indicate our approach can greatly improve the computational capacity of task planning in Markov stochastic domains.
1 Google Scholar 
2 Academic Index 
3 CiteSeerX 
4 refSeek 
5 Scribd 
6 SlideShare 
7 PDFCAST 
8 PdfSR 
1 Boutilier, C., Dearden, R., & Goldszmidt, M. (1995). Exploiting structure in policy construction. In Proceedings of IJCAI, pp. 1104-1113.
2 Chang, A., & Amir, E. (2006). Goal achievement in partially known, partially observable domains. In Proceedings of ICAPS, pp. 203-211. AAAI.
3 Deák, F., Kovács, A., Váncza, J., & Dobrowiecki, T. P. (2001). Hierarchical knowledge-based process planning in manufacturing. In Proceedings of the IFIP 11 International PROLAMAT Conference on Digital Enterprise, pp. 428-439.
4 Dean, T., & hong Lin, S. (1995). Decomposition techniques for planning in stochastic domains. In Proceedings of IJCAI, pp. 1121-1127. Morgan Kaufmann.
5 Dietterich, T. G. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227-303.
6 Givan, R., Dean, T., & Grieg, M. (2003). Equivalence notions and model minimization in Markov decision processes. Artificial Intelligence, 147 (1-2), 163-223.
7 Hansen, E. A., & Zhou, R. (2003). Synthesis of hierarchical finite-state controllers for POMDPs. In Proceedings of ICAPS, pp. 113-122. AAAI.
8 Hsiao, K., Kaelbling, L. P., & Lozano-Pérez, T. (2007). Grasping POMDPs. In Proceedings of ICRA, pp. 4685-4692.
9 Kurniawati, H., Hsu, D., & Lee, W. S. (2008). Sarsop: Efficient point-based pomdp planning by approximating optimally reachable belief spaces. In Proceedings of Robotics Science and Systems.
10 Lekavý, M., & Návrat, P. (2007). Expressivity of STRIPS-like and HTN-like planning. In Agent and Multi-Agent Systems: Technologies and Applications, First KES International Symposium,Vol. 4496, pp. 121-130. Springer.
11 Littman, M. L., Cassandra, A. R., & Kaelbling, L. P. (1995). Learning policies for partially observable environments: scaling up. In Proceedings of ICML, pp. 362-370.
12 Pineau, J., Roy, N., & Thrun, S. (2001). A hierarchical approach to pomdp planning and execution. In Workshop on Hierarchy and Memory in Reinforcement Learning (ICML).
13 Potts, D., & Hengst, B. (2004). Discovering multiple levels of a task hierarchy concurrently. Robotics and Autonomous Systems, 49 (1-2), 43-55.
14 Poupart, P., & Boutilier, C. (2002). Value-directed compression of POMDPs. In Proceedings of NIPS, pp. 1547-1554.
15 Singh, S. P., & Cohn, D. (1997). How to dynamically merge markov decision processes. In Proceedings of NIPS.
16 Smith, T., & Simmons, R. G. (2004). Heuristic search value iteration for POMDPs. In Proceedings of UAI.
17 Smith, T., & Simmons, R. G. (2005). Point-based POMDP algorithms: Improved analysis and implementation. In Proceedings of UAI, pp. 542-547.
Mr. Yong Lin
University of Texas at Arlington - United States of America
bracelyn@gmail.com
Professor Fillia Makedon
- United States of America