[1]JIN Zhuo-jun,QIAN Hui,CHEN Shen-yi,et al.Survey of apprenticeship learning based on reward function learning[J].CAAI Transactions on Intelligent Systems,2009,4(3):208-212.
Copy

Survey of apprenticeship learning based on reward function learning

References:
[1]ATKESON C G, SCHAAL S. Robot learning from demonstration[C]//Proceedings of the Fourteenth International Conference on Machine Learning. Nashville, USA, 1997: 1220.
[2]RATLIFF N D, BAGNELL J A, ZINKEVICH M A. Maximum margin planning[C]//Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh, USA, 2006: 729736.
[3]金卓军, 钱 徽, 陈沈轶,等. 基于回报函数逼近的学徒学习综述[J]. 华中科技大学学报:自然科学版,2008(S1): 288290, 294.
JIN Zhuojun, QIAN Hui, CHEN Shenyi, et al. Survey of apprenticeship learning based on reward function approximating[J]. Journal of Huazhong University of Science and Technology: Nature Science, 2008, 36(S1): 288290, 294.
[4]NG A Y, RUSSELL S J. Algorithms for inverse reinforcement learning[C]//Proceedings of the Seventeenth International Conference on Machine Learning. San Francisco, USA, 2000: 663670.
[5]ABBEEL P, NG A Y. Apprenticeship learning via inverse reinforcement learning[C]//Proceedings of the Twentyfirst International Conference on Machine Learning. Banff, Canada, 2004:18
[6]KOLTER J Z, ABBEEL P, NG A Y. Hierarchical apprenticeship learning with application to quadruped locomotion [C]//Advances in Neural Information Processing Systems.Cambridge, USA: MIT Press, 2008.
[7]RATLIFF N, BAGNELL J A, ZINKEVICH M A. Subgradient methods for maximum margin structured learning[C]//Workshop on Learning in Structured Outputs Spaces at ICML. Pittsburgh, USA, 2006.
[8]SYED U, BOWLING M, SCHAPIRE R E. Apprenticeship learning using linear programming[C]//Proceedings of the 25 International Conference on Machine Learning (ICML 2008). Helsinki, Finland, 2008: 10321039.
[9]SYED U, SCHAPIRE R E. A gametheoretic approach to apprenticeship learning[C]//Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2008.
[10]GRIMES D B, RAJESH D R, RAO R P N. Learning nonparametric models for probabilistic imitation[C]//Proceedings of Neural Information Processing Systems. Cambridge, USA: MIT Press, 2007: 521528.
[11]ABBEEL P, COATES A, QUIGLEY M, et al. An application of reinforcement learning to aerobatic helicopter flight[C]//Proceedings of Neural Information Processing Systems. Cambridge, USA: MIT Press, 2007: 18.
[12]KOLTER J Z, RODGERS M P, NG A Y. A complete control architecture for quadruped locomotion over rough terrain[C]//IEEE International Conference on Robotics and Automation. Pasadena, USA, 2008: 811818.
[13]REBULA J R, NEUHAUS P D, BONNLANDER B V, et al. A controller for the littledog quadruped walking on rough terrain[C]//2007 IEEE International Conference on Robotics and Automation. Roma, Italy, 2007: 14671473.
?[14]KAELBLING L P, LITTMAN M L, MOORE A W. Reinforcement learning: a survey[J]. Journal of Artificial Intelligence Research, 1996, 4: 237285.
[15]SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. Cambridge, USA: MIT Press, 1998.
[16]COATES A, ABBEEL P, NG A Y. Reinforcement learning with multiple demonstrations[C]//The Twentyfirst Annual Conference on Neural Information Processing Systems (NIPS 2007). Vancouver, Canada, 2007.
[17]TASKAR B, CHATALBASHEV V, KOLLER D, et al. Learning structured prediction models: a large margin approach[C]//Proceedings of the 22nd International Conference on Machine Learning. New York, USA: ACM, 2005: 896903.
?[18]TASKAR B, LACOSTEJULIEN S, JORDAN M. Structured prediction via the extragradient method[C]//Proceedings of Neural Information Processing Systems.Vancouver, Canada, 2005: 13451352.
[19]SHOR N Z, KIWIEL K C, RUSZCAYNSKI A. Minimization methods for nondifferentiable functions[M]. New York, USA: SpringerVerlag, 1985.
[20]TSOCHANTARIDIS I, JOACHIMS T, HOFMANN T, et al. Large margin methods for structured and interdependent output variables[J]. The Journal of Machine Learning Research, 2005, 6: 14531484
[21]CHECHIK G, HEITZ G, ELIDAN G, et al. Maxmargin classification of incomplete data [C]//Advances in Neural Information Processing Systems: Proceedings of the 2006 Conference. Cambridge, USA: MIT Press, 2007:233240.
[22]NEU G, SZEPESVARI C. Apprenticeship learning using inverse reinforcement learning and gradient methods[C]//Proceedings of Uncertainty in Artificial Intelligence. Vancouver, Canada, 2007: 295302.
Similar References:

Memo

-

Last Update: 2009-08-31

Copyright © CAAI Transactions on Intelligent Systems