[1]SHEN Tianxiao,HAN Yiyuan,HAN Bing,et al.Recognition of driver’s eye movement based on the human visual cortex two-stream model[J].CAAI Transactions on Intelligent Systems,2022,17(1):41-49.[doi:10.11992/tis.202106051]
Copy

Recognition of driver’s eye movement based on the human visual cortex two-stream model

References:
[1] 国家统计局. 中华人民共和国2019年国民经济和社会发展统计公报[N]. 人民日报, 2020-02-29(5).
[2] JAIN D K, JAIN R, LAN Xiangyuan, et al. Driver distraction detection using capsule network[J]. Neural computing and applications, 2021, 33(11): 6183–6196.
[3] LE T H N, ZHENG Yutong, ZHU Chenchen, et al. Multiple scale faster-RCNN approach to driver’s cell-phone usage and hands on steering wheel detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops. New York, USA: IEEE, 2016: 46?53.
[4] 王荣本, 郭克友, 储江伟, 等. 适用驾驶员疲劳状态监测的人眼定位方法研究[J]. 公路交通科技, 2003(5): 111–114
WANG Rongben, GUO Keyou, CHU Jiangwei, et al. Study on the eye location method in driver fatigue state surveillance[J]. Journal of highway and transportation research and development, 2003(5): 111–114
[5] 张杰. 基于眼动仪的驾驶员视点分布特性研究[J]. 湖南交通科技, 2012, 38(4): 153–155,170
ZHANG Jie. Driver’s viewpoint distribution based on the eye tracker[J]. Hunan communication science and technology, 2012, 38(4): 153–155,170
[6] 袁伟, 徐远新, 郭应时, 等. 车道变换与直行时的驾驶人注视转移特性[J]. 长安大学学报(自然科学版), 2015, 35(5): 124–130
YUAN Wei, XU Yuanxin, GUO Yingshi, et al. Fixation transfer characteristics of drivers during lane change and straight drive[J]. Journal of chang’an university (natural science edition), 2015, 35(5): 124–130
[7] MISHKIN M, UNGERLEIDER L G, MACKO K A. Object vision and spatial vision: two cortical pathways[J]. Trends in neurosciences, 1983, 6: 414–417.
[8] KOOTSTRA G, DE BOER B, SCHOMAKER L R B. Predicting eye fixations on complex visual stimuli using local symmetry[J]. Cognitive computation, 2011, 3(1): 223–240.
[9] SOOMRO K, ZAMIR A R, SHAH M. UCF101: a dataset of 101 human actions classes from videos in the wild[EB/OL]. (2012-12-1) [2021-05-30]. https://arxiv.org/abs/1212.0402.
[10] KAY W, CARREIRA J, SIMONYAN K, et al. The kinetics human action video dataset[EB/OL]. (2017-05-19) [2021-05-30]. https://arxiv.org/abs/1705.06950.
[11] SIGURDSSON G A, GUPTA A, SCHMID C, et al. Charades-ego: a large-scale dataset of paired third and first person videos[EB/OL]. (2018-04-30) [2021-05-30]. https://arxiv.org/abs/1804.09626.
[12] DAMEN Dima, DOUGHTY H, FARINELLA G M, et al. Scaling egocentric vision: the dataset[M]//Computer Vision – ECCV 2018. Cham: Springer International Publishing, 2018: 753?771.
[13] JIANG Lai, XU Mai, WANG Zulin. Predicting video saliency with object-to-motion CNN and two-layer convolutional LSTM[EB/OL]. (2017-09-19) [2021-06-30]. https://arxiv.org/abs/1709.06316.
[14] Li Y, Liu M, Rehg J M. In the eye of beholder: Joint learning of gaze and actions in first person video[C]//2018 European Conference on Computer Vision. Berlin, German: Springer, 2018: 619?635.
[15] YIN Li, YE Zhefan, REHG J M. Delving into egocentric actions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2015: 287?295.
[16] MATHE S, SMINCHISESCU C. Actions in the eye: dynamic gaze datasets and learnt saliency models for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(7): 1408–1424.
[17] MARSZALEK M, LAPTEV I, SCHMID C. Actions in context[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2009: 2929?2936.
[18] RODRIGUEZ M. Spatio-temporal maximum average correlation height templates in action recognition and video summarization[EB/OL]. (2013-12-10) [2021-06-30]. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.221.5006.
[19] JUDD T, EHINGER K, DURAND F, et al. Learning to predict where humans look[C]//2009 IEEE 12th International Conference on Computer Vision. New York, USA: IEEE, 2009: 2106?2113.
[20] PAPADOPOULOS D P, CLARKE A D F, KELLER F, et al. Training object class detectors from eye tracking data[C]//Computer vision–ECCV 2014. Berlin, German: Springer, 2014: 361?376.
[21] EVERINGHAM M, GOOL L V, WILLIAMS C K I, et al. The pascal visual object classes (VOC) challenge[J]. International journal of computer vision, 2010, 88(2): 303–338.
[22] JI Shuiwang, XU Wei, YANG Ming, et al. 3D convolutional neural networks for human action recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 35(1): 221–231.
[23] SIMONYAN K, ZISSERMAN A. Two-stream convolutional networks for action recognition in videos[EB/OL]. (2014-11-12) [2021-06-30]. https://arxiv.org/abs/1406.2199.
[24] NG Joey H, HAUSKNECHT M, VIJAYANARASIMHAN S, et al. Beyond short snippets: Deep networks for video classification[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2015: 4694?4702.
[25] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735–1780.
[26] WANG Limin, XIONG Yuanjun, WANG Zhe, et al. Temporal segment networks: towards good practices for deep action recognition[C]//Computer vision–ECCV 2016. Berlin, German: Springer, 2016: 20?36.
[27] LIN Ji, GAN Chuang, HAN Song. TSM: temporal shift module for efficient video understanding[C]//2019 IEEE/CVF International Conference on Computer Vision. New York, USA: IEEE, 2019: 7082?7092.
[28] TRAN D, BOURDEV L, FERGUS R, et al. Learning spatiotemporal features with 3D convolutional networks[C]//2015 IEEE International Conference on Computer Vision. New York, USA: IEEE, 2015: 4489?4497.
[29] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2016: 770?778.
[30] TRAN D, RAY J, SHOU ZHENG, et al. ConvNet architecture search for spatiotemporal feature learning[EB/OL]. (2017-8-16) [2021-06-30]. https://arxiv.org/abs/1708.05038.
[31] TRAN D, WANG Heng, TORRESANI L, et al. A closer look at spatiotemporal convolutions for action recognition[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2018: 6450?6459.
[32] FEICHTENHOFER C, FAN Haoqi, MALIK J, et al. SlowFast networks for video recognition[C]//2019 IEEE/CVF International Conference on Computer Vision. New York, USA: IEEE, 2019: 6201?6210.
[33] 彭金栓, 高翠翠, 郭应时. 基于熵率值的驾驶人视觉与心理负荷特性分析[J]. 重庆交通大学学报(自然科学版), 2014, 33(2): 118–121
PENG Jinshuan, GAO Cuicui, GUO Yingshi. Drivers’ visual characteristics and mental load based on entropy rates[J]. Journal of Chongqing Jiaotong University (natural science edition), 2014, 33(2): 118–121
[34] 袁伟, 付锐, 马勇, 等. 车速与标志文字高度对驾驶人视觉搜索模式的影响[J]. 交通运输工程学报, 2011, 11(1): 119–126
YUAN Wei, FU Rui, MA Yong, et al. Effects of vehicle speed and traffic sign text height on drivers’ visual search patterns[J]. Journal of traffic and transportation engineering, 2011, 11(1): 119–126
Similar References:

Memo

-

Last Update: 1900-01-01

Copyright © CAAI Transactions on Intelligent Systems