In recent years, driver's visual attention has been actively studied for driving automation technology. However, the number of models is few to perceive an insight understanding of driver's attention in various moments. All attention models process multi-level image representations by a two-stream/multi-stream network, increasing the computational cost due to an increment of model parameters. However, multi-level image representation such as optical flow plays a vital role in tasks involving videos. Therefore, to reduce the computational cost of a two-stream network and use multi-level image representation, this work proposes a single stream driver's visual attention model for a critical situation. The experiment was conducted using a publicly available critical driving dataset named BDD-A. Qualitative results confirm the effectiveness of the proposed model. Moreover, quantitative results highlight that the proposed model outperforms state-of-the-art visual attention models according to CC and SIM. Extensive ablation studies verify the presence of optical flow in the model, the position of optical flow in the spatial network, the convolution layers to process optical flow, and the computational cost compared to a two-stream model.
Rebeka SULTANA
Shizuoka University
Gosuke OHASHI
Shizuoka University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Rebeka SULTANA, Gosuke OHASHI, "Prediction of Driver's Visual Attention in Critical Moment Using Optical Flow" in IEICE TRANSACTIONS on Information,
vol. E106-D, no. 5, pp. 1018-1026, May 2023, doi: 10.1587/transinf.2022EDP7146.
Abstract: In recent years, driver's visual attention has been actively studied for driving automation technology. However, the number of models is few to perceive an insight understanding of driver's attention in various moments. All attention models process multi-level image representations by a two-stream/multi-stream network, increasing the computational cost due to an increment of model parameters. However, multi-level image representation such as optical flow plays a vital role in tasks involving videos. Therefore, to reduce the computational cost of a two-stream network and use multi-level image representation, this work proposes a single stream driver's visual attention model for a critical situation. The experiment was conducted using a publicly available critical driving dataset named BDD-A. Qualitative results confirm the effectiveness of the proposed model. Moreover, quantitative results highlight that the proposed model outperforms state-of-the-art visual attention models according to CC and SIM. Extensive ablation studies verify the presence of optical flow in the model, the position of optical flow in the spatial network, the convolution layers to process optical flow, and the computational cost compared to a two-stream model.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022EDP7146/_p
Copy
@ARTICLE{e106-d_5_1018,
author={Rebeka SULTANA, Gosuke OHASHI, },
journal={IEICE TRANSACTIONS on Information},
title={Prediction of Driver's Visual Attention in Critical Moment Using Optical Flow},
year={2023},
volume={E106-D},
number={5},
pages={1018-1026},
abstract={In recent years, driver's visual attention has been actively studied for driving automation technology. However, the number of models is few to perceive an insight understanding of driver's attention in various moments. All attention models process multi-level image representations by a two-stream/multi-stream network, increasing the computational cost due to an increment of model parameters. However, multi-level image representation such as optical flow plays a vital role in tasks involving videos. Therefore, to reduce the computational cost of a two-stream network and use multi-level image representation, this work proposes a single stream driver's visual attention model for a critical situation. The experiment was conducted using a publicly available critical driving dataset named BDD-A. Qualitative results confirm the effectiveness of the proposed model. Moreover, quantitative results highlight that the proposed model outperforms state-of-the-art visual attention models according to CC and SIM. Extensive ablation studies verify the presence of optical flow in the model, the position of optical flow in the spatial network, the convolution layers to process optical flow, and the computational cost compared to a two-stream model.},
keywords={},
doi={10.1587/transinf.2022EDP7146},
ISSN={1745-1361},
month={May},}
Copy
TY - JOUR
TI - Prediction of Driver's Visual Attention in Critical Moment Using Optical Flow
T2 - IEICE TRANSACTIONS on Information
SP - 1018
EP - 1026
AU - Rebeka SULTANA
AU - Gosuke OHASHI
PY - 2023
DO - 10.1587/transinf.2022EDP7146
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2023
AB - In recent years, driver's visual attention has been actively studied for driving automation technology. However, the number of models is few to perceive an insight understanding of driver's attention in various moments. All attention models process multi-level image representations by a two-stream/multi-stream network, increasing the computational cost due to an increment of model parameters. However, multi-level image representation such as optical flow plays a vital role in tasks involving videos. Therefore, to reduce the computational cost of a two-stream network and use multi-level image representation, this work proposes a single stream driver's visual attention model for a critical situation. The experiment was conducted using a publicly available critical driving dataset named BDD-A. Qualitative results confirm the effectiveness of the proposed model. Moreover, quantitative results highlight that the proposed model outperforms state-of-the-art visual attention models according to CC and SIM. Extensive ablation studies verify the presence of optical flow in the model, the position of optical flow in the spatial network, the convolution layers to process optical flow, and the computational cost compared to a two-stream model.
ER -