1-18hit |
Kenta NISHIYUKI Jia-Yau SHIAU Shigenori NAGAE Tomohiro YABUUCHI Koichi KINOSHITA Yuki HASEGAWA Takayoshi YAMASHITA Hironobu FUJIYOSHI
Driver drowsiness estimation is one of the important tasks for preventing car accidents. Most of the approaches are binary classification that classify a driver is significantly drowsy or not. Multi-level drowsiness estimation, that detects not only significant drowsiness but also moderate drowsiness, is helpful to a safer and more comfortable car system. Existing approaches are mostly based on conventional temporal measures which extract temporal information related to eye states, and these measures mainly focus on detecting significant drowsiness for binary classification. For multi-level drowsiness estimation, we propose two temporal measures, average eye closed time (AECT) and soft percentage of eyelid closure (Soft PERCLOS). Existing approaches are also based on a time domain convolutional neural network (CNN) as deep neural network models, of which layers are linked sequentially. The network model extracts features mainly focusing on mono-temporal resolution. We found that features focusing on multi-temporal resolution are effective to multi-level drowsiness estimation, and we propose a parallel linked time-domain CNN to extract the multi-temporal features. We collected an own dataset in a real environment and evaluated the proposed methods with the dataset. Compared with existing temporal measures and network models, Our system outperforms the existing approaches on the dataset.
Toan H. VU An DANG Jia-Ching WANG
We develop a deep neural network (DNN) for detecting driver drowsiness in videos. The proposed DNN model that receives driver's faces extracted from video frames as inputs consists of three components - a convolutional neural network (CNN), a convolutional control gate-based recurrent neural network (ConvCGRNN), and a voting layer. The CNN is to learn facial representations from global faces which are then fed to the ConvCGRNN to learn their temporal dependencies. The voting layer works like an ensemble of many sub-classifiers to predict drowsiness state. Experimental results on the NTHU-DDD dataset show that our model not only achieve a competitive accuracy of 84.81% without any post-processing but it can work in real-time with a high speed of about 100 fps.
Yukihiro TAGAMI Hayato KOBAYASHI Shingo ONO Akira TAJIMA
Modeling user activities on the Web is a key problem for various Web services, such as news article recommendation and ad click prediction. In our work-in-progress paper[1], we introduced an approach that summarizes each sequence of user Web page visits using Paragraph Vector[3], considering users and URLs as paragraphs and words, respectively. The learned user representations are used among the user-related prediction tasks in common. In this paper, on the basis of analysis of our Web page visit data, we propose Backward PV-DM, which is a modified version of Paragraph Vector. We show experimental results on two ad-related data sets based on logs from Web services of Yahoo! JAPAN. Our proposed method achieved better results than those of existing vector models.
Yutaro ONO Yuhei MORIMOTO Reiji HATTORI Masayuki WATANABE Nanae MICHIDA Kazuo NISHIKAWA
We present a smart steering wheel that detects the gripping position and area, as well as the distance to the approaching driver's hands by measuring the resonant frequency and its resistance value in an LCR circuit composed of the floating capacitance between the gripping hand and the electrode of the steering, and the body resistance. The resonant frequency measurement provides a high sensitivity that enables the estimation of the distance to the approaching hand, the gripping area of a gloved hand, and for covering the steering surface with any type of insulating material. This system can be applied for drowsiness detection, driving technique improvements, and for customization of the driving settings.
Toshiko TOMINAGA Kanako SATO Noriko YOSHIMURA Masataka MASUDA Hitoshi AOKI Takanori HAYASHI
Web browsing services are expanding as smartphones are becoming increasingly popular worldwide. To provide customers with appropriate quality of web-browsing services, quality design and in-service quality management on the basis of quality of experience (QoE) is important. We propose a web-browsing QoE estimation model. The most important QoE factor for web-browsing is the waiting time for a web page to load. Next, the variation in the communication quality based on a mobile network should be considered. We developed a subjective quality assessment test to clarify QoE characteristics in terms of waiting time using 20 different types of web pages and constructed a web-page QoE estimation model. We then conducted a subjective quality assessment test of web-browsing to clarify the relationship between web-page QoE and web-browsing QoE for three web sites. We obtained the following two QoE characteristics. First, the main factor influencing web-browsing QoE is the average web-page QoE. Second, when web-page QoE variation occurs, a decrease in web-page QoE with a huge amplitude causes the web-browsing QoE to decrease. We used these characteristics in constructing our web-browsing QoE estimation model. The verification test results using non-training data indicate the accuracy of the model. We also show that our findings are applicable to web-browsing quality design and solving management issues on the basis of QoE.
Due to the depletion of the public IPv4 address pool, Internet service providers will not be able to supply their new customers with public IPv4 addresses in the near future. Either they give private IPv4 addresses and use carrier grade NAT (CGN) or they move towards IPv6 and provide NAT64 service to the IPv6 only clients who want to reach IPv4 only servers. In both cases they must use a stateful NAT/NAT64 solution. When dimensioning a NAT/NAT64 gateway, the port number consumption of the clients is a key factor as the port numbers are 16 bits long and a unique one has to be provided for every session (when using traditional type NAPT, which does not include the destination IP address and port number in the tuple for the identification of TCP sessions) and a single web client may use several hundred sessions and an equal number of port numbers according to literature. In this paper, we present a method for the estimation of the port number consumption of web browsing. The method is based on the port number consumption measurements of the most popular web sites and their combination using the number of the visitors of the web sites as weight factors. We propose the resulting curve as an approximation of a general profile of the average port number consumption of web browsers after the first click, but without taking into consideration the effect of the web users' browsing behavior. We also discuss the case of the extended NAPT, which can reuse the source port numbers towards different destination IP addresses and/or destination port numbers. We propose a formula and give measurement results for the extended NAPT gateways, too. We disclose the measurement method in detail and provide the measurement scripts in Linux, too.
Erina ISHIKAWA Hiroaki KAWASHIMA Takashi MATSUYAMA
Studies on gaze analysis have revealed some of the relationships between viewers' gaze and their internal states (e.g., interests and intentions). However, understanding content browsing behavior in uncontrolled environments is still challenging because human gaze can be very complex; it is affected not only by viewers' states but also by the spatio-semantic structures of visual content. This study proposes a novel gaze analysis framework which introduces the content creators' point of view to understand the meaning of browsing behavior. Visual content such as web pages, digital articles and catalogs are comprised of structures intentionally designed by content creators, which we refer to as designed structure. This paper focuses on two design factors of designed structure: spatial structure of content elements (content layout), and their relationships such as “being in the same group”. The framework was evaluated with an experiment involving 12 participants, wherein the participant's state was estimated from their gaze behavior. The results from the experiment show that the use of design structure improved estimation accuracies of user states compared to other baseline methods.
Music-similarity computation is an essential building block for the browsing, retrieval, and indexing of digital music archives. This paper proposes a music similarity function based on the centroid model, which divides the feature space into non-overlapping clusters for the efficient computation of the timber distance of two songs. We place particular emphasis on the centroid deviation as a feature for music-similarity computation. Experiments show that the centroid-model representation of the auditory features is promising for music-similarity computation.
Xin FAN Hisashi MIYAMORI Katsumi TANAKA Mingjing LI
As the amount of recorded TV content is increasing rapidly, people need active and interactive browsing methods. In this paper, we use both text information from closed captions and visual information from video frames to generate links to enable users to easily explore not only the original video content but also augmented information from the Web. This solution especially shows its superiority when the video content cannot be fully represented by closed captions. A prototype system was implemented and some experiments were carried out to prove its effectiveness and efficiency.
Jason J. JUNG Kee-Sung LEE Seung-Bo PARK Geun-Sik JO
Web browsing task is based on depth-first searching scheme, so that searching relevant information from Web may be very tedious. In this paper, we propose personal browsing assistant system based on user intentions modeling. Before explicitly requested by a user, this system can analyze the prefetched resources from the hyperlinked Webpages and compare them with the estimated user intention, so that it can help him to make a better decision like which Webpage should be requested next. More important problem is the semantic heterogeneity between Web spaces. It makes the understandability of locally annotated resources more difficult. We apply semantic annotation, which is a transcoding procedure with the global ontology. Therefore, each local metadata can be semantically enriched, and efficiently comparable. As testing bed of our experiment, we organized three different online clothes stores whose images are annotated by semantically heterogeneous metadata. We simulated virtual customers navigating these cyberspaces. According to the predefined preferences of customer models, they conducted comparison-shopping. We have shown the reasonability of supporting the Web browsing, and its performance was evaluated as measuring the total size of browsed hyperspace.
Juan D. VELASQUEZ Hiroshi YASUDA Terumasa AOKI Richard WEBER
The behavior of visitors browsing in a web site offers a lot of information about their requirements and the way they use the respective site. Analyzing such behavior can provide the necessary information in order to improve the web site's structure. The literature contains already several suggestions on how to characterize web site usage and to identify the respective visitor requirements based on clustering of visitor sessions. Here we propose to combine visitor behavior with the content of the respective web pages and the similarity between different page sequences in order to define a similarity measure between different visits. This similarity serves as input for clustering of visitor sessions. The application of our approach to a bank's web site and its visitor sessions shows its potential for internet-based businesses.
Takahiro HAMADA Kazumasa ADACHI Tomoaki NAKANO Shin YAMAMOTO
It is inevitable for driver assist and warning systems to consider the drivers' state of consciousness. Drowsiness is one of the important factors in estimating the drivers' state of consciousness. A Method to extract the driver's initial stage of drowsiness was developed by means of the eyelid's opening relevant to each various characteristic of objects with motion pictures processing in the actual driving environment. The result was that an increase of the long eyelid closure time was the key factor in estimating the initial stage of drivers' drowsiness while driving. And the state of drowsiness could be presumed by checking the frequencies of long eyelid closure time per unit period.
Self-organizing map is a widely used tool in high-dimensional data visualization. However, despite its benefits of plotting very high-dimensional data on a low-dimensional grid, browsing and understanding the meaning of a trained map turn to be a difficult task -- specially when number of nodes or the size of data increases. Though there are some well-known techniques to visualize SOMs, they mainly deals with cluster boundaries and they fail to consider raw information available in original data in browsing SOMs. In this paper, we propose our Factor controlled Hierarchical SOM that enables us select number of data to train and label a particular map based on a pre-defined factor and provides consistent hierarchical SOM browsing.
Yanhua QU Makoto NAKASHIMA Tetsuro ITO
An information retrieval model that intends to conceptually deal with the documents in the Internet is proposed. The first half of this model is a stage to select the documents, which may meet the user's long-term interests, by employing a filtering or retrieval system. The latter half is a stage for linear document arrangement and for adjacent cluster based browsing. For the document collection filtered out, similarity matrices are computed and then the documents are arranged such that the highly similar ones are adjacently placed. By this treatment the documents are considered to form the clusters, some of which are adjacently placed when they include similar documents. A user can satisfy her/his needs by first browsing in the clusters containing documents highly similar to a query, and next by extending the browsing process into the clusters adjacent to the ones just examined. In the adjacent clusters the documents having no keywords common to but conceptually related to the query can be found. Computational and statistical evaluations were done on two standard test collections. A virtual space navigator is also designed by using JAVA to assist a user in the browsing task.
Yukinobu TANIGUCHI Akihito AKUTSU Yoshinobu TONOMURA
Browsing is an important function supporting efficient access to relevant information in video archives. In this paper, we present PanoramaExcerpts -- a video browsing interface that shows a catalogue of two types of video icons: panoramic and keyframe icons. A panoramic icon is automatically synthesized from a video segment taken with camera pan or tilt using a camera parameter estimation technique. One keyframe icon is extracted for each shot to supplement the panoramic icons. A panoramic icon represents the entire visible contents of a scene extended with a camera pan or tilt, which is difficult to represent using a single keyframe. A graphical representation, called camera-work trajectory, is also proposed to show the direction and the speed of camera operation. For the automatic generation of PanoramaExcerpts, we propose an approach to integrate the following: (a) a shot-change detection method; (b) a method for locating segments that contain smooth camera operations; (c) a layout method for packing icons in a space-efficient manner. In this paper, we mainly describe (b) and (c) with experimental results.
Masami TOKUMITSU Kazumi NISHIMURA Makoto HIRANO Kimiyoshi YAMASAKI
A 0.1-µm gate-length GaAs MESFET technology is reported. A 48.3-GHz dynamic-frequency divider, and an amplifier with 20-dB gain and 17.5-GHz bandwidth are successfully fabricated by integrating over-100-GHz-cut-off frequency MESFETs using a new lightly-doped drain structure with a buried p-layer (BP-LDD) device structure.
Kazumi NISHIMURA Kiyomitsu ONODERA Kou INOUE Masami TOKUMITSU Fumiaki HYUGA Kimiyoshi YAMASAKI
We have developed a planar devic technology consisting of 0.15-µm Au/WSiN-gate GaAs-heterostructure MESFETs (HMESFETs) fabricated by self-aligned ion-implantation. The gate-drain breakdown voltage has been improved to 10 V by using an asymmetric LDD structure, and the maximum oscillation frequency is 190 GHz. Because asymmetric and symmetric FETs can be fabricated simultaneously, this technology is suitable for use in making multi-functional millimeter-wave MMICs.
Video compression technologies such as MPEG have enabled the efficient use of video data in the computer environment. However, the compressed video information still has a huge amount of data compared with the other media such as text, audio, and graphics. Therefore, it is very important to handle the video information in a networked database for the efficient use of resources like storage media. Furthermore, in the networked database, its retrieval methods including search and delivery become the key issues especially for the video information which requires a large network bandwidth. In this paper, a video browsing method using an automatic fast scene cut detection for networked video database access is described. The scene cut is defined as the scene change frame and is detected by temporal change in interframe luminance difference and chrominance correlation which are obtained from spatio-temporally scaled image directly extracted from the MPEG compressed video without any complex processing of video decoding. The detected scene change frames are further investigated to exploit the relationship between the scene cuts and are classified in order to make a hierarchical indexing. These results of detection are stored as an scene index file using the MPEG format. The simulation results are also presented for several test video sequences to show that these methods have enabled the efficient video database construction and accessing.