Zhenhai TAN Yun YANG Xiaoman WANG Fayez ALQAHTANI
Chenrui CHANG Tongwei LU Feng YAO
Takuma TSUCHIDA Rikuho MIYATA Hironori WASHIZAKI Kensuke SUMOTO Nobukazu YOSHIOKA Yoshiaki FUKAZAWA
Shoichi HIROSE Kazuhiko MINEMATSU
Toshimitsu USHIO
Yuta FUKUDA Kota YOSHIDA Takeshi FUJINO
Qingping YU Yuan SUN You ZHANG Longye WANG Xingwang LI
Qiuyu XU Kanghui ZHAO Tao LU Zhongyuan WANG Ruimin HU
Lei Zhang Xi-Lin Guo Guang Han Di-Hui Zeng
Meng HUANG Honglei WEI
Yang LIU Jialong WEI Shujian ZHAO Wenhua XIE Niankuan CHEN Jie LI Xin CHEN Kaixuan YANG Yongwei LI Zhen ZHAO
Ngoc-Son DUONG Lan-Nhi VU THI Sinh-Cong LAM Phuong-Dung CHU THI Thai-Mai DINH THI
Lan XIE Qiang WANG Yongqiang JI Yu GU Gaozheng XU Zheng ZHU Yuxing WANG Yuwei LI
Jihui LIU Hui ZHANG Wei SU Rong LUO
Shota NAKAYAMA Koichi KOBAYASHI Yuh YAMASHITA
Wataru NAKAMURA Kenta TAKAHASHI
Chunfeng FU Renjie JIN Longjiang QU Zijian ZHOU
Masaki KOBAYASHI
Shinichi NISHIZAWA Masahiro MATSUDA Shinji KIMURA
Keisuke FUKADA Tatsuhiko SHIRAI Nozomu TOGAWA
Yuta NAGAHAMA Tetsuya MANABE
Baoxian Wang Ze Gao Hongbin Xu Shoupeng Qin Zhao Tan Xuchao Shi
Maki TSUKAHARA Yusaku HARADA Haruka HIRATA Daiki MIYAHARA Yang LI Yuko HARA-AZUMI Kazuo SAKIYAMA
Guijie LIN Jianxiao XIE Zejun ZHANG
Hiroki FURUE Yasuhiko IKEMATSU
Longye WANG Lingguo KONG Xiaoli ZENG Qingping YU
Ayaka FUJITA Mashiho MUKAIDA Tadahiro AZETSU Noriaki SUETAKE
Xingan SHA Masao YANAGISAWA Youhua SHI
Jiqian XU Lijin FANG Qiankun ZHAO Yingcai WAN Yue GAO Huaizhen WANG
Sei TAKANO Mitsuji MUNEYASU Soh YOSHIDA Akira ASANO Nanae DEWAKE Nobuo YOSHINARI Keiichi UCHIDA
Kohei DOI Takeshi SUGAWARA
Yuta FUKUDA Kota YOSHIDA Takeshi FUJINO
Mingjie LIU Chunyang WANG Jian GONG Ming TAN Changlin ZHOU
Hironori UCHIKAWA Manabu HAGIWARA
Atsuko MIYAJI Tatsuhiro YAMATSUKI Tomoka TAKAHASHI Ping-Lun WANG Tomoaki MIMOTO
Kazuya TANIGUCHI Satoshi TAYU Atsushi TAKAHASHI Mathieu MOLONGO Makoto MINAMI Katsuya NISHIOKA
Masayuki SHIMODA Atsushi TAKAHASHI
Yuya Ichikawa Naoko Misawa Chihiro Matsui Ken Takeuchi
Katsutoshi OTSUKA Kazuhito ITO
Rei UEDA Tsunato NAKAI Kota YOSHIDA Takeshi FUJINO
Motonari OHTSUKA Takahiro ISHIMARU Yuta TSUKIE Shingo KUKITA Kohtaro WATANABE
Iori KODAMA Tetsuya KOJIMA
Yusuke MATSUOKA
Yosuke SUGIURA Ryota NOGUCHI Tetsuya SHIMAMURA
Tadashi WADAYAMA Ayano NAKAI-KASAI
Li Cheng Huaixing Wang
Beining ZHANG Xile ZHANG Qin WANG Guan GUI Lin SHAN
Sicheng LIU Kaiyu WANG Haichuan YANG Tao ZHENG Zhenyu LEI Meng JIA Shangce GAO
Kun ZHOU Zejun ZHANG Xu TANG Wen XU Jianxiao XIE Changbing TANG
Soh YOSHIDA Nozomi YATOH Mitsuji MUNEYASU
Ryo YOSHIDA Soh YOSHIDA Mitsuji MUNEYASU
Nichika YUGE Hiroyuki ISHIHARA Morikazu NAKAMURA Takayuki NAKACHI
Ling ZHU Takayuki NAKACHI Bai ZHANG Yitu WANG
Toshiyuki MIYAMOTO Hiroki AKAMATSU
Yanchao LIU Xina CHENG Takeshi IKENAGA
Kengo HASHIMOTO Ken-ichi IWATA
Shota TOYOOKA Yoshinobu KAJIKAWA
Kyohei SUDO Keisuke HARA Masayuki TEZUKA Yusuke YOSHIDA
Hiroshi FUJISAKI
Tota SUKO Manabu KOBAYASHI
Akira KAMATSUKA Koki KAZAMA Takahiro YOSHIDA
Tingyuan NIE Jingjing NIE Kun ZHAO
Xinyu TIAN Hongyu HAN Limengnan ZHOU Hanzhou WU
Shibo DONG Haotian LI Yifei YANG Jiatianyi YU Zhenyu LEI Shangce GAO
Kengo NAKATA Daisuke MIYASHITA Jun DEGUCHI Ryuichi FUJIMOTO
Jie REN Minglin LIU Lisheng LI Shuai LI Mu FANG Wenbin LIU Yang LIU Haidong YU Shidong ZHANG
Ken NAKAMURA Takayuki NOZAKI
Yun LIANG Degui YAO Yang GAO Kaihua JIANG
Guanqun SHEN Kaikai CHI Osama ALFARRAJ Amr TOLBA
Zewei HE Zixuan CHEN Guizhong FU Yangming ZHENG Zhe-Ming LU
Bowen ZHANG Chang ZHANG Di YAO Xin ZHANG
Zhihao LI Ruihu LI Chaofeng GUAN Liangdong LU Hao SONG Qiang FU
Kenji UEHARA Kunihiko HIRAISHI
David CLARINO Shohei KURODA Shigeru YAMASHITA
Qi QI Zi TENG Hongmei HUO Ming XU Bing BAI
Ling Wang Zhongqiang Luo
Zongxiang YI Qiuxia XU
Donghoon CHANG Deukjo HONG Jinkeon KANG
Xiaowu LI Wei CUI Runxin LI Lianyin JIA Jinguo YOU
Zhang HUAGUO Xu WENJIE Li LIANGLIANG Liao HONGSHU
Seonkyu KIM Myoungsu SHIN Hanbeom SHIN Insung KIM Sunyeop KIM Donggeun KWON Deukjo HONG Jaechul SUNG Seokhie HONG
Manabu HAGIWARA
Yang LIU Yuqi XIA Haoqin SUN Xiaolei MENG Jianxiong BAI Wenbo GUAN Zhen ZHAO Yongwei LI
Speech emotion recognition (SER) has been a complex and difficult task for a long time due to emotional complexity. In this paper, we propose a multitask deep learning approach based on cascaded attention network and self-adaption loss for SER. First, non-personalized features are extracted to represent the process of emotion change while reducing external variables' influence. Second, to highlight salient speech emotion features, a cascade attention network is proposed, where spatial temporal attention can effectively locate the regions of speech that express emotion, while self-attention reduces the dependence on external information. Finally, the influence brought by the differences in gender and human perception of external information is alleviated by using a multitask learning strategy, where a self-adaption loss is introduced to determine the weights of different tasks dynamically. Experimental results on IEMOCAP dataset demonstrate that our method gains an absolute improvement of 1.97% and 0.91% over state-of-the-art strategies in terms of weighted accuracy (WA) and unweighted accuracy (UA), respectively.
The application of time-series prediction is very extensive, and it is an important problem across many fields, such as stock prediction, sales prediction, and loan prediction and so on, which play a great value in production and life. It requires that the model can effectively capture the long-term feature dependence between the output and input. Recent studies show that Transformer can improve the prediction ability of time-series. However, Transformer has some problems that make it unable to be directly applied to time-series prediction, such as: (1) Local agnosticism: Self-attention in Transformer is not sensitive to short-term feature dependence, which leads to model anomalies in time-series; (2) Memory bottleneck: The spatial complexity of regular transformation increases twice with the sequence length, making direct modeling of long time-series infeasible. In order to solve these problems, this paper designs an efficient model for long time-series prediction. It is a double pyramid bidirectional feature fusion mechanism network with parallel Temporal Convolution Network (TCN) and FastFormer. This network structure can combine the time series fine-grained information captured by the Temporal Convolution Network with the global interactive information captured by FastFormer, it can well handle the time series prediction problem.
Takayoshi SHOUDAI Satoshi MATSUMOTO Yusuke SUZUKI Tomoyuki UCHIDA Tetsuhiro MIYAHARA
A formal graph system (FGS for short) is a logic program consisting of definite clauses whose arguments are graph patterns instead of first-order terms. The definite clauses are referred to as graph rewriting rules. An FGS is shown to be a useful unifying framework for learning graph languages. In this paper, we show the polynomial-time PAC learnability of a subclass of FGS languages defined by parameterized hereditary FGSs with bounded degree, from the viewpoint of computational learning theory. That is, we consider VH-FGSLk,Δ(m, s, t, r, w, d) as the class of FGS languages consisting of graphs of treewidth at most k and of maximum degree at most Δ which is defined by variable-hereditary FGSs consisting of m graph rewriting rules having TGP patterns as arguments. The parameters s, t, and r denote the maximum numbers of variables, atoms in the body, and arguments of each predicate symbol of each graph rewriting rule in an FGS, respectively. The parameters w and d denote the maximum number of vertices of each hyperedge and the maximum degree of each vertex of TGP patterns in each graph rewriting rule in an FGS, respectively. VH-FGSLk,Δ(m, s, t, r, w, d) has infinitely many languages even if all the parameters are bounded by constants. Then we prove that the class VH-FGSLk,Δ(m, s, t, r, w, d) is polynomial-time PAC learnable if all m, s, t, r, w, d, Δ are constants except for k.
Tao LIU Meiyue WANG Dongyan JIA Yubo LI
In the massive machine-type communication scenario, aiming at the problems of active user detection and channel estimation in the grant-free non-orthogonal multiple access (NOMA) system, new sets of non-orthogonal spreading sequences are proposed by using the zero/low correlation zone sequence set with low correlation among multiple sets. The simulation results show that the resulting sequence set has low coherence, which presents reliable performance for channel estimation and active user detection based on compressed sensing. Compared with the traditional Zadoff-Chu (ZC) sequences, the new non-orthogonal spreading sequences have more flexible lengths, and lower peak-to-average power ratio (PAPR) and smaller alphabet size. Consequently, these sequences will effectively solve the problem of high PAPR of time domain signals and are more suitable for low-cost devices in massive machine-type communication.
For dichromats to receive the information represented in color images, it is important to study contrast improvement methods and quantitative evaluation indices of color conversion results. There is an index to evaluate the degree of contrast improvement and in this index, the contrast for dichromacy caused by the lightness component is given importance. In addition, random sampling was introduced in the computation of this index. Although the validity of the index has been shown through comparison with a subjective evaluation, it is considered that the following two points should be examined. First, should contrast for normal trichromacy caused by the lightness component also be attached importance. Second, the influence of random sampling should be examined in detail. In this paper, a new index is proposed and the above-mentioned points are examined. For the first point, the following is revealed through experiment. Consideration of the contrast for normal trichromacy caused by a lightness component that is the same as that for dichromacy may or may not result in a good outcome. The evaluation performance of the proposed index is equivalent to that of the previous index overall. It can be said that the proposed index is superior to the previous one in terms of the unity of evaluating contrast. For the second point, the computation time and the evaluation of significant digits are shown. In this paper, a sampling number such that the number of significant digits can be considered as three is used. In this case, the variation caused by random sampling is negligible compared with the range of the proposed index, whereas the computation time is about one-seventh that when the sampling is not adopted.
Yuxiang ZHANG Dehua LIU Chuanpeng SU Juncheng LIU
Uncovered muck truck detection aims to detect the muck truck and distinguish whether it is covered or not by dust-proof net to trace the source of pollution. Unlike traditional detection problem, recalling all uncovered trucks is more important than accurate locating for pollution traceability. When two objects are very close in an image, the occluded object may not be recalled because the non-maximum suppression (NMS) algorithm can remove the overlapped proposal. To address this issue, we propose a Location First NMS method to match the ground truth boxes and predicted boxes by position rather than class identifier (ID) in the training stage. Firstly, a box matching method is introduced to re-assign the predicted box ID using the closest ground truth one, which can avoid object missing when the IoU of two proposals is greater than the threshold. Secondly, we design a loss function to adapt the proposed algorithm. Thirdly, a uncovered muck truck detection system is designed using the method in a real scene. Experiment results show the effectiveness of the proposed method.
Wentao LYU Di ZHOU Chengqun WANG Lu ZHANG
In this paper, we present a novel discriminative dictionary learning (DDL) method for image classification. The local structural relationship between samples is first built by the Laplacian eigenmaps (LE), and then integrated into the basic DDL frame to suppress inter-class ambiguity in the feature space. Moreover, in order to improve the discriminative ability of the dictionary, the category label information of training samples is formulated into the objective function of dictionary learning by considering the discriminative promotion term. Thus, the data points of original samples are transformed into a new feature space, in which the points from different categories are expected to be far apart. The test results based on the real dataset indicate the effectiveness of this method.
Jingzhao DAI Ming LI Xuejiao HU Yang LI Sidan DU
Gaze following is the task of estimating where an observer is looking inside a scene. Both the observer and scene information must be learned to determine the gaze directions and gaze points. Recently, many existing works have only focused on scenes or observers. In contrast, revealed frameworks for gaze following are limited. In this paper, a gaze following method using a hybrid transformer is proposed. Based on the conventional method (GazeFollow), we conduct three developments. First, a hybrid transformer is applied for learning head images and gaze positions. Second, the pinball loss function is utilized to control the gaze point error. Finally, a novel ReLU layer with the reborn mechanism (reborn ReLU) is conducted to replace traditional ReLU layers in different network stages. To test the performance of our developments, we train our developed framework with the DL Gaze dataset and evaluate the model on our collected set. Through our experimental results, it can be proven that our framework can achieve outperformance over our referred methods.
Tian FANG Feng LIU Conggai LI Fangjiong CHEN Yanli XU
Underwater acoustic channels (UWA) are usually sparse, which can be exploited for adaptive equalization to improve the system performance. For the shallow UWA channels, based on the proportional minimum symbol error rate (PMSER) criterion, the adaptive equalization framework requires the sparsity selection. Since the sparsity of the L0 norm is stronger than that of the L1, we choose it to achieve better convergence. However, because the L0 norm leads to NP-hard problems, it is difficult to find an efficient solution. In order to solve this problem, we choose the Gaussian function to approximate the L0 norm. Simulation results show that the proposed scheme obtains better performance than the L1 based counterpart.
Sakyo HASHIMOTO Keigo TAKEUCHI
This letter simplifies and analyze existing state evolution recursions for conjugate gradient. The proposed simplification reduces the complexity for solving the recursions from cubic order to square order in the total number of iterations. The simplified recursions are still catastrophically sensitive to numerical errors, so that arbitrary-precision arithmetic is used for accurate evaluation of the recursions.
We propose a non-photorealistic rendering method to automatically generate reaction-diffusion-pattern-like images from photographic images. The proposed method uses smoothing filter with a circular window, and changes the size of the circular window depending on the position in photographic images. By partially changing the size of the circular window, the size of reaction-diffusion patterns can be changed partially. To verify the effectiveness of the proposed method, experiments were conducted to apply the proposed method to various photographic images.