Promoting the use of public transport (PT) is considered to be an effective way to reduce the number of passenger cars. The concept of Mobility-as-a-Service (MaaS), which began in Europe and is now spreading rapidly around the world, is expected to help to improve the convenience of PT on the viewpoint of users, using the latest information communication technology and Internet of Things technologies. This paper outlines the concept of MaaS in Europe and the efforts made at the policy level. It also focuses on the development of MaaS from the viewpoint of promoting the use of PT in Japan.
Mahmud Dwi SULISTIYO Yasutomo KAWANISHI Daisuke DEGUCHI Ichiro IDE Takatsugu HIRAYAMA Jiang-Yu ZHENG Hiroshi MURASE
Numerous applications such as autonomous driving, satellite imagery sensing, and biomedical imaging use computer vision as an important tool for perception tasks. For Intelligent Transportation Systems (ITS), it is required to precisely recognize and locate scenes in sensor data. Semantic segmentation is one of computer vision methods intended to perform such tasks. However, the existing semantic segmentation tasks label each pixel with a single object's class. Recognizing object attributes, e.g., pedestrian orientation, will be more informative and help for a better scene understanding. Thus, we propose a method to perform semantic segmentation with pedestrian attribute recognition simultaneously. We introduce an attribute-aware loss function that can be applied to an arbitrary base model. Furthermore, a re-annotation to the existing Cityscapes dataset enriches the ground-truth labels by annotating the attributes of pedestrian orientation. We implement the proposed method and compare the experimental results with others. The attribute-aware semantic segmentation shows the ability to outperform baseline methods both in the traditional object segmentation task and the expanded attribute detection task.
Akira John SUZUKI Masahiro YAMAMOTO Kiyoshi MIZUI
There is currently much interest in the development of Optic Wireless and Visible Light Communication (VLC) systems in the ITS field. Research in VLC and boomerang systems in particular often remain at a theoretical or computer-simulated level. This paper reports the 3-stage development of a boomerang prototype communication and ranging system using visible light V2V communication via LEDs and photodiodes, with direct-sequence spread spectrum techniques. The system uses simple and widely available components aiming for a low-cost frugal innovation approach. Results show that while we have to improve the prototype distance measurement unit due to a margin of error, simultaneous communication and ranging is possible with our newly designed prototype. The benefits of further research and development of boomerang technology prototypes are confirmed.
Shu FUJITA Keita TAKAHASHI Toshiaki FUJII
A light field, which is equivalent to a dense set of multi-view images, has various applications such as depth estimation and 3D display. One of the essential problems in light field applications is light field interpolation, i.e., view interpolation. The interpolation accuracy is enhanced by exploiting an inherent property of a light field. One example is that an epipolar plane image (EPI), which is a 2D subset of the 4D light field, consists of many lines, and these lines have almost the same slope in a local region. This structure induces a sparse representation in the frequency domain, where most of the energy resides on a line passing through the origin. On the basis of this observation, we propose a group sparsity prior suitable for light fields to exploit their line structure fully for interpolation. Specifically, we designed the directional groups in the discrete Fourier transform (DFT) domain so that the groups can represent the concentration of the energy, and we thereby formulated an LF interpolation problem as an overlapping group lasso. We also introduce several techniques to improve the interpolation accuracy such as applying a window function, determining group weights, expanding processing blocks, and merging blocks. Our experimental results show that the proposed method can achieve better or comparable quality as compared to state-of-the-art LF interpolation methods such as convolutional neural network (CNN)-based methods.
Huan-Bang LI Kenichi TAKIZAWA Fumihide KOJIMA
Because of its high throughput potentiality on short-range communications and inherent superiority of high precision on ranging and localization, ultra-wideband (UWB) technology has been attracting attention continuously in research and development (R&D) as well as in commercialization. The first domestic regulation admitting indoor UWB in Japan was released by the Ministry of Internal Affairs and Communications (MIC) in 2006. Since then, several revisions have been made in conjunction with UWB commercial penetration, emerging new trends of industrial demands, and coexistence evaluation with other wireless systems. However, it was not until May 2019 that MIC released a new revision to admit outdoor UWB. Meanwhile, the IEEE 802 LAN/MAN Standards Committee has been developing several UWB related standards or amendments accordingly for supporting different use cases. At the time when this paper is submitted, a new amendment known as IEEE 802.15.4z is undergoing drafting procedure which is expected to enhance ranging ability for impulse radio UWB (IR-UWB). In this paper, we first review the domestic UWB regulation and some of its revisions to get a picture of the domestic regulation transition from indoor to outdoor. We also foresee some anticipating changes in future revisions. Then, we overview several published IEEE 802 standards or amendments that are related to IR-UWB. Some features of IEEE 802.15.4z in drafting are also extracted from open materials. Finally, we show with our recent research results that time bias internal a transceiver becomes important for increasing localization accuracy.
Takamasa SHIMADA Noriko KONNO Atsuya YOKOI Noriharu MIYAHO
Visible light communication (VLC) will play a wide variety of important roles in future communication services. This paper deals with color shift keying (CSK) for the modulation of visible light communications. There are some previous studies about psychological and physiological effects of colors. These studies implied that color offset CSKs have psychological and physiological effects, which normal CSK doesn't have. This paper evaluates the psychological and physiological effects of color offset CSKs compared with normal CSK based on interviews and electroencephalogram (alpha wave, beta wave, and P300) measurements. This study evaluates the feasibility of visible light communication providing added value by measuring arousal, rest, visual attraction, task performance, capacity of working memory, and response for the CSK codes. The results showed that red-, green- and blue-offset CSK have specific features. Red-offset CSK induces excitement and increasing wakefulness levels, attracts attention, enlarges capacity of working memory, raises task performance, and induces fast responses. Green-offset CSK maintains rest levels, elevates relaxation levels, reduces stress, raises task performance, and induces fast responses. Blue-offset CSK maintains rest levels and induces fast responses. It is thought that we can use color offset CSK appropriately and provide added value to their application by considering the results of psychological and physiological investigations. Red-offset CSK is thought to be suitable for commercial advertisements. Green- and blue-offset CSK are thought to be suitable for wireless communication environments in hospitals. Red- and green-offset CSK are thought to be suitable for wireless communication environments in business. Red-, green- and blue-offset CSK are thought to be suitable for use in intelligent transportation systems (ITS).
Takanori ISOBE Kazuhiko MINEMATSU
In this paper, we analyze the security of an end-to-end encryption scheme (E2EE) of LINE, a.k.a Letter Sealing. LINE is one of the most widely-deployed instant messaging applications, especially in East Asia. By a close inspection of their protocols, we give several attacks against the message integrity of Letter Sealing. Specifically, we propose forgery and impersonation attacks on the one-to-one message encryption and the group message encryption. All of our attacks are feasible with the help of an end-to-end adversary, who has access to the inside of the LINE server (e.g. service provider LINE themselves). We stress that the main purpose of E2EE is to provide a protection against the end-to-end adversary. In addition, we found some attacks that even do not need the help of E2E adversary, which shows a critical security flaw of the protocol. Our results reveal that the E2EE scheme of LINE do not sufficiently guarantee the integrity of messages compared to the state-of-the-art E2EE schemes such as Signal, which is used by WhatApp and Facebook Messenger. We also provide some countermeasures against our attacks. We have shared our findings with LINE corporation in advance. The LINE corporation has confirmed our attacks are valid as long as the E2E adversary is involved, and officially recognizes our results as a vulnerability of encryption break.
Shengnan YAN Mingxin LIU Jingjing SI
In cognitive radio (CR) networks, spectrum sensing is an essential task for enabling dynamic spectrum sharing. However, the problem becomes quite challenging in wideband spectrum sensing due to high sampling pressure, limited power and computing resources, and serious channel fading. To overcome these challenges, this paper proposes a distributed collaborative spectrum sensing scheme based on 1-bit compressive sensing (CS). Each secondary user (SU) performs local 1-bit CS and obtains support estimate information from the signal reconstruction. To utilize joint sparsity and achieve spatial diversity, the support estimate information among the network is fused via the average consensus technique based on distributed computation and one-hop communications. Then the fused result on support estimate is used as priori information to guide the next local signal reconstruction, which is implemented via our proposed weighted binary iterative hard thresholding (BIHT) algorithm. The local signal reconstruction and the distributed fusion of support information are alternately carried out until reliable spectrum detection is achieved. Simulations testify the effectiveness of our proposed scheme in distributed CR networks.
Yuta UKON Koji YAMAZAKI Koyo NITTA
Advanced information-processing services based on cloud computing are in great demand. However, users want to be able to customize cloud services for their own purposes. To provide image-processing services that can be optimized for the purpose of each user, we propose a technique for chaining image-processing functions in a CPU-field programmable gate array (FPGA) coupled server architecture. One of the most important requirements for combining multiple image-processing functions on a network, is low latency in server nodes. However, large delay occurs in the conventional CPU-FPGA architecture due to the overheads of packet reordering for ensuring the correctness of image processing and data transfer between the CPU and FPGA at the application level. This paper presents a CPU-FPGA server architecture with a real-time packet reordering circuit for low-latency image processing. In order to confirm the efficiency of our idea, we evaluated the latency of histogram of oriented gradients (HOG) feature calculation as an offloaded image-processing function. The results show that the latency is about 26 times lower than that of the conventional CPU-FPGA architecture. Moreover, the throughput decreased by less than 3.7% under the worst-case condition where 90 percent of the packets are randomly swapped at a 40-Gbps input rate. Finally, we demonstrated that a real-time video monitoring service can be provided by combining image processing functions using our architecture.
Takehiro NAGATO Takumi TSUTANO Tomio KAMADA Yumi TAKAKI Chikara OHTA
In this article, we propose a data framework for edge computing that allows developers to easily attain efficient data transfer between mobile devices or users. We propose a distributed key-value storage platform for edge computing and its explicit data distribution management method that follows the publish/subscribe relationships specific to applications. In this platform, edge servers organize the distributed key-value storage in a uniform namespace. To enable fast data access to a record in edge computing, the allocation strategy of the record and its cache on the edge servers is important. Our platform offers distributed objects that can dynamically change their home server and allocate cache objects proactively following user-defined rules. A rule is defined in a declarative manner and specifies where to place cache objects depending on the status of the target record and its associated records. The system can reflect record modification to the cached records immediately. We also integrate a push notification system using WebSocket to notify events on a specified table. We introduce a messaging service application between mobile appliances and several other applications to show how cache rules apply to them. We evaluate the performance of our system using some sample applications.
Yuhuan WANG Hang YIN Zhanxin YANG Yansong LV Lu SI Xinle YU
In this paper, we propose an adaptive fusion successive cancellation list decoder (ADF-SCL) for polar codes with single cyclic redundancy check. The proposed ADF-SCL decoder reasonably avoids unnecessary calculations by selecting the successive cancellation (SC) decoder or the adaptive successive cancellation list (AD-SCL) decoder depending on a log-likelihood ratio (LLR) threshold in the decoding process. Simulation results show that compared to the AD-SCL decoder, the proposed decoder can achieve significant reduction of the average complexity in the low signal-to-noise ratio (SNR) region without degradation of the performance. When Lmax=32 and Eb/N0=0.5dB, the average complexity of the proposed decoder is 14.23% lower than that of the AD-SCL decoder.
Takashi YOKOTA Kanemitsu OOTSU Takeshi OHKAWA
Inter-node communication is essential in parallel computation. The performance of parallel processing depends on the efficiencies in both computation and communication, thus, the communication cost is not negligible. A parallel application program involves a logical communication structure that is determined by the interchange of data between computation nodes. Sometimes the logical communication structure mismatches to that in a real parallel machine. This mismatch results in large communication costs. This paper addresses the node-mapping problem that rearranges logical position of node so that the degree of mismatch is decreased. This paper assumes that parallel programs execute one or more collective communications that follow specific traffic patterns. An appropriate node-mapping achieves high communication performance. This paper proposes a strong heuristic method for solving the node-mapping problem and adapts the method to a genetic algorithm. Evaluation results reveal that the proposed method achieves considerably high performance; it achieves 8.9 (4.9) times speed-up on average in single-(two-)traffic-pattern cases in 32×32 torus networks. Specifically, for some traffic patterns in small-scale networks, the proposed method finds theoretically optimized solutions. Furthermore, this paper discusses in deep about various issues in the proposed method that employs genetic algorithm, such as population of genes, number of generations, and traffic patterns. This paper also discusses applicability to large-scale systems for future practical use.
The spectrum sensing of the orthogonal frequency division multiplexing (OFDM) system in cognitive radio (CR) has always been challenging, especially for user terminals that utilize the full-duplex (FD) mode. We herein propose an advanced FD spectrum-sensing scheme that can be successfully performed even when severe self-interference is encountered from the user terminal. Based on the “classification-converted sensing” framework, the cyclostationary periodogram generated by OFDM pilots is exhibited in the form of images. These images are subsequently plugged into convolutional neural networks (CNNs) for classifications owing to the CNN's strength in image recognition. More importantly, to realize spectrum sensing against residual self-interference, noise pollution, and channel fading, we used adversarial training, where a CR-specific, modified training database was proposed. We analyzed the performances exhibited by the different architectures of the CNN and the different resolutions of the input image to balance the detection performance with computing capability. We proposed a design plan of the signal structure for the CR transmitting terminal that can fit into the proposed spectrum-sensing scheme while benefiting from its own transmission. The simulation results prove that our method has excellent sensing capability for the FD system; furthermore, our method achieves a higher detection accuracy than the conventional method.
Koichi HIRAYAMA Jun-ichiro SUGISAKA Takashi YASUI
We propose the design method of a compact long-wavelength-pass filter implemented in a two-dimensional metal-dielectric-metal (MDM) waveguide with three stubs using a transmission line model based on a low-pass prototype filter, and present the wavelength characteristics for filters in an MDM waveguide based on 0.5- and 3.0-dB equal-ripple low-pass prototype filters.
KokSheik WONG ChuanSheng CHAN AprilPyone MAUNGMAUNG
With massive utilization of video in every aspect of our daily lives, managing videos is crucial and demanding. The rich literature of data embedding has proven its viability in managing as well as enriching videos and other multimedia contents, but conventional methods are designed to operate in the media/compression layer. In this work, the synchronization between the audio-video and subtitle tracks within an MP4 format container is manipulated to insert data. Specifically, the data are derived from the statistics of the audio samples and video frames, and it serves as the authentication data for verification purpose. When needed, the inserted authentication data can be extracted and compared against the information computed from the received audio samples and video frames. The proposed method is lightweight because simple statistics, i.e., ‘0’ and ‘1’ at the bit stream level, are treated as the authentication data. Furthermore, unlike conventional MP4 container format-based data insertion technique, the bit stream size remains unchanged before and after data insertion using the proposed method. The proposed authentication method can also be deployed for joint utilization with any existing authentication technique for audio / video as long as these media can be multiplexed into a single bit stream and contained within an MP4 container. Experiments are carried out to verify the basic functionality of the proposed technique as an authentication method.
(k,n)-visual secret sharing scheme ((k,n)-VSSS) is a method to divide a secret image into n images called shares that enable us to restore the original image by only stacking at least k of them without any complicated computations. In this paper, we consider (2,2)-VSSS to share two secret images at the same time only by two shares, and investigate the methods to improve the quality of decoded images. More precisely, we consider (2,2)-VSSS in which the first secret image is decoded by stacking those two shares in the usual way, while the second one is done by stacking those two shares in the way that one of them is used reversibly. Since the shares must have some subpixels that inconsistently correspond to pixels of the secret images, the decoded pixels do not agree with the corresponding pixels of the secret images, which causes serious degradation of the quality of decoded images. To reduce such degradation, we propose several methods to construct shares that utilize 8-neighbor Laplacian filter and halftoning. Then we show that the proposed methods can effectively improve the quality of decoded images. Moreover, we demonstrate that the proposed methods can be naturally extended to (2,2)-VSSS for RGB images.
Ryota KAMINISHI Haruna MIYAMOTO Sayaka SHIOTA Hitoshi KIYA
This study evaluates the effects of some non-learning blind bandwidth extension (BWE) methods on state-of-the-art automatic speaker verification (ASV) systems. Recently, a non-linear bandwidth extension (N-BWE) method has been proposed as a blind, non-learning, and light-weight BWE approach. Other non-learning BWEs have also been developed in recent years. For ASV evaluations, most data available to train ASV systems is narrowband (NB) telephone speech. Meanwhile, wideband (WB) data have been used to train the state-of-the-art ASV systems, such as i-vector, d-vector, and x-vector. This can cause sampling rate mismatches when all datasets are used. In this paper, we investigate the influence of sampling rate mismatches in the x-vector-based ASV systems and how non-learning BWE methods perform against them. The results showed that the N-BWE method improved the equal error rate (EER) on ASV systems based on the x-vector when the mismatches were present. We researched the relationship between objective measurements and EERs. Consequently, the N-BWE method produced the lowest EERs on both ASV systems and obtained the lower RMS-LSD value and the higher STOI score.
Weiqing TONG Haisheng LI Guoyue CHEN
Blob detection is an important part of computer vision and a special case of region detection with important applications in the image analysis. In this paper, the dilation operator in standard mathematical morphology is firstly extended to the order dilation operator of soft morphology, three soft morphological filters are designed by using the operator, and a novel blob detection algorithm called SMBD is proposed on that basis. SMBD had been proven to have better performance of anti-noise and blob shape detection than similar blob filters based on mathematical morphology like Quoit and N-Quoit in terms of theoretical and experimental aspects. Additionally, SMBD was also compared to LoG and DoH in different classes, which are the most commonly used blob detector, and SMBD also achieved significantly great results.
To enhance cover song identification accuracy on a large-size music archive, a song-level feature summarization method is proposed by using multi-scale representation. The chroma n-grams are extracted in multiple scales to cope with both global and local tempo changes. We derive index from the extracted n-grams by clustering to reduce storage and computation for DB search. Experiments on the widely used music datasets confirmed that the proposed method achieves the state-of-the-art accuracy while reducing cost for cover song search.
Huyen T. T. TRAN Trang H. HOANG Phu N. MINH Nam PHAM NGOC Truong CONG THANG
Thanks to the ability to bring immersive experiences to users, Virtual Reality (VR) technologies have been gaining popularity in recent years. A key component in VR systems is omnidirectional content, which can provide 360-degree views of scenes. However, at a given time, only a portion of the full omnidirectional content, called viewport, is displayed corresponding to the user's current viewing direction. In this work, we first develop Weighted-Viewport PSNR (W-VPSNR), an objective quality metric for quality assessment of omnidirectional content. The proposed metric takes into account the foveation feature of the human visual system. Then, we build a subjective database consisting of 72 stimuli with spatial varying viewport quality. By using this database, an evaluation of the proposed metric and four conventional metrics is conducted. Experiment results show that the W-VPSNR metric well correlates with the mean opinion scores (MOS) and outperforms the conventional metrics. Also, it is found that the conventional metrics do not perform well for omnidirectional content.