1-14hit |
Yoshinori TANAKA Takashi DATEKI
Efficient multiplexing of ultra-reliable and low-latency communications (URLLC) and enhanced mobile broadband (eMBB) traffic, as well as ensuring the various reliability requirements of these traffic types in 5G wireless communications, is becoming increasingly important, particularly for vertical services. Interference management techniques, such as coordinated inter-cell scheduling, can enhance reliability in dense cell deployments. However, tight inter-cell coordination necessitates frequent information exchange between cells, which limits implementation. This paper introduces a novel RAN slicing framework based on centralized frequency-domain interference control per slice and link adaptation optimized for URLLC. The proposed framework does not require tight inter-cell coordination but can fulfill the requirements of both the decoding error probability and the delay violation probability of each packet flow. These controls are based on a power-law estimation of the lower tail distribution of a measured data set with a smaller number of discrete samples. As design guidelines, we derived a theoretical minimum radio resource size of a slice to guarantee the delay violation probability requirement. Simulation results demonstrate that the proposed RAN slicing framework can achieve the reliability targets of the URLLC slice while improving the spectrum efficiency of the eMBB slice in a well-balanced manner compared to other evaluated benchmarks.
Kairi TOKUDA Takehiro SATO Eiji OKI
Mobile edge computing (MEC) is a key technology for providing services that require low latency by migrating cloud functions to the network edge. The potential low quality of the wireless channel should be noted when mobile users with limited computing resources offload tasks to an MEC server. To improve the transmission reliability, it is necessary to perform resource allocation in an MEC server, taking into account the current channel quality and the resource contention. There are several works that take a deep reinforcement learning (DRL) approach to address such resource allocation. However, these approaches consider a fixed number of users offloading their tasks, and do not assume a situation where the number of users varies due to user mobility. This paper proposes Deep reinforcement learning model for MEC Resource Allocation with Dummy (DMRA-D), an online learning model that addresses the resource allocation in an MEC server under the situation where the number of users varies. By adopting dummy state/action, DMRA-D keeps the state/action representation. Therefore, DMRA-D can continue to learn one model regardless of variation in the number of users during the operation. Numerical results show that DMRA-D improves the success rate of task submission while continuing learning under the situation where the number of users varies.
Daisuke KOBAYASHI Ken NAKAMURA Masaki KITAHARA Tatsuya OSAWA Yuya OMORI Takayuki ONISHI Hiroe IWASAKI
This paper describes a novel low-latency 4K 60 fps HEVC (high efficiency video coding)/H.265 multi-channel encoding system with content-aware bitrate control for live streaming. Adaptive bitrate (ABR) streaming techniques, such as MPEG-DASH (dynamic adaptive streaming over HTTP) and HLS (HTTP live streaming), spread widely on Internet video streaming. Live content has increased with the expansion of streaming services, which has led to demands for traffic reduction and low latency. To reduce network traffic, we propose content-aware dynamic and seamless bitrate control that supports multi-channel real-time encoding for ABR, including 4K 60 fps video. Our method further supports chunked packaging transfer to provide low-latency streaming. We adopt a hybrid architecture consisting of hardware and software processing. The system consists of multiple 4K HEVC encoder LSIs that each LSI can encode 4K 60 fps or up to high-definition (HD) ×4 videos efficiently with the proposed bitrate control method. The software takes the packaging process according to the various streaming protocol. Experimental results indicate that our method reduces encoding bitrates obtained with constant bitrate encoding by as much as 56.7%, and the streaming latency over MPEG-DASH is 1.77 seconds.
Koji ISHIBASHI Takanori HARA Sota UCHIMURA Tetsuya IYE Yoshimi FUJII Takahide MURAKAMI Hiroyuki SHINBO
In this paper, we propose new radio access network (RAN) architecture for reliable millimeter-wave (mmWave) communications, which has the flexibility to meet users' diverse and fluctuating requirements in terms of communication quality. This architecture is composed of multiple radio units (RUs) connected to a common distributed unit (DU) via fronthaul links to virtually enlarge its coverage. We further present grant-free non-orthogonal multiple access (GF-NOMA) for low-latency uplink communications with a massive number of users and robust coordinated multi-point (CoMP) transmission using blockage prediction for uplink/downlink communications with a high data rate and a guaranteed minimum data rate as the technical pillars of the proposed RAN. The numerical results indicate that our proposed architecture can meet completely different user requirements and realize a user-centric design of the RAN for beyond 5G/6G.
Taiki YAMAE Naoki TAKEUCHI Nobuyuki YOSHIKAWA
The adiabatic quantum-flux-parametron (AQFP) is an energy-efficient superconductor logic device. In a previous study, we proposed a low-latency clocking scheme called delay-line clocking, and several low-latency AQFP logic gates have been demonstrated. In delay-line clocking, the latency between adjacent excitation phases is determined by the propagation delay of excitation currents, and thus the rising time of excitation currents should be sufficiently small; otherwise, an AQFP gate can switch before the previous gate is fully excited. This means that delay-line clocking needs high clock frequencies, because typical excitation currents are sinusoidal and the rising time depends on the frequency. However, AQFP circuits need to be tested in a wide frequency range experimentally. Hence, in the present study, we investigate AQFP circuits adopting delay-line clocking with square excitation currents to apply delay-line clocking in a low frequency range. Square excitation currents have shorter rising time than sinusoidal excitation currents and thus enable low frequency operation. We demonstrate an AQFP buffer chain with delay-line clocking using square excitation currents, in which the latency is approximately 20ps per gate, and confirm that the operating margin for the buffer chain is kept sufficiently wide at clock frequencies below 1GHz, whereas in the sinusoidal case the operating margin shrinks below 500MHz. These results indicate that AQFP circuits adopting delay-line clocking can operate in a low frequency range by using square excitation currents.
Hideya SO Kazuhiko FUKAWA Hayato SOYA Yuyuan CHANG
In unlicensed spectrum, wireless communications employing carrier sense multiple access with collision avoidance (CSMA/CA) suffer from longer transmission delay time as the number of user terminals (UTs) increases, because packet collisions are more likely to occur. To cope with this problem, this paper proposes a new multiuser detection (MUD) scheme that uses both request-to-send (RTS) and enhanced clear-to-send (eCTS) for high-reliable and low-latency wireless communications. As in conventional MUD scheme, the metric-combining MUD (MC-MUD) calculates log likelihood functions called metrics and accumulates the metrics for the maximum likelihood detection (MLD). To avoid increasing the number of states for MLD, MC-MUD forces the relevant UTs to retransmit their packets until all the collided packets are correctly detected, which requires a kind of central control and reduces the system throughput. To overcome these drawbacks, the proposed scheme, which is referred to as cancelling MC-MUD (CMC-MUD), deletes replicas of some of the collided packets from the received signals, once the packets are correctly detected during the retransmission. This cancellation enables new UTs to transmit their packets and then performs MLD without increasing the number of states, which improves the system throughput without increasing the complexity. In addition, the proposed scheme adopts RTS and eCTS. One UT that suffers from packet collision transmits RTS before the retransmission. Then, the corresponding access point (AP) transmits eCTS including addresses of the other UTs, which have experienced the same packet collision. To reproduce the same packet collision, these other UTs transmit their packets once they receive the eCTS. Computer simulations under one AP conditions evaluate an average carrier-to-interference ratio (CIR) range in which the proposed scheme is effective, and clarify that the transmission delay time of the proposed scheme is shorter than that of the conventional schemes. In two APs environments that can cause the hidden terminal problem, it is demonstrated that the proposed scheme achieves shorter transmission delay times than the conventional scheme with RTS and conventional CTS.
Krittin INTHARAWIJITR Katsuyoshi IIDA Hiroyuki KOGA Katsunori YAMAOKA
The Internet of Things (IoT) with its support for cyber-physical systems (CPS) will provide many latency-sensitive services that require very fast responses from network services. Mobile edge computing (MEC), one of the distributed computing models, is a promising component of the low-latency network architecture. In network architectures with MEC, mobile devices will offload heavy computing tasks to edge servers. There exist numbers of researches about low-latency network architecture with MEC. However, none of the existing researches simultaneously satisfy the followings: (1) guarantee the latency of computing tasks and (2) implement a real system. In this paper, we designed and implemented an MEC based network architecture that guarantees the latency of offloading tasks. More specifically, we first estimate the total latency including computing and communication ones at the centralized node called orchestrator. If the estimated value exceeds the latency requirement, the task will be rejected. We then evaluated its performance in terms of the blocking probability of the tasks. To analyze the results, we compared the performance between obtained from experiments and simulations. Based on the comparisons, we clarified that the computing latency estimation accuracy is a significant factor for this system.
Krittin INTHARAWIJITR Katsuyoshi IIDA Hiroyuki KOGA Katsunori YAMAOKA
Most of latency-sensitive mobile applications depend on computational resources provided by a cloud computing service. The problem of relying on cloud computing is that, sometimes, the physical locations of cloud servers are distant from mobile users and the communication latency is long. As a result, the concept of distributed cloud service, called mobile edge computing (MEC), is being introduced in the 5G network. However, MEC can reduce only the communication latency. The computing latency in MEC must also be considered to satisfy the required total latency of services. In this research, we study the impact of both latencies in MEC architecture with regard to latency-sensitive services. We also consider a centralized model, in which we use a controller to manage flows between users and mobile edge resources to analyze MEC in a practical architecture. Simulations show that the interval and controller latency trigger some blocking and error in the system. However, the permissive system which relaxes latency constraints and chooses an edge server by the lowest total latency can improve the system performance impressively.
Satoshi IMAMURA Eiji YOSHIDA Kazuichi OE
Emerging solid state drives (SSDs) based on a next-generation memory technology have been recently released in market. In this work, we call them low-latency SSDs because the device latency of them is an order of magnitude lower than that of conventional NAND flash SSDs. Although low-latency SSDs can drastically reduce an I/O latency perceived by an application, the overhead of OS processing included in the I/O latency has become noticeable because of the very low device latency. Since the OS processing is executed on a CPU core, its operating frequency should be maximized for reducing the OS overhead. However, a higher core frequency causes the higher CPU power consumption during I/O accesses to low-latency SSDs. Therefore, we propose the device utilization-aware DVFS (DU-DVFS) technique that periodically monitors the utilization of a target block device and applies dynamic voltage and frequency scaling (DVFS) to CPU cores executing I/O-intensive processes only when the block device is fully utilized. In this case, DU-DVFS can reduce the CPU power consumption without hurting performance because the delay of OS processing incurred by decreasing the core frequency can be hidden. Our evaluation with 28 I/O-intensive workloads on a real server containing an Intel® Optane™ SSD demonstrates that DU-DVFS reduces the CPU power consumption by 41.4% on average (up to 53.8%) with a negligible performance degradation, compared to a standard DVFS governor on Linux. Moreover, the evaluation with multiprogrammed workloads composed of I/O-intensive and non-I/O-intensive programs shows that DU-DVFS is also effective for them because it can apply DVFS only to CPU cores executing I/O-intensive processes.
Hong-Thu NGUYEN Xuan-Thuan NGUYEN Cong-Kha PHAM
COordinate Rotation DIgital Computer (CORDIC) is an efficient algorithm to compute elementary arithmetic such as trigonometric, exponent, and logarithm. However, the main drawback of the conventional CORDIC is that the number of iterations is equal to the number of angle constants. Among a great deal of research to overcome this disadvantage, angle recording method is an effective method because it is capable of reducing 50% of the number of iterations. Nevertheless, the hardware architecture of this algorithm is difficult to implement in pipeline. Therefore, a low-latency parallel pipeline hybrid adaptive CORDIC (PP-CORDIC) architecture is proposed in this paper. In the design hybrid architecture was exploited together with pipeline and parallel technique to achieve low latency. This design is able to operate at 122.6 MHz frequency and costs 8, 12, and 15 clock cycles latency in the best, average, and worst case, respectively. More significantly, the latency of PP-CORDIC in the worst case is 1.1X lower than that of the Altera's commercial floating-point sine and cosine IP cores.
Peter GUSEV Zhehao WANG Jeff BURKE Lixia ZHANG Takahiro YONEDA Ryota OHNISHI Eiichi MURAMOTO
Named Data Networking (NDN) is a proposed future Internet architecture that shifts the fundamental abstraction of the network from host-to-host communication to request-response for named, signed data-an information dissemination focused approach. This paper describes a general design for receiver-driven, real-time streaming data (RTSD) applications over the current NDN implementation that aims to take advantage of the architecture's unique affordances. It is based on experimental development and testing of running code for real-time video conferencing, a positional tracking system for interactive multimedia, and a distributed control system for live performance. The design includes initial approaches to minimizing latency, managing buffer size and Interest retransmission, and adapting retrieval to maximize bandwidth and control congestion. Initial implementations of these approaches are evaluated for functionality and performance results, and the potential for future research in this area, and improved performance as new features of the architecture become available, is discussed.
Juha PETÄJÄJÄRVI Heikki KARVONEN Konstantin MIKHAYLOV Aarno PÄRSSINEN Matti HÄMÄLÄINEN Jari IINATTI
This paper discusses the perspectives of using a wake-up receiver (WUR) in wireless body area network (WBAN) applications with event-driven data transfers. First we compare energy efficiency between the WUR-based and the duty-cycled medium access control protocol -based IEEE 802.15.6 compliant WBAN. Then, we review the architectures of state-of-the-art WURs and discuss their suitability for WBANs. The presented results clearly show that the radio frequency envelope detection based architecture features the lowest power consumption at a cost of sensitivity. The other architectures are capable of providing better sensitivity, but consume more power. Finally, we propose the design modification that enables using a WUR to receive the control commands beside the wake-up signals. The presented results reveal that use of this feature does not require complex modifications of the current architectures, but enables to improve energy efficiency and latency for small data blocks transfers.
Shinichi SUZUKI Takayuki NAKAGAWA Tetsuomi IKEDA
The Millimeter-wave Mobile Camera (MiMoCam) developed by NHK STRL uses millimeter-wave band (42 GHz/55 GHz) to transmit Hi-Vision TV picture with high quality and low latency. Multiple-input multiple-output (MIMO) technology which uses a number of antennas at both the transmitter and receiver can be adapted to use to transmit higher quality Hi-Vision TV picture. The camera was intended to be used in a studio environment where there is a high degree of multi-path, however there are also many requests for the MiMoCam to be used outdoor. This will present a different channel statistics where the camera will be operating in a near line-of-sight (LOS) environment without much reflected waves. We have conducted an outdoor transmission test and measured the outdoors transmission performance of the proposed MIMO system to clarify the possibility of using the MiMoCam in outdoor environment. This paper introduces the features of the MiMoCam system and the MIMO transmission technique used in the MiMoCam and presents the findings of this outdoor test. It was also confirmed that channel correlation of the MIMO propagation channels were suppressed by using orthogonally polarized waves and bit error rate (BER) characteristics with respect to the average receiving carrier-to-noise ratio (CNR) was improved. Finally, we could find the feasibility of the MiMoCam outdoor operation from these results.
Yasuo SUGURE Seiji TAKEUCHI Yuichi ABE Hiromichi YAMADA Kazuya HIRAYANAGI Akihiko TOMITA Kesami HAGIWARA Takeshi KATAOKA Takanori SHIMURA
A 32-bit embedded RISC microcontroller core targeted for automotive, industrial, and PC-peripheral applications has been developed to offer the smaller code size, lower-latency instruction and interrupt processing needed for next-generation microcontrollers. The 360 MIPS/400MFLOPS/200 MHz core--based on the Harvard bus architecture--uses 0.13/0.15-µm CMOS technology and consists of a CPU, FPU, and register banks. To reduce the size of the control programs, new instructions have been added to the instruction set. These new instructions, as well as an enhanced C compiler, produce object files about 25% smaller than those for a previous designed core. A dual-issue superscalar structure consisting of three- or five-stage pipelines provides instruction processing with low latency. The cycle performance is thus an average of 1.8 times faster than the previous designed core. The superscalar structure is used to save 19 CPU registers in parallel when executing interrupt processing. That is, it saves the 19 CPU registers to the resister bank by accessing four registers at a time. This structure significantly improves interrupt response time from 37 cycles to 6 cycles.