The search functionality is under construction.

Keyword Search Result

[Keyword] reinforcement learning(72hit)

1-20hit(72hit)

  • Federated Deep Reinforcement Learning for Multimedia Task Offloading and Resource Allocation in MEC Networks Open Access

    Rongqi ZHANG  Chunyun PAN  Yafei WANG  Yuanyuan YAO  Xuehua LI  

     
    PAPER-Network

      Vol:
    E107-B No:6
      Page(s):
    446-457

    With maturation of 5G technology in recent years, multimedia services such as live video streaming and online games on the Internet have flourished. These multimedia services frequently require low latency, which pose a significant challenge to compute the high latency requirements multimedia tasks. Mobile edge computing (MEC), is considered a key technology solution to address the above challenges. It offloads computation-intensive tasks to edge servers by sinking mobile nodes, which reduces task execution latency and relieves computing pressure on multimedia devices. In order to use MEC paradigm reasonably and efficiently, resource allocation has become a new challenge. In this paper, we focus on the multimedia tasks which need to be uploaded and processed in the network. We set the optimization problem with the goal of minimizing the latency and energy consumption required to perform tasks in multimedia devices. To solve the complex and non-convex problem, we formulate the optimization problem as a distributed deep reinforcement learning (DRL) problem and propose a federated Dueling deep Q-network (DDQN) based multimedia task offloading and resource allocation algorithm (FDRL-DDQN). In the algorithm, DRL is trained on the local device, while federated learning (FL) is responsible for aggregating and updating the parameters from the trained local models. Further, in order to solve the not identically and independently distributed (non-IID) data problem of multimedia devices, we develop a method for selecting participating federated devices. The simulation results show that the FDRL-DDQN algorithm can reduce the total cost by 31.3% compared to the DQN algorithm when the task data is 1000 kbit, and the maximum reduction can be 35.3% compared to the traditional baseline algorithm.

  • Resource Allocation for Mobile Edge Computing System Considering User Mobility with Deep Reinforcement Learning

    Kairi TOKUDA  Takehiro SATO  Eiji OKI  

     
    PAPER-Network

      Pubricized:
    2023/10/06
      Vol:
    E107-B No:1
      Page(s):
    173-184

    Mobile edge computing (MEC) is a key technology for providing services that require low latency by migrating cloud functions to the network edge. The potential low quality of the wireless channel should be noted when mobile users with limited computing resources offload tasks to an MEC server. To improve the transmission reliability, it is necessary to perform resource allocation in an MEC server, taking into account the current channel quality and the resource contention. There are several works that take a deep reinforcement learning (DRL) approach to address such resource allocation. However, these approaches consider a fixed number of users offloading their tasks, and do not assume a situation where the number of users varies due to user mobility. This paper proposes Deep reinforcement learning model for MEC Resource Allocation with Dummy (DMRA-D), an online learning model that addresses the resource allocation in an MEC server under the situation where the number of users varies. By adopting dummy state/action, DMRA-D keeps the state/action representation. Therefore, DMRA-D can continue to learn one model regardless of variation in the number of users during the operation. Numerical results show that DMRA-D improves the success rate of task submission while continuing learning under the situation where the number of users varies.

  • Reinforcement Learning for Multi-Agent Systems with Temporal Logic Specifications

    Keita TERASHIMA  Koichi KOBAYASHI  Yuh YAMASHITA  

     
    PAPER

      Pubricized:
    2023/07/19
      Vol:
    E107-A No:1
      Page(s):
    31-37

    In a multi-agent system, it is important to consider a design method of cooperative actions in order to achieve a common goal. In this paper, we propose two novel multi-agent reinforcement learning methods, where the control specification is described by linear temporal logic formulas, which represent a common goal. First, we propose a simple solution method, which is directly extended from the single-agent case. In this method, there are some technical issues caused by the increase in the number of agents. Next, to overcome these technical issues, we propose a new method in which an aggregator is introduced. Finally, these two methods are compared by numerical simulations, with a surveillance problem as an example.

  • Minimization of Energy Consumption in TDMA-Based Wireless-Powered Multi-Access Edge Computing Networks

    Xi CHEN  Guodong JIANG  Kaikai CHI  Shubin ZHANG  Gang CHEN  Jiang LIU  

     
    PAPER-Communication Theory and Signals

      Pubricized:
    2023/06/19
      Vol:
    E106-A No:12
      Page(s):
    1544-1554

    Many nodes in Internet of Things (IoT) rely on batteries for power. Additionally, the demand for executing compute-intensive and latency-sensitive tasks is increasing for IoT nodes. In some practical scenarios, the computation tasks of WDs have the non-separable characteristic, that is, binary offloading strategies should be used. In this paper, we focus on the design of an efficient binary offloading algorithm that minimizes system energy consumption (EC) for TDMA-based wireless-powered multi-access edge computing networks, where WDs either compute tasks locally or offload them to hybrid access points (H-APs). We formulate the EC minimization problem which is a non-convex problem and decompose it into a master problem optimizing binary offloading decision and a subproblem optimizing WPT duration and task offloading transmission durations. For the master problem, a DRL based method is applied to obtain the near-optimal offloading decision. For the subproblem, we firstly consider the scenario where the nodes do not have completion time constraints and obtain the optimal analytical solution. Then we consider the scenario with the constraints. By jointly using the Golden Section Method and bisection method, the optimal solution can be obtained due to the convexity of the constraint function. Simulation results show that the proposed offloading algorithm based on DRL can achieve the near-minimal EC.

  • Joint Virtual Network Function Deployment and Scheduling via Heuristics and Deep Reinforcement Learning

    Zixiao ZHANG  Eiji OKI  

     
    PAPER-Network

      Pubricized:
    2023/08/01
      Vol:
    E106-B No:12
      Page(s):
    1424-1440

    This paper introduces heuristic approaches and a deep reinforcement learning approach to solve a joint virtual network function deployment and scheduling problem in a dynamic scenario. We formulate the problem as an optimization problem. Based on the mathematical description of the optimization problem, we introduce three heuristic approaches and a deep reinforcement learning approach to solve the problem. We define an objective to maximize the ratio of delay-satisfied requests while minimizing the average resource cost for a dynamic scenario. Our introduced two greedy approaches are named finish time greedy and computational resource greedy, respectively. In the finish time greedy approach, we make each request be finished as soon as possible despite its resource cost; in the computational resource greedy approach, we make each request occupy as few resources as possible despite its finish time. Our introduced simulated annealing approach generates feasible solutions randomly and converges to an approximate solution. In our learning-based approach, neural networks are trained to make decisions. We use a simulated environment to evaluate the performances of our introduced approaches. Numerical results show that the introduced deep reinforcement learning approach has the best performance in terms of benefit in our examined cases.

  • A Lightweight Reinforcement Learning Based Packet Routing Method Using Online Sequential Learning

    Kenji NEMOTO  Hiroki MATSUTANI  

     
    PAPER-Computer System

      Pubricized:
    2023/08/15
      Vol:
    E106-D No:11
      Page(s):
    1796-1807

    Existing simple routing protocols (e.g., OSPF, RIP) have some disadvantages of being inflexible and prone to congestion due to the concentration of packets on particular routers. To address these issues, packet routing methods using machine learning have been proposed recently. Compared to these algorithms, machine learning based methods can choose a routing path intelligently by learning efficient routes. However, machine learning based methods have a disadvantage of training time overhead. We thus focus on a lightweight machine learning algorithm, OS-ELM (Online Sequential Extreme Learning Machine), to reduce the training time. Although previous work on reinforcement learning using OS-ELM exists, it has a problem of low learning accuracy. In this paper, we propose OS-ELM QN (Q-Network) with a prioritized experience replay buffer to improve the learning performance. It is compared to a deep reinforcement learning based packet routing method using a network simulator. Experimental results show that introducing the experience replay buffer improves the learning performance. OS-ELM QN achieves a 2.33 times speedup than a DQN (Deep Q-Network) in terms of learning speed. Regarding the packet transfer latency, OS-ELM QN is comparable or slightly inferior to the DQN while they are better than OSPF in most cases since they can distribute congestions.

  • Performance Analysis and Optimization of Worst Case User in CoMP Ultra Dense Networks

    Sinh Cong LAM  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2023/03/27
      Vol:
    E106-B No:10
      Page(s):
    979-986

    In the cellular system, the Worst Case User (WCU), whose distances to three nearest BSs are the similar, usually achieves the lowest performance. Improving user performance, especially the WCU, is a big problem for both network designers and operators. This paper works on the WCU in terms of coverage probability analysis by the stochastic geometry tool and data rate optimization with the transmission power constraint by the reinforcement learning technique under the Stretched Pathloss Model (SPLM). In analysis, only fast fading from the WCU to the serving Base Stations (BSs) is taken into the analysis to derive the lower bound coverage probability. Furthermore, the paper assumes that the Coordinated Multi-Point (CoMP) technique is only employed for the WCU to enhance its downlink signal and avoid the explosion of Intercell Interference (ICI). Through the analysis and simulation, the paper states that to improve the WCU performance under bad wireless environments, an increase in transmission power can be a possible solution. However, in good environments, the deployment of advanced techniques such as Joint Transmission (JT), Joint Scheduling (JS), and reinforcement learning is an suitable solution.

  • Dynamic VNF Scheduling: A Deep Reinforcement Learning Approach

    Zixiao ZHANG  Fujun HE  Eiji OKI  

     
    PAPER-Network

      Pubricized:
    2023/01/10
      Vol:
    E106-B No:7
      Page(s):
    557-570

    This paper introduces a deep reinforcement learning approach to solve the virtual network function scheduling problem in dynamic scenarios. We formulate an integer linear programming model for the problem in static scenarios. In dynamic scenarios, we define the state, action, and reward to form the learning approach. The learning agents are applied with the asynchronous advantage actor-critic algorithm. We assign a master agent and several worker agents to each network function virtualization node in the problem. The worker agents work in parallel to help the master agent make decision. We compare the introduced approach with existing approaches by applying them in simulated environments. The existing approaches include three greedy approaches, a simulated annealing approach, and an integer linear programming approach. The numerical results show that the introduced deep reinforcement learning approach improves the performance by 6-27% in our examined cases.

  • UE Set Selection for RR Scheduling in Distributed Antenna Transmission with Reinforcement Learning Open Access

    Go OTSURU  Yukitoshi SANADA  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2023/01/13
      Vol:
    E106-B No:7
      Page(s):
    586-594

    In this paper, user set selection in the allocation sequences of round-robin (RR) scheduling for distributed antenna transmission with block diagonalization (BD) pre-coding is proposed. In prior research, the initial phase selection of user equipment allocation sequences in RR scheduling has been investigated. The performance of the proposed RR scheduling is inferior to that of proportional fair (PF) scheduling under severe intra-cell interference. In this paper, the multi-input multi-output technology with BD pre-coding is applied. Furthermore, the user equipment (UE) sets in the allocation sequences are eliminated with reinforcement learning. After the modification of a RR allocation sequence, no estimated throughput calculation for UE set selection is required. Numerical results obtained through computer simulation show that the maximum selection, one of the criteria for initial phase selection, outperforms the weighted PF scheduling in a restricted realm in terms of the computational complexity, fairness, and throughput.

  • Semantic Path Planning for Indoor Navigation Tasks Using Multi-View Context and Prior Knowledge

    Jianbing WU  Weibo HUANG  Guoliang HUA  Wanruo ZHANG  Risheng KANG  Hong LIU  

     
    PAPER-Positioning and Navigation

      Pubricized:
    2022/01/20
      Vol:
    E106-D No:5
      Page(s):
    756-764

    Recently, deep reinforcement learning (DRL) methods have significantly improved the performance of target-driven indoor navigation tasks. However, the rich semantic information of environments is still not fully exploited in previous approaches. In addition, existing methods usually tend to overfit on training scenes or objects in target-driven navigation tasks, making it hard to generalize to unseen environments. Human beings can easily adapt to new scenes as they can recognize the objects they see and reason the possible locations of target objects using their experience. Inspired by this, we propose a DRL-based target-driven navigation model, termed MVC-PK, using Multi-View Context information and Prior semantic Knowledge. It relies only on the semantic label of target objects and allows the robot to find the target without using any geometry map. To perceive the semantic contextual information in the environment, object detectors are leveraged to detect the objects present in the multi-view observations. To enable the semantic reasoning ability of indoor mobile robots, a Graph Convolutional Network is also employed to incorporate prior knowledge. The proposed MVC-PK model is evaluated in the AI2-THOR simulation environment. The results show that MVC-PK (1) significantly improves the cross-scene and cross-target generalization ability, and (2) achieves state-of-the-art performance with 15.2% and 11.0% increase in Success Rate (SR) and Success weighted by Path Length (SPL), respectively.

  • Edge Computing Resource Allocation Algorithm for NB-IoT Based on Deep Reinforcement Learning

    Jiawen CHU  Chunyun PAN  Yafei WANG  Xiang YUN  Xuehua LI  

     
    PAPER-Network

      Pubricized:
    2022/11/04
      Vol:
    E106-B No:5
      Page(s):
    439-447

    Mobile edge computing (MEC) technology guarantees the privacy and security of large-scale data in the Narrowband-IoT (NB-IoT) by deploying MEC servers near base stations to provide sufficient computing, storage, and data processing capacity to meet the delay and energy consumption requirements of NB-IoT terminal equipment. For the NB-IoT MEC system, this paper proposes a resource allocation algorithm based on deep reinforcement learning to optimize the total cost of task offloading and execution. Since the formulated problem is a mixed-integer non-linear programming (MINLP), we cast our problem as a multi-agent distributed deep reinforcement learning (DRL) problem and address it using dueling Q-learning network algorithm. Simulation results show that compared with the deep Q-learning network and the all-local cost and all-offload cost algorithms, the proposed algorithm can effectively guarantee the success rates of task offloading and execution. In addition, when the execution task volume is 200KBit, the total system cost of the proposed algorithm can be reduced by at least 1.3%, and when the execution task volume is 600KBit, the total cost of system execution tasks can be reduced by 16.7% at most.

  • Deep Reinforcement Learning Based Ontology Meta-Matching Technique

    Xingsi XUE  Yirui HUANG  Zeqing ZHANG  

     
    PAPER-Core Methods

      Pubricized:
    2022/03/04
      Vol:
    E106-D No:5
      Page(s):
    635-643

    Ontologies are regarded as the solution to data heterogeneity on the Semantic Web (SW), but they also suffer from the heterogeneity problem, which leads to the ambiguity of data information. Ontology Meta-Matching technique (OMM) is able to solve the ontology heterogeneity problem through aggregating various similarity measures to find the heterogeneous entities. Inspired by the success of Reinforcement Learning (RL) in solving complex optimization problems, this work proposes a RL-based OMM technique to address the ontology heterogeneity problem. First, we propose a novel RL-based OMM framework, and then, a neural network that is called evaluated network is proposed to replace the Q table when we choose the next action of the agent, which is able to reduce memory consumption and computing time. After that, to better guide the training of neural network and improve the accuracy of RL agent, we establish a memory bank to mine depth information during the evaluated network's training procedure, and we use another neural network that is called target network to save the historical parameters. The experiment uses the famous benchmark in ontology matching domain to test our approach's performance, and the comparisons among Deep Reinforcement Learning(DRL), RL and state-of-the-art ontology matching systems show that our approach is able to effectively determine high-quality alignments.

  • SPSD: Semantics and Deep Reinforcement Learning Based Motion Planning for Supermarket Robot

    Jialun CAI  Weibo HUANG  Yingxuan YOU  Zhan CHEN  Bin REN  Hong LIU  

     
    PAPER-Positioning and Navigation

      Pubricized:
    2022/09/15
      Vol:
    E106-D No:5
      Page(s):
    765-772

    Robot motion planning is an important part of the unmanned supermarket. The challenges of motion planning in supermarkets lie in the diversity of the supermarket environment, the complexity of obstacle movement, the vastness of the search space. This paper proposes an adaptive Search and Path planning method based on the Semantic information and Deep reinforcement learning (SPSD), which effectively improves the autonomous decision-making ability of supermarket robots. Firstly, based on the backbone of deep reinforcement learning (DRL), supermarket robots process real-time information from multi-modality sensors to realize high-speed and collision-free motion planning. Meanwhile, in order to solve the problem caused by the uncertainty of the reward in the deep reinforcement learning, common spatial semantic relationships between landmarks and target objects are exploited to define reward function. Finally, dynamics randomization is introduced to improve the generalization performance of the algorithm in the training. The experimental results show that the SPSD algorithm is excellent in the three indicators of generalization performance, training time and path planning length. Compared with other methods, the training time of SPSD is reduced by 27.42% at most, the path planning length is reduced by 21.08% at most, and the trained network of SPSD can be applied to unfamiliar scenes safely and efficiently. The results are motivating enough to consider the application of the proposed method in practical scenes. We have uploaded the video of the results of the experiment to https://www.youtube.com/watch?v=h1wLpm42NZk.

  • Adversarial Reinforcement Learning-Based Coordinated Robust Spatial Reuse in Broadcast-Overlaid WLANs

    Yuto KIHIRA  Yusuke KODA  Koji YAMAMOTO  Takayuki NISHIO  

     
    PAPER-Terrestrial Wireless Communication/Broadcasting Technologies

      Pubricized:
    2022/08/02
      Vol:
    E106-B No:2
      Page(s):
    203-212

    Broadcast services for wireless local area networks (WLANs) are being standardized in the IEEE 802.11 task group bc. Envisaging the upcoming coexistence of broadcast access points (APs) with densely-deployed legacy APs, this paper addresses a learning-based spatial reuse with only partial receiver-awareness. This partial awareness means that the broadcast APs can leverage few acknowledgment frames (ACKs) from recipient stations (STAs). This is in view of the specific concerns of broadcast communications. In broadcast communications for a very large number of STAs, ACK implosions occur unless some STAs are stopped from responding with ACKs. Given this, the main contribution of this paper is to demonstrate the feasibility to improve the robustness of learning-based spatial reuse to hidden interferers only with the partial receiver-awareness while discarding any re-training of broadcast APs. The core idea is to leverage robust adversarial reinforcement learning (RARL), where before a hidden interferer is installed, a broadcast AP learns a rate adaptation policy in a competition with a proxy interferer that provides jamming signals intelligently. Therein, the recipient STAs experience interference and the partial STAs provide a feedback overestimating the effect of interference, allowing the broadcast AP to select a data rate to avoid frame losses in a broad range of recipient STAs. Simulations demonstrate the suppression of the throughput degradation under a sudden installation of a hidden interferer, indicating the feasibility of acquiring robustness to the hidden interferer.

  • Particle Filter Design Based on Reinforcement Learning and Its Application to Mobile Robot Localization

    Ryota YOSHIMURA  Ichiro MARUTA  Kenji FUJIMOTO  Ken SATO  Yusuke KOBAYASHI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2022/01/28
      Vol:
    E105-D No:5
      Page(s):
    1010-1023

    Particle filters have been widely used for state estimation problems in nonlinear and non-Gaussian systems. Their performance depends on the given system and measurement models, which need to be designed by the user for each target system. This paper proposes a novel method to design these models for a particle filter. This is a numerical optimization method, where the particle filter design process is interpreted into the framework of reinforcement learning by assigning the randomnesses included in both models of the particle filter to the policy of reinforcement learning. In this method, estimation by the particle filter is repeatedly performed and the parameters that determine both models are gradually updated according to the estimation results. The advantage is that it can optimize various objective functions, such as the estimation accuracy of the particle filter, the variance of the particles, the likelihood of the parameters, and the regularization term of the parameters. We derive the conditions to guarantee that the optimization calculation converges with probability 1. Furthermore, in order to show that the proposed method can be applied to practical-scale problems, we design the particle filter for mobile robot localization, which is an essential technology for autonomous navigation. By numerical simulations, it is demonstrated that the proposed method further improves the localization accuracy compared to the conventional method.

  • Deep Coalitional Q-Learning for Dynamic Coalition Formation in Edge Computing

    Shiyao DING  Donghui LIN  

     
    PAPER

      Pubricized:
    2021/12/14
      Vol:
    E105-D No:5
      Page(s):
    864-872

    With the high development of computation requirements in Internet of Things, resource-limited edge servers usually require to cooperate to perform the tasks. Most related studies usually assume a static cooperation approach which might not suit the dynamic environment of edge computing. In this paper, we consider a dynamic cooperation approach by guiding edge servers to form coalitions dynamically. It raises two issues: 1) how to guide them to optimally form coalitions and 2) how to cope with the dynamic feature where server statuses dynamically change as the tasks are performed. The coalitional Markov decision process (CMDP) model proposed in our previous work can handle these issues well. However, its basic solution, coalitional Q-learning, cannot handle the large scale problem when the task number is large in edge computing. Our response is to propose a novel algorithm called deep coalitional Q-learning (DCQL) to solve it. To sum up, we first formulate the dynamic cooperation problem of edge servers as a CMDP: each edge server is regarded as an agent and the dynamic process is modeled as a MDP where the agents observe the current state to formulate several coalitions. Each coalition takes an action to impact the environment which correspondingly transfers to the next state to repeat the above process. Then, we propose DCQL which includes a deep neural network and so can well cope with large scale problem. DCQL can guide the edge servers to form coalitions dynamically with the target of optimizing some goal. Furthermore, we run experiments to verify our proposed algorithm's effectiveness in different settings.

  • Multi-Agent Reinforcement Learning for Cooperative Task Offloading in Distributed Edge Cloud Computing

    Shiyao DING  Donghui LIN  

     
    PAPER

      Pubricized:
    2021/12/28
      Vol:
    E105-D No:5
      Page(s):
    936-945

    Distributed edge cloud computing is an important computation infrastructure for Internet of Things (IoT) and its task offloading problem has attracted much attention recently. Most existing work on task offloading in distributed edge cloud computing usually assumes that each self-interested user owns one edge server and chooses whether to execute its tasks locally or to offload the tasks to cloud servers. The goal of each edge server is to maximize its own interest like low delay cost, which corresponds to a non-cooperative setting. However, with the strong development of smart IoT communities such as smart hospital and smart factory, all edge and cloud servers can belong to one organization like a technology company. This corresponds to a cooperative setting where the goal of the organization is to maximize the team interest in the overall edge cloud computing system. In this paper, we consider a new problem called cooperative task offloading where all edge servers try to cooperate to make the entire edge cloud computing system achieve good performance such as low delay cost and low energy cost. However, this problem is hard to solve due to two issues: 1) each edge server status dynamically changes and task arrival is uncertain; 2) each edge server can observe only its own status, which makes it hard to optimize team interest as global information is unavailable. For solving these issues, we formulate the problem as a decentralized partially observable Markov decision process (Dec-POMDP) which can well handle the dynamic features under partial observations. Then, we apply a multi-agent reinforcement learning algorithm called value decomposition network (VDN) and propose a VDN-based task offloading algorithm (VDN-TO) to solve the problem. Specifically, the motivation is that we use a team value function to evaluate the team interest, which is then divided into individual value functions for each edge server. Then, each edge server updates its individual value function in the direction that can maximize the team interest. Finally, we choose a part of a real dataset to evaluate our algorithm and the results show the effectiveness of our algorithm in a comparison with some other existing methods.

  • Improved Metric Function for AlphaSeq Algorithm to Design Ideal Complementary Codes for Multi-Carrier CDMA Systems

    Shucong TIAN  Meng YANG  Jianpeng WANG  Rui WANG  Avik R. ADHIKARY  

     
    LETTER-Communication Theory and Signals

      Pubricized:
    2021/11/15
      Vol:
    E105-A No:5
      Page(s):
    901-905

    AlphaSeq is a new paradigm to design sequencess with desired properties based on deep reinforcement learning (DRL). In this work, we propose a new metric function and a new reward function, to design an improved version of AlphaSeq. We show analytically and also through numerical simulations that the proposed algorithm can discover sequence sets with preferable properties faster than that of the previous algorithm.

  • A Reinforcement Learning Method for Optical Thin-Film Design Open Access

    Anqing JIANG  Osamu YOSHIE  

     
    PAPER-Optoelectronics

      Pubricized:
    2021/08/24
      Vol:
    E105-C No:2
      Page(s):
    95-101

    Machine learning, especially deep learning, is dramatically changing the methods associated with optical thin-film inverse design. The vast majority of this research has focused on the parameter optimization (layer thickness, and structure size) of optical thin-films. A challenging problem that arises is an automated material search. In this work, we propose a new end-to-end algorithm for optical thin-film inverse design. This method combines the ability of unsupervised learning, reinforcement learning and includes a genetic algorithm to design an optical thin-film without any human intervention. Furthermore, with several concrete examples, we have shown how one can use this technique to optimize the spectra of a multi-layer solar absorber device.

  • A Reinforcement Learning Approach for Self-Optimization of Coverage and Capacity in Heterogeneous Cellular Networks

    Junxuan WANG  Meng YU  Xuewei ZHANG  Fan JIANG  

     
    PAPER-Antennas and Propagation

      Pubricized:
    2021/04/13
      Vol:
    E104-B No:10
      Page(s):
    1318-1327

    Heterogeneous networks (HetNets) are emerging as an inevitable method to tackle the capacity crunch of the cellular networks. Due to the complicated network environment and a large number of configured parameters, coverage and capacity optimization (CCO) is a challenging issue in heterogeneous cellular networks. By combining the self-optimizing algorithm for radio frequency (RF) parameters with the power control mechanism of small cells, the CCO problem of self-organizing network is addressed in this paper. First, the optimization of RF parameters is solved based on reinforcement learning (RL), where the base station is modeled as an agent that can learn effective strategies to control the tunable parameters by interacting with the surrounding environment. Second, the small cell can autonomously change the state of wireless transmission by comparing its distance from the user equipment with the virtual cell size. Simulation results show that the proposed algorithm can achieve better performance on user throughput compared to different conventional methods.

1-20hit(72hit)