The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] IT(16991hit)

21-40hit(16991hit)

  • Chinese Spelling Correction Based on Knowledge Enhancement and Contrastive Learning Open Access

    Hao WANG  Yao MA  Jianyong DUAN  Li HE  Xin LI  

     
    PAPER-Natural Language Processing

      Pubricized:
    2024/05/17
      Vol:
    E107-D No:9
      Page(s):
    1264-1273

    Chinese Spelling Correction (CSC) is an important natural language processing task. Existing methods for CSC mostly utilize BERT models, which select a character from a candidate list to correct errors in the sentence. World knowledge refers to structured information and relationships spanning a wide range of domains and subjects, while definition knowledge pertains to textual explanations or descriptions of specific words or concepts. Both forms of knowledge have the potential to enhance a model’s ability to comprehend contextual nuances. As BERT lacks sufficient guidance from world knowledge for error correction and existing models overlook the rich definition knowledge in Chinese dictionaries, the performance of spelling correction models is somewhat compromised. To address these issues, within the world knowledge network, this study injects world knowledge from knowledge graphs into the model to assist in correcting spelling errors caused by a lack of world knowledge. Additionally, the definition knowledge network in this model improves the error correction capability by utilizing the definitions from the Chinese dictionary through a comparative learning approach. Experimental results on the SIGHAN benchmark dataset validate the effectiveness of our approach.

  • TIG: A Multitask Temporal Interval Guided Framework for Key Frame Detection Open Access

    Shijie WANG  Xuejiao HU  Sheng LIU  Ming LI  Yang LI  Sidan DU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2024/05/17
      Vol:
    E107-D No:9
      Page(s):
    1253-1263

    Detecting key frames in videos has garnered substantial attention in recent years, it is a point-level task and has deep research value and application prospect in daily life. For instances, video surveillance system, video cover generation and highlight moment flashback all demands the technique of key frame detection. However, the task is beset by challenges such as the sparsity of key frame instances, imbalances between target frames and background frames, and the absence of post-processing method. In response to these problems, we introduce a novel and effective Temporal Interval Guided (TIG) framework to precisely localize specific frames. The framework is incorporated with a proposed Point-Level-Soft non-maximum suppression (PLS-NMS) post-processing algorithm which is suitable for point-level task, facilitated by the well-designed confidence score decay function. Furthermore, we propose a TIG-loss, exhibiting sensitivity to temporal interval from target frame, to optimize the two-stage framework. The proposed method can be broadly applied to key frame detection in video understanding, including action start detection and static video summarization. Extensive experimentation validates the efficacy of our approach on action start detection benchmark datasets: THUMOS’14 and Activitynet v1.3, and we have reached state-of-the-art performance. Competitive results are also demonstrated on SumMe and TVSum datasets for deep learning based static video summarization.

  • Type-Enhanced Ensemble Triple Representation via Triple-Aware Attention for Cross-Lingual Entity Alignment Open Access

    Zhishuo ZHANG  Chengxiang TAN  Xueyan ZHAO  Min YANG  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2024/05/22
      Vol:
    E107-D No:9
      Page(s):
    1182-1191

    Entity alignment (EA) is a crucial task for integrating cross-lingual and cross-domain knowledge graphs (KGs), which aims to discover entities referring to the same real-world object from different KGs. Most existing embedding-based methods generate aligning entity representation by mining the relevance of triple elements, paying little attention to triple indivisibility and entity role diversity. In this paper, a novel framework named TTEA - Type-enhanced Ensemble Triple Representation via Triple-aware Attention for Cross-lingual Entity Alignment is proposed to overcome the above shortcomings from the perspective of ensemble triple representation considering triple specificity and diversity features of entity role. Specifically, the ensemble triple representation is derived by regarding relation as information carrier between semantic and type spaces, and hence the noise influence during spatial transformation and information propagation can be smoothly controlled via specificity-aware triple attention. Moreover, the role diversity of triple elements is modeled via triple-aware entity enhancement in TTEA for EA-oriented entity representation. Extensive experiments on three real-world cross-lingual datasets demonstrate that our framework makes comparative results.

  • Using Genetic Algorithm and Mathematical Programming Model for Ambulance Location Problem in Emergency Medical Service Open Access

    Batnasan LUVAANJALBA  Elaine Yi-Ling WU  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2024/05/08
      Vol:
    E107-D No:9
      Page(s):
    1123-1132

    Emergency Medical Services (EMS) play a crucial role in healthcare systems, managing pre-hospital or out-of-hospital emergencies from the onset of an emergency call to the patient’s arrival at a healthcare facility. The design of an efficient ambulance location model is pivotal in enhancing survival rates, controlling morbidity, and preventing disability. Key factors in the classical models typically include travel time, demand zones, and the number of stations. While urban EMS systems have received extensive examination due to their centralized populations, rural areas pose distinct challenges. These include lower population density and longer response distances, contributing to a higher fatality rate due to sparse population distribution, limited EMS stations, and extended travel times. To address these challenges, we introduce a novel mathematical model that aims to optimize coverage and equity. A distinctive feature of our model is the integration of equity within the objective function, coupled with a focus on practical response time that includes the period required for personal protective equipment procedures, ensuring the model’s applicability and realism in emergency response scenarios. We tackle the proposed problem using a tailored genetic algorithm and propose a greedy algorithm for solution construction. The implementation of our tailored Genetic Algorithm promises efficient and effective EMS solutions, potentially enhancing emergency care and health outcomes in rural communities.

  • Permissionless Blockchain-Based Sybil-Resistant Self-Sovereign Identity Utilizing Attested Execution Secure Processors Open Access

    Koichi MORIYAMA  Akira OTSUKA  

     
    INVITED PAPER

      Pubricized:
    2024/04/15
      Vol:
    E107-D No:9
      Page(s):
    1112-1122

    This article describes the idea of utilizing Attested Execution Secure Processors (AESPs) that fit into building a secure Self-Sovereign Identity (SSI) system satisfying Sybil-resistance under permissionless blockchains. Today’s circumstances requiring people to be more online have encouraged us to address digital identity preserving privacy. There is a momentum of research addressing SSI, and many researchers approach blockchain technology as a foundation. SSI brings natural persons various benefits such as owning controls; on the other side, digital identity systems in the real world require Sybil-resistance to comply with Anti-Money-Laundering (AML) and other needs. The main idea in our proposal is to utilize AESPs for three reasons: first is the use of attested execution capability along with tamper-resistance, which is a strong assumption; second is powerfulness and flexibility, allowing various open-source programs to be executed within a secure enclave, and the third is that equipping hardware-assisted security in mobile devices has become a norm. Rafael Pass et al.’s formal abstraction of AESPs and the ideal functionality $\color{brown}{\mathcal{G}_\mathtt{att}}$ enable us to formulate how hardware-assisted security works for secure digital identity systems preserving privacy under permissionless blockchains mathematically. Our proposal of the AESP-based SSI architecture and system protocols, $\color{blue}{\Pi^{\mathcal{G}_\mathtt{att}}}$, demonstrates the advantages of building a proper SSI system that satisfies the Sybil-resistant requirement. The protocols may eliminate the online distributed committee assumed in other research, such as CanDID, because of assuming AESPs; thus, $\color{blue}{\Pi^{\mathcal{G}_\mathtt{att}}}$ allows not to rely on multi-party computation (MPC), bringing drastic flexibility and efficiency compared with the existing SSI systems.

  • Computer-Aided Design of Cross-Voltage-Domain Energy-Optimized Tapered Buffers Open Access

    Zhibo CAO  Pengfei HAN  Hongming LYU  

     
    PAPER-Electronic Circuits

      Pubricized:
    2024/04/09
      Vol:
    E107-C No:9
      Page(s):
    245-254

    This paper introduces a computer-aided low-power design method for tapered buffers that address given load capacitances, output transition times, and source impedances. Cross-voltage-domain tapered buffers involving a low-voltage domain in the frontier stages and a high-voltage domain in the posterior stages are further discussed which breaks the trade-off between the energy dissipation and the driving capability in conventional designs. As an essential circuit block, a dedicated analytical model for the level-shifter is proposed. The energy-optimized tapered buffer design is verified for different source and load conditions in a 180-nm CMOS process. The single-VDD buffer model achieves an average inaccuracy of 8.65% on the transition loss compared with Spice simulation results. Cross-voltage tapered buffers can be optimized to further remarkably reduce the energy consumption. The study finds wide applications in energy-efficient switching-mode analog applications.

  • Reduced Peripheral Leakage Current in Pin Photodetectors of Ge on n+-Si by P+ Implantation to Compensate Surface Holes Open Access

    Koji ABE  Mikiya KUZUTANI  Satoki FURUYA  Jose A. PIEDRA-LORENZANA  Takeshi HIZAWA  Yasuhiko ISHIKAWA  

     
    BRIEF PAPER

      Pubricized:
    2024/05/15
      Vol:
    E107-C No:9
      Page(s):
    237-240

    A reduced dark leakage current, without degrading the near-infrared responsivity, is reported for a vertical pin structure of Ge photodiodes (PDs) on n+-Si substrate, which usually shows a leakage current higher than PDs on p+-Si. The peripheral/surface leakage, the dominant leakage in PDs on n+-Si, is significantly suppressed by globally implanting P+ in the i-Si cap layer protecting the fragile surface of i-Ge epitaxial layer before locally implanting B+/BF2+ for the top p+ region of the pin junction. The P+ implantation compensates free holes unintentionally induced due to the Fermi level pinning at the surface/interface of Ge. By preventing the hole conduction from the periphery to the top p+ region under a negative/reverse bias, a reduction in the leakage current of PDs on n+-Si is realized.

  • Digital/Analog-Operation of Hf-Based FeNOS Nonvolatile Memory Utilizing Ferroelectric Nondoped HfO2 Blocking Layer Open Access

    Shun-ichiro OHMI  

     
    PAPER

      Pubricized:
    2024/06/03
      Vol:
    E107-C No:9
      Page(s):
    232-236

    In this research, we investigated the digital/analog-operation utilizing ferroelectric nondoped HfO2 (FeND-HfO2) as a blocking layer (BL) in the Hf-based metal/oxide/nitride/oxide/Si (MONOS) nonvolatile memory (NVM), so called FeNOS NVM. The Al/HfN0.5/HfN1.1/HfO2/p-Si(100) FeNOS diodes realized small equivalent oxide thickness (EOT) of 4.5 nm with the density of interface states (Dit) of 5.3 × 1010 eV-1cm-2 which were suitable for high-speed and low-voltage operation. The flat-band voltage (VFB) was well controlled as 80-100 mV with the input pulses of ±3 V/100 ms controlled by the partial polarization of FeND-HfO2 BL at each 2-bit state operated by the charge injection with the input pulses of +8 V/1-100 ms.

  • Modulation Recognition of Communication Signals Based on Cascade Network Open Access

    Yanli HOU  Chunxiao LIU  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E107-B No:9
      Page(s):
    620-626

    To improve the recognition rate of the end-to-end modulation recognition method based on deep learning, a modulation recognition method of communication signals based on a cascade network is proposed, which is composed of two networks: Stacked Denoising Auto Encoder (SDAE) network and DCELDNN (Dilated Convolution, ECA Mechanism, Long Short-Term Memory, Deep Neural Networks) network. SDAE network is used to denoise the data, reconstruct the input data through encoding and decoding, and extract deep information from the data. DCELDNN network is constructed based on the CLDNN (Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks) network. In the DCELDNN network, dilated convolution is used instead of normal convolution to enlarge the receptive field and extract signal features, the Efficient Channel Attention (ECA) mechanism is introduced to enhance the expression ability of the features, the feature vector information is integrated by a Global Average Pooling (GAP) layer, and signal features are extracted by the DCELDNN network efficiently. Finally, end-to-end classification recognition of communication signals is realized. The test results on the RadioML2018.01a dataset show that the average recognition accuracy of the proposed method reaches 63.1% at SNR of -10 to 15 dB, compared with CNN, LSTM, and CLDNN models, the recognition accuracy is improved by 25.8%, 12.3%, and 4.8% respectively at 10 dB SNR.

  • A Novel 3D Non-Stationary Vehicle-to-Vehicle Channel Model with Circular Arc Motions Open Access

    Zixv SU  Wei CHEN  Yuanyuan YANG  

     
    PAPER-Antennas and Propagation

      Vol:
    E107-B No:9
      Page(s):
    607-619

    In this paper, a cluster-based three-dimensional (3D) non-stationary vehicle-to-vehicle (V2V) channel model with circular arc motions and antenna rotates is proposed. The channel model simulates the complex urban communication scenario where clusters move with arbitrary velocities and directions. A novel cluster evolution algorithm with time-array consistency is developed to capture the non-stationarity. For time evolution, the birth-and-death (BD) property of clusters including birth, death, and rebirth are taken into account. Additionally, a visibility region (VR) method is proposed for array evolution, which is verified to be applicable to circular motions. Based on the Taylor expansion formula, a detailed derivation of space-time correlation function (ST-CF) with circular arc motions is shown. Statistical properties including ST-CF, Doppler power spectrum density (PSD), quasi-stationary interval, instantaneous Doppler frequency, root mean square delay spread (RMS-DS), delay PSD, and angular PSD are derived and analyzed. According to the simulated results, the non-stationarity in time, space, delay, and angular domains is captured. The presented results show that motion modes including linear motions as well as circular motions, the dynamic property of the scattering environment, and the velocity of the vehicle all have significant impacts on the statistical properties.

  • Stop-Probability-Based Network Topology Discovery Method Open Access

    Yuguang ZHANG  Zhiyong ZHANG  Wei ZHANG  Deming MAO  Zhihong RAO  

     
    PAPER-Network

      Vol:
    E107-B No:9
      Page(s):
    583-594

    Using a limited number of probes has always been a focus in interface-level network topology probing to discover complete network topologies. Stop-set-based network topology probing methods significantly reduce the number of probes sent but suffer from the side effect of incomplete topology information discovery. This study proposes an optimized probing method based on stop probabilities (SPs) that builds on existing stop-set-based network topology discovery methods to address the issue of incomplete topology information owing to multipath routing. The statistics of repeat nodes (RNs) and multipath routing on the Internet are analyzed and combined with the principles of stop-set-based probing methods, highlighting that stopping probing at the first RN compromises the completeness of topology discovery. To address this issue, SPs are introduced to adjust the stopping strategy upon encountering RNs during probing. A method is designed for generating SPs that achieves high completeness and low cost based on the distribution of the number of RNs. Simulation experiments demonstrate that the proposed stop-probability-based probing method almost completely discovers network nodes and links across different regions and times over a two-year period, while significantly reducing probing redundancy. In addition, the proposed approach balances and optimizes the trade-off between complete topology discovery and reduced probing costs compared with existing topology probing methods. Building on this, the factors influencing the probing cost of the proposed method and methods to further reduce the number of probes while ensuring completeness are analyzed. The proposed method yields universally applicable SPs in the current Internet environment.

  • A Novel Frequency Hopping Prediction Model Based on TCN-GRU Open Access

    Chen ZHONG  Chegnyu WU  Xiangyang LI  Ao ZHAN  Zhengqiang WANG  

     
    LETTER-Intelligent Transport System

      Pubricized:
    2024/04/19
      Vol:
    E107-A No:9
      Page(s):
    1577-1581

    A novel temporal convolution network-gated recurrent unit (NTCN-GRU) algorithm is proposed for the greatest of constant false alarm rate (GO-CFAR) frequency hopping (FH) prediction, integrating GRU and Bayesian optimization (BO). GRU efficiently captures the semantic associations among long FH sequences, and mitigates the phenomenon of gradient vanishing or explosion. BO improves extracting data features by optimizing hyperparameters besides. Simulations demonstrate that the proposed algorithm effectively reduces the loss in the training process, greatly improves the FH prediction effect, and outperforms the existing FH sequence prediction model. The model runtime is also reduced by three-quarters compared with others FH sequence prediction models.

  • Adaptive Output Feedback Leader-Following in Networks of Linear Systems Using Switching Logic Open Access

    Sungryul LEE  

     
    LETTER-Systems and Control

      Pubricized:
    2024/05/13
      Vol:
    E107-A No:9
      Page(s):
    1565-1569

    This study explores adaptive output feedback leader-following in networks of linear systems utilizing switching logic. A local state observer is employed to estimate the true state of each agent within the network. The proposed protocol is based on the estimated states obtained from neighboring agents and employs a switching logic to tune its adaptive gain by utilizing only local neighboring information. The proposed leader-following protocol is fully distributed because it has a distributed adaptive gain and relies on only local information from its neighbors. Consequently, compared to conventional adaptive protocols, the proposed design method provides the advantages of a very simple adaptive law and dynamics with a low dimension.

  • Enhanced Radar Emitter Recognition with Virtual Adversarial Training: A Semi-Supervised Framework Open Access

    Ziqin FENG  Hong WAN  Guan GUI  

     
    PAPER-Neural Networks and Bioengineering

      Pubricized:
    2024/05/15
      Vol:
    E107-A No:9
      Page(s):
    1534-1541

    Radar emitter identification (REI) is a crucial function of electronic radar warfare support systems. The challenge emphasizes identifying and locating unique transmitters, avoiding potential threats, and preparing countermeasures. Due to the remarkable effectiveness of deep learning (DL) in uncovering latent features within data and performing classifications, deep neural networks (DNNs) have seen widespread application in radar emitter identification (REI). In many real-world scenarios, obtaining a large number of annotated radar transmitter samples for training identification models is essential yet challenging. Given the issues of insufficient labeled datasets and abundant unlabeled training datasets, we propose a novel REI method based on a semi-supervised learning (SSL) framework with virtual adversarial training (VAT). Specifically, two objective functions are designed to extract the semantic features of radar signals: computing cross-entropy loss for labeled samples and virtual adversarial training loss for all samples. Additionally, a pseudo-labeling approach is employed for unlabeled samples. The proposed VAT-based SS-REI method is evaluated on a radar dataset. Simulation results indicate that the proposed VAT-based SS-REI method outperforms the latest SS-REI method in recognition performance.

  • DETrack: Multi-Object Tracking Algorithm Based on Feature Decomposition and Feature Enhancement Open Access

    Feng WEN  Haixin HUANG  Xiangyang YIN  Junguang MA  Xiaojie HU  

     
    PAPER-Neural Networks and Bioengineering

      Pubricized:
    2024/04/22
      Vol:
    E107-A No:9
      Page(s):
    1522-1533

    Multi-object tracking (MOT) algorithms are typically classified as one-shot or two-step algorithms. The one-shot MOT algorithm is widely studied and applied due to its fast inference speed. However, one-shot algorithms include two sub-tasks of detection and re-ID, which have conflicting directions for model optimization, thus limiting tracking performance. Additionally, MOT algorithms often suffer from serious ID switching issues, which can negatively affect the tracking effect. To address these challenges, this study proposes the DETrack algorithm, which consists of feature decomposition and feature enhancement modules. The feature decomposition module can effectively exploit the differences and correlations of different tasks to solve the conflict problem. Moreover, it can effectively mitigate the competition between the detection and re-ID tasks, while simultaneously enhancing their cooperation. The feature enhancement module can improve feature quality and alleviate the problem of target ID switching. Experimental results demonstrate that DETrack has achieved improvements in multi-object tracking performance, while reducing the number of ID switching. The designed method of feature decomposition and feature enhancement can significantly enhance target tracking effectiveness.

  • Outsider-Anonymous Broadcast Encryption with Keyword Search: Generic Construction, CCA Security, and with Sublinear Ciphertexts Open Access

    Keita EMURA  Kaisei KAJITA  Go OHTAKE  

     
    PAPER-Cryptography and Information Security

      Pubricized:
    2024/02/26
      Vol:
    E107-A No:9
      Page(s):
    1465-1477

    As a multi-receiver variant of public key encryption with keyword search (PEKS), broadcast encryption with keyword search (BEKS) has been proposed (Attrapadung et al. at ASIACRYPT 2006/Chatterjee-Mukherjee at INDOCRYPT 2018). Unlike broadcast encryption, no receiver anonymity is considered because the test algorithm takes a set of receivers as input and thus a set of receivers needs to be contained in a ciphertext. In this paper, we propose a generic construction of BEKS from anonymous and weakly robust 3-level hierarchical identity-based encryption (HIBE). The proposed generic construction provides outsider anonymity, where an adversary is allowed to obtain secret keys of outsiders who do not belong to the challenge sets, and provides sublinear-size ciphertext in terms of the number of receivers. Moreover, the proposed construction considers security against chosen-ciphertext attack (CCA) where an adversary is allowed to access a test oracle in the searchable encryption context. The proposed generic construction can be seen as an extension to the Fazio-Perera generic construction of anonymous broadcast encryption (PKC 2012) from anonymous and weakly robust identity-based encryption (IBE) and the Boneh et al. generic construction of PEKS (EUROCRYPT 2004) from anonymous IBE. We run the Fazio-Perera construction employs on the first-level identity and run the Boneh et al. generic construction on the second-level identity, i.e., a keyword is regarded as a second-level identity. The third-level identity is used for providing CCA security by employing one-time signatures. We also introduce weak robustness in the HIBE setting, and demonstrate that the Abdalla et al. generic transformation (TCC 2010/JoC 2018) for providing weak robustness to IBE works for HIBE with an appropriate parameter setting. We also explicitly introduce attractive concrete instantiations of the proposed generic construction from pairings and lattices, respectively.

  • Dispersion in a Polygon Open Access

    Tetsuya ARAKI  Shin-ichi NAKANO  

     
    PAPER-Algorithms and Data Structures

      Pubricized:
    2024/03/11
      Vol:
    E107-A No:9
      Page(s):
    1458-1464

    The dispersion problem is a variant of facility location problems, that has been extensively studied. Given a polygon with n edges on a plane we want to find k points in the polygon so that the minimum pairwise Euclidean distance of the k points is maximized. We call the problem the k-dispersion problem in a polygon. Intuitively, for an island, we want to locate k drone bases far away from each other in flying distance to avoid congestion in the sky. In this paper, we give a polynomial-time approximation scheme (PTAS) for this problem when k is a constant and ε < 1 (where ε is a positive real number). Our proposed algorithm runs in O(((1/ε)2 + n/ε)k) time with 1/(1 + ε) approximation, the first PTAS developed for this problem. Additionally, we consider three variations of the dispersion problem and design a PTAS for each of them.

  • Rectangle-of-Influence Drawings of Five-Connected Plane Graphs Open Access

    Kazuyuki MIURA  

     
    PAPER-Algorithms and Data Structures

      Pubricized:
    2024/02/09
      Vol:
    E107-A No:9
      Page(s):
    1452-1457

    A rectangle-of-influence drawing of a plane graph G is a straight-line planar drawing of G such that there is no vertex in the proper inside of the axis-parallel rectangle defined by the two ends of any edge. In this paper, we show that any given 5-connected plane graph G with five or more vertices on the outer face has a rectangle-of-influence drawing in an integer grid such that W + H ≤ n - 2, where n is the number of vertices in G, W is the width and H is the height of the grid.

  • International Competition on Graph Counting Algorithms 2023 Open Access

    Takeru INOUE  Norihito YASUDA  Hidetomo NABESHIMA  Masaaki NISHINO  Shuhei DENZUMI  Shin-ichi MINATO  

     
    INVITED PAPER-Algorithms and Data Structures

      Pubricized:
    2024/01/15
      Vol:
    E107-A No:9
      Page(s):
    1441-1451

    This paper reports on the details of the International Competition on Graph Counting Algorithms (ICGCA) held in 2023. The graph counting problem is to count the subgraphs satisfying specified constraints on a given graph. The problem belongs to #P-complete, a computationally tough class. Since many essential systems in modern society, e.g., infrastructure networks, are often represented as graphs, graph counting algorithms are a key technology to efficiently scan all the subgraphs representing the feasible states of the system. In the ICGCA, contestants were asked to count the paths on a graph under a length constraint. The benchmark set included 150 challenging instances, emphasizing graphs resembling infrastructure networks. Eleven solvers were submitted and ranked by the number of benchmarks correctly solved within a time limit. The winning solver, TLDC, was designed based on three fundamental approaches: backtracking search, dynamic programming, and model counting or #SAT (a counting version of Boolean satisfiability). Detailed analyses show that each approach has its own strengths, and one approach is unlikely to dominate the others. The codes and papers of the participating solvers are available: https://afsa.jp/icgca/.

  • Improved Just Noticeable Difference Model Based Algorithm for Fast CU Partition in V-PCC Open Access

    Zhi LIU  Heng WANG  Yuan LI  Hongyun LU  Hongyuan JING  Mengmeng ZHANG  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2024/04/05
      Vol:
    E107-D No:8
      Page(s):
    1101-1104

    In video-based point cloud compression (V-PCC), the partitioning of the Coding Unit (CU) has ultra-high computational complexity. Just Noticeable Difference Model (JND) is an effective metric to guide this process. However, in this paper, it is found that the performance of traditional JND model is degraded in V-PCC. For the attribute video, due to the pixel-filling operation, the capability of brightness perception is reduced for the JND model. For the geometric video, due to the depth filling operation, the capability of depth perception is degraded in the boundary area for depth based JND models (JNDD). In this paper, a joint JND model (J_JND) is proposed for the attribute video to improve the brightness perception capacity, and an occupancy map guided JNDD model (O_JNDD) is proposed for the geometric video to improve the depth difference estimation accuracy of the boundaries. Based on the two improved JND models, a fast V-PCC Coding Unit (CU) partitioning algorithm is proposed with adaptive CU depth prediction. The experimental results show that the proposed algorithm eliminates 27.46% of total coding time at the cost of only 0.36% and 0.75% Bjontegaard Delta rate increment under the geometry Point-to-Point (D1) error and attribute Luma Peak-signal-Noise-Ratio (PSNR), respectively.

21-40hit(16991hit)