IEICE global.ieice.org Site

Keyword Search Result

[Keyword] Y(22735hit)

121-140hit(22735hit)

FSAMT: Face Shape Adaptive Makeup Transfer Open Access
Haoran LUO Tengfei SHAO Shenglei LI Reiko HISHIYAMA

PAPER-Image Recognition, Computer Vision

Pubricized:
2024/04/02
Vol:
E107-D No:8
Page(s):
1059-1069
Makeup transfer is the process of applying the makeup style from one picture (reference) to another (source), allowing for the modification of characters’ makeup styles. To meet the diverse makeup needs of individuals or samples, the makeup transfer framework should accurately handle various makeup degrees, ranging from subtle to bold, and exhibit intelligence in adapting to the source makeup. This paper introduces a “3-level” adaptive makeup transfer framework, addressing facial makeup through two sub-tasks: 1. Makeup adaptation, utilizing feature descriptors and eyelid curve algorithms to classify 135 organ-level face shapes; 2. Makeup transfer, achieved by learning the reference picture from three branches (color, highlight, pattern) and applying it to the source picture. The proposed framework, termed “Face Shape Adaptive Makeup Transfer” (FSAMT), demonstrates superior results in makeup transfer output quality, as confirmed by experimental results.
MDX-Mixer: Music Demixing by Leveraging Source Signals Separated by Existing Demixing Models Open Access
Tomoyasu NAKANO Masataka GOTO

PAPER-Music Information Processing

Pubricized:
2024/04/05
Vol:
E107-D No:8
Page(s):
1079-1088
This paper presents MDX-Mixer, which improves music demixing (MDX) performance by leveraging source signals separated by multiple existing MDX models. Deep-learning-based MDX models have improved their separation performances year by year for four kinds of sound sources: “vocals,” “drums,” “bass,” and “other”. Our research question is whether mixing (i.e., weighted sum) the signals separated by state-of-the-art MDX models can obtain either the best of everything or higher separation performance. Previously, in singing voice separation and MDX, there have been studies in which separated signals of the same sound source are mixed with each other using time-invariant or time-varying positive mixing weights. In contrast to those, this study is novel in that it allows for negative weights as well and performs time-varying mixing using all of the separated source signals and the music acoustic signal before separation. The time-varying weights are estimated by modeling the music acoustic signals and their separated signals by dividing them into short segments. In this paper we propose two new systems: one that estimates time-invariant weights using 1×1 convolution, and one that estimates time-varying weights by applying the MLP-Mixer layer proposed in the computer vision field to each segment. The latter model is called MDX-Mixer. Their performances were evaluated based on the source-to-distortion ratio (SDR) using the well-known MUSDB18-HQ dataset. The results show that the MDX-Mixer achieved higher SDR than the separated signals given by three state-of-the-art MDX models.
Tracking WebVR User Activities through Hand Motions: An Attack Perspective Open Access
Jiyeon LEE

LETTER-Human-computer Interaction

Pubricized:
2024/04/16
Vol:
E107-D No:8
Page(s):
1089-1092
With the rapid advancement of graphics processing units (GPUs), Virtual Reality (VR) experiences have significantly improved, enhancing immersion and realism. However, these advancements also raise security concerns in VR. In this paper, I introduce a new attack leveraging known WebVR vulnerabilities to track the activities of VR users. The proposed attack leverages the user’s hand motion information exposed to web attackers, demonstrating the capability to identify consumed content, such as 3D images and videos, and pilfer private drawings created in a 3D drawing app. To achieve this, I employed a machine learning approach to process controller sensor data and devised techniques to extract sensitive activities during the use of target apps. The experimental results demonstrate that the viewed content in the targeted content viewer can be identified with 90% accuracy. Furthermore, I successfully obtained drawing outlines that precisely match the user’s original drawings without performance degradation, validating the effectiveness of the attack.
A CNN-Based Feature Pyramid Segmentation Strategy for Acoustic Scene Classification Open Access
Ji XI Yue XIE Pengxu JIANG Wei JIANG

LETTER-Speech and Hearing

Pubricized:
2024/03/26
Vol:
E107-D No:8
Page(s):
1093-1096
Currently, a significant portion of acoustic scene categorization (ASC) research is centered around utilizing Convolutional Neural Network (CNN) models. This preference is primarily due to CNN’s ability to effectively extract time-frequency information from audio recordings of scenes by employing spectrum data as input. The expression of many dimensions can be achieved by utilizing 2D spectrum characteristics. Nevertheless, the diverse interpretations of the same object’s existence in different positions on the spectrum map can be attributed to the discrepancies between spectrum properties and picture qualities. The lack of distinction between different aspects of input information in ASC-based CNN networks may result in a decline in system performance. Considering this, a feature pyramid segmentation (FPS) approach based on CNN is proposed. The proposed approach involves utilizing spectrum features as the input for the model. These features are split based on a preset scale, and each segment-level feature is then fed into the CNN network for learning. The SoftMax classifier will receive the output of all feature scales, and these high-level features will be fused and fed to it to categorize different scenarios. The experiment provides evidence to support the efficacy of the FPS strategy and its potential to enhance the performance of the ASC system.
Cloud-Edge-Device Collaborative High Concurrency Access Management for Massive IoT Devices in Distribution Grid Open Access
Shuai LI Xinhong YOU Shidong ZHANG Mu FANG Pengping ZHANG

PAPER-Systems and Control

Pubricized:
2023/10/26
Vol:
E107-A No:7
Page(s):
946-957
Emerging data-intensive services in distribution grid impose requirements of high-concurrency access for massive internet of things (IoT) devices. However, the lack of effective high-concurrency access management results in severe performance degradation. To address this challenge, we propose a cloud-edge-device collaborative high-concurrency access management algorithm based on multi-timescale joint optimization of channel pre-allocation and load balancing degree. We formulate an optimization problem to minimize the weighted sum of edge-cloud load balancing degree and queuing delay under the constraint of access success rate. The problem is decomposed into a large-timescale channel pre-allocation subproblem solved by the device-edge collaborative access priority scoring mechanism, and a small-timescale data access control subproblem solved by the discounted empirical matching mechanism (DEM) with the perception of high-concurrency number and queue backlog. Particularly, information uncertainty caused by externalities is tackled by exploiting discounted empirical performance which accurately captures the performance influence of historical time points on present preference value. Simulation results demonstrate the effectiveness of the proposed algorithm in reducing edge-cloud load balancing degree and queuing delay.
More Efficient Two-Round Multi-Signature Scheme with Provably Secure Parameters for Standardized Elliptic Curves Open Access
Kaoru TAKEMURE Yusuke SAKAI Bagus SANTOSO Goichiro HANAOKA Kazuo OHTA

PAPER-Cryptography and Information Security

Pubricized:
2023/10/05
Vol:
E107-A No:7
Page(s):
966-988
The existing discrete-logarithm-based two-round multi-signature schemes without using the idealized model, i.e., the Algebraic Group Model (AGM), have quite large reduction loss. This means that an implementation of these schemes requires an elliptic curve (EC) with a very large order for the standard 128-bit security when we consider concrete security. Indeed, the existing standardized ECs have orders too small to ensure 128-bit security of such schemes. Recently, Pan and Wagner proposed two two-round schemes based on the Decisional Diffie-Hellman (DDH) assumption (EUROCRYPT 2023). For 128-bit security in concrete security, the first scheme can use the NIST-standardized EC P-256 and the second can use P-384. However, with these parameter choices, they do not improve the signature size and the communication complexity over the existing non-tight schemes. Therefore, there is no two-round scheme that (i) can use a standardized EC for 128-bit security and (ii) has high efficiency. In this paper, we construct a two-round multi-signature scheme achieving both of them from the DDH assumption. We prove that an EC with at least a 321-bit order is sufficient for our scheme to ensure 128-bit security. Thus, we can use the NIST-standardized EC P-384 for 128-bit security. Moreover, the signature size and the communication complexity per one signer of our proposed scheme under P-384 are 1152 bits and 1535 bits, respectively. These are most efficient among the existing two-round schemes without using the AGM including Pan-Wagner’s schemes and non-tight schemes which do not use the AGM. Our experiment on an ordinary machine shows that for signing and verification, each can be completed in about 65 ms under 100 signers. This shows that our scheme has sufficiently reasonable running time in practice.
Novel Constructions of Cross Z-Complementary Pairs with New Lengths Open Access
Longye WANG Chunlin CHEN Xiaoli ZENG Houshan LIU Lingguo KONG Qingping YU Qingsong WANG

PAPER-Information Theory

Pubricized:
2023/10/10
Vol:
E107-A No:7
Page(s):
989-996
Spatial modulation (SM) is a type of multiple-input multiple-output (MIMO) technology that provides several benefits over traditional MIMO systems. SM-MIMO is characterized by its unique transmission principle, which results in lower costs, enhanced spectrum utilization, and reduced inter-channel interference. To optimize channel estimation performance over frequency-selective channels in the spatial modulation system, cross Z-complementary pairs (CZCPs) have been proposed as training sequences. The zero correlation zone (ZCZ) properties of CZCPs for auto-correlation sums and cross-correlation sums enable them to achieve optimal channel estimation performance. In this paper, we systematically construct CZCPs based on binary Golay complementary pairs and binary Golay complementary pairs via Turyn’s method. We employ a special matrix operation and concatenation method to obtain CZCPs with new lengths 2M + N and 2(M + L), where M and L are the lengths of binary GCP, and N is the length of binary GCP via Turyn’s method. Further, we obtain the perfect CZCP with new length 4N and extend the lengths of CZCPs.
Real-Time Monitoring Systems That Provide M2M Communication between Machines Open Access
Ya ZHONG

PAPER-Language, Thought, Knowledge and Intelligence

Pubricized:
2023/10/17
Vol:
E107-A No:7
Page(s):
1019-1026
Artificial intelligence and the introduction of Internet of Things technologies have benefited from technological advances and new automated computer system technologies. Eventually, it is now possible to integrate them into a single offline industrial system. This is accomplished through machine-to-machine communication, which eliminates the human factor. The purpose of this article is to examine security systems for machine-to-machine communication systems that rely on identification and authentication algorithms for real-time monitoring. The article investigates security methods for quickly resolving data processing issues by using the Security operations Center’s main machine to identify and authenticate devices from 19 different machines. The results indicate that when machines are running offline and performing various tasks, they can be exposed to data leaks and malware attacks by both the individual machine and the system as a whole. The study looks at the operation of 19 computers, 7 of which were subjected to data leakage and malware attacks. AnyLogic software is used to create visual representations of the results using wireless networks and algorithms based on previously processed methods. The W76S is used as a protective element within intelligent sensors due to its built-in memory protection. For 4 machines, the data leakage time with malware attacks was 70 s. For 10 machines, the duration was 150 s with 3 attacks. Machine 15 had the longest attack duration, lasting 190 s and involving 6 malware attacks, while machine 19 had the shortest attack duration, lasting 200 s and involving 7 malware attacks. The highest numbers indicated that attempting to hack a system increased the risk of damaging a device, potentially resulting in the entire system with connected devices failing. Thus, illegal attacks by attackers using malware may be identified over time, and data processing effects can be prevented by intelligent control. The results reveal that applying identification and authentication methods using a protocol increases cyber-physical system security while also allowing real-time monitoring of offline system security.
Modeling and Analysis of Electromechanical Automatic Leveling Mechanism for High-Mobility Vehicle-Mounted Theodolites Open Access
Xiangyu LI Ping RUAN Wei HAO Meilin XIE Tao LV

PAPER-Measurement Technology

Pubricized:
2023/09/26
Vol:
E107-A No:7
Page(s):
1027-1039
To achieve precise measurement without landing, the high-mobility vehicle-mounted theodolite needs to be leveled quickly with high precision and ensure sufficient support stability before work. After the measurement, it is also necessary to ensure that the high-mobility vehicle-mounted theodolite can be quickly withdrawn. Therefore, this paper proposes a hierarchical automatic leveling strategy and establishes a two-stage electromechanical automatic leveling mechanism model. Using coarse leveling of the first-stage automatic leveling mechanism and fine leveling of the second-stage automatic leveling mechanism, the model realizes high-precision and fast leveling of the vehicle-mounted theodolites. Then, the leveling control method based on repeated positioning is proposed for the first-stage automatic leveling mechanism. To realize the rapid withdrawal for high-mobility vehicle-mounted theodolites, the method ensures the coincidence of spatial movement paths when the structural parts are unfolded and withdrawn. Next, the leg static balance equation is constructed in the leveling state, and the support force detection method is discussed in realizing the stable support for vehicle-mounted theodolites. Furthermore, a mathematical model for “false leg” detection is established furtherly, and a “false leg” detection scheme based on the support force detection method is analyzed to significantly improve the support stability of vehicle-mounted theodolites. Finally, an experimental platform is constructed to perform the performance test for automatic leveling mechanisms. The experimental results show that the leveling accuracy of established two-stage electromechanical automatic leveling mechanism can reach 3.6″, and the leveling time is no more than 2 mins. The maximum support force error of the support force detection method is less than 15%, and the average support force error is less than 10%. In contrast, the maximum support force error of the drive motor torque detection method reaches 80.12%, and its leg support stability is much less than the support force detection method. The model and analysis method proposed in this paper can also be used for vehicle-mounted radar, vehicle-mounted laser measurement devices, vehicle-mounted artillery launchers and other types of vehicle-mounted equipment with high-precision and high-mobility working requirements.
Four Classes of Bivariate Permutation Polynomials over Finite Fields of Even Characteristic Open Access
Changhui CHEN Haibin KAN Jie PENG Li WANG

LETTER-Cryptography and Information Security

Pubricized:
2023/10/17
Vol:
E107-A No:7
Page(s):
1045-1048
Permutation polynomials have important applications in cryptography, coding theory and combinatorial designs. In this letter, we construct four classes of permutation polynomials over 𝔽2n × 𝔽2n, where 𝔽2n is the finite field with 2n elements.
Two Classes of Optimal Ternary Cyclic Codes with Minimum Distance Four Open Access
Chao HE Xiaoqiong RAN Rong LUO

LETTER-Information Theory

Pubricized:
2023/10/16
Vol:
E107-A No:7
Page(s):
1049-1052
Cyclic codes are a subclass of linear codes and have applications in consumer electronics, data storage systems, and communication systems as they have efficient encoding and decoding algorithms. Let C(t,e) denote the cyclic code with two nonzero αt and αe, where α is a generator of 𝔽*3m. In this letter, we investigate the ternary cyclic codes with parameters [3m - 1, 3m - 1 - 2m, 4] based on some results proposed by Ding and Helleseth in 2013. Two new classes of optimal ternary cyclic codes C(t,e) are presented by choosing the proper t and e and determining the solutions of certain equations over 𝔽3m.
Novel Constructions of Complementary Sets of Sequences of Lengths Non-Power-of-Two Open Access
Longye WANG Houshan LIU Xiaoli ZENG Qingping YU

LETTER-Coding Theory

Pubricized:
2023/11/07
Vol:
E107-A No:7
Page(s):
1053-1057
This letter presented several new constructions of complementary sets (CSs) with flexible sequence lengths using matrix transformations. The constructed CSs of size 4 have different lengths, namely N + L and 2N + L, where N and L are the lengths for which complementary pairs exist. Also, presented CSs of size 8 have lengths N + P, P + Q and 2P + Q, where N is length of complementary pairs, P and Q are lengths of CSs of size 4 exist. The achieved designs can be easily extended to a set size of 2n+2 by recursive method. The proposed constructions generalize some previously reported constructions along with generating CSs under fewer constraints.
A Frequency Estimation Algorithm for High Precision Monitoring of Significant Space Targets Open Access
Ze Fu GAO Wen Ge YANG Yi Wen JIAO

LETTER-Communication Theory and Signals

Pubricized:
2023/09/26
Vol:
E107-A No:7
Page(s):
1058-1061
Space is becoming increasingly congested and contested, which calls for effective means to conduct effective monitoring of high-value space assets, especially in Space Situational Awareness (SSA) missions, while there are imperfections in existing methods and corresponding algorithms. To overcome such a problem, this letter proposes an algorithm for accurate Connected Element Interferometry (CEI) in SSA based on more interpolation information and iterations. Simulation results show that: (i) after iterations, the estimated asymptotic variance of the proposed method can basically achieve uniform convergence, and the ratio of it to ACRB is 1.00235 in δ0 ∈ [-0.5, 0.5], which is closer to 1 than the current best AM algorithms; (ii) In the interval of SNR ∈ [-14dB, 0dB], the estimation error of the proposed algorithm decreases significantly, which is basically comparable to CRLB (maintains at 1.236 times). The research of this letter could play a significant role in effective monitoring and high-precision tracking and measurement with significant space targets during futuristic SSA missions.
Joint CFO and DOA Estimation Based on MVDR Criterion in Interleaved OFDMA/SDMA Uplink Open Access
Chih-Chang SHEN Wei JHANG

LETTER-Spread Spectrum Technologies and Applications

Pubricized:
2023/10/26
Vol:
E107-A No:7
Page(s):
1066-1070
This letter deals with joint carrier frequency offset (CFO) and direction of arrival (DOA) estimation based on the minimum variance distortionless response (MVDR) criterion for interleaved orthogonal frequency division multiple access (OFDMA)/space division multiple access (SDMA) uplink systems. In order to reduce the computational load of two-dimensional searching based methods, the proposed method includes only once polynomial CFO rooting and does not require DOA paring, hence it raises the searching efficiency. Several simulation results are provided to illustrate the effectiveness of the proposed method.
Dither Signal Design for PAPR Reduction in OFDM-IM over a Rayleigh Fading Channel Open Access
Kee-Hoon KIM

PAPER-Wireless Communication Technologies

Vol:
E107-B No:7
Page(s):
505-512
Orthogonal frequency division multiplexing with index modulation (OFDM-IM) is a novel scheme where the information bits are conveyed through the subcarrier activation pattern (SAP) and the symbols on the active subcarriers. Specifically, the subcarriers are partitioned into many subblocks and the subcarriers in each subblock can have two states, active or idle. Unfortunately, OFDM-IM inherits the high peak-to-average power ratio (PAPR) problem from the classical OFDM. The OFDM-IM signal with high PAPR induces in-band distortion and out-of-band radiation when it passes through high power amplifier (HPA). Recently, there are attempts to reduce PAPR by exploiting the unique structure of OFDM-IM, which is adding dither signals in the idle subcarriers. The most recent work dealing with the dither signals is using dithers signals with various amplitude constraints according to the characteristic of the corresponding OFDM-IM subblock. This is reasonable because OFDM subblocks have distinct levels of robustness against noise. However, the amplitude constraint in the recent work is efficient for only additive white Gaussian noise (AWGN) channels and cannot be used for maximum likelihood (ML) detection. Therefore, in this paper, based on pairwise error probability (PEP) analysis, a specific constraint for the dither signals is derived over a Rayleigh fading channel.
RAN Slicing with Inter-Cell Interference Control and Link Adaptation for Reliable Wireless Communications Open Access
Yoshinori TANAKA Takashi DATEKI

PAPER-Terrestrial Wireless Communication/Broadcasting Technologies

Vol:
E107-B No:7
Page(s):
513-528
Efficient multiplexing of ultra-reliable and low-latency communications (URLLC) and enhanced mobile broadband (eMBB) traffic, as well as ensuring the various reliability requirements of these traffic types in 5G wireless communications, is becoming increasingly important, particularly for vertical services. Interference management techniques, such as coordinated inter-cell scheduling, can enhance reliability in dense cell deployments. However, tight inter-cell coordination necessitates frequent information exchange between cells, which limits implementation. This paper introduces a novel RAN slicing framework based on centralized frequency-domain interference control per slice and link adaptation optimized for URLLC. The proposed framework does not require tight inter-cell coordination but can fulfill the requirements of both the decoding error probability and the delay violation probability of each packet flow. These controls are based on a power-law estimation of the lower tail distribution of a measured data set with a smaller number of discrete samples. As design guidelines, we derived a theoretical minimum radio resource size of a slice to guarantee the delay violation probability requirement. Simulation results demonstrate that the proposed RAN slicing framework can achieve the reliability targets of the URLLC slice while improving the spectrum efficiency of the eMBB slice in a well-balanced manner compared to other evaluated benchmarks.
Soft-Error Tolerance by Guard-Gate Structures on Flip-Flops in 22 and 65 nm FD-SOI Technologies Open Access
Ryuichi NAKAJIMA Takafumi ITO Shotaro SUGITANI Tomoya KII Mitsunori EBARA Jun FURUTA Kazutoshi KOBAYASHI Mathieu LOUVAT Francois JACQUET Jean-Christophe ELOY Olivier MONTFORT Lionel JURE Vincent HUARD

PAPER

Pubricized:
2024/01/23
Vol:
E107-C No:7
Page(s):
191-200
We evaluated soft-error tolerance by heavy-ion irradiation test on three-types of flip-flops (FFs) named the standard FF (STDFF), the dual feedback recovery FF (DFRFF), and the DFRFF with long delay (DFRFFLD) in 22 and 65 nm fully-depleted silicon on insulator (FD-SOI) technologies. The guard-gate (GG) structure in DFRFF mitigates soft errors. A single event transient (SET) pulse is removed by the C-element with the signal delayed by the GG structure. DFRFFLD increases the GG delay by adding two more inverters as delay elements. We investigated the effectiveness of the GG structure in 22 and 65 nm. In 22 nm, Kr (40.3 MeV-cm2/mg) and Xe (67.2 MeV-cm2/mg) irradiation tests revealed that DFRFFLD has sufficient soft-error tolerance in outer space. In 65 nm, the relationship between GG delay and CS reveals the GG delay time which no error was observed under Kr irradiation.
Determination Method of Cascaded Number for Lumped Parameter Models Oriented to Transmission Lines Open Access
Risheng QIN Hua KUANG He JIANG Hui YU Hong LI Zhuan LI

PAPER-Electronic Circuits

Pubricized:
2023/12/20
Vol:
E107-C No:7
Page(s):
201-209
This paper proposes a determination method of the cascaded number for lumped parameter models (LPMs) of the transmission lines. The LPM is used to simulate long-distance transmission lines, and the cascaded number significantly impacts the simulation results. Currently, there is a lack of a system-level determination method of the cascaded number for LPMs. Based on the theoretical analysis and eigenvalue decomposition of network matrix, this paper discusses the error in resonance characteristics between distributed parameter model and LPMs. Moreover, it is deduced that optimal cascaded numbers of the cascaded π-type and T-type LPMs are the same, and the Γ-type LPM has a lowest analog accuracy. The principle that the maximum simulation frequency is less than the first resonance frequency of each segment is presented. According to the principle, optimal cascaded numbers of cascaded π-type, T-type, and Γ-type LPMs are obtained. The effectiveness of the proposed determination method is verified by simulation.
Understanding Characteristics of Phishing Reports from Experts and Non-Experts on Twitter Open Access
Hiroki NAKANO Daiki CHIBA Takashi KOIDE Naoki FUKUSHI Takeshi YAGI Takeo HARIU Katsunari YOSHIOKA Tsutomu MATSUMOTO

PAPER-Information Network

Pubricized:
2024/03/01
Vol:
E107-D No:7
Page(s):
807-824
The increase in phishing attacks through email and short message service (SMS) has shown no signs of deceleration. The first thing we need to do to combat the ever-increasing number of phishing attacks is to collect and characterize more phishing cases that reach end users. Without understanding these characteristics, anti-phishing countermeasures cannot evolve. In this study, we propose an approach using Twitter as a new observation point to immediately collect and characterize phishing cases via e-mail and SMS that evade countermeasures and reach users. Specifically, we propose CrowdCanary, a system capable of structurally and accurately extracting phishing information (e.g., URLs and domains) from tweets about phishing by users who have actually discovered or encountered it. In our three months of live operation, CrowdCanary identified 35,432 phishing URLs out of 38,935 phishing reports. We confirmed that 31,960 (90.2%) of these phishing URLs were later detected by the anti-virus engine, demonstrating that CrowdCanary is superior to existing systems in both accuracy and volume of threat extraction. We also analyzed users who shared phishing threats by utilizing the extracted phishing URLs and categorized them into two distinct groups - namely, experts and non-experts. As a result, we found that CrowdCanary could collect information that is specifically included in non-expert reports, such as information shared only by the company brand name in the tweet, information about phishing attacks that we find only in the image of the tweet, and information about the landing page before the redirect. Furthermore, we conducted a detailed analysis of the collected information on phishing sites and discovered that certain biases exist in the domain names and hosting servers of phishing sites, revealing new characteristics useful for unknown phishing site detection.
VH-YOLOv5s: Detecting the Skin Color of Plectropomus leopardus in Aquaculture Using Mobile Phones Open Access
Beibei LI Xun RAN Yiran LIU Wensheng LI Qingling DUAN

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2024/03/04
Vol:
E107-D No:7
Page(s):
835-844
Fish skin color detection plays a critical role in aquaculture. However, challenges arise from image color cast and the limited dataset, impacting the accuracy of the skin color detection process. To address these issues, we proposed a novel fish skin color detection method, termed VH-YOLOv5s. Specifically, we constructed a dataset for fish skin color detection to tackle the limitation posed by the scarcity of available datasets. Additionally, we proposed a Variance Gray World Algorithm (VGWA) to correct the image color cast. Moreover, the designed Hybrid Spatial Pyramid Pooling (HSPP) module effectively performs multi-scale feature fusion, thereby enhancing the feature representation capability. Extensive experiments have demonstrated that VH-YOLOv5s achieves excellent detection results on the Plectropomus leopardus skin color dataset, with a precision of 91.7%, recall of 90.1%, mAP@0.5 of 95.2%, and mAP@0.5:0.95 of 57.5%. When compared to other models such as Centernet, AutoAssign, and YOLOX-s, VH-YOLOv5s exhibits superior detection performance, surpassing them by 2.5%, 1.8%, and 1.7%, respectively. Furthermore, our model can be deployed directly on mobile phones, making it highly suitable for practical applications.

121-140hit(22735hit)

Keyword Search Result

[Keyword] Y(22735hit)

FSAMT: Face Shape Adaptive Makeup Transfer Open Access

MDX-Mixer: Music Demixing by Leveraging Source Signals Separated by Existing Demixing Models Open Access

Tracking WebVR User Activities through Hand Motions: An Attack Perspective Open Access

A CNN-Based Feature Pyramid Segmentation Strategy for Acoustic Scene Classification Open Access

Cloud-Edge-Device Collaborative High Concurrency Access Management for Massive IoT Devices in Distribution Grid Open Access

More Efficient Two-Round Multi-Signature Scheme with Provably Secure Parameters for Standardized Elliptic Curves Open Access

Novel Constructions of Cross Z-Complementary Pairs with New Lengths Open Access

Real-Time Monitoring Systems That Provide M2M Communication between Machines Open Access

Modeling and Analysis of Electromechanical Automatic Leveling Mechanism for High-Mobility Vehicle-Mounted Theodolites Open Access

Four Classes of Bivariate Permutation Polynomials over Finite Fields of Even Characteristic Open Access

Two Classes of Optimal Ternary Cyclic Codes with Minimum Distance Four Open Access

Novel Constructions of Complementary Sets of Sequences of Lengths Non-Power-of-Two Open Access

A Frequency Estimation Algorithm for High Precision Monitoring of Significant Space Targets Open Access

Joint CFO and DOA Estimation Based on MVDR Criterion in Interleaved OFDMA/SDMA Uplink Open Access

Dither Signal Design for PAPR Reduction in OFDM-IM over a Rayleigh Fading Channel Open Access

RAN Slicing with Inter-Cell Interference Control and Link Adaptation for Reliable Wireless Communications Open Access

Soft-Error Tolerance by Guard-Gate Structures on Flip-Flops in 22 and 65 nm FD-SOI Technologies Open Access

Determination Method of Cascaded Number for Lumped Parameter Models Oriented to Transmission Lines Open Access

Understanding Characteristics of Phishing Reports from Experts and Non-Experts on Twitter Open Access

VH-YOLOv5s: Detecting the Skin Color of Plectropomus leopardus in Aquaculture Using Mobile Phones Open Access

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles