The search functionality is under construction.

Keyword Search Result

[Keyword] loss function(13hit)

1-13hit
  • Research on Mask-Wearing Detection Algorithm Based on Improved YOLOv7-Tiny Open Access

    Min GAO  Gaohua CHEN  Jiaxin GU  Chunmei ZHANG  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2024/03/19
      Vol:
    E107-D No:7
      Page(s):
    878-889

    Wearing a mask correctly is an effective method to prevent respiratory infectious diseases. Correct mask use is a reliable approach for preventing contagious respiratory infections. However, when dealing with mask-wearing in some complex settings, the detection accuracy still needs to be enhanced. The technique for mask-wearing detection based on YOLOv7-Tiny is enhanced in this research. Distribution Shifting Convolutions (DSConv) based on YOLOv7-tiny are used instead of the 3×3 convolution in the original model to simplify computation and increase detection precision. To decrease the loss of coordinate regression and enhance the detection performance, we adopt the loss function Intersection over Union with Minimum Points Distance (MPDIoU) instead of Complete Intersection over Union (CIoU) in the original model. The model is introduced with the GSConv and VoVGSCSP modules, recognizing the model’s mobility. The P6 detection layer has been designed to increase detection precision for tiny targets in challenging environments and decrease missed and false positive detection rates. The robustness of the model is increased further by creating and marking a mask-wearing data set in a multi environment that uses Mixup and Mosaic technologies for data augmentation. The efficiency of the model is validated in this research using comparison and ablation experiments on the mask dataset. The results demonstrate that when compared to YOLOv7-tiny, the precision of the enhanced detection algorithm is improved by 5.4%, Recall by 1.8%, mAP@.5 by 3%, mAP@.5:.95 by 1.7%, while the FLOPs is decreased by 8.5G. Therefore, the improved detection algorithm realizes more real-time and accurate mask-wearing detection tasks.

  • GazeFollowTR: A Method of Gaze Following with Reborn Mechanism

    Jingzhao DAI  Ming LI  Xuejiao HU  Yang LI  Sidan DU  

     
    PAPER-Vision

      Pubricized:
    2022/11/30
      Vol:
    E106-A No:6
      Page(s):
    938-946

    Gaze following is the task of estimating where an observer is looking inside a scene. Both the observer and scene information must be learned to determine the gaze directions and gaze points. Recently, many existing works have only focused on scenes or observers. In contrast, revealed frameworks for gaze following are limited. In this paper, a gaze following method using a hybrid transformer is proposed. Based on the conventional method (GazeFollow), we conduct three developments. First, a hybrid transformer is applied for learning head images and gaze positions. Second, the pinball loss function is utilized to control the gaze point error. Finally, a novel ReLU layer with the reborn mechanism (reborn ReLU) is conducted to replace traditional ReLU layers in different network stages. To test the performance of our developments, we train our developed framework with the DL Gaze dataset and evaluate the model on our collected set. Through our experimental results, it can be proven that our framework can achieve outperformance over our referred methods.

  • CAMRI Loss: Improving the Recall of a Specific Class without Sacrificing Accuracy

    Daiki NISHIYAMA  Kazuto FUKUCHI  Youhei AKIMOTO  Jun SAKUMA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/01/23
      Vol:
    E106-D No:4
      Page(s):
    523-537

    In real world applications of multiclass classification models, misclassification in an important class (e.g., stop sign) can be significantly more harmful than in other classes (e.g., no parking). Thus, it is crucial to improve the recall of an important class while maintaining overall accuracy. For this problem, we found that improving the separation of important classes relative to other classes in the feature space is effective. Existing methods that give a class-sensitive penalty for cross-entropy loss do not improve the separation. Moreover, the methods designed to improve separations between all classes are unsuitable for our purpose because they do not consider the important classes. To achieve the separation, we propose a loss function that explicitly gives loss for the feature space, called class-sensitive additive angular margin (CAMRI) loss. CAMRI loss is expected to reduce the variance of an important class due to the addition of a penalty to the angle between the important class features and the corresponding weight vectors in the feature space. In addition, concentrating the penalty on only the important class hardly sacrifices separating the other classes. Experiments on CIFAR-10, GTSRB, and AwA2 showed that CAMRI loss could improve the recall of a specific class without sacrificing accuracy. In particular, compared with GTSRB's second-worst class recall when trained with cross-entropy loss, CAMRI loss improved recall by 9%.

  • A Night Image Enhancement Algorithm Based on MDIFE-Net Curve Estimation

    Jing ZHANG  Dan LI  Hong-an LI  Xuewen LI  Lizhi ZHANG  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2022/11/04
      Vol:
    E106-D No:2
      Page(s):
    229-239

    In order to solve the low-quality problems such as low brightness, poor contrast, noise interference and color imbalance in night images, a night image enhancement algorithm based on MDIFE-Net curve estimation is presented. This algorithm mainly consists of three parts: Firstly, we design an illumination estimation curve (IEC), which adjusts the pixel level of the low illumination image domain through a non-linear fitting function, maps to the enhanced image domain, and effectively eliminates the effect of illumination loss; Secondly, the DCE-Net is improved, replacing the original Relu activation function with a smoother Mish activation function, so that the parameters can be better updated; Finally, illumination estimation loss function, which combines image attributes with fidelity, is designed to drive the no-reference image enhancement, which preserves more image details while enhancing the night image. The experimental results show that our method can not only effectively improve the image contrast, but also make the details of the target more prominent, improve the visual quality of the image, and make the image achieve a better visual effect. Compared with four existing low illumination image enhancement algorithms, the NIQE and STD evaluation index values are better than other representative algorithms, verify the feasibility and validity of the algorithm, and verify the rationality and necessity of each component design through ablation experiments.

  • Learning from Noisy Complementary Labels with Robust Loss Functions

    Hiroki ISHIGURO  Takashi ISHIDA  Masashi SUGIYAMA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/11/01
      Vol:
    E105-D No:2
      Page(s):
    364-376

    It has been demonstrated that large-scale labeled datasets facilitate the success of machine learning. However, collecting labeled data is often very costly and error-prone in practice. To cope with this problem, previous studies have considered the use of a complementary label, which specifies a class that an instance does not belong to and can be collected more easily than ordinary labels. However, complementary labels could also be error-prone and thus mitigating the influence of label noise is an important challenge to make complementary-label learning more useful in practice. In this paper, we derive conditions for the loss function such that the learning algorithm is not affected by noise in complementary labels. Experiments on benchmark datasets with noisy complementary labels demonstrate that the loss functions that satisfy our conditions significantly improve the classification performance.

  • Loss Function Considering Multiple Attributes of a Temporal Sequence for Feed-Forward Neural Networks

    Noriyuki MATSUNAGA  Yamato OHTANI  Tatsuya HIRAHARA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2020/08/31
      Vol:
    E103-D No:12
      Page(s):
    2659-2672

    Deep neural network (DNN)-based speech synthesis became popular in recent years and is expected to soon be widely used in embedded devices and environments with limited computing resources. The key intention of these systems in poor computing environments is to reduce the computational cost of generating speech parameter sequences while maintaining voice quality. However, reducing computational costs is challenging for two primary conventional DNN-based methods used for modeling speech parameter sequences. In feed-forward neural networks (FFNNs) with maximum likelihood parameter generation (MLPG), the MLPG reconstructs the temporal structure of the speech parameter sequences ignored by FFNNs but requires additional computational cost according to the sequence length. In recurrent neural networks, the recursive structure allows for the generation of speech parameter sequences while considering temporal structures without the MLPG, but increases the computational cost compared to FFNNs. We propose a new approach for DNNs to acquire parameters captured from the temporal structure by backpropagating the errors of multiple attributes of the temporal sequence via the loss function. This method enables FFNNs to generate speech parameter sequences by considering their temporal structure without the MLPG. We generated the fundamental frequency sequence and the mel-cepstrum sequence with our proposed method and conventional methods, and then synthesized and subjectively evaluated the speeches from these sequences. The proposed method enables even FFNNs that work on a frame-by-frame basis to generate speech parameter sequences by considering the temporal structure and to generate sequences perceptually superior to those from the conventional methods.

  • Generative Adversarial Network Using Weighted Loss Map and Regional Fusion Training for LDR-to-HDR Image Conversion

    Sung-Woon JUNG  Hyuk-Ju KWON  Dong-Min SON  Sung-Hak LEE  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2020/08/18
      Vol:
    E103-D No:11
      Page(s):
    2398-2402

    High dynamic range (HDR) imaging refers to digital image processing that modifies the range of color and contrast to enhance image visibility. To create an HDR image, two or more images that include various information are needed. In order to convert low dynamic range (LDR) images to HDR images, we consider the possibility of using a generative adversarial network (GAN) as an appropriate deep neural network. Deep learning requires a great deal of data in order to build a module, but once the module is created, it is convenient to use. In this paper, we propose a weight map for local luminance based on learning to reconstruct locally tone-mapped images.

  • Fresh Tea Shoot Maturity Estimation via Multispectral Imaging and Deep Label Distribution Learning

    Bin CHEN  JiLi YAN  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2020/06/01
      Vol:
    E103-D No:9
      Page(s):
    2019-2022

    Fresh Tea Shoot Maturity Estimation (FTSME) is the basement of automatic tea picking technique, determines whether the shoot can be picked. Unfortunately, the ambiguous information among single labels and uncontrollable imaging condition lead to a low FTSME accuracy. A novel Fresh Tea Shoot Maturity Estimating method via multispectral imaging and Deep Label Distribution Learning (FTSME-DLDL) is proposed to overcome these issues. The input is 25-band images, and the output is the corresponding tea shoot maturity label distribution. We utilize the multiple VGG-16 and auto-encoding network to obtain the multispectral features, and learn the label distribution by minimizing the Kullback-Leibler divergence using deep convolutional neural networks. The experimental results show that the proposed method has a better performance on FTSME than the state-of-the-art methods.

  • Tea Sprouts Segmentation via Improved Deep Convolutional Encoder-Decoder Network

    Chunhua QIAN  Mingyang LI  Yi REN  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2019/11/06
      Vol:
    E103-D No:2
      Page(s):
    476-479

    Tea sprouts segmentation via machine vision is the core technology of tea automatic picking. A novel method for Tea Sprouts Segmentation based on improved deep convolutional encoder-decoder Network (TS-SegNet) is proposed in this paper. In order to increase the segmentation accuracy and stability, the improvement is carried out by a contrastive-center loss function and skip connections. Therefore, the intra-class compactness and inter-class separability are comprehensively utilized, and the TS-SegNet can obtain more discriminative tea sprouts features. The experimental results indicate that the proposed method leads to good segmentation results, and the segmented tea sprouts are almost coincident with the ground truth.

  • Boosting Learning Algorithm for Pattern Recognition and Beyond Open Access

    Osamu KOMORI  Shinto EGUCHI  

     
    INVITED PAPER

      Vol:
    E94-D No:10
      Page(s):
    1863-1869

    This paper discusses recent developments for pattern recognition focusing on boosting approach in machine learning. The statistical properties such as Bayes risk consistency for several loss functions are discussed in a probabilistic framework. There are a number of loss functions proposed for different purposes and targets. A unified derivation is given by a generator function U which naturally defines entropy, divergence and loss function. The class of U-loss functions associates with the boosting learning algorithms for the loss minimization, which includes AdaBoost and LogitBoost as a twin generated from Kullback-Leibler divergence, and the (partial) area under the ROC curve. We expand boosting to unsupervised learning, typically density estimation employing U-loss function. Finally, a future perspective in machine learning is discussed.

  • Least Absolute Policy Iteration--A Robust Approach to Value Function Approximation

    Masashi SUGIYAMA  Hirotaka HACHIYA  Hisashi KASHIMA  Tetsuro MORIMURA  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E93-D No:9
      Page(s):
    2555-2565

    Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers in observed rewards. In this paper, we propose an alternative method that employs the absolute loss for enhancing robustness and reliability. The proposed method is formulated as a linear programming problem which can be solved efficiently by standard optimization software, so the computational advantage is not sacrificed for gaining robustness and reliability. We demonstrate the usefulness of the proposed approach through a simulated robot-control task.

  • Multiclass Boosting Algorithms for Shrinkage Estimators of Class Probability

    Takafumi KANAMORI  

     
    PAPER-Artificial Intelligence and Cognitive Science

      Vol:
    E90-D No:12
      Page(s):
    2033-2042

    Our purpose is to estimate conditional probabilities of output labels in multiclass classification problems. Adaboost provides highly accurate classifiers and has potential to estimate conditional probabilities. However, the conditional probability estimated by Adaboost tends to overfit to training samples. We propose loss functions for boosting that provide shrinkage estimator. The effect of regularization is realized by shrinkage of probabilities toward the uniform distribution. Numerical experiments indicate that boosting algorithms based on proposed loss functions show significantly better results than existing boosting algorithms for estimation of conditional probabilities.

  • Generating Category Hierarchy for Classifying Large Corpora

    Fumiyo FUKUMOTO  Yoshimi SUZUKI  

     
    PAPER-Natural Language Processing

      Vol:
    E89-D No:4
      Page(s):
    1543-1554

    We address the problem of dealing with large collections of data, and investigate the use of automatically constructing domain specific category hierarchies to improve text classification. We use two well-known techniques, the partitioning clustering method called k-means and loss function, to create the category hierarchy. The k-means method involves iterating through the data that the system is permitted to classify during each iteration and construction of a hierarchical structure. In general, the number of clusters k is not given beforehand. Therefore, we used a loss function that measures the degree of disappointment in any differences between the true distribution over inputs and the learner's prediction to select the appropriate number of clusters k. Once the optimal number of k is selected, the procedure is repeated for each cluster. Our evaluation using the 1996 Reuters corpus, which consists of 806,791 documents, showed that automatically constructing hierarchies improves classification accuracy.