The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] SI(16314hit)

221-240hit(16314hit)

  • Single UAV-Based Wave Source Localization in NLOS Environments Open Access

    Shinichi MURATA  Takahiro MATSUDA  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2023/08/01
      Vol:
    E106-B No:12
      Page(s):
    1491-1500

    To localize an unknown wave source in non-line-of-sight environments, a wave source localization scheme using multiple unmanned-aerial-vehicles (UAVs) is proposed. In this scheme, each UAV estimates the direction-of-arrivals (DoAs) of received signals and the wave source is localized from the estimated DoAs by means of maximum likelihood estimation. In this study, by extending the concept of this scheme, we propose a novel wave source localization scheme using a single UAV. In the proposed scheme, the UAV moves on the path comprising multiple measurement points and the wave source is sequentially localized from DoA distributions estimated at these measurement points. At each measurement point, with a moving path planning algorithm, the UAV determines the next measurement point from the estimated DoA distributions and measurement points that the UAV has already visited. We consider two moving path planning algorithms, and validate the proposed scheme through simulation experiments.

  • Fine Feature Analysis of Metal Plate Based on Two-Dimensional Imaging under Non-Ideal Scattering

    Xiaofan LI  Bin DENG  Qiang FU  Hongqiang WANG  

     
    PAPER-Electromagnetic Theory

      Pubricized:
    2023/05/29
      Vol:
    E106-C No:12
      Page(s):
    789-798

    The ideal point scattering model requires that each scattering center is isotropic, the position of the scattering center corresponding to the target remains unchanged, and the backscattering amplitude and phase of the target do not change with the incident frequency and incident azimuth. In fact, these conditions of the ideal point scattering model are difficult to meet, and the scattering models are not ideal in most cases. In order to understand the difference between non-ideal scattering center and ideal scattering center, this paper takes a metal plate as the research object, carries out two-dimensional imaging of the metal plate, compares the difference between the imaging position and the theoretical target position, and compares the shape of the scattering center obtained from two-dimensional imaging of the plate from different angles. From the experimental results, the offset between the scattering center position and the theoretical target position corresponding to the two-dimensional imaging of the plate under the non-ideal point scattering model is less than the range resolution and azimuth resolution. The deviation between the small angle two-dimensional imaging position and the theoretical target position using the ideal point scattering model is small, and the ideal point scattering model is still suitable for the two-dimensional imaging of the plate. In the imaging process, the ratio of range resolution and azimuth resolution affects the shape of the scattering center. The range resolution is equal to the azimuth resolution, the shape of the scattering center is circular; the range resolution is not equal to the azimuth resolution, and the shape of the scattering center is elliptic. In order to obtain more accurate two-dimensional image, the appropriate range resolution and azimuth resolution can be considered when using the ideal point scattering model for two-dimensional imaging. The two-dimensional imaging results of the plate at different azimuth and angle can be used as a reference for the study of non-ideal point scattering model.

  • Transactional TF: Transform Library with Concurrency and Correctness

    Yushi OGIWARA  Ayanori YOROZU  Akihisa OHYA  Hideyuki KAWASHIMA  

     
    PAPER

      Pubricized:
    2023/06/22
      Vol:
    E106-D No:12
      Page(s):
    1951-1959

    In the Robot Operating System (ROS), a major middleware for robots, the Transform Library (TF) is a mandatory package that manages transformation information between coordinate systems by using a directed forest data structure and providing methods for registering and computing the information. However, the structure has two fundamental problems. The first is its poor scalability: since it accepts only a single thread at a time due to using a single giant lock for mutual exclusion, the access to the tree is sequential. Second, there is a lack of data freshness: it retrieves non-latest synthetic data when computing coordinate transformations because it prioritizes temporal consistency over data freshness. In this paper, we propose methods based on transactional techniques. This will allow us to avoid anomalies, achieve high performance, and obtain fresh data. These transactional methods show a throughput of up to 429 times higher than the conventional method on a read-only workload and a freshness of up to 1276 times higher than the conventional one on a read-write combined workload.

  • A Fully-Parallel Annealing Algorithm with Autonomous Pinning Effect Control for Various Combinatorial Optimization Problems

    Daiki OKONOGI  Satoru JIMBO  Kota ANDO  Thiem Van CHU  Jaehoon YU  Masato MOTOMURA  Kazushi KAWAMURA  

     
    PAPER

      Pubricized:
    2023/09/19
      Vol:
    E106-D No:12
      Page(s):
    1969-1978

    Annealing computation has recently attracted attention as it can efficiently solve combinatorial optimization problems using an Ising spin-glass model. Stochastic cellular automata annealing (SCA) is a promising algorithm that can realize fast spin-update by utilizing its parallel computing capability. However, in SCA, pinning effect control to suppress the spin-flip probability is essential, making escaping from local minima more difficult than serial spin-update algorithms, depending on the problem. This paper proposes a novel approach called APC-SCA (Autonomous Pinning effect Control SCA), where the pinning effect can be controlled autonomously by focusing on individual spin-flip. The evaluation results using max-cut, N-queen, and traveling salesman problems demonstrate that APC-SCA can obtain better solutions than the original SCA that uses pinning effect control pre-optimized by a grid search. Especially in solving traveling salesman problems, we confirm that the tour distance obtained by APC-SCA is up to 56.3% closer to the best-known compared to the conventional approach.

  • Optimization Algorithm with Automatic Adjustment of the Number of Switches in the Order/Radix Problem

    Masaki TSUKAMOTO  Yoshiko HANADA  Masahiro NAKAO  Keiji YAMAMOTO  

     
    PAPER

      Pubricized:
    2023/06/12
      Vol:
    E106-D No:12
      Page(s):
    1979-1987

    The Order/Radix Problem (ORP) is an optimization problem that can be solved to find an optimal network topology in distributed memory systems. It is important to find the optimum number of switches in the ORP. In the case of a regular graph, a good estimation of the preferred number of switches has been proposed, and it has been shown that simulated annealing (SA) finds a good solution given a fixed number of switches. However, generally the optimal graph does not necessarily satisfy the regular condition, which greatly increases the computational costs required to find a good solution with a suitable number of switches for each case. This study improved the new method based on SA to find a suitable number of switches. By introducing neighborhood searches in which the number of switches is increased or decreased, our method can optimize a graph by changing the number of switches adaptively during the search. In numerical experiments, we verified that our method shows a good approximation for the best setting for the number of switches, and can simultaneously generate a graph with a small host-to-host average shortest path length, using instances presented by Graph Golf, an international ORP competition.

  • Power Analysis and Power Modeling of Directly-Connected FPGA Clusters

    Kensuke IIZUKA  Haruna TAKAGI  Aika KAMEI  Kazuei HIRONAKA  Hideharu AMANO  

     
    PAPER

      Pubricized:
    2023/07/20
      Vol:
    E106-D No:12
      Page(s):
    1997-2005

    FPGA cluster is a promising platform for future computing not only in the cloud but in the 5G wireless base stations with limited power supply by taking significant advantage of power efficiency. However, almost no power analyses with real systems have been reported. This work reports the detailed power consumption analyses of two FPGA clusters, namely FiC and M-KUBOS clusters with introducing power measurement tools and running the real applications. From the detailed analyses, we find that the number of activated links mainly determines the total power consumption of the systems regardless they are used or not. To improve the performance of applications while reducing power consumption, we should increase the clock frequency of the applications, use the minimum number of links and apply link aggregation. We also propose the power model for both clusters from the results of the analyses and this model can estimate the total power consumption of both FPGA clusters at the design step with 15% errors at maximum.

  • MITA: Multi-Input Adaptive Activation Function for Accurate Binary Neural Network Hardware

    Peiqi ZHANG  Shinya TAKAMAEDA-YAMAZAKI  

     
    PAPER

      Pubricized:
    2023/05/24
      Vol:
    E106-D No:12
      Page(s):
    2006-2014

    Binary Neural Networks (BNN) have binarized neuron and connection values so that their accelerators can be realized by extremely efficient hardware. However, there is a significant accuracy gap between BNNs and networks with wider bit-width. Conventional BNNs binarize feature maps by static globally-unified thresholds, which makes the produced bipolar image lose local details. This paper proposes a multi-input activation function to enable adaptive thresholding for binarizing feature maps: (a) At the algorithm level, instead of operating each input pixel independently, adaptive thresholding dynamically changes the threshold according to surrounding pixels of the target pixel. When optimizing weights, adaptive thresholding is equivalent to an accompanied depth-wise convolution between normal convolution and binarization. Accompanied weights in the depth-wise filters are ternarized and optimized end-to-end. (b) At the hardware level, adaptive thresholding is realized through a multi-input activation function, which is compatible with common accelerator architectures. Compact activation hardware with only one extra accumulator is devised. By equipping the proposed method on FPGA, 4.1% accuracy improvement is achieved on the original BNN with only 1.1% extra LUT resource. Compared with State-of-the-art methods, the proposed idea further increases network accuracy by 0.8% on the Cifar-10 dataset and 0.4% on the ImageNet dataset.

  • Adaptive Lossy Data Compression Extended Architecture for Memory Bandwidth Conservation in SpMV

    Siyi HU  Makiko ITO  Takahide YOSHIKAWA  Yuan HE  Hiroshi NAKAMURA  Masaaki KONDO  

     
    PAPER

      Pubricized:
    2023/07/20
      Vol:
    E106-D No:12
      Page(s):
    2015-2025

    Widely adopted by machine learning and graph processing applications nowadays, sparse matrix-Vector multiplication (SpMV) is a very popular algorithm in linear algebra. This is especially the case for fully-connected MLP layers, which dominate many SpMV computations and play a substantial role in diverse services. As a consequence, a large fraction of data center cycles is spent on SpMV kernels. Meanwhile, despite having efficient storage options against sparsity (such as CSR or CSC), SpMV kernels still suffer from the problem of limited memory bandwidth during data transferring because of the memory hierarchy of modern computing systems. In more detail, we find that both integer and floating-point data used in SpMV kernels are handled plainly without any necessary pre-processing. Therefore, we believe bandwidth conservation techniques, such as data compression, may dramatically help SpMV kernels when data is transferred between the main memory and the Last Level Cache (LLC). Furthermore, we also observe that convergence conditions in some typical scientific computation benchmarks (based on SpMV kernels) will not be degraded when adopting lower precision floating-point data. Based on these findings, in this work, we propose a simple yet effective data compression scheme that can be extended to general purpose computing architectures or HPC systems preferably. When it is adopted, a best-case speedup of 1.92x is made. Besides, evaluations with both the CG kernel and the PageRank algorithm indicate that our proposal introduces negligible overhead on both the convergence speed and the accuracy of final results.

  • I Never Trust My University for This! Investigating Student PII Leakage at Vietnamese Universities

    Ha DAO  Quoc-Huy VO  Tien-Huy PHAM  Kensuke FUKUDA  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2023/09/06
      Vol:
    E106-D No:12
      Page(s):
    2048-2056

    Universities collect and process a massive amount of Personal Identifiable Information (PII) at registration and throughout interactions with individuals. However, student PII can be exposed to the public by uploading documents along with university notice without consent and awareness, which could put individuals at risk of a variety of different scams, such as identity theft, fraud, or phishing. In this paper, we perform an in-depth analysis of student PII leakage at Vietnamese universities. To the best of our knowledge, we are the first to conduct a comprehensive study on student PII leakage in higher educational institutions. We find that 52.8% of Vietnamese universities leak student PII, including one or more types of personal data, in documents on their websites. It is important to note that the compromised PII includes sensitive types of data, student medical record and religion. Also, student PII leakage is not a new phenomenon and it has happened year after year since 2005. Finally, we present a study with 23 Vietnamese university employees who have worked on student PII to get a deeper understanding of this situation and envisage concrete solutions. The results are entirely surprising: the employees are highly aware of the concept of student PII. However, student PII leakage still happens due to their working habits or the lack of a management system and regulation. Therefore, the Vietnamese university should take a more active stand to protect student data in this situation.

  • Associating Colors with Mental States for Computer-Aided Drawing Therapy

    Satoshi MAEDA  Tadahiko KIMOTO  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/09/14
      Vol:
    E106-D No:12
      Page(s):
    2057-2068

    The aim of a computer-aided drawing therapy system in this work is to associate drawings which a client makes with the client's mental state in quantitative terms. A case study is conducted on experimental data which contain both pastel drawings and mental state scores obtained from the same client in a psychotherapy program. To perform such association through colors, we translate a drawing to a color feature by measuring its representative colors as primary color rates. A primary color rate of a color is defined from a psychological primary color in a way such that it shows a rate of emotional properties of the psychological primary color which is supposed to affect the color. To obtain several informative colors as representative ones of a drawing, we define two kinds of color: approximate colors extracted by color reduction, and area-averaged colors calculated from the approximate colors. A color analysis method for extracting representative colors from each drawing in a drawing sequence under the same conditions is presented. To estimate how closely a color feature is associated with a concurrent mental state, we propose a method of utilizing machine-learning classification. A practical way of building a classification model through training and validation on a very small dataset is presented. The classification accuracy reached by the model is considered as the degree of association of the color feature with the mental state scores given in the dataset. Experiments were carried out on given clinical data. Several kinds of color feature were compared in terms of the association with the same mental state. As a result, we found out a good color feature with the highest degree of association. Also, primary color rates proved more effective in representing colors in psychological terms than RGB components. The experimentals provide evidence that colors can be associated quantitatively with states of human mind.

  • Shift Quality Classifier Using Deep Neural Networks on Small Data with Dropout and Semi-Supervised Learning

    Takefumi KAWAKAMI  Takanori IDE  Kunihito HOKI  Masakazu MURAMATSU  

     
    PAPER-Pattern Recognition

      Pubricized:
    2023/09/05
      Vol:
    E106-D No:12
      Page(s):
    2078-2084

    In this paper, we apply two methods in machine learning, dropout and semi-supervised learning, to a recently proposed method called CSQ-SDL which uses deep neural networks for evaluating shift quality from time-series measurement data. When developing a new Automatic Transmission (AT), calibration takes place where many parameters of the AT are adjusted to realize pleasant driving experience in all situations that occur on all roads around the world. Calibration requires an expert to visually assess the shift quality from the time-series measurement data of the experiments each time the parameters are changed, which is iterative and time-consuming. The CSQ-SDL was developed to shorten time consumed by the visual assessment, and its effectiveness depends on acquiring a sufficient number of data points. In practice, however, data amounts are often insufficient. The methods proposed here can handle such cases. For the cases wherein only a small number of labeled data points is available, we propose a method that uses dropout. For those cases wherein the number of labeled data points is small but the number of unlabeled data is sufficient, we propose a method that uses semi-supervised learning. Experiments show that while the former gives moderate improvement, the latter offers a significant performance improvement.

  • Hierarchical Detailed Intermediate Supervision for Image-to-Image Translation

    Jianbo WANG  Haozhi HUANG  Li SHEN  Xuan WANG  Toshihiko YAMASAKI  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2023/09/14
      Vol:
    E106-D No:12
      Page(s):
    2085-2096

    The image-to-image translation aims to learn a mapping between the source and target domains. For improving visual quality, the majority of previous works adopt multi-stage techniques to refine coarse results in a progressive manner. In this work, we present a novel approach for generating plausible details by only introducing a group of intermediate supervisions without cascading multiple stages. Specifically, we propose a Laplacian Pyramid Transformation Generative Adversarial Network (LapTransGAN) to simultaneously transform components in different frequencies from the source domain to the target domain within only one stage. Hierarchical perceptual and gradient penalization are utilized for learning consistent semantic structures and details at each pyramid level. The proposed model is evaluated based on various metrics, including the similarity in feature maps, reconstruction quality, segmentation accuracy, similarity in details, and qualitative appearances. Our experiments show that LapTransGAN can achieve a much better quantitative performance than both the supervised pix2pix model and the unsupervised CycleGAN model. Comprehensive ablation experiments are conducted to study the contribution of each component.

  • Single-Line Text Detection in Multi-Line Text with Narrow Spacing for Line-Based Character Recognition

    Chee Siang LEOW  Hideaki YAJIMA  Tomoki KITAGAWA  Hiromitsu NISHIZAKI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/08/31
      Vol:
    E106-D No:12
      Page(s):
    2097-2106

    Text detection is a crucial pre-processing step in optical character recognition (OCR) for the accurate recognition of text, including both fonts and handwritten characters, in documents. While current deep learning-based text detection tools can detect text regions with high accuracy, they often treat multiple lines of text as a single region. To perform line-based character recognition, it is necessary to divide the text into individual lines, which requires a line detection technique. This paper focuses on the development of a new approach to single-line detection in OCR that is based on the existing Character Region Awareness For Text detection (CRAFT) model and incorporates a deep neural network specialized in line segmentation. However, this new method may still detect multiple lines as a single text region when multi-line text with narrow spacing is present. To address this, we also introduce a post-processing algorithm to detect single text regions using the output of the single-line segmentation. Our proposed method successfully detects single lines, even in multi-line text with narrow line spacing, and hence improves the accuracy of OCR.

  • Energy-Efficient One-to-One and Many-to-One Concurrent Transmission for Wireless Sensor Networks

    SenSong HE  Ying QIU  

     
    LETTER-Information Network

      Pubricized:
    2023/09/19
      Vol:
    E106-D No:12
      Page(s):
    2107-2111

    Recent studies have shown that concurrent transmission with precise time synchronization enables reliable and efficient flooding for wireless networks. However, most of them require all nodes in the network to forward packets a fixed number of times to reach the destination, which leads to unnecessary energy consumption in both one-to-one and many-to-one communication scenarios. In this letter, we propose G1M address this issue by reducing redundant packet forwarding in concurrent transmissions. The evaluation of G1M shows that compared with LWB, the average energy consumption of one-to-one and many-to-one transmission is reduced by 37.89% and 25%, respectively.

  • User Verification Using Evoked EEG by Invisible Visual Stimulation

    Atikur RAHMAN  Nozomu KINJO  Isao NAKANISHI  

     
    PAPER-Biometrics

      Pubricized:
    2023/06/19
      Vol:
    E106-A No:12
      Page(s):
    1569-1576

    Person authentication using biometric information has recently become popular among researchers. User management based on biometrics is more reliable than that using conventional methods. To secure private information, it is necessary to build continuous authentication-based user management systems. Brain waves are suitable biometric modalities for continuous authentication. This study is based on biometric authentication using brain waves evoked by invisible visual stimuli. Invisible visual stimulation is considered over visual stimulation to overcome the obstacles faced by a user when using a system. Invisible stimuli are confirmed by changing the intensity of the image and presenting high-speed stimulation. To ensure invisibility, stimuli of different intensities were tested, and the stimuli with an intensity of 5% was confirmed to be invisible. To improve the verification performance, a continuous wavelet transform was introduced over the Fourier transform because it extracts both time and frequency information from the brain wave. The scalogram obtained by the wavelet transform was used as an individual feature and for synchronizing the template and test data. Furthermore, to improve the synchronization performance, the waveband was split based on the power distribution of the scalogram. A performance evaluation using 20 subjects showed an equal error rate of 3.8%.

  • Comments on Quasi-Linear Support Vector Machine for Nonlinear Classification

    Sei-ichiro KAMATA  Tsunenori MINE  

     
    WRITTEN DISCUSSION-General Fundamentals and Boundaries

      Pubricized:
    2023/05/08
      Vol:
    E106-A No:11
      Page(s):
    1444-1445

    In 2014, the above paper entitled ‘Quasi-Linear Support Vector Machine for Nonlinear Classification’ was published by Zhou, et al. [1]. They proposed a quasi-linear kernel function for support vector machine (SVM). However, in this letter, we point out that this proposed kernel function is a part of multiple kernel functions generated by well-known multiple kernel learning which is proposed by Bach, et al. [2] in 2004. Since then, there have been a lot of related papers on multiple kernel learning with several applications [3]. This letter verifies that the main kernel function proposed by Zhou, et al. [1] can be derived using multiple kernel learning algorithms [3]. In the kernel construction, Zhou, et al. [1] used Gaussian kernels, but the multiple kernel learning had already discussed the locality of additive Gaussian kernels or other kernels in the framework [4], [5]. Especially additive Gaussian or other kernels were discussed in tutorial at major international conference ECCV2012 [6]. The authors did not discuss these matters.

  • An In-Vehicle Auditory Signal Evaluation Platform based on a Driving Simulator

    Fuma SAWA  Yoshinori KAMIZONO  Wataru KOBAYASHI  Ittetsu TANIGUCHI  Hiroki NISHIKAWA  Takao ONOYE  

     
    PAPER-Acoustics

      Pubricized:
    2023/05/22
      Vol:
    E106-A No:11
      Page(s):
    1368-1375

    Advanced driver-assistance systems (ADAS) generally play an important role to support safe drive by detecting potential risk factors beforehand and informing the driver of them. However, if too many services in ADAS rely on visual-based technologies, the driver becomes increasingly burdened and exhausted especially on their eyes. The drivers should be back out of monitoring tasks other than significantly important ones in order to alleviate the burden of the driver as long as possible. In-vehicle auditory signals to assist the safe drive have been appealing as another approach to altering visual suggestions in recent years. In this paper, we developed an in-vehicle auditory signals evaluation platform in an existing driving simulator. In addition, using in-vehicle auditory signals, we have demonstrated that our developed platform has highlighted the possibility to partially switch from only visual-based tasks to mixing with auditory-based ones for alleviating the burden on drivers.

  • i-MSE: A Fine Structure Imaging for Surface and Its Inside of Solid Material with Micro Slurry-Jet Erosion Test

    Shinji FUKUMA  Yoshiro IWAI  Shin-ichiro MORI  

     
    PAPER-Image

      Pubricized:
    2023/05/22
      Vol:
    E106-A No:11
      Page(s):
    1376-1384

    We propose a fine structure imaging for the surface and its inside of solid material such as coated drill bits with TiN (Titanium Nitride). We call this method i-MSE (innovative MSE) since the fine structure is visualized with a local mechanical strength (the local erosion rate) which is obtained from a set of erosion depth profiles measured with Micro Slurry-jet Erosion test (MSE). The local erosion rate at any sampling point is estimated from the depth profile using a sliding window regression and for the rest of the 2-dimensional points it is interpolated with the mean value coordinate technique. The interpolated rate is converted to a 2D image (i-MSE image) with a color map. The i-MSE image can distinguish layers if the testing material surface is composed of coats which have different resistance to erosion (erosive wear), while microscopic image such as SEM (Scanning Electron Microscope) and a calotest just provides appearance information, not physical characteristics. Experiments for some layered specimens show that i-MSE can be an effective tool to visualize the structure and to evaluate the mechanical characteristics for the surface and the inside of solid material.

  • Deep Unrolling of Non-Linear Diffusion with Extended Morphological Laplacian

    Gouki OKADA  Makoto NAKASHIZUKA  

     
    PAPER-Image

      Pubricized:
    2023/07/21
      Vol:
    E106-A No:11
      Page(s):
    1395-1405

    This paper presents a deep network based on unrolling the diffusion process with the morphological Laplacian. The diffusion process is an iterative algorithm that can solve the diffusion equation and represents time evolution with Laplacian. The diffusion process is applied to smoothing of images and has been extended with non-linear operators for various image processing tasks. In this study, we introduce the morphological Laplacian to the basic diffusion process and unwrap to deep networks. The morphological filters are non-linear operators with parameters that are referred to as structuring elements. The discrete Laplacian can be approximated with the morphological filters without multiplications. Owing to the non-linearity of the morphological filter with trainable structuring elements, the training uses error back propagation and the network of the morphology can be adapted to specific image processing applications. We introduce two extensions of the morphological Laplacian for deep networks. Since the morphological filters are realized with addition, max, and min, the error caused by the limited bit-length is not amplified. Consequently, the morphological parts of the network are implemented in unsigned 8-bit integer with single instruction multiple data set (SIMD) to achieve fast computation on small devices. We applied the proposed network to image completion and Gaussian denoising. The results and computational time are compared with other denoising algorithm and deep networks.

  • U-Net Architecture for Ancient Handwritten Chinese Character Detection in Han Dynasty Wooden Slips

    Hojun SHIMOYAMA  Soh YOSHIDA  Takao FUJITA  Mitsuji MUNEYASU  

     
    PAPER-Image

      Pubricized:
    2023/05/15
      Vol:
    E106-A No:11
      Page(s):
    1406-1415

    Recent character detectors have been modeled using deep neural networks and have achieved high performance in various tasks, such as text detection in natural scenes and character detection in historical documents. However, existing methods cannot achieve high detection accuracy for wooden slips because of their multi-scale character sizes and aspect ratios, high character density, and close character-to-character distance. In this study, we propose a new U-Net-based character detection and localization framework that learns character regions and boundaries between characters. The proposed method enhances the learning performance of character regions by simultaneously learning the vertical and horizontal boundaries between characters. Furthermore, by adding simple and low-cost post-processing using the learned regions of character boundaries, it is possible to more accurately detect the location of a group of characters in a close neighborhood. In this study, we construct a wooden slip dataset. Experiments demonstrated that the proposed method outperformed existing character detection methods, including state-of-the-art character detection methods for historical documents.

221-240hit(16314hit)