The search functionality is under construction.

Author Search Result

[Author] Feng XU(25hit)

1-20hit(25hit)

  • Multi Information Fusion Network for Saliency Quality Assessment

    Kai TAN  Qingbo WU  Fanman MENG  Linfeng XU  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2019/02/26
      Vol:
    E102-D No:5
      Page(s):
    1111-1114

    Saliency quality assessment aims at estimating the objective quality of a saliency map without access to the ground-truth. Existing works typically evaluate saliency quality by utilizing information from saliency maps to assess its compactness and closedness while ignoring the information from image content which can be used to assess the consistence and completeness of foreground. In this letter, we propose a novel multi-information fusion network to capture the information from both the saliency map and image content. The key idea is to introduce a siamese module to collect information from foreground and background, aiming to assess the consistence and completeness of foreground and the difference between foreground and background. Experiments demonstrate that by incorporating image content information, the performance of the proposed method is significantly boosted. Furthermore, we validate our method on two applications: saliency detection and segmentation. Our method is utilized to choose optimal saliency map from a set of candidate saliency maps, and the selected saliency map is feeded into an segmentation algorithm to generate a segmentation map. Experimental results verify the effectiveness of our method.

  • Improved Intra Prediction Coding Scheme Based on Minimum Distance Prediction for H.264/AVC

    Qingbo WU  Linfeng XU  Zhengning WANG  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E96-D No:4
      Page(s):
    980-983

    In this letter, we propose a novel intra prediction coding scheme for H.264/AVC. Based on our proposed minimum distance prediction (MDP) scheme, the optimal reference samples for predicting the current pixel can be adaptively updated corresponding to different video contents. The experimental results show that up to 2 dB and 1 dB coding gains can be achieved with the proposed method for QCIF and CIF sequences respectively.

  • Estimation of Bridge Height over Water from Polarimetric SAR Image Data Using Mapping and Projection Algorithm and De-Orientation Theory

    Haipeng WANG  Feng XU  Ya-Qiu JIN  Kazuo OUCHI  

     
    PAPER-Sensing

      Vol:
    E92-B No:12
      Page(s):
    3875-3882

    An inversion method of bridge height over water by polarimetric synthetic aperture radar (SAR) is developed. A geometric ray description to illustrate scattering mechanism of a bridge over water surface is identified by polarimetric image analysis. Using the mapping and projecting algorithm, a polarimetric SAR image of a bridge model is first simulated and shows that scattering from a bridge over water can be identified by three strip lines corresponding to single-, double-, and triple-order scattering, respectively. A set of polarimetric parameters based on the de-orientation theory is applied to analysis of three types scattering, and the thinning-clustering algorithm and Hough transform are then employed to locate the image positions of these strip lines. These lines are used to invert the bridge height. Fully polarimetric image data of airborne Pi-SAR at X-band are applied to inversion of the height and width of the Naruto Bridge in Japan. Based on the same principle, this approach is also applicable to spaceborne ALOSPALSAR single-polarization data of the Eastern Ocean Bridge in China. The results show good feasibility to realize the bridge height inversion.

  • ConvNeXt-Haze: A Fog Image Classification Algorithm for Small and Imbalanced Sample Dataset Based on Convolutional Neural Network

    Fuxiang LIU  Chen ZANG  Lei LI  Chunfeng XU  Jingmin LUO  

     
    PAPER

      Pubricized:
    2022/11/22
      Vol:
    E106-D No:4
      Page(s):
    488-494

    Aiming at the different abilities of the defogging algorithms in different fog concentrations, this paper proposes a fog image classification algorithm for a small and imbalanced sample dataset based on a convolution neural network, which can classify the fog images in advance, so as to improve the effect and adaptive ability of image defogging algorithm in fog and haze weather. In order to solve the problems of environmental interference, camera depth of field interference and uneven feature distribution in fog images, the CutBlur-Gauss data augmentation method and focal loss and label smoothing strategies are used to improve the accuracy of classification. It is compared with the machine learning algorithm SVM and classical convolution neural network classification algorithms alexnet, resnet34, resnet50 and resnet101. This algorithm achieves 94.5% classification accuracy on the dataset in this paper, which exceeds other excellent comparison algorithms at present, and achieves the best accuracy. It is proved that the improved algorithm has better classification accuracy.

  • FSPose: A Heterogeneous Framework with Fast and Slow Networks for Human Pose Estimation in Videos

    Jianfeng XU  Satoshi KOMORITA  Kei KAWAMURA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/03/20
      Vol:
    E106-D No:6
      Page(s):
    1165-1174

    We propose a framework for the integration of heterogeneous networks in human pose estimation (HPE) with the aim of balancing accuracy and computational complexity. Although many existing methods can improve the accuracy of HPE using multiple frames in videos, they also increase the computational complexity. The key difference here is that the proposed heterogeneous framework has various networks for different types of frames, while existing methods use the same networks for all frames. In particular, we propose to divide the video frames into two types, including key frames and non-key frames, and adopt three networks including slow networks, fast networks, and transfer networks in our heterogeneous framework. For key frames, a slow network is used that has high accuracy but high computational complexity. For non-key frames that follow a key frame, we propose to warp the heatmap of a slow network from a key frame via a transfer network and fuse it with a fast network that has low accuracy but low computational complexity. Furthermore, when extending to the usage of long-term frames where a large number of non-key frames follow a key frame, the temporal correlation decreases. Therefore, when necessary, we use an additional transfer network that warps the heatmap from a neighboring non-key frame. The experimental results on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed FSPose achieves a better balance between accuracy and computational complexity than the competitor method. Our source code is available at https://github.com/Fenax79/fspose.

  • Efficient Generation of Dancing Animation Synchronizing with Music Based on Meta Motion Graphs

    Jianfeng XU  Koichi TAKAGI  Shigeyuki SAKAZAWA  

     
    PAPER-Computer Graphics

      Vol:
    E95-D No:6
      Page(s):
    1646-1655

    This paper presents a system for automatic generation of dancing animation that is synchronized with a piece of music by re-using motion capture data. Basically, the dancing motion is synthesized according to the rhythm and intensity features of music. For this purpose, we propose a novel meta motion graph structure to embed the necessary features including both rhythm and intensity, which is constructed on the motion capture database beforehand. In this paper, we consider two scenarios for non-streaming music and streaming music, where global search and local search are required respectively. In the case of the former, once a piece of music is input, the efficient dynamic programming algorithm can be employed to globally search a best path in the meta motion graph, where an objective function is properly designed by measuring the quality of beat synchronization, intensity matching, and motion smoothness. In the case of the latter, the input music is stored in a buffer in a streaming mode, then an efficient search method is presented for a certain amount of music data (called a segment) in the buffer with the same objective function, resulting in a segment-based search approach. For streaming applications, we define an additional property in the above meta motion graph to deal with the unpredictable future music, which guarantees that there is some motion to match the unknown remaining music. A user study with totally 60 subjects demonstrates that our system outperforms the stat-of-the-art techniques in both scenarios. Furthermore, our system improves the synthesis speed greatly (maximal speedup is more than 500 times), which is essential for mobile applications. We have implemented our system on commercially available smart phones and confirmed that it works well on these mobile phones.

  • A Propagation Method for Multi Object Tracklet Repair

    Nii L. SOWAH  Qingbo WU  Fanman MENG  Liangzhi TANG  Yinan LIU  Linfeng XU  

     
    LETTER-Pattern Recognition

      Pubricized:
    2018/05/29
      Vol:
    E101-D No:9
      Page(s):
    2413-2416

    In this paper, we improve upon the accuracy of existing tracklet generation methods by repairing tracklets based on their quality evaluation and detection propagation. Starting from object detections, we generate tracklets using three existing methods. Then we perform co-tracklet quality evaluation to score each tracklet and filtered out good tracklet based on their scores. A detection propagation method is designed to transfer the detections in the good tracklets to the bad ones so as to repair bad tracklets. The tracklet quality evaluation in our method is implemented by intra-tracklet detection consistency and inter-tracklet detection completeness. Two propagation methods; global propagation and local propagation are defined to achieve more accurate tracklet propagation. We demonstrate the effectiveness of the proposed method on the MOT 15 dataset

  • Texture Representation via Joint Statistics of Local Quantized Patterns

    Tiecheng SONG  Linfeng XU  Chao HUANG  Bing LUO  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E97-D No:1
      Page(s):
    155-159

    In this paper, a simple yet efficient texture representation is proposed for texture classification by exploring the joint statistics of local quantized patterns (jsLQP). In order to combine information of different domains, the Gaussian derivative filters are first employed to obtain the multi-scale gradient responses. Then, three feature maps are generated by encoding the local quantized binary and ternary patterns in the image space and the gradient space. Finally, these feature maps are hybridly encoded, and their joint histogram is used as the final texture representation. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art LBP based and even learning based methods for texture classification.

  • De-Blocking Artifacts in DCT Domain Using Projection onto Convex Sets Algorithm

    Hai-Feng XU  Song-Yu YU  Ci WANG  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E89-D No:8
      Page(s):
    2460-2463

    Based on the theory of block projection onto convex sets (BPOCS), a novel de-blocking algorithm is proposed. A new smoothness constraint set (SCS) is used to remove the unnecessary high frequencies. In addition, an adaptive quantization constraint set (AQCS) is employed to suppress error in the smoothing process. The proposed size and position of new SCS are different from traditional ones. Extensive experimental results are provided to demonstrate that the proposed method can achieve better image quality with fewer iterations.

  • Learning a Saliency Map for Fixation Prediction

    Linfeng XU  Liaoyuan ZENG  Zhengning WANG  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E96-D No:10
      Page(s):
    2294-2297

    In this letter, we use the saliency maps obtained by several bottom-up methods to learn a model to generate a bottom-up saliency map. In order to consider top-down image semantics, we use the high-level features of objectness and background probability to learn a top-down saliency map. The bottom-up map and top-down map are combined through a two-layer structure. Quantitative experiments demonstrate that the proposed method and features are effective to predict human fixation.

  • Summarization of 3D Video by Rate-Distortion Trade-off

    Jianfeng XU  Toshihiko YAMASAKI  Kiyoharu AIZAWA  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E90-D No:9
      Page(s):
    1430-1438

    3D video, which consists of a sequence of mesh models, can reproduce dynamic scenes containing 3D information. To summarize 3D video, a key frame extraction method is developed using rate-distortion (R-D) trade-off. For this purpose, an effective feature vector is extracted for each frame. Shot detection is performed using the feature vectors as a preprocessing followed by key frame extraction. Simple but reasonable definitions of rate and distortion are presented. Based on an assumption of linearity, an R-D curve is generated in each shot, where the locations of the key frames are optimized. Finally, R-D trade-off can be achieved by optimizing a cost function using a Lagrange multiplier, where the number of key frames is optimized in each shot. Therefore, our system will automatically determine the best locations and the number of key frames in the sense of R-D trade-off. Our experimental results show the extracted key frames are compact and faithful to the original 3D video.

  • Self-Supervised Learning of Video Representation for Anticipating Actions in Early Stage

    Yinan LIU  Qingbo WU  Liangzhi TANG  Linfeng XU  

     
    LETTER-Pattern Recognition

      Pubricized:
    2018/02/21
      Vol:
    E101-D No:5
      Page(s):
    1449-1452

    In this paper, we propose a novel self-supervised learning of video representation which is capable to anticipate the video category by only reading its short clip. The key idea is that we employ the Siamese convolutional network to model the self-supervised feature learning as two different image matching problems. By using frame encoding, the proposed video representation could be extracted from different temporal scales. We refine the training process via a motion-based temporal segmentation strategy. The learned representations for videos can be not only applied to action anticipation, but also to action recognition. We verify the effectiveness of the proposed approach on both action anticipation and action recognition using two datasets namely UCF101 and HMDB51. The experiments show that we can achieve comparable results with the state-of-the-art self-supervised learning methods on both tasks.

  • Capacitance Extraction of Three-Dimensional Interconnects Using Element-by-Element Finite Element Method (EBE-FEM) and Preconditioned Conjugate Gradient (PCG) Technique

    Jianfeng XU  Hong LI  Wen-Yan YIN  Junfa MAO  Le-Wei LI  

     
    PAPER-Integrated Electronics

      Vol:
    E90-C No:1
      Page(s):
    179-188

    The element-by-element finite element method (EBE-FEM) combined with the preconditioned conjugate gradient (PCG) technique is employed in this paper to calculate the coupling capacitances of multi-level high-density three-dimensional interconnects (3DIs). All capacitive coupling 3DIs can be captured, with the effects of all geometric and physical parameters taken into account. It is numerically demonstrated that with this hybrid method in the extraction of capacitances, an effective and accurate convergent solution to the Laplace equation can be obtained, with less memory and CPU time required, as compared to the results obtained by using the commercial FEM software of either MAXWELL 3D or ANSYS.

  • A Virtualization-Based Approach for Application Whitelisting

    Donghai TIAN  Jingfeng XUE  Changzhen HU  Xuanya LI  

     
    LETTER-Software System

      Vol:
    E97-D No:6
      Page(s):
    1648-1651

    A whitelisting approach is a promising solution to prevent unwanted processes (e.g., malware) getting executed. However, previous solutions suffer from limitations in that: 1) Most methods place the whitelist information in the kernel space, which could be tempered by attackers; 2) Most methods cannot prevent the execution of kernel processes. In this paper, we present VAW, a novel application whitelisting system by using the virtualization technology. Our system is able to block the execution of unauthorized user and kernel processes. Compared with the previous solutions, our approach can achieve stronger security guarantees. The experiments show that VAW can deny the execution of unwanted processes effectively with a little performance overhead.

  • Thermal Effect Simulation of GaN HFETs under CW and Pulsed Operation

    Jianfeng XU  Wen-Yan YIN  Junfa MAO  Le-Wei LI  

     
    LETTER-Electronic Components

      Vol:
    E90-C No:1
      Page(s):
    204-207

    In this paper, the thermal characteristic of the GaN HFETs has been analyzed using the hybrid finite element method (FEM). Both the steady and transient state thermal operations are quantitatively studied with the effects of temperature-dependent thermal conductivities of GaN and the substrate materials properly treated. The temperature distribution and the maximum temperatures of the HFETs operated under excitations of continuous-waves (CW) and pulsed-waves (PW) including double exponential shape PW such as electromagnetic pulse (EMP) and ultra-wideband (UWB) signal are studied and compared.

  • Small Group Detection in Crowds using Interaction Information

    Kai TAN  Linfeng XU  Yinan LIU  Bing LUO  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2017/04/17
      Vol:
    E100-D No:7
      Page(s):
    1542-1545

    Small group detection is still a challenging problem in crowds. Traditional methods use the trajectory information to measure pairwise similarity which is sensitive to the variations of group density and interactive behaviors. In this paper, we propose two types of information by simultaneously incorporating trajectory and interaction information, to detect small groups in crowds. The trajectory information is used to describe the spatial proximity and motion information between trajectories. The interaction information is designed to capture the interactive behaviors from video sequence. To achieve this goal, two classifiers are exploited to discover interpersonal relations. The assumption is that interactive behaviors often occur in group members while there are no interactions between individuals in different groups. The pairwise similarity is enhanced by combining the two types of information. Finally, an efficient clustering approach is used to achieve small group detection. Experiments show that the significant improvement is gained by exploiting the interaction information and the proposed method outperforms the state-of-the-art methods.

  • On the Use of Shanks Transformation to Accelerate Capacitance Extraction for Periodic Structures

    Ye LIU  Zheng-Fan LI  Mei XUE  Rui-Feng XUE  

     
    LETTER-Electromagnetic Theory

      Vol:
    E87-C No:6
      Page(s):
    1078-1081

    Integral equation method is used to compute three-dimension-structure capacitance in this paper. Since some multi-conductor structures present regular periodic property, the periodic cell is used to reduce the computational domain with adding appropriate magnetic and electric walls. The periodic Green's function in the integral equation method is represented in the form of infinite series with slow convergence. In this paper, Shanks transformation is used to accelerate the convergence. Numerical examples show that the proposed method is accurate with a much higher efficiency in capacitance extraction for 3-D periodic structures.

  • Content-Based Retrieval of Motion Capture Data Using Short-Term Feature Extraction

    Jianfeng XU  Haruhisa KATO  Akio YONEYAMA  

     
    PAPER-Contents Technology and Web Information Systems

      Vol:
    E92-D No:9
      Page(s):
    1657-1667

    This paper presents a content-based retrieval algorithm for motion capture data, which is required to re-use a large-scale database that has many variations in the same category of motions. The most challenging problem is that logically similar motions may not be numerically similar due to the motion variations in a category. Our algorithm can effectively retrieve logically similar motions to a query, where a distance metric between our novel short-term features is defined properly as a fundamental component in our system. We extract the features based on short-term analysis of joint velocities after dividing an entire motion capture sequence into many small overlapped clips. In each clip, we select not only the magnitude but also the dynamic pattern of the joint velocities as our features, which can discard the motion variations while keeping the significant motion information in a category. Simultaneously, the amount of data is reduced, alleviating the computational cost. Using the extracted features, we define a novel distance metric between two motion clips. By dynamic time warping, a motion dissimilarity measure is calculated between two motion capture sequences. Then, given a query, we rank all the motions in our dataset according to their motion dissimilarity measures. Our experiments, which are performed on a test dataset consisting of more than 190 motions, demonstrate that our algorithm greatly improves the performance compared to two conventional methods according to a popular evaluation measure P(NR).

  • Assessment of Building Damage in 2008 Wenchuan Earthquake from Multi-Temporal SAR Images Using Getis Statistic

    Haipeng WANG  Tianlin WANG  Feng XU  Kazuo OUCHI  

     
    LETTER

      Vol:
    E94-B No:11
      Page(s):
    2983-2986

    In this paper, the Getis statistic is applied to ALOS- PALSAR (Advanced Land Ovserving Satellite-Phased Array L-band Synthetic Aperture Radar) images for assessing the building damage caused by the Wenchuan earthquake in 2008. As a proposed image analysis, a simulated building image using mapping and projection algorithm is first presented for analysis of the Getis statistic. The results show the high accuracy of the assessment of the proposed approach. The Getis statistic is then applied to two ALOS-PALSAR images acquired before and after the Wenchuan earthquake to assess the level of building damage. Results of the Getis statistic show that the damage level is approximately 81%.

  • A Novel Joint Rate Distortion Optimization Scheme for Intra Prediction Coding in H.264/AVC

    Qingbo WU  Jian XIONG  Bing LUO  Chao HUANG  Linfeng XU  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E97-D No:4
      Page(s):
    989-992

    In this paper, we propose a novel joint rate distortion optimization (JRDO) model for intra prediction coding. The spatial prediction dependency is exploited by modeling the distortion propagation with a linear fitting function. A novel JRDO based Lagrange multiplier (LM) is derived from this model. To adapt to different blocks' distortion propagation characteristics, we also introduce a generalized multiple Lagrange multiplier (MLM) framework where some candidate LMs are used in the RDO process. Experiment results show that our proposed JRDO-MLM scheme is superior to the H.264/AVC encoder.

1-20hit(25hit)