The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] binary code(10hit)

1-10hit
  • SIBYL: A Method for Detecting Similar Binary Functions Using Machine Learning

    Yuma MASUBUCHI  Masaki HASHIMOTO  Akira OTSUKA  

     
    PAPER-Dependable Computing

      Pubricized:
    2021/12/28
      Vol:
    E105-D No:4
      Page(s):
    755-765

    Binary code similarity comparison methods are mainly used to find bugs in software, to detect software plagiarism, and to reduce the workload during malware analysis. In this paper, we propose a method to compare the binary code similarity of each function by using a combination of Control Flow Graphs (CFGs) and disassembled instruction sequences contained in each function, and to detect a function with high similarity to a specified function. One of the challenges in performing similarity comparisons is that different compile-time optimizations and different architectures produce different binary code. The main units for comparing code are instructions, basic blocks and functions. The challenge of functions is that they have a graph structure in which basic blocks are combined, making it relatively difficult to derive similarity. However, analysis tools such as IDA, display the disassembled instruction sequence in function units. Detecting similarity on a function basis has the advantage of facilitating simplified understanding by analysts. To solve the aforementioned challenges, we use machine learning methods in the field of natural language processing. In this field, there is a Transformer model, as of 2017, that updates each record for various language processing tasks, and as of 2021, Transformer is the basis for BERT, which updates each record for language processing tasks. There is also a method called node2vec, which uses machine learning techniques to capture the features of each node from the graph structure. In this paper, we propose SIBYL, a combination of Transformer and node2vec. In SIBYL, a method called Triplet-Loss is used during learning so that similar items are brought closer and dissimilar items are moved away. To evaluate SIBYL, we created a new dataset using open-source software widely used in the real world, and conducted training and evaluation experiments using the dataset. In the evaluation experiments, we evaluated the similarity of binary codes across different architectures using evaluation indices such as Rank1 and MRR. The experimental results showed that SIBYL outperforms existing research. We believe that this is due to the fact that machine learning has been able to capture the features of the graph structure and the order of instructions on a function-by-function basis. The results of these experiments are presented in detail, followed by a discussion and conclusion.

  • A Class of Binary Cyclic Codes and Their Weight Distributions

    Chao HE  Rong LUO  Mei YANG  

     
    LETTER-Coding Theory

      Vol:
    E103-A No:3
      Page(s):
    634-637

    Let m, k be positive integers with m=2k and k≥3. Let C(u, ν) is a class of cyclic codes of length 2m-1 whose parity-check polynomial is mu(x)mν(x), where mu(x) and mν(x) are the minimal polynomials of α-u and α-ν over GF(2). For the case $(u, u)=(1, rac{1}{3}(2^m-1))$, the weight distributions of binary cyclic codes C(u, ν) was determined in 2017. This paper determines the weight distributions of the binary cyclic codes C(u, ν) for the case of (u, ν)=(3, 2k-1+1). The application of these cyclic codes in secret sharing is also considered.

  • An Improvement of Non-Binary Single b-Burst of Insertion/Deletion Correcting Code

    Toyohiko SAEKI  Takayuki NOZAKI  

     
    PAPER-Coding Theory

      Vol:
    E102-A No:12
      Page(s):
    1591-1599

    This paper constructs non-binary codes correcting a single b-burst of insertions or deletions with large cardinalities. This paper also provides insertion and deletion correcting algorithms of the constructed codes and evaluates a lower bound of the cardinalities of the constructed codes. Moreover, we evaluate a non-asymptotic upper bound on the cardinalities of arbitrary codes which correct a single b-burst of insertions or deletions.

  • Vector Quantization of High-Dimensional Speech Spectra Using Deep Neural Network

    JianFeng WU  HuiBin QIN  YongZhu HUA  LiHuan SHAO  Ji HU  ShengYing YANG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/07/02
      Vol:
    E102-D No:10
      Page(s):
    2047-2050

    This paper proposes a deep neural network (DNN) based framework to address the problem of vector quantization (VQ) for high-dimensional data. The main challenge of applying DNN to VQ is how to reduce the binary coding error of the auto-encoder when the distribution of the coding units is far from binary. To address this problem, three fine-tuning methods have been adopted: 1) adding Gaussian noise to the input of the coding layer, 2) forcing the output of the coding layer to be binary, 3) adding a non-binary penalty term to the loss function. These fine-tuning methods have been extensively evaluated on quantizing speech magnitude spectra. The results demonstrated that each of the methods is useful for improving the coding performance. When implemented for quantizing 968-dimensional speech spectra using only 18-bit, the DNN-based VQ framework achieved an averaged PESQ of about 2.09, which is far beyond the capability of conventional VQ methods.

  • Sample-Adaptive Product Quantizers with Affine Index Assignments for Noisy Channels

    Dong Sik KIM  Youngcheol PARK  

     
    PAPER-Fundamental Theories for Communications

      Vol:
    E92-B No:10
      Page(s):
    3084-3093

    When we design a robust vector quantizer (VQ) for noisy channels, an appropriate index assignment function should be contrived to minimize the channel-error effect. For relatively high rates, the complexity for finding an optimal index assignment function is too high to be implemented. To overcome such a problem, we use a structurally constrained VQ, which is called the sample-adaptive product quantizer (SAPQ) [12], for low complexities of quantization and index assignment. The product quantizer (PQ) and its variation SAPQ [13], which are based on the scalar quantizer (SQ) and thus belong to a class of the binary lattice VQ [16], have inherent error resilience even though the conventional affine index assignment functions, such as the natural binary code, are employed. The error resilience of SAPQ is observed in a weak sense through worst-case bounds. Using SAPQ for noisy channels is useful especially for high rates, e.g., > 1 bit/sample, and it is numerically shown that the channel-limit performance of SAPQ is comparable to that of the best codebook permutation of binary switching algorithm (BSA) [23]. Further, the PQ or SAPQ codebook with an affine index assignment function is used for the initial guess of the conventional clustering algorithm, and it is shown that the performance of the best BSA can be easily achieved.

  • M-Ary Substitution/Deletion/Insertion/Adjacent-Symbol-Transposition Error Correcting Codes for Data Entry Systems

    Haruhiko KANEKO  Eiji FUJIWARA  

     
    PAPER-Coding Theory

      Vol:
    E92-A No:7
      Page(s):
    1668-1676

    Nonbinary M-ary data processed by data entry systems, such as keyboard devices and character recognition systems, often have various types of error, such as symbol-substitution errors, deletion errors, insertion errors, and adjacent-symbol-transposition errors. This paper proposes nonsystematic M-ary codes capable of correcting these errors. The code is defined as a set of codewords that satisfy three conditions required to correct substitution, deletion/insertion, and adjacent-symbol-transposition errors. Since symbol-substitution errors in data entry systems are usually asymmetric, this paper also presents asymmetric-symbol-substitution error correcting codes capable of correcting deletion, insertion, and adjacent-symbol-transposition errors. For asymmetric-symbol-substitution error correction, we employ a mapping derived from the vertex coloring in an error directionality graph. The evaluation shows that the asymmetric codes have three to five times larger number of codewords than the symmetric codes.

  • Selection of Test Patterns in an Iterative Erasure and Error Decoding Algorithm for Non-binary Block Codes

    Hitoshi TOKUSHIGE  Ippei HISADOMI  Tadao KASAMI  

     
    LETTER-Coding Theory

      Vol:
    E89-A No:11
      Page(s):
    3355-3359

    This letter considers an iterative decoding algorithm for non-binary linear block codes in which erasure and error decoding is performed for input words given by the sums of a hard-decision received sequence and given test patterns. We have proposed a new selection method of test patterns for the iterative decoding algorithm. Simulation results have shown that the decoding algorithm with test patterns by the proposed selection method provides better error performance than a conventional iterative decoding algorithm with the same number of the error and erasure decoding iterations over an additive white Gaussian noise channel using binary phase-shift keying modulation.

  • New Binary Constant Weight Codes Based on Cayley Graphs of Groups and Their Decoding Methods

    Jun IMAI  Yoshinao SHIRAKI  

     
    PAPER-Coding Theory

      Vol:
    E88-A No:10
      Page(s):
    2734-2744

    We propose a new class of binary nonlinear codes of constant weights derived from a permutation representation of a group that is given by a combinatorial definition such as Cayley graphs of a group. These codes are constructed by the following direct interpretation method from a group: (1) take one discrete group whose elements are defined by generators and their relations, such as those in the form of Cayley graphs; and (2) embedding the group into a binary space using some of their permutation representations by providing the generators with realization of permutations of some terms. The proposed codes are endowed with some good characteristics as follows: (a) we can easily learn information about the distances of the obtained codes, and moreover, (b) we can establish a decoding method for them that can correct random errors whose distances from code words are less than half of the minimum distances achieved using only parity checking procedures.

  • Phase Offset of Binary Code and Its Application to the CDMA Mobile Communications

    Young Yearl HAN  Young Joon SONG  

     
    PAPER-Universal Personal Communications

      Vol:
    E81-A No:6
      Page(s):
    1145-1151

    It is important to know phase offsets of a binary code in the field of mobile communications because different phase offsets of the same code are used to distinguish signals received at a mobile station from those of different base stations. When the period of the code is not very long, the relative phase offset between the code and its shifted code can be found by counting the number of bits delayed from the code of the same bit streams. But as the period of the code increases, it becomes difficult to find the phase offset. This paper proposes a new method to calculate the phase offset of a binary code. We define an accumulator function, which is used to calculate the phase offsets between the code and its shifted code. Also the properties of the accumulator function are investigated. This number theoretical approach and its results show that this method is very easy for the phase offset calculation. Its application to the code division multiple access (CDMA) system to define a reference code is given. The simple circuit realization of the accumulator function to calculate the phase offset between the received code and receiver stored replica code is described.

  • Some Optimal and Quasi-Optimal Binary Codes from Cyclic Codes over GF(2m)

    Katsumi SAKAKIBARA  Masao KASAHARA  Yoshiharu YUBA  

     
    LETTER-Information Theory and Coding Theory

      Vol:
    E79-A No:10
      Page(s):
    1737-1738

    It is shown that five optimal and one quasioptimal binary codes with respect to the Griesmer bound can be obtained from cyclic codes over GF(2fm). An [m(2em - 1), em, 2em-1m] code, a [3(22e - 1), 2e, 322e-1] code, a [2(22e - 1), 2, (22e+2 - 4)/3] code, a [3(22e - 1), 2, 22e+1 - 2] code, and a [3(22e - 1), 2(e+1), 322e-1 - 2] code are optimal and a [2(22e - 1), 2(e + 1), 22e - 2] code is quasi-optimal.