IEICE global.ieice.org Site

Keyword Search Result

[Keyword] protein(18hit)

1-18hit

Bicolored Path Embedding Problems Inspired by Protein Folding Models
Tianfeng FENG Ryuhei UEHARA Giovanni VIGLIETTA

PAPER-Fundamentals of Information Systems

Pubricized:
2021/12/07
Vol:
E105-D No:3
Page(s):
623-633
In this paper, we introduce a path embedding problem inspired by the well-known hydrophobic-polar (HP) model of protein folding. A graph is said bicolored if each vertex is assigned a label in the set {red, blue}. For a given bicolored path P and a given bicolored graph G, our problem asks whether we can embed P into G in such a way as to match the colors of the vertices. In our model, G represents a protein's “blueprint,” and P is an amino acid sequence that has to be folded to form (part of) G. We first show that the bicolored path embedding problem is NP-complete even if G is a rectangular grid (a typical scenario in protein folding models) and P and G have the same number of vertices. By contrast, we prove that the problem becomes tractable if the height of the rectangular grid G is constant, even if the length of P is independent of G. Our proof is constructive: we give a polynomial-time algorithm that computes an embedding (or reports that no embedding exists), which implies that the problem is in XP when parameterized according to the height of G. Additionally, we show that the problem of embedding P into a rectangular grid G in such a way as to maximize the number of red-red contacts is NP-hard. (This problem is directly inspired by the HP model of protein folding; it was previously known to be NP-hard if G is not given, and P can be embedded in any way on a grid.) Finally, we show that, given a bicolored graph G, the problem of constructing a path P that embeds in G maximizing red-red contacts is Poly-APX-hard.
An Active Transfer Learning Framework for Protein-Protein Interaction Extraction
Lishuang LI Xinyu HE Jieqiong ZHENG Degen HUANG Fuji REN

PAPER-Natural Language Processing

Pubricized:
2017/10/30
Vol:
E101-D No:2
Page(s):
504-511
Protein-Protein Interaction Extraction (PPIE) from biomedical literatures is an important task in biomedical text mining and has achieved great success on public datasets. However, in real-world applications, the existing PPI extraction methods are limited to label effort. Therefore, transfer learning method is applied to reduce the cost of manual labeling. Current transfer learning methods suffer from negative transfer and lower performance. To tackle this problem, an improved TrAdaBoost algorithm is proposed, that is, relative distribution is introduced to initialize the weights of TrAdaBoost to overcome the negative transfer caused by domain differences. To make further improvement on the performance of transfer learning, an approach combining active learning with the improved TrAdaBoost is presented. The experimental results on publicly available PPI corpora show that our method outperforms TrAdaBoost and SVM when the labeled data is insufficient,and on document classification corpora, it also illustrates that the proposed approaches can achieve better performance than TrAdaBoost and TPTSVM in final, which verifies the effectiveness of our methods.
A Multi-Channel Electrochemical Measurement System for Biomolecular Detection
Wei-Chiun LIU Bin-Da LIU Chia-Ling WEI

PAPER-Electronic Circuits

Vol:
E99-C No:11
Page(s):
1295-1303
A modularized, low-cost, and non-invasive electrochemical examination platform is proposed in this work. Melatonin has been found to be a possible significant indicator molecule in the detection of breast cancer. 3-hydroxyanthranilic acid and nuclear matrix protein 22 can be used as a significant index for potential bladder cancer risks. The proposed system was verified by measuring the melatonin, 3-hydroxyanthranilic acid and nuclear matrix protein 22. Cyclic voltammetry and molecularly imprinted polymers were used in the experiments. Screen-printed electrodes were coated with a film imprinted with target molecules. The measurement results of the proposed system were compared with those of a commercial potentiostat. The two sets of results were very similar. Moreover, the proposed system can be expanded to a four-channel system, which can perform four measurements simultaneously. The proposed system also provides convenient graphical user interface for real-time monitoring and records the information of the redox reactions.
Novel Reconfigurable Hardware Accelerator for Protein Sequence Alignment Using Smith-Waterman Algorithm
Atef IBRAHIM Hamed ELSIMARY Abdullah ALJUMAH

PAPER-Digital Signal Processing

Vol:
E99-A No:3
Page(s):
683-690
This paper presents novel reconfigurable semi-systolic array architecture for the Smith-Waterman with an affine gap penalty algorithm to align protein sequences optimized for shorter database sequences. This architecture has been modified to enable hardware reuse rather than replicating processing elements of the semi-systolic array in multiple FPGAs. The proposed hardware architecture and the previously published conventional one are described at the Register Transfer Level (RTL) using VHDL language and implemented using the FPGA technology. The results show that the proposed design has significant higher normalized speedup (up to 125%) over the conventional one for query sequence lengths less than 512 residues. According to the UniProtKB/TrEMBL protein database (release 2015_05) statistics, the largest number of sequences (about 80%) have sequence length less than 512 residues that makes the proposed design outperforms the conventional one in terms of speed and area in this sequence lengths range.
Protein Fold Classification Using Large Margin Combination of Distance Metrics
Chendra Hadi SURYANTO Kazuhiro FUKUI Hideitsu HINO

PAPER-Pattern Recognition

Pubricized:
2015/12/14
Vol:
E99-D No:3
Page(s):
714-723
Many methods have been proposed for measuring the structural similarity between two protein folds. However, it is difficult to select one best method from them for the classification task, as each method has its own strength and weakness. Intuitively, combining multiple methods is one solution to get the optimal classification results. In this paper, by generalizing the concept of the large margin nearest neighbor (LMNN), a method for combining multiple distance metrics from different types of protein structure comparison methods for protein fold classification task is proposed. While LMNN is limited to Mahalanobis-based distance metric learning from a set of feature vectors of training data, the proposed method learns an optimal combination of metrics from a set of distance metrics by minimizing the distances between intra-class data and enlarging the distances of different classes' data. The main advantage of the proposed method is the capability in finding an optimal weight coefficient for combination of many metrics, possibly including poor metrics, avoiding the difficulties in selecting which metrics to be included for the combination. The effectiveness of the proposed method is demonstrated on classification experiments using two public protein datasets, namely, Ding Dubchak dataset and ENZYMES dataset.
Measuring the Similarity of Protein Structures Using Image Compression Algorithms
Morihiro HAYASHIDA Tatsuya AKUTSU

PAPER-Artificial Intelligence, Data Mining

Vol:
E94-D No:12
Page(s):
2468-2478
For measuring the similarity of biological sequences and structures such as DNA sequences, protein sequences, and tertiary structures, several compression-based methods have been developed. However, they are based on compression algorithms only for sequential data. For instance, protein structures can be represented by two-dimensional distance matrices. Therefore, it is expected that image compression is useful for measuring the similarity of protein structures because image compression algorithms compress data horizontally and vertically. This paper proposes series of methods for measuring the similarity of protein structures. In the methods, an original protein structure is transformed into a distance matrix, which is regarded as a two-dimensional image. Then, the similarity of two protein structures is measured by a kind of compression ratio of the concatenated image. We employed several image compression algorithms, JPEG, GIF, PNG, IFS, and SPC. Since SPC often gave better results among the other image compression methods, and it is simple and easy to be modified, we modified SPC and obtained MSPC. We applied the proposed methods to clustering of protein structures, and performed Receiver Operating Characteristic (ROC) analysis. The results of computational experiments suggest that MSPC has the best performance among existing compression-based methods. We also present some theoretical results on the time complexity and Kolmogorov complexity of image compression-based protein structure comparison.
In Situ Observation of Time Dependent Electrochemical Activity of Cytochrome c at Bare Indium-Tin-Oxide Electrodes by Cyclic Voltammetry and Slab Optical Waveguide Spectroscopy
Yusuke AYATO Akiko TAKATSU Kenji KATO Naoki MATSUDA

PAPER-Bioelectronics

Vol:
E91-C No:12
Page(s):
1899-1904
In situ observation of electrochemical activity and time dependent characteristics of cytochrome c (cyt c) was carried out in 0.01 M phosphate buffered saline (PBS, pH 7.4) containing 20 µM cyt c solutions at bare indium-tin-oxide (ITO) electrodes by using a cyclic voltammetry (CV) and a slab optical waveguide (SOWG) spectroscopy. The bare ITO electrodes could retain the electrochemical activity of cyt c in the PBS solutions, indicating the great advantage of using ITO electrodes against other electrode materials, such as gold (Au). The CV curves and simultaneously observed the time-resolved SOWG absorption spectra in the consecutive cycles implied that the cyt c molecules could retain its own electrochemical function for a long time.
Extracting Protein-Protein Interaction Information from Biomedical Text with SVM
Tomohiro MITSUMORI Masaki MURATA Yasushi FUKUDA Kouichi DOI Hirohumi DOI

LETTER-Natural Language Processing

Vol:
E89-D No:8
Page(s):
2464-2466
Automated information extraction systems from biomedical text have been reported. Some systems are based on manually developed rules or pattern matching. Manually developed rules are specific for analysis, however, new rules must be developed for each new domain. Although the corpus must be developed by human effort, a machine-learning approach automatically learns the rules from the corpus. In this article, we present a system for automatically extracting protein-protein interaction information from biomedical text with support vector machines (SVMs). We describe the performance of our system and compare its ability to extract protein-protein interaction information with that of other systems.
Dynamic Programming and Clique Based Approaches for Protein Threading with Profiles and Constraints
Tatsuya AKUTSU Morihiro HAYASHIDA Dukka Bahadur K.C. Etsuji TOMITA Jun'ichi SUZUKI Katsuhisa HORIMOTO

PAPER

Vol:
E89-A No:5
Page(s):
1215-1222
The protein threading problem with profiles is known to be efficiently solvable using dynamic programming. In this paper, we consider a variant of the protein threading problem with profiles in which constraints on distances between residues are given. We prove that protein threading with profiles and constraints is NP-hard. Moreover, we show a strong hardness result on the approximation of an optimal threading satisfying all the constraints. On the other hand, we develop two practical algorithms: CLIQUETHREAD and BBDPTHREAD. CLIQUETHREAD reduces the threading problem to the maximum edge-weight clique problem, whereas BBDPTHREAD combines dynamic programming and branch-and-bound techniques. We perform computational experiments using protein structure data in PDB (Protein Data Bank) using simulated distance constraints. The results show that constraints are useful to improve the alignment accuracy of the target sequence and the template structure. Moreover, these results also show that BBDPTHREAD is in general faster than CLIQUETHREAD for larger size proteins whereas CLIQUETHREAD is useful if there does not exist a feasible threading.
Mapping of Hierarchical Parallel Genetic Algorithms for Protein Folding onto Computational Grids
Weiguo LIU Bertil SCHMIDT

PAPER-Grid Computing

Vol:
E89-D No:2
Page(s):
589-596
Genetic algorithms are a general problem-solving technique that has been widely used in computational biology. In this paper, we present a framework to map hierarchical parallel genetic algorithms for protein folding problems onto computational grids. By using this framework, the two level communication parts of hierarchical parallel genetic algorithms are separated. Thus both parts of the algorithm can evolve independently. This permits users to experiment with alternative communication models on different levels conveniently. The underlying programming techniques are based on generic programming, a programming technique suited for the generic representation of abstract concepts. This allows the framework to be built in a generic way at application level and thus provides good extensibility and flexibility. Experiments show that it can lead to significant runtime savings on PC clusters and computational grids.
Multi-Modal Neural Networks for Symbolic Sequence Pattern Classification
Hanxi ZHU Ikuo YOSHIHARA Kunihito YAMAMORI Moritoshi YASUNAGA

PAPER-Biocybernetics, Neurocomputing

Vol:
E87-D No:7
Page(s):
1943-1952
We have developed Multi-modal Neural Networks (MNN) to improve the accuracy of symbolic sequence pattern classification. The basic structure of the MNN is composed of several sub-classifiers using neural networks and a decision unit. Two types of the MNN are proposed: a primary MNN and a twofold MNN. In the primary MNN, the sub-classifier is composed of a conventional three-layer neural network. The decision unit uses the majority decision to produce the final decisions from the outputs of the sub-classifiers. In the twofold MNN, the sub-classifier is composed of the primary MNN for partial classification. The decision unit uses a three-layer neural network to produce the final decisions. In the latter type of the MNN, since the structure of the primary MNN is folded into the sub-classifier, the basic structure of the MNN is used twice, which is the reason why we call the method twofold MNN. The MNN is validated with two benchmark tests: EPR (English Pronunciation Reasoning) and prediction of protein secondary structure. The reasoning accuracy of EPR is improved from 85.4% by using a three-layer neural network to 87.7% by using the primary MNN. In the prediction of protein secondary structure, the average accuracy is improved from 69.1% of a three-layer neural network to 74.6% by the primary MNN and 75.6% by the twofold MNN. The prediction test is based on a database of 126 non-homologous protein sequences.
Effect of Surface Hydrophilicity and Solution Chemistry on the Adsorption Behavior of Cytochrome c in Quartz Studied Using Slab Optical Waveguide (SOWG) Spectroscopy
Jose H. SANTOS Naoki MATSUDA Zhi-mei QI Akiko TAKATSU Kenji KATO

PAPER-Optoelectronics and Photonics

Vol:
E85-C No:6
Page(s):
1275-1281
The adsorption behavior of cytochrome c was investigated using slab optical waveguide (SOWG) absorption spectroscopy at the near ultraviolet region utilizing thin quartz plates as planar waveguides. SOWG absorption spectra of cytochrome c measured at constant time intervals showed significant influence of surface hydrophilicity and solution chemistry on the adsorption of this important heme protein in quartz surface. Being polar and typically amphoteric, the protein preferred adsorption on hydrophilic surface than on hydrophobic surface as implied by the lower absorbance data obtained in the latter than in the former. At lower ionic strength and in the absence of buffer, the protein molecules tend to adsorb on the quartz surface. Plots of near steady-state absorbance versus protein concentration follow hyperbolic pattern in the absence of buffer or at low ionic strength and become more linear as the buffer concentration is increased. The results presented here are explained in terms of the general qualitative understanding of protein adsorption at solid-aqueous interfaces and further aids in elucidating the properties of protein monolayers and films.
Sulfate Binding Protein Modified Electrode as a Chemical Sensor
Izumi KUBO Hidenori NAGAI

PAPER-Sensor

Vol:
E83-C No:7
Page(s):
1035-1039
A novel chemical sensor for sulfate detection was proposed in this study, utilizing sulfate binding protein (SBP) derived from Escherichia coli as sulfate recognition element. Purified SBP was immobilized on a gold electrode modified with cysteamine and glutaraldehyde. In this study the surface potential change of the SBP modified electrode to sulfate and various ions were investigated. In order to evaluate nonspecific interaction with ionic species, proteins with various isoelectric point were immobilized on the surface of gold electrode and response to ions were measured and compared to sulfate binding protein modified electrode. We made clear that the protein modified electrode shows the potential change to ions and these potential change was effected by the isoelectric point of the protein molecule, and BSA, whose isoelectric point is closest to that of SBP, showed the similar response to ions except sulfate. With use BSA modified electrode as a reference electrode, this sensing system showed selective response to sulfate, probably because of the selective binding sulfate by SBP. This potential change difference between the SBP modified electrode and the BSA modified electrode depended on the concentration of sulfate with in the range of 5 - 150 mM.
Detection of Conserved Domains in Protein Sequences Using a Maximum-Density Subgraph Algorithm
Hideo MATSUDA

PAPER

Vol:
E83-A No:4
Page(s):
713-721
In this paper, we propose a method for detecting conserved domains from a set of amino acid sequences that belong to a protein family. This method detects the domains as follows: first, generate fixed-length subsequences from the sequences; second, construct a weighted graph that connects any two of the subsequences (vertices) having higher similarity than a pre-defined threshold; third, search for the maximum-density subgraph for each connected component of the graph; finally, explore conserved domains in the sequences by combining the results of the previous step. From the performance results obtained by applying the method to several protein families that have complex conserved domains, we found that our method was able to detect those domains even though some domains were weakly conserved.
Protein Structure Alignment Using Dynamic Programing and Iterative Improvement
Tatsuya AKUTSU

PAPER-Algorithm and Computational Complexity

Vol:
E79-D No:12
Page(s):
1629-1636
In this paper, we consider the protein structure alignment problem, which is a very important problem in molecular biology. Since an outline of protein structure is represented by a sequence of points in three-dimensional space, this problem is defined as the following geometric pattern matching problem: given two point sequences P and Q in three-dimensions and a real number δ > 0, find a maximum-cardinality set of point pairs such that the distance between each pair is at most δ under the condition that any translation and rotation can be applied to P. Since it is very difficult to solve this problem exactly, we consider algorithms that solve it approximately. We propose three algorithms: BASICALIGN, RANDALIGN and FRAGALIGN whose worst case time complexities are O(n8), O((n7/k3) polylog(n)) and O(n4) respectively, where n denotes the size of larger input structure and k denotes the minimum size of the alignment to be obtained. All of these have the following common framework: a series of initial superpositions are computed; for each of such superpositions, a rough alignment is first computed using a dynamic programming technique, and then it is refined through an iterative improvement procedure which also uses dynamic programming; the best alignment among them is selected as an output. The difference among three algorithms lies in the methods of finding initial superpositions. BASICALIGN, RANDALIGN and FRAGALIGN use exhaustive search, random sampling technique and fragment-based search, respectively. We prove guaranteed approximation ratios (in the sense of distances between point pairs) for theoretical versions of BASICALIGN and RANDALIGN. Practical versions of RANDALIGN and FRAGALIGN were implemented and compared with a previous algorithm using real protein structure data. The experimental results show that FRAGALIGN is best among them and it outputs good alignments quickly.
Data Classification Component in a Deductive Database System and Its Application to Protein Structural Analysis
Akio NISHIKAWA Kenji SATOU Emiko FURUICHI Satoru KUHARA Kazuo USHIJIMA

PAPER-Advanced Applications

Vol:
E78-D No:11
Page(s):
1377-1387
Scientific database systems for the analysis of genes and proteins are becoming very important these days. We have developed a deductive database system PACADE for analyzing the three dimensional and secondary structures of proteins. In this paper, we describe the statistical data classification component of PACADE. We implemented the component for cluster analysis and discrimination analysis. In addition, we enhanced the aggregation function in order to calculate the characteristic values which are useful for data classification. By using the cluster analysis function, the proteins are thereby classified into different types of structural characteristics. The results of these structural analysis experiments are also described in this paper.
Penetration Characteristics of Submillimeter Waves in Tissues and Aqueous Solution of Protein
Tadashi FUSE Masao TAKI Osamu YOKORO

PAPER

Vol:
E77-B No:6
Page(s):
743-748
This paper presents an experimental study on the penetration characteristics of submillimeter waves in biological tissues and material. The measured values of the penetration depth in excised natural muscle, fat, and aqueous solution of protein, bovine serum albumin (BSA), over the wavelengths of 281 through 496µm are presented. Penetration depths at these wavelengths are 0.11-0.17mm in the natural pork muscle, and 0.69-0.98mm in the natural pork fat, and are the larger at the longer wavelengths. The values vary considerably from sample to sample. Since the measurement of the penetration depth in this study is shown sufficiently reproducible, the variation of the measured penetration depth is attributed to the variation of natural tissues such as that in water content. It is found that the penetration depth of submillimeter waves in aqueous solution of BSA depends almost linearly on the amount of protein content in the solution, and that the typical values of the penetration depth in the natural muscle roughly agree with that in the 35% aqueous solution of BSA in the submillimeter-wave region.
An Application of the Optimal Control Strategy for Artificial Production of Protein on Messenger RNA
Hirohumi HIRAYAMA Norio TAKEUCHI Yuzou FUKUYAMA

LETTER

Vol:
E76-A No:12
Page(s):
2076-2081
The regulatory mechanism of protein synthesis on a messenger RNA was analyzed from view point of the optimal control and discussed about availability for artificial production of peptide and protein. The transient movements of a ribozome through a messenger RNA with its production of peptide was based on the theory proposed by Gordon (1968). The optimal state of total process was defined as the state at which the time dependent change of each process of peptide synthesis has been minimized during a given time interval. This biological problem was converted into mathematical one by setting state variables and utilizing the optimal control theory with the help of Hamiltonian function. The first process of transition of a ribozome on a messenger RNA showed the largest change and with progress of state, the magnitude of change of each process decreased and became a simpler pattern. The effect of weighting coefficient relating with individual process was not confined only to its proper process but extended to all other processes. Each process was affected from all other processes. These were manifestations of effective and rational control strategies particularly for regulation of the sequential reaction in peptide synthesis. Such results were originated in the operation of the optimal control. By simulating physiological experimental data, it is possible to predict at what process and at what degree, the synthesis is regulated in order to achieve the optimal synthesis state. By analyzing the optimal synthesis process in combination with physiological experimental data, it would be possible to create artificial peptide and protein.

Keyword Search Result

[Keyword] protein(18hit)

Bicolored Path Embedding Problems Inspired by Protein Folding Models

An Active Transfer Learning Framework for Protein-Protein Interaction Extraction

A Multi-Channel Electrochemical Measurement System for Biomolecular Detection

Novel Reconfigurable Hardware Accelerator for Protein Sequence Alignment Using Smith-Waterman Algorithm

Protein Fold Classification Using Large Margin Combination of Distance Metrics

Measuring the Similarity of Protein Structures Using Image Compression Algorithms

In Situ Observation of Time Dependent Electrochemical Activity of Cytochrome c at Bare Indium-Tin-Oxide Electrodes by Cyclic Voltammetry and Slab Optical Waveguide Spectroscopy

Extracting Protein-Protein Interaction Information from Biomedical Text with SVM

Dynamic Programming and Clique Based Approaches for Protein Threading with Profiles and Constraints

Mapping of Hierarchical Parallel Genetic Algorithms for Protein Folding onto Computational Grids

Multi-Modal Neural Networks for Symbolic Sequence Pattern Classification

Effect of Surface Hydrophilicity and Solution Chemistry on the Adsorption Behavior of Cytochrome c in Quartz Studied Using Slab Optical Waveguide (SOWG) Spectroscopy

Sulfate Binding Protein Modified Electrode as a Chemical Sensor

Detection of Conserved Domains in Protein Sequences Using a Maximum-Density Subgraph Algorithm

Protein Structure Alignment Using Dynamic Programing and Iterative Improvement

Data Classification Component in a Deductive Database System and Its Application to Protein Structural Analysis

Penetration Characteristics of Submillimeter Waves in Tissues and Aqueous Solution of Protein

An Application of the Optimal Control Strategy for Artificial Production of Protein on Messenger RNA

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles