1-1hit |
Masahiko HARUNO Satoru IKEHARA
This paper describes a new method for learning bilingual collocations from sentence-aligned parallel corpora. Our method comprises two steps: (1) extracting useful word chunks (n-grams) in each language by word-level sorting and (2) constructing bilingual collocations by combining the word-chunks acquired in stage (1). We apply the method to a two kinds of Japanese-English texts; (1) scientific articles that comprise relatively literal translations and (2) more challenging texts: a stock market bulletin in Japanese and its abstract in English. In both cases, domain specific collocations are well captured even if they were not contained in the dictionaries of specialized terms.