This paper addresses the novel task of detecting chorus sections in English and Japanese lyrics text. Although chorus-section detection using audio signals has been studied, whether chorus sections can be detected from text-only lyrics is an open issue. Another open issue is whether patterns of repeating lyric lines such as those appearing in chorus sections depend on language. To investigate these issues, we propose a neural-network-based model for sequence labeling. It can learn phrase repetition and linguistic features to detect chorus sections in lyrics text. It is, however, difficult to train this model since there was no dataset of lyrics with chorus-section annotations as there was no prior work on this task. We therefore generate a large amount of training data with such annotations by leveraging pairs of musical audio signals and their corresponding manually time-aligned lyrics; we first automatically detect chorus sections from the audio signals and then use their temporal positions to transfer them to the line-level chorus-section annotations for the lyrics. Experimental results show that the proposed model with the generated data contributes to detecting the chorus sections, that the model trained on Japanese lyrics can detect chorus sections surprisingly well in English lyrics, and that patterns of repeating lyric lines are language-independent.
Kento WATANABE
National Institute of Advanced Industrial Science and Technology (AIST)
Masataka GOTO
National Institute of Advanced Industrial Science and Technology (AIST)
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Kento WATANABE, Masataka GOTO, "A Method to Detect Chorus Sections in Lyrics Text" in IEICE TRANSACTIONS on Information,
vol. E106-D, no. 9, pp. 1600-1609, September 2023, doi: 10.1587/transinf.2022EDP7139.
Abstract: This paper addresses the novel task of detecting chorus sections in English and Japanese lyrics text. Although chorus-section detection using audio signals has been studied, whether chorus sections can be detected from text-only lyrics is an open issue. Another open issue is whether patterns of repeating lyric lines such as those appearing in chorus sections depend on language. To investigate these issues, we propose a neural-network-based model for sequence labeling. It can learn phrase repetition and linguistic features to detect chorus sections in lyrics text. It is, however, difficult to train this model since there was no dataset of lyrics with chorus-section annotations as there was no prior work on this task. We therefore generate a large amount of training data with such annotations by leveraging pairs of musical audio signals and their corresponding manually time-aligned lyrics; we first automatically detect chorus sections from the audio signals and then use their temporal positions to transfer them to the line-level chorus-section annotations for the lyrics. Experimental results show that the proposed model with the generated data contributes to detecting the chorus sections, that the model trained on Japanese lyrics can detect chorus sections surprisingly well in English lyrics, and that patterns of repeating lyric lines are language-independent.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022EDP7139/_p
Copy
@ARTICLE{e106-d_9_1600,
author={Kento WATANABE, Masataka GOTO, },
journal={IEICE TRANSACTIONS on Information},
title={A Method to Detect Chorus Sections in Lyrics Text},
year={2023},
volume={E106-D},
number={9},
pages={1600-1609},
abstract={This paper addresses the novel task of detecting chorus sections in English and Japanese lyrics text. Although chorus-section detection using audio signals has been studied, whether chorus sections can be detected from text-only lyrics is an open issue. Another open issue is whether patterns of repeating lyric lines such as those appearing in chorus sections depend on language. To investigate these issues, we propose a neural-network-based model for sequence labeling. It can learn phrase repetition and linguistic features to detect chorus sections in lyrics text. It is, however, difficult to train this model since there was no dataset of lyrics with chorus-section annotations as there was no prior work on this task. We therefore generate a large amount of training data with such annotations by leveraging pairs of musical audio signals and their corresponding manually time-aligned lyrics; we first automatically detect chorus sections from the audio signals and then use their temporal positions to transfer them to the line-level chorus-section annotations for the lyrics. Experimental results show that the proposed model with the generated data contributes to detecting the chorus sections, that the model trained on Japanese lyrics can detect chorus sections surprisingly well in English lyrics, and that patterns of repeating lyric lines are language-independent.},
keywords={},
doi={10.1587/transinf.2022EDP7139},
ISSN={1745-1361},
month={September},}
Copy
TY - JOUR
TI - A Method to Detect Chorus Sections in Lyrics Text
T2 - IEICE TRANSACTIONS on Information
SP - 1600
EP - 1609
AU - Kento WATANABE
AU - Masataka GOTO
PY - 2023
DO - 10.1587/transinf.2022EDP7139
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2023
AB - This paper addresses the novel task of detecting chorus sections in English and Japanese lyrics text. Although chorus-section detection using audio signals has been studied, whether chorus sections can be detected from text-only lyrics is an open issue. Another open issue is whether patterns of repeating lyric lines such as those appearing in chorus sections depend on language. To investigate these issues, we propose a neural-network-based model for sequence labeling. It can learn phrase repetition and linguistic features to detect chorus sections in lyrics text. It is, however, difficult to train this model since there was no dataset of lyrics with chorus-section annotations as there was no prior work on this task. We therefore generate a large amount of training data with such annotations by leveraging pairs of musical audio signals and their corresponding manually time-aligned lyrics; we first automatically detect chorus sections from the audio signals and then use their temporal positions to transfer them to the line-level chorus-section annotations for the lyrics. Experimental results show that the proposed model with the generated data contributes to detecting the chorus sections, that the model trained on Japanese lyrics can detect chorus sections surprisingly well in English lyrics, and that patterns of repeating lyric lines are language-independent.
ER -