中国口腔颌面外科杂志 ›› 2022, Vol. 20 ›› Issue (2): 151-157.doi: 10.19438/j.cjoms.2022.02.009

• 论著 • 上一篇    下一篇

369例口腔癌患者汉语普通话语音数据库的建立与初步评价

肖育栋, 郭凯欣, 杨乐, 邓威, 曾滨, 张思恩, 劳小媚, 廖贵清, 梁玉洁   

  1. 中山大学光华口腔医学院·附属口腔医院 口腔颌面外科,广东省口腔医学重点实验室,广东 广州 510055
  • 收稿日期:2021-07-30 修回日期:2021-11-26 出版日期:2022-03-20 发布日期:2022-03-20
  • 通讯作者: 梁玉洁,E-mail:liangyj35@mail.sysu.edu.cn
  • 作者简介:肖育栋(1993-),男,在读博士研究生,E-mail:xiaoyud@mail2.sysu.edu.cn
  • 基金资助:
    广东省财政高水平医院建设专项资金(174-2018-XMZC-0001-03-0125/A-03)

Preliminary establishment of Mandarin speech database of369Chinese oral cancer patients

XIAO Yu-dong, GUO Kai-xin, YANG Le, DENG Wei, ZENG Bin, ZHANG Si-en, LAO Xiao-mei, LIAO Gui-qing, LIANG Yu-jie   

  1. Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Hospital of Stomatology, Sun Yat-sen University; Guangdong Provincial Key Laboratory of Stomatology. Guangzhou 510055, Guangdong Province, China
  • Received:2021-07-30 Revised:2021-11-26 Online:2022-03-20 Published:2022-03-20

摘要: 目的:收集口腔癌患者手术前、后语音样本,建立口腔癌患者汉语普通话语音数据库,为口腔癌的临床诊治和康复提供数据平台。方法:收集在中山大学附属口腔医院口腔颌面外科就诊及术后定期复诊的口腔癌患者,利用多种语音测试工具对患者进行语音评估,在安静环境下录音,纳入非恶性肿瘤病变患者和健康人作为对照组。所有音频样本经统一预处理、分割、标注、匿名化处理,生成标准化语音数据集。结果:2017年7月—2021年4月纳入481位独立发音人,男274例(57.0%),女207例(43.0%),平均年龄(46.98±16.34)岁。口腔癌患者369例(76.7%),健康人79例(16.4%),非恶性肿瘤患者33例(6.9%)。口腔癌患者中,病变位于舌及口底258例,术前进入评估队列202例,T1、T2、T3、T4分别为35例(17.3%)、68例(33.7%)、41例(20.3%)、58例(28.7%)。所有发音人的中位随访(评估)时间为术后219 d(IQR:87.5~587 d),共计1 100人次。共得到73 008份独立语料音频,所涉及的语料种类包括元音(6,1.30%)、下颌轮替运动音节(7,1.51%)、单字(238,51.63%)、词语(169,36.66%)、句子(38,8.24%)共461种。结论:本研究建立了国内外首个以口腔癌为主要病种的语音数据库,丰富了口腔癌临床诊治的功能学维度,为临床生物信息标志物的深入研究及个体化语言康复提供了重要的数据支持。

关键词: 数据库, 语音, 口腔癌, 普通话

Abstract: PURPOSE: To establish Mandarin speech database of Chinese oral cancer patients by collecting speech samples of patients with oral cancers before and after operation, so as to provide important data platform for clinical diagnosis, treatment and rehabilitation of oral cancer patients. METHODS: The patients from Department of Oral and Maxillofacial Surgery, Hospital of Stomatology, Sun Yat-Sen University were collected. A variety of speech corpus were applied to assess the participants and collect speech samples which were followed by uniform pre-processing, segmentation, annotation, and anonymization. A set of speech samples from relative health controls were also collected. RESULTS: A total of 481 individual speakers were enrolled from July 2017 to April 2021. The mean age at first assessment was 46.98±16.34 years. The participants consisted of 274 males(57.0%) and 207 females(43.0%). Of them, 369 patients with oral cancers(76.7%), 79 healthy subjects(16.4%) and 33 subjects with non-malignant lesions(6.9%) were comprised. Among oral cancers, a sum of 258 patients had primary lesions located on the tongue and floor of the mouth, and 202 patients were recruited before surgery whose T classifications were 35 cases at T1(17.3%), 68 cases at T2(33.7%), 41 cases at T3(20.3%) and 58 cases at T4(28.7%). The median follow-up(assessment) time was 219 d(IQR: 87.5~587 d) after surgery with a sum of 1 100 appointments. A sum of 73 008 isolated audio samples were obtained from 461 kinds of corpus stimuli which consisted of vowels (6,1.30%), diadochokinesis syllables(7, 1.51%), single words (238, 51.63%), phrases (169, 36.66%) and sentences (38,8.24%). CONCLUSIONS: The present study established the first speech database whose speech samples were dominantly from oral cancers, which provided important data support for the in-depth study of clinical biomarkers and development of individualized speech rehabilitation.

Key words: Database, Speech, Oral cancer, Mandarin Chinese

中图分类号: