%0 Journal Article %A DAO Hu %A GU Qi %A HAN Xin-Liang %A XIE Yong-Hong %A YANG Shi-Bing %T Traditional Chinese Medicine Symptom Normalization Approach Based on Pre-trained Language Models %D 2022 %R %J Journal of Beijing University of Posts and Telecommunications %P 14-20 %V 45 %N 4 %X Symptom normalization plays a vital role in mining Traditional Chinese medicine (TCM) knowledge and the promotion of the modernization of TCM. It is difficult because the challenges of symptom descriptions such as one symptom having different literal descriptions, one-to-many symptom descriptions. To deal with this problem, a two-stage framework based on pre-trained language models is proposed. First, a multi-label text classification model is adopted to semantically divide the symptom descriptions to obtain candidate normalization symptom words, according to the definition and classification of symptoms. Then score and sort the candidate words with a symptom word matching model, after which take the candidate word with the highest score in each semantic label as the normalization result of the symptom description. Finally, some strategies are designed to perform a second recall of the results to improve performance. The research analyzes the results obtained with different pre-trained models with a constructed symptom normalization dataset. The experiments show that the method and strategies can effectively deal with symptom normalization, among which the ERNIE-based model shows the best performance with F1 value 0.894. %U https://journal.bupt.edu.cn/EN/abstract/article_4935.shtml