Chinese Journal of Scientific and Technical Periodicals ›› 2025, Vol. 36 ›› Issue (11): 1478-1486. doi: 10.11946/cjstp.202506240737

Previous Articles     Next Articles

Research on intelligent early warning of academic paper retraction risks: Hybrid enhanced detection framework based on large language models

ZHAO Xiansong1)(), CHEN Xiaohui1), LU Ye2),*()(), YANG Ming2), LIN Yuan3)   

  1. 1)School of Marxism,Dalian University of Technology,2 Linggong Road,Ganjingzi District,Dalian 116024,China
    2)School of Software,Dalian University of Technology,321 Tuqiang Road,Jinpuxin District,Dalian 116000,China,
    3)School of Public Administration and Policy,Dalian University of Technology,2 Linggong Road,Ganjingzi District,Dalian116024,China
  • Received:2025-06-24 Online:2025-11-25 Published:2025-12-10

学术撤稿风险智能预警研究:基于大语言模型的混合增强检测框架

赵显嵩1)(), 陈晓晖1), 陆晔2),*()(), 杨茗2), 林原3)   

  1. 1)大连理工大学马克思主义学院,辽宁省大连市甘井子区凌工路2号 116024
    2)大连理工大学软件学院,辽宁省大连市金普新区图强路321号 116000
    3)大连理工大学公共管理学院,辽宁省大连市甘井子区凌工路2号 116024
  • 通讯作者: 陆 晔(ORCID: 0009-0001-1092-0295),硕士研究生, E-mail:
  • 作者简介:
    赵显嵩(ORCID:0009-0000-0622-197X),博士研究生,副教授,E-mail:
    陈晓晖,博士,教授;
    杨 茗,硕士研究生;
    林 原,博士,副教授。
    作者贡献声明: 赵显嵩:设计论文框架、审核修订论文; 陈晓晖:提出研究方向; 陆 晔:收集数据、进行实验、起草论文; 杨 茗:参与撰写论文、修订论文; 林 原:审核修改论文。
  • 基金资助:
    国家自然科学基金项目“融合多源信息的学术推荐研究”(61976036)

Abstract:

Purposes Aiming at the problem of the spread of incorrect knowledge caused by the lag in retracting academic papers, this study aims to shorten the identification cycle of abnormal paper status and maintain the reliability and integrity of the academic communication system through AI technique. Methods Integrate the metadata and peer reviews contained in 12,098 papers from authoritative platforms such as PubPeer and PubMed as data support, develop a hybrid enhanced detection framework for paper retraction risk, and make decisions through retrieval-enhanced generation and expert-enhanced generation results. Findings This framework can accurately and effectively identify retracted papers. The verification accuracy rate of the retracted paper status reaches 91.91%, and the recall rate reaches 73.72%. Conclusions This study confirms the technical feasibility of large language models in academic early warning research and academic integrity maintenance. The application of the hybrid enhanced detection framework can provide a practical retraction early warning plan for publishing institutions. The open sharing of the hybrid enhanced detection framework will promote the standardization process of research on using large language models to handle retraction risk detection, help build an active monitoring scientific research integrity management system, improve supervision efficiency, and maintain the healthy development of the scientific research ecosystem.

Key words: Retracted papers, Retraction risk detection, Large language models, Research integrity

摘要:

目的 针对学术论文撤稿滞后现象导致的错误知识传播问题,运用人工智能技术缩短论文异常状态识别周期,维护学术交流系统的可靠性与完整性。 方法 整合PubPeer、 PubMed等权威平台12098篇论文所包含的元数据及其同行评议作为数据支撑,开发论文撤销风险混合增强检测框架,通过检索增强生成和专家增强生成结果进行决策。 结果 框架可准确、有效地识别撤销论文,撤销论文状态验证准确率达91.91%,召回率达73.72%。 结论 大语言模型在学术预警研究及学术诚信维护中具有可行性,应用混合增强检测框架能够为出版机构提供可操作的撤稿预警方案,助力构建主动监测的科研诚信管理体系,提高监管效率,维护科研生态健康发展。

关键词: 撤销论文, 撤稿风险检测, 大语言模型, 科研诚信