中国科技期刊研究 ›› 2023, Vol. 34 ›› Issue (2): 136-143. doi: 10.11946/cjstp.202209250730

• 论坛 • 上一篇    下一篇

影响因子操纵期刊识别与分类方法构建与应用

姜丰辉1)()(), 刘祥鹏2), 邵巍3), 陈春平1), 于龙振4),*()()   

  1. 1) 《青岛科技大学学报(自然科学版)》编辑部,山东省青岛市崂山区松岭路99号 266061
    2) 青岛科技大学数理学院,山东省青岛市崂山区松岭路99号 266061
    3) 青岛科技大学自动化与电子工程学院,山东省青岛市崂山区松岭路99号 266061
    4) 青岛科技大学经济与管理学院,山东省青岛市崂山区松岭路99号 266061
  • 收稿日期:2022-09-25 修回日期:2023-01-08 出版日期:2023-02-15 发布日期:2023-03-20
  • 通讯作者:
    *于龙振(ORCID:0000-0002-6594-6679),博士,副教授,E-mail:
  • 作者简介:

    姜丰辉(ORCID:0000-0001-8655-2305),硕士,编辑,E-mail:;

    刘祥鹏,硕士,副教授;
    邵 巍,博士,教授;
    陈春平,博士,编审。
    作者贡献声明: 姜丰辉:提出研究思路,调研文献,收集、处理数据,选择算法,编写程序,分析实验结果,撰写论文; 刘祥鹏:处理数据,选择算法,编写程序,修改论文; 邵 巍:选择算法,设计程序,修改论文; 陈春平:分析方案可行性,修改论文; 于龙振:分析方案可行性,处理数据,选择算法,编写程序,分析实验结果,修改论文。
  • 基金资助:
    中国高校科技期刊研究会项目“基于大数据与人工智能算法的期刊影响因子操纵模式识别与对策”(CUJS-CX-2021-029); 山东省教育厅项目“山东省高等学校期刊高质量发展建设项目”(JYTQKB202211)

Identification and classification of journals of impact factor manipulation

JIANG Fenghui1)()(), LIU Xiangpeng2), SHAO Wei3), CHEN Chunping1), YU Longzhen4)()()   

  1. 1) Editorial Office of Journal of Qingdao University of Science and Technology (Natural Science Edition), 99 Songling Road, Laoshan District, Qingdao 266061, China
    2) School of Mathematics and Physics, Qingdao University of Science and Technology, 99 Songling Road, Laoshan District, Qingdao 266061, China
    3) College of Automation and Electronic Engineering, Qingdao University of Science and Technology, 99 Songling Road, Laoshan District, Qingdao 266061, China
    4) College of Economics and Management, Qingdao University of Science and Technology, 99 Songling Road, Laoshan District, Qingdao 266061, China
  • Received:2022-09-25 Revised:2023-01-08 Online:2023-02-15 Published:2023-03-20

摘要:

【目的】 严重的期刊影响因子操纵现象影响了影响因子客观性,这种不正当行为应该被严格禁止,识别受操纵期刊的有效方式亟待发掘。【方法】 以Web of Science 平台发布的历年JCR数据为研究对象,选取正常期刊和异常(因影响因子受操纵而被镇压)期刊的14个文献计量学指标的历年数据,形成正常和异常2个期刊数据集。利用Python Scikit-learn库编写机器学习算法程序,对由正常、异常期刊数据集合并生成的训练集、验证集和测试集分别进行分类、训练、验证、测试。【结果】 机器学习算法可以有效地对正常、异常期刊数据集进行分类,对验证集分类的准确率、精确率和召回率均达到98%以上,对算法最重要的5个特征的特征重要性为91.55%。部分算法对镇压后恢复正常期刊在镇压后第5年的数据的识别效果开始降低,所有编辑关注期刊均被分类为异常期刊,2021版JCR镇压期刊及镇压预警期刊均被准确分类为异常期刊。支持向量机算法具有最好的预测效果。【结论】 机器学习算法在识别影响因子操纵期刊上具有天然的快速性和客观性优势。随着对影响因子的操纵手法及文献计量学指标不断增多,人工综合各种指标来识别、判定受操纵期刊的难度越来越大,各种机器学习算法的优势不断凸显。

关键词: 影响因子操纵, JCR镇压期刊, JCR编辑关注期刊, JCR指标, 机器学习, 自动识别

Abstract:

[Purposes] The serious manipulation of journal impact factors has seriously affected its objectivity, and this improper behavior should be strictly prohibited. It is urgent to find effective methods for identifying manipulated journals. [Methods] Taking the JCR data published on the Web of Science platform as the research object, the data on 14 bibliometrics indexes of normal journals and abnormal (suppressed due to manipulation of impact factors) journals were selected to form two data sets (normal and abnormal). Python Scikit-learn library was used to compile machine learning algorithm program to classify, train, verify, and test the training set, verification set, and test set generated from the normal and abnormal combined data set. [Findings] The machine-learning algorithm effectively classifies the normal and abnormal journal data sets, with precision, accuracy, and recall rate in data validation sets reaching more than 98%. The feature importance of the 5 most important features of the algorithm is 91.55%. The recognition effect of some algorithms on the data of the fifth year after the suppression of the journals restored to normal begins to decline. All the journals concerned by editors are classified as abnormal journals. The 2021 edition JCR suppression and suppression-warning journals are accurately classified as abnormal journals. Support vector machine algorithm has an optimal prediction effect. [Conclusions] The machine-learning algorithm has natural advantages of rapidity and objectivity in the recognition of journals of impact factors manipulation. With the increasing number of manipulation methods of impact factors and bibliometric indicators, it is more and more difficult to manually synthesize various indicators for identification and judgment, and the advantages of various machine-learning algorithms are continuously reflected.

Key words: Impact factor manipulation, JCR suppression journal, JCR editorial concern journal, JCR indicator, Machine learning, Automatic identification