摘要:
【目的】基于当前大数据技术支撑的主流传播数据挖掘逻辑机制,即作者关联度机制及文献碎片化自然语言处理机制,围绕传播对象的相关性、时效性等维度进行实验设计和数据统计。【方法】在CNKI数据库中按照学科分类随机选取100篇文献作为传播样本,在各机制应用平台挖掘共计10000个传播对象进行德尔菲式的判定及数据统计分析。分析维度涉及时效相关性、匹配相关性、发文频率等一系列图情指标。【结果】分析显示,作者关联度机制存在内生性问题,难以继续优化;文献碎片化自然语言处理机制虽存在学科分类表象与匹配聚类实质不易弥合的客观问题,但可以通过优化数据挖掘逻辑提升数据挖掘效果。【结论】基于分析结果,通过改进算法映射及摒弃“超龄”数据来提出优化路径,并通过实验验证其有效性。
关键词:
大数据,
学术期刊,
精准传播,
传播对象,
数据挖掘
Abstract:
[Purposes] Based on the mainstream communication data mining logic mechanism supported by the current big data technology, namely the author relevance mechanism and the document fragmentation natural language processing mechanism, this study aims to conduct experimental design and data statistics around the relevance, effectiveness, and other dimensions of propagation objects. [Methods] In the CNKI database, 100 documents were randomly selected according to the discipline classification as the propagation samples, and a total of 10000 propagation objects were mined in mechanism application platforms for Delphi judgment and data statistical analysis. The analysis dimension involved a series of graph and information indicators such as timeliness correlation, matching correlation, and document frequency. [Findings] The analysis shows that the author relevance mechanism has inherent defects and it is difficult to continue optimization. Although the document fragmentation natural language processing mechanism has the objective problem that the representation of discipline classification and the essence of matching clustering are not easy to bridge, the effect of data mining can be improved by optimizing data mining logic. [Conclusions] Based on the analysis results, an optimization path is proposed by improving algorithm mapping and abandoning the average data, and its effectiveness is verified through experiments.
Key words:
Big data,
Academic journal,
Accurate communication,
Potential reader,
Data mining
田海江, 黄江华. 基于大数据的中文学术期刊传播对象数据精准挖掘逻辑优化[J]. 中国科技期刊研究, 2023, 34(3): 341-347.
TIAN Haijiang, HUANG Jianghua. Optimization of the mechanism for data mining of potential readers of Chinese academic journals based on big data[J]. Chinese Journal of Scientific and Technical Periodicals, 2023, 34(3): 341-347.