×

联系我们

方式一(推荐):点击跳转至留言建议,您的留言将以短信方式发送至管理员,回复更快

方式二:发送邮件至 nktanglan@163.com

学生论文

论文查询结果

返回搜索

论文编号:11830 
作者编号:1120160839 
上传时间:2020/6/22 14:00:42 
中文题目:基于非专利引文分析的科学与技术文本知识相关性研究 
英文题目:Research on the Correlation of Scientific and Technical Text Knowledge Based on Non-patent Citation Analysis 
指导老师:王芳 
中文关键字:非专利引文;基于科学的技术创新;科学与技术关联;信息抽取;相关性计算 
英文关键字:non-patent citation; science-based technological innovation; science and technology correlation; information extraction; correlation measurement 
中文摘要:科技创新是国家经济和社会发展的重要支撑,科学发展对技术创新的积极作用也得到各国政府和领域学者的广泛认可。企业作为国家技术创新体系中最为活跃和重要的参与主体,是推进创新驱动发展战略的中坚力量。科学研究中已揭示的科学发展对技术创新的促进作用,多元主体从事研发活动所积累的创新成果等,为以企业为代表的技术创新主体开展基于科学的技术创新实践提供了理论指导和知识积累。 作为科学和技术领域间存在关联关系的重要依据,专利文献中包含的非专利引文为探索二者间复杂的作用关系和互动方式提供了可度量的现实途径。已有的研究多根据技术专利和学术论文中的著录信息在不同创新主体、创新领域、创新成果之间建立映射,再结合创新扩散、社会网络关系、技术转移等理论分别探讨了科学发展推动技术创新的内在原理与作用机制。虽然非专利引文真正关联的是学术论文和技术专利这两类文本,但其能够作为关联科学和技术领域的依据则是基于这样两个假设:1)专利和论文分别是技术创新和科学研究的重要产出;2)非专利引文关系中的专利与论文之间存在知识上的相关性。 目前,学术界对于第一个假设的成立已达成较为统一的共识,在对各个国家和地区授予专利和发表论文数量进行统计后发现,企业和以高校为代表的科研机构分别是专利和论文的申请与发表的主体,因而将专利和论文作为两类创新主体的主要创新产出存在其合理性。然而,关于第二个假设的成立却仍缺少系统且全面的科学论证,这直接影响了建立在非专利引文基础上的科学和技术关联研究的可靠性。此外,虽然重大技术突破离不开相关领域的科学进展,但并非所有与施引专利存在知识相关的学术成果对于技术问题的解决具有直接推动作用,优秀的科研成果也并非总是具有可直接进行技术转化的高应用价值。在企业的技术创新实践中,非专利引文能够为研发活动需要的科学知识提供具体线索,但过分依赖也容易造成忽视科学研究和技术创新发展逻辑上的不兼容,进而导致企业研发战略的制定和资源投入出现偏差,增加项目投资风险和技术研发周期。因此,检验非专利引文关系中的科学和技术文本知识相关性,揭示两类文本的知识相关类型和特点,开发能够从非专利引文中识别具有技术创新应用潜力科学研究的技术方法,对于探索科学与技术间的复杂关系,丰富和完善二者相关性研究的范式与方法,指导企业从事基于科学的技术创新实践等具有重要意义。 本文利用文本挖掘技术和信息分析方法对非专利引文关系中的论文和专利两类科学与技术文本的知识相关性进行系统研究,主要的研究内容包括: 一、梳理科学与技术间存在关联关系的理论依据、研究方法、技术手段,对基于非专利引文所确立的科学技术相关性研究现状进行评述并指出不足; 二、将本文的研究对象限定在论文和专利两类科学和技术文本,并将非专利引文限定在技术专利对科学论文的引用。以向量空间模型计算两类文本的知识相关性,采用大样本数据集对3D打印技术领域非专利引文关系中的论文和专利的知识相关性进行检验,对非专利引文能否作为判定两类文本存在知识相关性的依据提供系统而全面的论证; 三、在四轮德尔菲专家调查法后归纳总结施引专利与被引论文间的知识相关类型,结合创新扩散、基于文献的知识发现理论,探讨与专利存在不同相关类型的科学研究成果如何辅助企业的技术创新实践,特别是如何加快具体技术问题的解决。在此基础上,分析以向量空间模型为基础的相关性计算结果对不同知识相关文本的度量效果,指出将高得分科学研究成果作为实现技术创新重要途径所存在的问题与不足; 四、针对上述传统相关性计算方法中的不足,从信息抽取、知识表示、相关性度量三个方面入手,提出能够从非专利引文中识别具有创新应用潜力的科学研究的方法,包括用于专利和论文知识内容表示的关键词抽取算法、融合概念间语义信息的文本知识表示方法、论文与专利的知识相关性计算方法; 五、为了体现本文提出的相关性计算方法的优越性,以3D打印技术领域的技术专利和科学论文为分析对象,介绍如何将该方法应用于企业的技术创新合作伙伴识别任务当中。重点揭示将该相关性结果作为合作伙伴评价指标对识别结果的影响,以此证明本文方法在科学和技术文本知识相关性计算方面的可靠性。 本文的主要研究结论与成果包括: 一、证明了具有非专利引文关系的技术专利和科学论文之间的确存在知识相关性; 二、提出了非专利引文关系中专利与论文具有的四种知识相关类别,即知识背景相关、创新依存相关、技术功能相关、主题概念相关; 三、提出了一种能够用于表示专利和论文摘要中重要知识内容的关键词抽取算法。在开放语料上的算法评估结果显示,该方法比其它两个基准算法具有更为出色的性能; 四、提出了一种用于计算专利和论文文本知识相关性的新方法,能够反映文本知识在文本内容、“技术—功能”关联、知识网络距离三个维度上的相关性特征; 五、以3D打印技术领域企业的技术创新合作伙伴识别任务为例,证明了本文方法在科学和技术文本相关性计算方面的优越性,说明了方法广泛的应用情景。 本文共包含图51幅,表46个,参考文献281篇。 
英文摘要:Scientific and technological innovation is the foundation of national economic and social development. The positive role of scientific development in technological innovation has also been widely recognized by scholars and governments in various countries. As the most active participant in the national technological innovation system, enterprises are the backbone of advancing the innovation-driven development strategy. The promotion of scientific development on technological innovation and the innovation results accumulated by multiple subjects engaged in R&D activities provide theoretical guidance and knowledge accumulation for technological innovation entities, which represented by enterprises to carry out scientific-based technological innovation practices. As the basis for the association between science and technology, non-patent citations contained in patents provide a measurable realistic way to explore the complex interaction and interaction between the two fields. Existing studies have mostly established mappings between different innovation subjects, innovation fields, and innovation achievements based on patents and bibliographic information labels in academic papers. Theories include innovation diffusion, social network relationships, and technology transfer also provide theoretical perspectives to discuss how scientific development promotion technical innovation. However, non-patent citations associate academic papers with technical patents are based on two assumptions: 1) patents and papers are main technological innovation and scientific research respectively output; 2) There is a knowledge correlation between the patent and the paper in the non-patent citation relationship. At present, the academic community has reached a relatively unified consensus on the establishment of the first hypothesis. After counting the number of patents and published papers in various countries and regions, it is found that enterprises and scientific research institutions represented by universities applied and published most of the patents and papers, respectively. Therefore, it is reasonable to use patents and papers as the main output of innovation activities of two types of scientific and technological innovation entities. However, the establishment of the second hypothesis still lacks systematic and comprehensive scientific argumentation, which directly affects the reliability of scientific and technical research based on non-patent citations. In addition, although major technological breakthroughs are inseparable from scientific progress in related fields, not all academic achievements directly promote the resolution of technical problems. Excellent scientific research achievements do not always have high application value for technology transformation. Non-patent citations can provide specific clues for scientific knowledge required for R&D activities in the enterprises’ technological innovation practice. However, excessive reliance also tends to lead to the neglect of the logical incompatibility of scientific research and technological innovation development. It will lead to the deviation of R&D strategies, which increases the investment risk of the project and delay the development cycle. Therefore, testing the knowledge-related types and characteristics of scientific and technical texts in non-patent citations, developing technical methods that can identify scientific research with technological innovation potential from non-patent citations, and enriching and exploring the complex relationship between science and technology are of great significance to perfect the related research paradigm and method between the two fields. It also facilitates enterprises to engage in innovative practice of science-based technology. This article adopts text mining technology and information analysis method to test the knowledge correlation of scientific paper and technical patents with non-patent citation relationships based on a large sample data set. After four rounds of Delphi expert surveys, the author summarized the four types of knowledge correlation of the two texts and proposed the method and technology of knowledge correlation calculation that reflects the potential of scientific research for technological innovation. All of these will assist enterprises’ engagement of science-based technological innovation and improve their efficiency of R&D. The main research contents include: I. Sorting out the theoretical basis, research methods, and techniques for exploring the relationship between science and technology. Reviewing the studies on science and technology correlations based on non-patent citations and pointing out their deficiencies. II. The non-patent citation studied in this dissertation is limited to the citation between patents and papers. By a case of 3D printing technology, the vector space model was used to calculate the knowledge correlation between the two texts, and the results were used to test whether the non-patent citation is a reliable judgment of knowledge correlation of patents and papers. III. By applying the Delphi research method, the author summarized opinions from field experts and proposed four types of knowledge correlations in patents and their scientific citations. This study also elaborated on how knowledge discovered in scientific work supports the technological innovation activities of enterprises from the perspective of innovation diffusion and literature-based knowledge discovery, then points out the weakness of current text mining technology in identifying key scientific research with application potential. IV. By improving the efficiencies of information extraction, knowledge representative, and correlation measurement, this study proposed a new method for identifying scientific research that has the potential of technology innovation. It includes a graph-based algorithm of keyword extraction for representing important knowledge from papers and patents, a textual knowledge representation method that integrates semantic information of concepts, and a method to calculate the innovation and application values of scientific studies. V. Taking the partner identification of enterprises in the 3D printing field as an example, the method of calculating knowledge correlation between patents and papers proposed in this study was used to establish the link between technology and scientific knowledge. Combined with the technical characteristics of enterprises and the research capabilities of scientific research institutions, the preferred R&D partners were selected for various types of enterprises to engage in their science-based technological innovation practices. The main conclusions include: I. Prove the correlation between scientific papers and technology patents in non-patent citations. II. Propose four types of knowledge correlations in patents and their scientific citations, namely knowledge background, the innovation foundation, technology & function, and topic & concept correlations. III. Propose an algorithm of keyword extraction and evaluate its performance by an open-access corpus. This algorithm has an outperformance when compared with the other two graph-based baseline methods. IV. Propose a new method for calculating the correlation of knowledge between patents and papers. It reflects the correlations of any two texts in the aspects of keyword co-occurrence, “technology-function” association, and knowledge distance in an external knowledge network. V. The research results of this dissertation were successfully applied to the task of identifying innovation partners of 3D printing enterprises in their practices of science-based technological innovation, and the advantages of the methods were proved by a domain case. This dissertation contains: 51 figures, 46 tables, and 281 references. 
查看全文:预览  下载(下载需要进行登录)