×

联系我们

方式一(推荐):点击跳转至留言建议,您的留言将以短信方式发送至管理员,回复更快

方式二:发送邮件至 nktanglan@163.com

学生论文

论文查询结果

返回搜索

论文编号:11094 
作者编号:2120163255 
上传时间:2019/12/6 14:51:00 
中文题目:基于FastText的微信公众号文章标题谣言识别研究 
英文题目:Research on the Recognition of the Titles of WeChat Articles Based on FastText 
指导老师:王芳 
中文关键字:深度学习;微信谣言;特征分析 
英文关键字:Deep learning; WeChat rumors; feature analysis 
中文摘要:微信用户基数大,功能强,且便利齐全,人们已经越来越离不开微信,对其中的谣言进行分析和挖掘有利于推动社会的和谐稳定与进步。微信公众号文章标题文本谣言分析识别研究也是现代谣言识别处理研究中的一个重点,其对于谣言检测、谣言监控都具有较大意义。 本文以微信公众号的文章标题为研究目标,提取其中的关键特征作为比较对象,利用字典树和模糊匹配算法对微信公众号文章题目中的实体进行识别和清洗。文章主要提出了基于FastText的微信公众号文章谣言识别模型。此模型主要包括语料的预处理模块、文本向量化模块、神经网络训练模块三大模块。为了验证所提算法的实用性,本文介绍了传统的机器学习算法,并将其作为实验部分的比较对象。实验表明本文使用的FastText算法则在训练耗时与预测耗时上有着得天独厚的优势,为三种算法中最优,并且保证了较好的分类效果。同时,本文在原算法的基础上进行改进,将基于通用词的修改器加入原有算法之中,经实验验证,改进后的算法在准确度,效率等方面均优于原有算法。通过实验,文章得出结论:FastText算法在微信公众号文章标题谣言文本分类中也能有较好的发挥,本文提出的基于FastText的微信公众号文章标题文本谣言识别模型在微信公众号谣言识别中具有较高的使用和研究价值。 本研究提出的谣言分类与基于FastText的微信公众号文章标题谣言识别模型可以对政府相关部门的谣言管控提供极大地助力,对社会公众提供更及时的谣言鉴别,实现更快速,更高效的谣言预警机制,现有谣言识别方式还存在效率低下,预警不及时等问题,通过本文研究能较大改善现有谣言识别效率低下等问题,为谣言管控提供极大助力。同时本文提出的谣言分类也有助于对之后谣言特征的进一步研究探讨,具有较高的研究价值。 
英文摘要:With the rapid development of Web2.0 and mobile information technology, the popularity of various instant messaging software, Weibo, BBS forums, various types of network APP as an important information dissemination carrier (or as a platform) continue to work with the public, learning and Entertainment is tightly integrated. In online social networks, the general public, who was originally only a recipient of information, now acts as a publisher and disseminator of information. The online social network has greatly accelerated the frequency and speed of people’s information exchange, and improved the efficiency of information communication. However, the online network society provides people with convenient information communication and low diffusion of false information. The cost conditions, in turn, accelerate the spread of rumors. In this paper, the article title of WeChat public account is the research goal, and the key features are extracted as the comparison object. The dictionary tree and fuzzy matching algorithm are used to identify and clean the entities in the WeChat public article title. The article mainly proposes the rumor analysis and recognition model of WeChat public article based on FastText. This model mainly includes three modules: expected preprocessing, text vectorization, and neural network training. FastText is an open source text categorization and word training tool launched by facebook AI lab. Its biggest feature is that it has faster training speed and higher training efficiency than the depth model, and it can have the same level of deep learning while ensuring high efficiency. Or similar accuracy. FastText is very similar to the traditional word2sec CBOW model, and its essence can also be understood as a shallow neural network. Compared with traditional CBOW, FastText has many similarities. First, FastText predicts the output of the article’s label, while the CBOW model outputs the predicted word in the middle of the context. And FastText is unsupervised learning, traditional CBOW for supervised learning. Finally, CBOW only outputs all the words in the window to be predicted, while the FastText model outputs all the words of the article. This model mainly includes three modules: expected preprocessing, text vectorization, and neural network training. Using the superiority of the FastText algorithm to achieve effective classification of rumor text. In order to verify the utility of the proposed algorithm, the article introduces a traditional machine learning algorithm as a comparison object in the experimental part.Experiments show that the random forest classification effect has better classification effect than the traditional FastText algorithm and the traditional Naive Bayes algorithm,but the time spent on training and prediction is slightly higher than the other two, which is slightly insufficient; Bayesian algorithm is better than random forest in time, which is almost the same as FastText algorithm, but the classification effect is not satisfactory.The classification effect is the worst among the three algorithms. The FastText algorithm used in this paper is in training time and prediction consumption. The time has a unique advantage, which is the best among the three algorithms, and guarantees a good classification effect, which has great analytical value. Through experiments, the article concludes that the FastText algorithm can also play a good role in the classification of the verbs of the WeChat public article title. The text of the WeChat public article title text rumor recognition model proposed in this paper has higher under certain conditions. Use and research value. 
查看全文:预览  下载(下载需要进行登录)