×

联系我们

方式一(推荐):点击跳转至留言建议,您的留言将以短信方式发送至管理员,回复更快

方式二:发送邮件至 nktanglan@163.com

学生论文

论文查询结果

返回搜索

论文编号:6469 
作者编号:1120110797 
上传时间:2014/6/11 9:42:39 
中文题目:特定事件情境下中文微博用户情感挖掘与传播研究 
英文题目:Users’ Sentiment Mining and Spreading among Chinese Microblogs in the Context of Specific Events 
指导老师:王芳 
中文关键字:情感分析,情感词表,情感传播,社交媒体,舆情分析 
英文关键字:Sentiment Analysis, Sentiment Thesaurus, Sentiment Spreading, Social Media, Public Opinion 
中文摘要:微博等在线社交媒体在舆论的传播方面所起的作用已越来越明显,社交媒体中的用户可以自由发表他们对某一事件的观点和看法,也可以通过文字、图片、视频等形式发泄情绪。由于社交媒体中的用户处于不同的社交网络中,信息的传播非常迅速,所以在特定事件情境下,社交媒体用户极易产生群体极化现象,甚至导致网络或实际生活中的群体性事件。用户表达的情感不仅能影响事件的传播速度,而且情感能够相互感染,不利情感或负面情绪能够激发用户的负面行为,促使事件朝着不利的方向发展。所以有必要对微博等社交媒体用户在特定事件情境下的情感进行分析,判断用户的情感类型和情感极性强度,寻找影响用户情感表达和传播的影响因素,进行有针对性地监控和引导。基于以上背景,本文将研究问题集中于特定事件情境下中文微博用户的情感挖掘与情感传播研究。本研究需要解决三大研究问题,分别为情感特征识别问题、情感特征统计和描述问题和情感传播问题。主要研究工作包括:第一,构建中文情感分类词表,包括情绪分类词表和评价分类词表。词表中的情感词一方面来源于现有三个中英文情感词表,另一方面来源于待分析事件的微博文本语料,采用基于HowNet知识库标注的方法实现情感词的分类和极性强度判断。最终情绪分类词表包含12个大类和32个小类,共3773个情绪词;评价分类词表包含8个大类和100个小类,共12844个评价词;第二,实现情感词可视化和情感特征分类统计。通过情感词在同一微博中的共现计算情感词之间的关系,然后通过位置算法将情感词之间的关系通过图形的方式进行展示,同时还通过字体的大小和颜色来表示情感词的热度和极性强度。研究发现高频中心词反映了事件的主导情感类型,越靠近边缘位置,情感词越能反映普通公众的情感。对情绪词的分类统计能发现特定事件下用户所表达的各情绪类型的强度。将表情符号按正负面极性分布进行时间序列的统计可以发现一些难以发现的水军广播,去除水军广播后正负面表情强度的变化趋势相似;第三,构建特定事件信息传播网络,通过社会网络分析方法分析事件信息传播网络中的关键用户、信息传播距离、传播网络集聚程度。将用户情绪嵌入事件传播网络中,进行信息传播网络用户情感可视化,了解用户情感的分布情况。分析用户的情感表达与用户的角色之间的关系,发现决策者应关注表达“激动”、“诋毁”、“同意”或“反对”等情绪的用户,表达这些情绪的用户更容易在信息传播中起关键作用。本研究的创新点主要有三个方面:第一,完善了情感词表的构建。目前虽已有少数关于中文情感词典构建的相关研究,但一方面这些情感词表无法公开使用,少数可用的情感词表仅仅将情感词分为正面和负面两类,由于这方面的限制,目前的情感分析大多集中于对句子或文本的正负面极性进行判断,无法获知文本中具体的情感类型和情感强度。本研究将整个情感词表分为情绪分类词表和评价分类词表,不仅能实现极性的计算,而且能够实现具体情绪类型的分析。第二,将可视化技术应用于情感描述,有助于情感分析方法的完善。目前自然语言可视化较常见的是对文本标签或关键词进行标签云的展示,主要是对文本主题的可视化,而情感词可视化的研究和应用并不多见。本研究不仅仅将情感词进行可视化表示,还将情感词之间的关系和情感词的极性强度特征通过图形进行表达。另外,本研究不仅对情感词进行了可视化,还介绍了将用户情感在事件信息传播网络中进行可视化的方法,通过多种可视化技术和算法可为决策者提供更直观的用户情感信息。第三,推进了特定事件下中文社交网络情感传播研究。目前已有相关英文情感传播的研究,但这些研究多数关注用户日常交流网络的情感互动,而用户对事件信息的情感及这种情感如何在事件传播网络中进行传播和分布的研究较少,为了探索情感传播相关因素,本研究还对用户情感与用户在事件传播中扮演的角色之间的关系进行了分析。在理论研究方面,本文基于心理学的研究和HowNet本体构建了情感词的分类体系,可供后续研究作参考。在方法方面,提供了情感知识的表示方法,有助于目前情感分析方法的完善,并结合社会网络分析方法和相关性分析方法进行用户情感传播研究,有助于情感传播研究方法的完善。在实践方面,有助于政府有关部门了解公众在事件发生过程中的情感传播状况,为避免公众情感的集聚和极化,提供有针对性的信息。有助于企业或个人了解微博公众对事件的情绪反应和评价,通过公众情感扩散规律制定有针对性的应对策略。有助于公共管理部门、企业了解公众对自身服务或产品的情绪和评价,以改进自身服务或产品。 
英文摘要:Microblog and other online social media play important roles in the spread of public opinion. Users can express their views and perceptions of an event with text, images, video and other information forms. The information about an event can be quickly transmitted by a user to another one among different social networks within social media. In the context of a specific event, group polarization phenomenon often appears in social media and thus to cause mass incidents in actual life. Emotions expressed by users can not only influence the propagation speed of the event information, but also can infect other people in social media. Negative emotions can trigger the negative behavior of users and prompt events toward the negative direction. Therefore, it is necessary to analyze the sentiment of users in a particular event situation, identify the types of emotions and the intensity of emotional polarity and explore the factors that influence user's emotional expression and transmission. This paper focuses on sentiment mining and sentiment spreading within Chinese microblogs when users face specific events. Three major research problems are solved: sentiment features recognition, sentiment description and sentiment spreading by three steps: Firstly, Chinese sentiment thesauruses, including an emotional thesaurus and an evaluational thesaurus are constructed. Sentiment words in thesaurus are selected from three existing sentiment words lists and microblog text corpus about specific events. Then these extracted sentiment words are classified and their polarity intensities are labeled according to HowNet Chinese repository. The emotional thesaurus contains 3773 classified words in 12 categories and 32 secondary classes. The evaluational thesaurus contains 12844 classified words in 8 categories and 100 secondary classes. Secondly, sentiment visualization and sentiment features statistics are carried out for two specific events. The relationships between sentiment words are determined through their co-occurrence frequency in the same microblog text, and then are displayed in a graph through location algorithm. The frequency and polarity of sentiment words are also displayed in graph through the size and color of the font. It is found that the high frequency words at the centre of graph can reflect the dominant sentiment types of events, while words which are more close to the edge of the graph reflect the emotion of the general public. Classification statistics for sentiment words can disclose the strength of all types of emotions expressed by the users on specific events. Time series statistics of emoticons can identify some messages posted by water armies. After removing these messages, it is found that the change trend of negative intensity is similar to positive intensity. Thirdly, the dissemination network of specific event information is constructed. Key users, information propagation distance and cluster degree of the information dissemination network are analyzed using social network analysis method. Users’ sentiments are embedded in the information dissemination network so as to visualize sentiment and help to understand the distribution of users’ emotions. The relationships between users’ emotions and the their roles played in dissemination networks are analyzed. It is suggested that policy makers should pay more attention to users who express emotions such as "excited", "abuse", "agree" or "against", because users with such emotions are more likely to play a key role in information dissemination. The innovation of this study mainly includes three aspects: firstly, Chinese sentiment thesaurus are improved. There are a few existing studies related with Chinese sentiment lexicon building, but most of these sentiment lexicons can't be used publicly and a few available sentiment lexicons divide sentiment words into only two categories: positive words and negative words. Because of the limitation of previous sentiment lexicons, current sentiment analysis mostly focus on polarity judgment of sentences or texts, but cannot analyze specific types of emotions and their intensities. This study divides the whole sentiment vocabulary into emotional classification vocabulary and evaluational classification vocabulary, not only can analyze the polarity of emotions, but also can analyze specific types of sentiment. Secondly, visualization technology is used to visualize sentiment words and relationships between different words. At present, tag clouds are commonly used in the visualization of themes or keywords of texts, but very rarely applied in sentiment visualization. This study not only carries on the sentiment words visualization, but also visualizes the polarity intensity of words through graph. In addition, this study also visualizes users’ sentiment in event information transmission network. It can provide more intuitive sentiment information for decision makers through a variety of visualization technologies and algorithms. Thirdly, the research on sentiment spreading in Chinese online social network is prompted. There are some related studies in English, but most of them focus on users’ emotional interaction in daily communication network. In China there are few studies about sentiment spreading in the context of specific events within social media. In order to explore related factors of the emotional communication, this study also analyzes the relationships between users’ emotions and their roles in event information transmission. The theoretical implication of this dissertation is that it constructs a classification system for sentiment words based on psychology research and the HowNet ontology. This system will be available for reference for further research.The methodological implication of this dissertation is that it provides methods for emotional knowledge representation and sentiment spreading analysis. This will help to improve the current sentiment analysis and sentiment spreading research. In practical implication of this study is that it can help relevant government departments to understand public sentiment spreading during events and provide targeted information to avoid the agglomeration and polarization of public feelings. This study also can help enterprises or individuals to understand public emotional responses to events in micoblogs and to develop targeted strategies based on the regular pattern of public sentiment spreading, and can help governments and enterprises to improve the quality of their services or products by evaluating public mood in microblogs. 
查看全文:预览  下载(下载需要进行登录)