作文题材>
- 弹幕发送字数分布
- 单人弹幕发送总量分布
- 弹幕心情评分分布
- 弹幕主要关注更多更多人物散布
- 弹幕人物画像(代码暂略)
先导入数据处置方式和相关的作图库小说题材。
#导入库
importnumpiasnp
importpandaaspd
importmatplotlib.pyplotasplt
importjieba
importjieba.analyse
fromsnownlpimportSnowNLP
frompyecharts.chartimportBar,Pie,Line,WordCloud,Page
frompyechartimportopt影视题材ionasopts
01弹视频题材幕基本概况
爬取到弹幕是这样的的子的一共包罗了以下有关相关信息:集数、议论id视频题材用户名、vip品阶、议论其他其他内容、议论时间点和评论点赞。
#读入数据
df_1=pd.read_c瀑布题材sv\’./data/古董局中局2腾讯弹幕.csv\’,encoding=\’utf-8瀑布题材\’,engine=\’python\’
df_1.head
一共有107570条弹幕数据。
df_1.info
<class\’pandas.core.fram短片题材e.DataFrame\’>
RangeIndex:107570entries,0to107569
Datacolumntotal7column:
episod107570non-nulint64
comment_id107570non-nulint64
oper_nam25539non-nulobject
vip_degre107570non-nulint64
cont107570non-nulobject
time_point107570non-nulint64
up_count107570non-nulint64
dtypes:int645,object2
memoriusage:5.7+MB
02弹幕字数分布
#寻思字数
word_num=df_1[\’content\’].applilambdax:lenx
#分箱
bin=[0,5,10,15,20,25,30,35,40,45,50,549]
word_num_cut=pd.cutword_num,bin.value_count.sort_index
#柱形图
bar1=Barinit_opts=opts.InitOptwidth=\’1350px\’,height=\’750px\’
bar1.add_xaxiword_num_cut.index.astyp\’str\’.tolist
bar1.add_yaxi””,word_num_cut.values.tolist,category_gap=\’4%\’
bar1.set_global_opttitle_opts=opts.TitleOpttitle=”弹幕发送字数分布”,
visualmap_opts=opts.VisualMapOptmax_=38107
bar1.rend
03弹幕情感分析
defnlp_scorex:
“””
效用短片题材:可可获取心情评分
“””
sn=SnowNLPx
returnsn.sentiments
#寻思心情得分
df_1[\’nlp_score\’]=df_1[\’content\’].maplambdax:nlp_scorex
#分箱
score_bin=[0,0.1,0.2,小说题材0.3,0.4,0.5,0.6,0.7,0.8,0.9,1]
score_cut=pd.cutdf_1[\’nlp_score\’],bins=score_bin
score_cut=score_cut.value_count.sort_index
#绘制折线图
line1=Lineinit_opts=opts.InitOptwidth=\’1350px\’,height=\’750px\’
line1.add_xaxiscore_cut.index.astyp\’str\’.tolist
line1.add_yaxi\’\’,score_cut.values.tolist,
areastyle_opts=opts.A reaStyleOptopacity=0.5,
label_opts=opts.LabelOptis_show=Fals
line1.set_global_opttitle_opts=opts.TitleOpttitle=\’弹幕心情评分分布[0~1]\’,
#toolbox_opts=opts.ToolboxOpt,
visualmap_opts=opts.VisualMapOptmax_=22288
line1.set_series_optlinestyle_opts=opts.LineStyleOptwidth=4
line1.rend
04弹幕关注更多更多人数数量占比
defcalculate_numwords_list:
“””
效用:给定关键词列表,寻思弹幕出现次数
“””
pattern=\’|\’.joinwords_list
num=intdf_1[\’content\’].str.containpattern.sum
returnnum
#关键词
xiayu=[\’许下愿望\’,\’夏雨\’,\’下雨\’]
weichen=[\’药不然\’,\’魏晨\’,\’药二爷\’,\’晨晨\’,\’老魏\’,\’晨哥\’,\’晨\’]
laochaofeng=[\’老朝奉\’,\’老朝凤\’,\’老朝\’,\’老嘲讽\’,\’老巢\’]
A liya=[\’黄烟烟\’,\’阿丽亚\’,\’烟烟\’]
liangj=[\’梁静\’,\’素姐\’]
liuyim=[\’一鸣\’,\’毕彦君\’]
all_list=[xiayu,weichen,laochaofeng,A liya,liangjing,liuyiming]
danmu_nam=[\’夏雨\’,\’魏晨\’,\’老朝奉\’,\’阿丽亚\’,\’梁静\’,\’刘一鸣\’]
danmu_num=[calculate_numiforiinall_list]
#柱形图
bar2=Barinit_opts=opts.InitOptwidth=\’1350px\’,height=\’750px\’
bar2.add_xaxidanmu_nam
bar2.add_yaxi\’数量\’,danmu_num
bar2.set_global_opttitle_opts=opts.TitleOpt\’弹幕主要关注更多更多人物散布\’,
visualmap_opts=opts.VisualMapOptmax_=20000
bar2.rend
结语
现在这部《古2还在更新中,人各也也可以等都更新完了看,终究免费看剧岂不不香吗?同时C君在这里还是很举荐人各去玩下这个《古董局中局》桌游,确比较烧脑的
本文出品:CDA 数据分析师(ID:cdacdacda
原文链接:https://blog.csdn.net/yoggieCDA /article/details/106228954?ops_request_misc=%257B%作文题材2522request%255Fid%2522%253A %2522166856496116782425691787%2522%252C%2522scm%2522%253A %252220140713.130102334.pc%255Fblog.%2522%257D&request_id=166856496116782425691787&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~blog~first_rank_ecpm_v1~times_rank-15-106228954-null-null.nonecase&utm_term=%E9%A 2%9影视题材8%E6%9D%90