找回密码
 立即注册

扫一扫,访问微社区

QQ登录

只需一步,快速开始

查看: 4734|回复: 3

[求助] 用python解决json文件问题

2

主题

2

帖子

2

积分

贫民

积分
2
奥拉夫 发表于 2017-7-24 22:39:59 | 显示全部楼层 |阅读模式
1威望
部分原始数据,使用spyder打开后的json文件显示如下:
{"msgid":"8280204259419051","msgpriority":0,"msgtext":"PTV-8698|8280204259419051|function () {}|1498722414739|257|!206!3041!0!0!1!CCTV-1 综合(高清)","receiverid":"data","senderid":"8280204259419051","subjectid":"data.stb.report"}
{"msgid":"0","msgpriority":0,"msgtext":"HC3100||ec-f4-bb-da-4c-e4|599925|257|1!5!501!0!0!2!CCTV-2 财经","receiverid":"data","senderid":"0","subjectid":"data.stb.report"}
{"msgid":"8280203295899516","msgpriority":0,"msgtext":"OTS_4K_SC|8280203295899516|00-23-b8-d6-9d-f1|139169|257|1!5!500!0!0!1!CCTV-1 综合","receiverid":"data","senderid":"8280203295899516","subjectid":"data.stb.report"}
{"msgid":"0","msgpriority":0,"msgtext":"HC3100||ec-f4-bb-da-4c-e4|747643|49|影视/剧场&logos=/poster/201705261441546372.jpg!index.html/second.html!剧场&logos=/poster/201705261441546372.jpg","receiverid":"data","senderid":"0","subjectid":"data.stb.report"}
{"msgid":"0","msgpriority":0,"msgtext":"HC3100||ec-f4-bb-da-4c-e4|751253|771|1!2591697!剧场/年度新剧!1!10","receiverid":"data","senderid":"0","subjectid":"data.stb.report"}
{"msgid":"0","msgpriority":0,"msgtext":"HC3100||ec-f4-bb-da-4c-e4|753289|772|_A1004457072!欢乐颂!01!0!0!1004457072!剧场/年度新剧!2582!1!0","receiverid":"data","senderid":"0","subjectid":"data.stb.report"}
{"msgid":"8280203295899516","msgpriority":0,"msgtext":"OTS_4K_SC|8280203295899516|00-23-b8-d6-9d-f1|182526|772|TVMA214976_A1003122726!回魂夜(香港 1995年!)!0!0!1003122726!!4734!1!0","receiverid":"data","senderid":"8280203295899516","subjectid":"data.stb.report"}
{"msgid":"0","msgpriority":0,"msgtext":"HC3100||ec-f4-bb-da-4c-e4|835216|49|影视/!index.html/!","receiverid":"data","senderid":"0","subjectid":"data.stb.report"}
{"msgid":"8510010615009789","msgpriority":0,"msgtext":"|8510010615009789|18-99-f5-ea-ed-61|1498722859743|774|1005273535!楚乔传:末路逢生(6)!湖南卫视高清!20170628!22:59:24!0!1","receiverid":"data","senderid":"8510010615009789","subjectid":"data.stb.report"}
{"msgid":"8280204143293241","msgpriority":0,"msgtext":"DVC-7078|8280204143293241||1498723024614|257|1!206!3041!0!0!1!CCTV-1 综合(高清)","receiverid":"data","senderid":"8280204143293241","subjectid":"data.stb.report"}


求助如何处理这个json文件,提取msgtext的信息,并根据‘|’和‘!’来分词,最后导入到excel文件或者csv文件格式。



回复

使用道具 举报

0

主题

20

帖子

20

积分

贫民

积分
20
zxy 发表于 2017-7-25 09:56:37 | 显示全部楼层
  1. import re
  2. with open('file.txt') as f:
  3.         for i in f:
  4.                 d=eval(i)
  5.                 print re.split(r'\||!',d["msgtext"])
复制代码
回复

使用道具 举报

30

主题

116

帖子

116

积分

侠客

积分
116
chenmengdan 发表于 2017-7-25 10:03:28 | 显示全部楼层
也是,直接另存为一种其他的文件
回复

使用道具 举报

0

主题

25

帖子

25

积分

贫民

积分
25
zps26 发表于 2017-7-25 11:34:17 | 显示全部楼层
本帖最后由 zps26 于 2017-7-25 15:37 编辑
  1. import re
  2. with open(r'C:\Users\zps\Desktop\data.json','r') as f:
  3.     s=f.read()
  4.     print(type(s),'s:',s,sep='\n')
  5. s2=s.replace('||','|')
  6. s3=s2.replace('|','!')
  7. print('s2:',s2,'s3:',s3,sep='\n')
  8. p=re.compile(r'msgtext":"(.*?)}\n')
  9. listtext=re.findall(p,s3)
  10. print(len(listtext),'listtext:',listtext,sep='\n')
复制代码


对listtext的分离与写入excel自己完成吧

回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

快速回复 返回顶部 返回列表