python处理大量excel表数据问题

奶牛在寻找钳子 · 发表于 2016-8-15 11:05:47

# -*- coding: utf-8 -*-
import xlrd
import xlwt
import glob
import os
import sys
reload(sys)
sys.setdefaultencoding( "utf-8" )
old_filename_list = []

def filename():
      global old_filename_list
      path = 'E:\\test'
      for root, dirs, files in os.walk(path):
            for i in range(len(files)):
                     old_filename = str(root) + '/' + files
                     old_filename = old_filename.replace('\\','/')
                     old_filename_list.append(old_filename)
filename()
book = xlwt.Workbook(encoding='utf-8',style_compression=0)
sheet = book.add_sheet('1',cell_overwrite_ok=True)
name_list= []
list = [3,5,6,8,12,13] #excel表中第4，6，7，9，13，14列
list_1 = [3,4,5,6,7,8] #excel表中第4，5，6，7，8，9列
def read_list():
      global list
      global list_1
      global name_list
      for file in old_filename_list :
            #print file
            # filename = os.path.basename(file)
            if ('xls' in file):
                     workbook = xlrd.open_workbook(file) #读取文件
                     sheet2 = workbook.sheet_by_index(1) #读取工作表
                     ncols = sheet2.ncols
                     for i in list: #读取列数据
                              cols = sheet2.col_values(i) #读取列数据
                              for y in cols[1:]: #从第二列开始读取写入
                                    a = str(y).encode("utf-8")
                                    b = a + '\n'
                                    name_list.append(b) #取得数据list
                     s = 0
                     for x in range(len(name_list)): #将上面读取的数据循环写入
                              sheet.write(x,list_1,name_list[x]) #行数为X 列数为i 插入数据为name_list[x]
                              if (s < 6):
                                    s = s + 1
                              else:
                                    s = 0
                              book.save('E:\\13.xls')
                     print name_list
                     print old_filename_list
            name_list = []
read_list()
报错：
Traceback (most recent call last):
  File "E:\test\test.py", line 67, in <module>
read_list()
  File "E:\test\test.py", line 58, in read_list
sheet.write(x,list_1,name_list[x]) #行数为X 列数为i 插入数据为name_list[x]
IndexError: list index out of range
[Finished in 0.2s]

首先我想做的是一个能处理大量excel工作簿的脚本，读取excel工作簿中第二个表中指定的列中的所有列数据，然后新建一个excel工作簿将以读取的列按顺序导入进去，然后保存。所有的excel工作簿都是相同格式的，希望把后面的几个工作簿也按顺序添加在新建的工作簿中，并且不覆盖。
现在的问题是这个报越界错误，还有老是会覆盖前面导入的数据。
有没有大表哥能给点提示，帮忙解决下问题啊！！！！！！！！！！！！！！！！！！！！！

落叶秋风 · 发表于 2016-8-15 22:18:23

我看了你的代码和出现的问题发现了如下错误
首先你的list_1列表中就6个元素，下标从0开始到5结束，但是你这段代码中
for x in range(len(name_list)): #将上面读取的数据循环写入
                              sheet.write(x,list_1[s],name_list[x]) #行数为X 列数为i 插入数据为name_list[x]
                              if (s < 6):
                                    s = s + 1
                              else:
                                    s = 0
                              book.save('E:\\13.xls')
我可以明确的告诉你在s加到5的时候再进去他判断小于6，加1变到6，然后再开始循环进去，直接list_1[6]，直接列表越界报错，第一个错误在这里。
其次你的第二个问题，覆盖，问题也出在这里，你for循环一次就直接保存成E:\\13.xls这个文件，那你多少个数据循环多少次都是最新的数据进去，同一个文件名直接覆盖原文件了啊，其实这些问题出现的根源不是你的代码有问题，而是你写代码的逻辑出现了问题，很抱歉这么久才帮你答疑解惑，可能你现在已经解决了问题，不过如果你有新的疑问，可以继续发上论坛来提问，多多交流才会进步。

		自动登录	找回密码
密码			立即注册

[已解决] python处理大量excel表数据问题

相关帖子

最佳新人

活跃会员

热心会员

突出贡献