|
老师有问题 想请教一下:就是用Python 与pdfplumber模块读取电子发票时,有些发票数据读不出来, 但是如果是别的PDF 竟然可以完全识别,这是什么情况
# -*- coding: UTF-8 -*-
import os
import re
import pdfplumber as pdf
filename="fapiao001.pdf"
file=pdf.open(filename)
for page in file.pages:
print("+"*25)
text=page.extract_text()
print(text)
print("+"*25)
words=page.extract_words()
print(words)
print("+"*25)
for tb in page.extract_table():
print(tb)
|
|