Python實(shí)現(xiàn)為pdf添加水印功能
創(chuàng)建需要的水印模板
wps創(chuàng)建
輸出pdf
水印pdf
實(shí)現(xiàn)步驟
安裝依賴
pip install PyPDF2
代碼
import os from PyPDF2 import PdfFileReader as pr from PyPDF2 import PdfFileWriter as pw def write_watermark(watermark_pdf_path: str, target_pdf_path: str): result_pdf = pw() pdf_file_name = os.path.basename(target_pdf_path) f_target = open(target_pdf_path, 'rb') f_watermark = open(watermark_pdf_path, 'rb') target_pdf = pr(f_target) watermark_page = pr(f_watermark).getPage(0) for page in range(target_pdf.getNumPages()): try: # 這一段try except是一個(gè)讓我超級(jí)麻煩的bug,讓我解決了一天。 target_pdf.getPage(page).mergePage(watermark_page) result_pdf.addPage(target_pdf.getPage(page)) except Exception as e: result_pdf.addPage(watermark_page) if not os.path.exists("output"): os.makedirs("output") result_pdf.write(open("output/已添加水印_" + pdf_file_name, 'wb')) f_target.close() f_watermark.close() def folder_pdf_files(folder: str) -> list[str]: # 一個(gè)文件夾里面有多少pdf文件 file_list = [] for a, b, c in os.walk(folder): if b == []: for filename in c: if filename[-3:].lower() == 'pdf': file_path = os.path.join(a, filename) file_list.append(file_path) print(folder, ": 有", len(file_list), "個(gè)pdf文件") return file_list def group_write_watermark(path_array: list[str], watermark_pdf_path: str): # 一組pdf文件添加水印 for pdf_path in path_array: print(pdf_path, "添加水印中...") write_watermark(watermark_pdf_path, pdf_path) print("完成") if __name__ == '__main__': watermark_pdf_path = "水印文件.pdf" folder_pdf = "目錄" # 需要添加水印的pdf的目錄 pdf_list = folder_pdf_files(folder_pdf) group_write_watermark(pdf_list, watermark_pdf_path)
問(wèn)題
UnicodeEncodeError: ‘latin-1’ codec can’t encode characters in position 8-9: ordinal not in range(256)
如果出現(xiàn)該錯(cuò)誤,可以參考以下內(nèi)容。
使用pypdf2出現(xiàn)編碼問(wèn)題
報(bào)錯(cuò)信息
‘latin-1’ codec can’t encode characters in position 8-11: ordinal not in range(256)
通常這情況是出現(xiàn)了中文字符編碼問(wèn)題
以下是使用pypdf2復(fù)制 pdf 時(shí)報(bào)錯(cuò)信息
//報(bào)錯(cuò)信息 <ipython-input-1-4f7e1b354328> in <module>() 14 output.addPage(p) 15 with open('D:\\Program Files\\2.pdf', 'wb') as f: ---> 16 output.write(f) D:\Program Files (x86)\anaconda\lib\site-packages\PyPDF2\pdf.py in write(self, stream) 499 md5_hash = md5(key).digest() 500 key = md5_hash[:min(16, len(self._encrypt_key) + 5)] --> 501 obj.writeToStream(stream, key) 502 stream.write(b_("\nendobj\n")) 503 D:\Program Files (x86)\anaconda\lib\site-packages\PyPDF2\generic.py in writeToStream(self, stream, encryption_key) 547 key.writeToStream(stream, encryption_key) 548 stream.write(b_(" ")) --> 549 value.writeToStream(stream, encryption_key) 550 stream.write(b_("\n")) 551 stream.write(b_(">>")) D:\Program Files (x86)\anaconda\lib\site-packages\PyPDF2\generic.py in writeToStream(self, stream, encryption_key) 470 471 def writeToStream(self, stream, encryption_key): --> 472 stream.write(b_(self)) 473 474 def readFromStream(stream, pdf): D:\Program Files (x86)\anaconda\lib\site-packages\PyPDF2\utils.py in b_(s) 236 return s 237 else: --> 238 r = s.encode('latin-1') 239 if len(s) < 2: 240 bc[s] = r UnicodeEncodeError: 'latin-1' codec can't encode characters in position 8-11: ordinal not in range(256)
解決方法
1、修改pypdf2包中的generic.py文件
由于我使用的是anaconda,路徑為anaconda\Lib\site-packages\PyPDF2\generic.py
generic.py文件第488行原文
try: return NameObject(name.decode('utf-8')) except (UnicodeEncodeError, UnicodeDecodeError) as e: # Name objects should represent irregular characters # with a '#' followed by the symbol's hex number if not pdf.strict: warnings.warn("Illegal character in Name Object", utils.PdfReadWarning) return NameObject(name) else: raise utils.PdfReadError("Illegal character in Name Object")
改成
try: return NameObject(name.decode('utf-8')) except (UnicodeEncodeError, UnicodeDecodeError) as e: try: return NameObject(name.decode('gbk')) except (UnicodeEncodeError, UnicodeDecodeError) as e: # Name objects should represent irregular characters # with a '#' followed by the symbol's hex number if not pdf.strict: warnings.warn("Illegal character in Name Object", utils.PdfReadWarning) return NameObject(name) else: raise utils.PdfReadError("Illegal character in Name Object")
2、修改pypdf2包中的utils.py文件
utils.py238行原文
r = s.encode('latin-1') if len(s) < 2: bc[s] = r return r
修改為
try: r = s.encode('latin-1') if len(s) < 2: bc[s] = r return r except Exception as e: print(s) r = s.encode('utf-8') if len(s) < 2: bc[s] = r return r
問(wèn)題解決
感悟
此代碼的創(chuàng)新點(diǎn)在于可以實(shí)現(xiàn)文件夾遍歷添加水印。
其實(shí)第一點(diǎn)也沒(méi)有什么,我感覺(jué)我最有成功感的就是write_watermark函數(shù)里那段try except語(yǔ)句處,這個(gè)一段代碼解決pdf空白頁(yè)而發(fā)送的錯(cuò)誤。解決了一天,網(wǎng)絡(luò)上都沒(méi)有解決方法。我摸著石頭過(guò)河。
有一個(gè)問(wèn)題就是,這個(gè)代碼對(duì)于圖片pdf的水印效果不好,因?yàn)閳D片pdf的也頁(yè)面大小比普通的文字pdf大一些,難以控制水印的位置。我想的是在創(chuàng)建水印pdf的時(shí)候就把pdf的大小放大。
到此這篇關(guān)于Python實(shí)現(xiàn)為pdf添加水印功能的文章就介紹到這了,更多相關(guān)Python pdf水印內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
相關(guān)文章
python中的[1:]、[::-1]、X[:,m:n]和X[1,:]的使用
本文主要介紹了python中的[1:]、[::-1]、X[:,m:n]和X[1,:]的使用,文中通過(guò)示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來(lái)一起學(xué)習(xí)學(xué)習(xí)吧2022-08-08

Python模擬簡(jiǎn)易版淘寶客服機(jī)器人的示例代碼

Python變量格式化輸出實(shí)現(xiàn)原理解析

Python?文件與文件對(duì)象及文件打開(kāi)關(guān)閉

python實(shí)現(xiàn)對(duì)arxml文件的操作方法

Python基于釘釘監(jiān)控發(fā)送消息提醒的實(shí)現(xiàn)