Java實現(xiàn)將txt/word/pdf轉(zhuǎn)成圖片并在線預覽的功能
如果不想網(wǎng)頁上的文章被復制(沒錯,說的就是某點),如果想實現(xiàn)文檔不需要下載下來就能在線預覽查看(常見于文檔付費下載網(wǎng)站、郵箱附件預覽),該怎么做?常見的做法就是將他們轉(zhuǎn)化成圖片。以下代碼基于 aspose-words(用于txt、word轉(zhuǎn)圖片),pdfbox(用于pdf轉(zhuǎn)圖片),封裝成一個工具類來實現(xiàn)txt、word、pdf等文件轉(zhuǎn)圖片的需求。
首先在項目的pom文件里添加下面兩個依賴
<dependency> <groupId>com.luhuiguo</groupId> <artifactId>aspose-words</artifactId> <version>23.1</version></dependency> <dependency> <groupId>org.apache.pdfbox</groupId> <artifactId>pdfbox</artifactId> <version>2.0.4</version> </dependency>
一、將文件轉(zhuǎn)換成圖片,并生成到本地
1、將word文件轉(zhuǎn)成圖片
public static void wordToImage(String wordPath, String imagePath) throws Exception { Document doc = new Document(wordPath); File file = new File(wordPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); for (int i = 0; i < doc.getPageCount(); i++) { Document extractedPage = doc.extractPages(i, 1); String path = pathPre + (i + 1) + ".png"; extractedPage.save(path, SaveFormat.PNG); } }
驗證:
public static void main(String[] args) throws Exception { FileConvertUtil.wordToImage("D:\\書籍\\電子書\\其它\\《山海經(jīng)》異獸圖.doc", "D:\\test\\word"); }
驗證結果:
2、將txt文件轉(zhuǎn)成圖片(同word文件轉(zhuǎn)成圖片)
public static void txtToImage(String txtPath, String imagePath) throws Exception { wordToImage(txtPath, imagePath); }
驗證:
public static void main(String[] args) throws Exception { FileConvertUtil.wordToImage("D:\\書籍\\電子書\\其它\\《山海經(jīng)》異獸圖.doc", "D:\\test\\word"); }
驗證結果:
3、將pdf文件轉(zhuǎn)圖片
public static void pdfToImage(String pdfPath, String imagePath) throws Exception { File file = new File(pdfPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); PDDocument doc = PDDocument.load(file); PDFRenderer renderer = new PDFRenderer(doc); for (int i = 0; i < doc.getNumberOfPages(); i++) { BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI String pathname = pathPre + (i + 1) + ".png"; ImageIO.write(image, "PNG", new File(pathname)); } doc.close(); }
驗證:
public static void main(String[] args) throws Exception { FileConvertUtil.pdfToImage("D:\\書籍\\電子書\\其它\\自然哲學的數(shù)學原理.pdf", "D:\\test\\pdf"); }
驗證結果:
4、同時支持多種文件類型轉(zhuǎn)成圖片
public static void fileToImage(String sourceFilePath, String imagePath) throws Exception { String ext = sourceFilePath.substring(sourceFilePath.lastIndexOf(".")); switch (ext) { case ".doc": case ".docx": wordToImage(sourceFilePath, imagePath); break; case ".pdf": pdfToImage(sourceFilePath, imagePath); break; case ".txt": txtToImage(sourceFilePath, imagePath); break; default: System.out.println("文件格式不支持"); } }
二、利用多線程提升文件寫入本地的效率
? 在將牛頓大大的長達669頁的巨作《自然哲學的數(shù)學原理》時發(fā)現(xiàn)執(zhí)行時間較長,執(zhí)行花了140,281ms。但其實這種IO密集型的操作是通過使用多線程的方式來提升效率的,于是針對這點,我又寫了一版多線程的版本。
同步執(zhí)行導出 自然哲學的數(shù)學原理.pdf 耗時:
優(yōu)化后的代碼如下:
public static void pdfToImageAsync(String pdfPath, String imagePath) throws Exception { long old = System.currentTimeMillis(); File file = new File(pdfPath); PDDocument doc = PDDocument.load(file); PDFRenderer renderer = new PDFRenderer(doc); int pageCount = doc.getNumberOfPages(); int numCores = Runtime.getRuntime().availableProcessors(); ExecutorService executorService = Executors.newFixedThreadPool(numCores); for (int i = 0; i < pageCount; i++) { int finalI = i; executorService.submit(() -> { try { BufferedImage image = renderer.renderImageWithDPI(finalI, 144); // Windows native DPI String filename = file.getName(); filename = filename.substring(0, filename.lastIndexOf(".")); String pathname = imagePath + File.separator + filename + (finalI + 1) + ".png"; ImageIO.write(image, "PNG", new File(pathname)); } catch (Exception ex) { ex.printStackTrace(); } }); } executorService.shutdown(); executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS); doc.close(); long now = System.currentTimeMillis(); System.out.println("pdfToImage 多線程 轉(zhuǎn)換完成..用時:" + (now - old) + "ms"); }
多線程執(zhí)行導出 自然哲學的數(shù)學原理.pdf 耗時如下:
從上圖可以看到本次執(zhí)行只花了24045ms,只花了原先差不多六分之一的時間,極大地提升了執(zhí)行效率。除了pdf,word、txt轉(zhuǎn)圖片也可以做這樣的多線程改造:
//將word轉(zhuǎn)成圖片(多線程) public static void wordToImageAsync(String wordPath, String imagePath) throws Exception { Document doc = new Document(wordPath); File file = new File(wordPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); int numCores = Runtime.getRuntime().availableProcessors(); ExecutorService executorService = Executors.newFixedThreadPool(numCores); for (int i = 0; i < doc.getPageCount(); i++) { int finalI = i; executorService.submit(() -> { try { Document extractedPage = doc.extractPages(finalI, 1); String path = pathPre + (finalI + 1) + ".png"; extractedPage.save(path, SaveFormat.PNG); } catch (Exception ex) { ex.printStackTrace(); } }); } } //將txt轉(zhuǎn)成圖片(多線程) public static void txtToImageAsync(String txtPath, String imagePath) throws Exception { wordToImageAsync(txtPath, imagePath); }
三、將文件轉(zhuǎn)換成圖片流
? 有的時候我們轉(zhuǎn)成圖片后并不需要在本地生成圖片,而是需要將圖片返回或者上傳到圖片服務器,這時候就需要將轉(zhuǎn)換后的圖片轉(zhuǎn)成流返回以方便進行傳輸,代碼示例如下:
1、將word文件轉(zhuǎn)成圖片流
public static List<byte[]> wordToImageStream(String wordPath) throws Exception { Document doc = new Document(wordPath); List<byte[]> list = new ArrayList<>(); for (int i = 0; i < doc.getPageCount(); i++) { try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()){ Document extractedPage = doc.extractPages(i, 1); extractedPage.save(outputStream, SaveFormat.*PNG*); list.add(outputStream.toByteArray()); } } return list; }
2、將txt文件轉(zhuǎn)成圖片流
public static List<byte[]> txtToImageStream(String txtPath) throws Exception { return *wordToImagetream*(txtPath); }
3、將pdf轉(zhuǎn)成圖片流
public static List<byte[]> pdfToImageStream(String pdfPath) throws Exception { File file = new File(pdfPath); PDDocument doc = PDDocument.*load*(file); PDFRenderer renderer = new PDFRenderer(doc); List<byte[]> list = new ArrayList<>(); for (int i = 0; i < doc.getNumberOfPages(); i++) { try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) { BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI ImageIO.*write*(image, "PNG", outputStream); list.add(outputStream.toByteArray()); } } doc.close(); return list; }
4、支持多種類型文件轉(zhuǎn)成圖片流
public static List<byte[]> fileToImageStream(String pdfPath) throws Exception { String ext = pdfPath.substring(pdfPath.lastIndexOf(".")); switch (ext) { case ".doc": case ".docx": return *wordToImageStream*(pdfPath); case ".pdf": return *pdfToImageStream*(pdfPath); case ".txt": return *txtToImageStream*(pdfPath); default: System.*out*.println("文件格式不支持"); } return null; }
最后附上完整的工具類代碼:
package com.fhey.service.common.utils.file; import com.aspose.words.Document; import com.aspose.words.SaveFormat; import com.aspose.words.SaveOptions; import javassist.bytecode.ByteArray; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.rendering.PDFRenderer; import javax.imageio.ImageIO; import java.awt.image.BufferedImage; import java.io.ByteArrayOutputStream; import java.io.File; import java.util.ArrayList; import java.util.List; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.TimeUnit; public class FileConvertUtil { //文件轉(zhuǎn)成圖片 public static void fileToImage(String sourceFilePath, String imagePath) throws Exception { String ext = sourceFilePath.substring(sourceFilePath.lastIndexOf(".")); switch (ext) { case ".doc": case ".docx": wordToImage(sourceFilePath, imagePath); break; case ".pdf": pdfToImage(sourceFilePath, imagePath); break; case ".txt": txtToImage(sourceFilePath, imagePath); break; default: System.out.println("文件格式不支持"); } } //將pdf轉(zhuǎn)成圖片 public static void pdfToImage(String pdfPath, String imagePath) throws Exception { File file = new File(pdfPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); PDDocument doc = PDDocument.load(file); PDFRenderer renderer = new PDFRenderer(doc); for (int i = 0; i < doc.getNumberOfPages(); i++) { BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI String pathname = pathPre + (i + 1) + ".png"; ImageIO.write(image, "PNG", new File(pathname)); } doc.close(); } //txt轉(zhuǎn)成轉(zhuǎn)成圖片 public static void txtToImage(String txtPath, String imagePath) throws Exception { wordToImage(txtPath, imagePath); } //將word轉(zhuǎn)成圖片 public static void wordToImage(String wordPath, String imagePath) throws Exception { Document doc = new Document(wordPath); File file = new File(wordPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); for (int i = 0; i < doc.getPageCount(); i++) { Document extractedPage = doc.extractPages(i, 1); String path = pathPre + (i + 1) + ".png"; extractedPage.save(path, SaveFormat.PNG); } } //pdf轉(zhuǎn)成圖片(多線程) public static void pdfToImageAsync(String pdfPath, String imagePath) throws Exception { long old = System.currentTimeMillis(); File file = new File(pdfPath); PDDocument doc = PDDocument.load(file); PDFRenderer renderer = new PDFRenderer(doc); int pageCount = doc.getNumberOfPages(); int numCores = Runtime.getRuntime().availableProcessors(); ExecutorService executorService = Executors.newFixedThreadPool(numCores); for (int i = 0; i < pageCount; i++) { int finalI = i; executorService.submit(() -> { try { BufferedImage image = renderer.renderImageWithDPI(finalI, 144); // Windows native DPI String filename = file.getName(); filename = filename.substring(0, filename.lastIndexOf(".")); String pathname = imagePath + File.separator + filename + (finalI + 1) + ".png"; ImageIO.write(image, "PNG", new File(pathname)); } catch (Exception ex) { ex.printStackTrace(); } }); } executorService.shutdown(); executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS); doc.close(); long now = System.currentTimeMillis(); System.out.println("pdfToImage 多線程 轉(zhuǎn)換完成..用時:" + (now - old) + "ms"); } //將word轉(zhuǎn)成圖片(多線程) public static void wordToImageAsync(String wordPath, String imagePath) throws Exception { Document doc = new Document(wordPath); File file = new File(wordPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); int numCores = Runtime.getRuntime().availableProcessors(); ExecutorService executorService = Executors.newFixedThreadPool(numCores); for (int i = 0; i < doc.getPageCount(); i++) { int finalI = i; executorService.submit(() -> { try { Document extractedPage = doc.extractPages(finalI, 1); String path = pathPre + (finalI + 1) + ".png"; extractedPage.save(path, SaveFormat.PNG); } catch (Exception ex) { ex.printStackTrace(); } }); } } //將txt轉(zhuǎn)成圖片(多線程) public static void txtToImageAsync(String txtPath, String imagePath) throws Exception { wordToImageAsync(txtPath, imagePath); } //將文件轉(zhuǎn)成圖片流 public static List<byte[]> fileToImageStream(String pdfPath) throws Exception { String ext = pdfPath.substring(pdfPath.lastIndexOf(".")); switch (ext) { case ".doc": case ".docx": return wordToImageStream(pdfPath); case ".pdf": return pdfToImageStream(pdfPath); case ".txt": return txtToImageStream(pdfPath); default: System.out.println("文件格式不支持"); } return null; } //將pdf轉(zhuǎn)成圖片流 public static List<byte[]> pdfToImageStream(String pdfPath) throws Exception { File file = new File(pdfPath); PDDocument doc = PDDocument.load(file); PDFRenderer renderer = new PDFRenderer(doc); List<byte[]> list = new ArrayList<>(); for (int i = 0; i < doc.getNumberOfPages(); i++) { try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) { BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI ImageIO.write(image, "PNG", outputStream); list.add(outputStream.toByteArray()); } } doc.close(); return list; } //將word轉(zhuǎn)成圖片流 public static List<byte[]> wordToImageStream(String wordPath) throws Exception { Document doc = new Document(wordPath); List<byte[]> list = new ArrayList<>(); for (int i = 0; i < doc.getPageCount(); i++) { try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()){ Document extractedPage = doc.extractPages(i, 1); extractedPage.save(outputStream, SaveFormat.PNG); list.add(outputStream.toByteArray()); } } return list; } //將txt轉(zhuǎn)成圖片流 public static List<byte[]> txtToImageStream(String txtPath) throws Exception { return wordToImageStream(txtPath); } }
到此這篇關于Java實現(xiàn)將txt/word/pdf轉(zhuǎn)成圖片并在線預覽的功能的文章就介紹到這了,更多相關Java圖片在線預覽內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關文章希望大家以后多多支持腳本之家!
相關文章
基于springboot攔截器HandlerInterceptor的注入問題
這篇文章主要介紹了springboot攔截器HandlerInterceptor的注入問題,具有很好的參考價值,希望對大家有所幫助。如有錯誤或未考慮完全的地方,望不吝賜教2021-09-09一文詳細解析Java?8?Stream?API中的flatMap方法
這篇文章主要介紹了Java?8?Stream?API中的flatMap方法的相關資料,flatMap方法是Java?StreamAPI中的重要中間操作,用于將流中的每個元素轉(zhuǎn)換為一個新的流,并將多個流合并為一個單一的流,常用于處理嵌套集合和一對多映射,需要的朋友可以參考下2024-12-12spring Cloud微服務跨域?qū)崿F(xiàn)步驟
這篇文章主要介紹了spring Cloud微服務跨域?qū)崿F(xiàn)步驟,文中通過示例代碼介紹的非常詳細,對大家的學習或者工作具有一定的參考學習價值,需要的朋友可以參考下2019-11-11詳談Array和ArrayList的區(qū)別與聯(lián)系
下面小編就為大家?guī)硪黄斦凙rray和ArrayList的區(qū)別與聯(lián)系。小編覺得挺不錯的,現(xiàn)在就分享給大家,也給大家做個參考。一起跟隨小編過來看看吧2017-06-06