Java實現(xiàn)將txt/word/pdf轉(zhuǎn)成圖片并在線預覽的功能

更新時間：2023年05月30日 10:46:04 作者：知北游z

本文將基于aspose-words（用于txt、word轉(zhuǎn)圖片），pdfbox（用于pdf轉(zhuǎn)圖片），封裝成一個工具類來實現(xiàn)txt、word、pdf等文件轉(zhuǎn)圖片的需求并實現(xiàn)在線預覽功能，需要的可以參考一下

如果不想網(wǎng)頁上的文章被復制（沒錯，說的就是某點），如果想實現(xiàn)文檔不需要下載下來就能在線預覽查看（常見于文檔付費下載網(wǎng)站、郵箱附件預覽），該怎么做？常見的做法就是將他們轉(zhuǎn)化成圖片。以下代碼基于 aspose-words（用于txt、word轉(zhuǎn)圖片），pdfbox（用于pdf轉(zhuǎn)圖片），封裝成一個工具類來實現(xiàn)txt、word、pdf等文件轉(zhuǎn)圖片的需求。

首先在項目的pom文件里添加下面兩個依賴

<dependency>    
    <groupId>com.luhuiguo</groupId>    
    <artifactId>aspose-words</artifactId>    
    <version>23.1</version></dependency>
<dependency>    
    <groupId>org.apache.pdfbox</groupId>    
    <artifactId>pdfbox</artifactId>    
    <version>2.0.4</version>
</dependency>

一、將文件轉(zhuǎn)換成圖片，并生成到本地

1、將word文件轉(zhuǎn)成圖片

public static void wordToImage(String wordPath, String imagePath) throws Exception {
        Document doc = new Document(wordPath);
        File file = new File(wordPath);
        String filename = file.getName();
        String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf("."));
        for (int i = 0; i < doc.getPageCount(); i++) {
            Document extractedPage = doc.extractPages(i, 1);
            String path = pathPre + (i + 1) + ".png";
            extractedPage.save(path, SaveFormat.PNG);
        }
    }

驗證：

public static void main(String[] args) throws Exception {
        FileConvertUtil.wordToImage("D:\\書籍\\電子書\\其它\\《山海經(jīng)》異獸圖.doc", "D:\\test\\word");
    }

驗證結(jié)果：

2、將txt文件轉(zhuǎn)成圖片（同word文件轉(zhuǎn)成圖片）

public static void txtToImage(String txtPath, String imagePath) throws Exception {
        wordToImage(txtPath, imagePath);
    }

驗證：

public static void main(String[] args) throws Exception {
        FileConvertUtil.wordToImage("D:\\書籍\\電子書\\其它\\《山海經(jīng)》異獸圖.doc", "D:\\test\\word");
    }

驗證結(jié)果：

3、將pdf文件轉(zhuǎn)圖片

public static void pdfToImage(String pdfPath, String imagePath) throws Exception {
        File file = new File(pdfPath);
        String filename = file.getName();
        String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf("."));
        PDDocument doc = PDDocument.load(file);
        PDFRenderer renderer = new PDFRenderer(doc);
        for (int i = 0; i < doc.getNumberOfPages(); i++) {
            BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI
            String pathname = pathPre + (i + 1) + ".png";
            ImageIO.write(image, "PNG", new File(pathname));
        }
        doc.close();
    }

驗證：

public static void main(String[] args) throws Exception {
        FileConvertUtil.pdfToImage("D:\\書籍\\電子書\\其它\\自然哲學的數(shù)學原理.pdf", "D:\\test\\pdf");
    }

驗證結(jié)果：

4、同時支持多種文件類型轉(zhuǎn)成圖片

 public static void fileToImage(String sourceFilePath, String imagePath) throws Exception {
        String ext = sourceFilePath.substring(sourceFilePath.lastIndexOf("."));
        switch (ext) {
            case ".doc":
            case ".docx":
                wordToImage(sourceFilePath, imagePath);
                break;
            case ".pdf":
                pdfToImage(sourceFilePath, imagePath);
                break;
            case ".txt":
                txtToImage(sourceFilePath, imagePath);
                break;
            default:
                System.out.println("文件格式不支持");
        }
    }

二、利用多線程提升文件寫入本地的效率

? 在將牛頓大大的長達669頁的巨作《自然哲學的數(shù)學原理》時發(fā)現(xiàn)執(zhí)行時間較長，執(zhí)行花了140,281ms。但其實這種IO密集型的操作是通過使用多線程的方式來提升效率的，于是針對這點，我又寫了一版多線程的版本。

同步執(zhí)行導出自然哲學的數(shù)學原理.pdf 耗時：

優(yōu)化后的代碼如下：

public static void pdfToImageAsync(String pdfPath, String imagePath) throws Exception {
        long old = System.currentTimeMillis();
        File file = new File(pdfPath);
        PDDocument doc = PDDocument.load(file);
        PDFRenderer renderer = new PDFRenderer(doc);
        int pageCount = doc.getNumberOfPages();
        int numCores = Runtime.getRuntime().availableProcessors();
        ExecutorService executorService = Executors.newFixedThreadPool(numCores);
        for (int i = 0; i < pageCount; i++) {
            int finalI = i;
            executorService.submit(() -> {
                try {
                    BufferedImage image = renderer.renderImageWithDPI(finalI, 144); // Windows native DPI
                    String filename = file.getName();
                    filename = filename.substring(0, filename.lastIndexOf("."));
                    String pathname = imagePath + File.separator + filename + (finalI + 1) + ".png";
                    ImageIO.write(image, "PNG", new File(pathname));
                } catch (Exception ex) {
                    ex.printStackTrace();
                }
            });
        }
        executorService.shutdown();
        executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
        doc.close();
        long now = System.currentTimeMillis();
        System.out.println("pdfToImage 多線程 轉(zhuǎn)換完成..用時：" + (now - old) + "ms");
    }

多線程執(zhí)行導出自然哲學的數(shù)學原理.pdf 耗時如下：

從上圖可以看到本次執(zhí)行只花了24045ms，只花了原先差不多六分之一的時間，極大地提升了執(zhí)行效率。除了pdf，word、txt轉(zhuǎn)圖片也可以做這樣的多線程改造：

//將word轉(zhuǎn)成圖片(多線程)
    public static void wordToImageAsync(String wordPath, String imagePath) throws Exception {
        Document doc = new Document(wordPath);
        File file = new File(wordPath);
        String filename = file.getName();
        String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf("."));
        int numCores = Runtime.getRuntime().availableProcessors();
        ExecutorService executorService = Executors.newFixedThreadPool(numCores);
        for (int i = 0; i < doc.getPageCount(); i++) {
            int finalI = i;
            executorService.submit(() -> {
                try {
                    Document extractedPage = doc.extractPages(finalI, 1);
                    String path = pathPre + (finalI + 1) + ".png";
                    extractedPage.save(path, SaveFormat.PNG);
                } catch (Exception ex) {
                    ex.printStackTrace();
                }
            });
        }
    }
    //將txt轉(zhuǎn)成圖片(多線程)
    public static void txtToImageAsync(String txtPath, String imagePath) throws Exception {
        wordToImageAsync(txtPath, imagePath);
    }

三、將文件轉(zhuǎn)換成圖片流

? 有的時候我們轉(zhuǎn)成圖片后并不需要在本地生成圖片，而是需要將圖片返回或者上傳到圖片服務器，這時候就需要將轉(zhuǎn)換后的圖片轉(zhuǎn)成流返回以方便進行傳輸，代碼示例如下：

1、將word文件轉(zhuǎn)成圖片流

public static List<byte[]> wordToImageStream(String wordPath) throws Exception {
    Document doc = new Document(wordPath);
    List<byte[]> list = new ArrayList<>();
    for (int i = 0; i < doc.getPageCount(); i++) {
        try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()){
            Document extractedPage = doc.extractPages(i, 1);
            extractedPage.save(outputStream, SaveFormat.*PNG*);
            list.add(outputStream.toByteArray());
        }
    }
    return list;
}

2、將txt文件轉(zhuǎn)成圖片流

public static List<byte[]> txtToImageStream(String txtPath) throws Exception {
    return *wordToImagetream*(txtPath);
}

3、將pdf轉(zhuǎn)成圖片流

public static List<byte[]> pdfToImageStream(String pdfPath) throws Exception {
    File file = new File(pdfPath);
    PDDocument doc = PDDocument.*load*(file);
    PDFRenderer renderer = new PDFRenderer(doc);
    List<byte[]> list = new ArrayList<>();
    for (int i = 0; i < doc.getNumberOfPages(); i++) {
        try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) {
            BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI
            ImageIO.*write*(image, "PNG", outputStream);
            list.add(outputStream.toByteArray());
        }
    }
    doc.close();
    return list;
}

4、支持多種類型文件轉(zhuǎn)成圖片流

public static List<byte[]> fileToImageStream(String pdfPath) throws Exception {
    String ext = pdfPath.substring(pdfPath.lastIndexOf("."));
    switch (ext) {
        case ".doc":
        case ".docx":
            return *wordToImageStream*(pdfPath);
        case ".pdf":
            return *pdfToImageStream*(pdfPath);
        case ".txt":
            return *txtToImageStream*(pdfPath);
        default:
            System.*out*.println("文件格式不支持");
    }
    return null;
}

最后附上完整的工具類代碼：

package com.fhey.service.common.utils.file;
import com.aspose.words.Document;
import com.aspose.words.SaveFormat;
import com.aspose.words.SaveOptions;
import javassist.bytecode.ByteArray;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
public class FileConvertUtil {
    //文件轉(zhuǎn)成圖片
    public static void fileToImage(String sourceFilePath, String imagePath) throws Exception {
        String ext = sourceFilePath.substring(sourceFilePath.lastIndexOf("."));
        switch (ext) {
            case ".doc":
            case ".docx":
                wordToImage(sourceFilePath, imagePath);
                break;
            case ".pdf":
                pdfToImage(sourceFilePath, imagePath);
                break;
            case ".txt":
                txtToImage(sourceFilePath, imagePath);
                break;
            default:
                System.out.println("文件格式不支持");
        }
    }
    //將pdf轉(zhuǎn)成圖片
    public static void pdfToImage(String pdfPath, String imagePath) throws Exception {
        File file = new File(pdfPath);
        String filename = file.getName();
        String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf("."));
        PDDocument doc = PDDocument.load(file);
        PDFRenderer renderer = new PDFRenderer(doc);
        for (int i = 0; i < doc.getNumberOfPages(); i++) {
            BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI
            String pathname = pathPre + (i + 1) + ".png";
            ImageIO.write(image, "PNG", new File(pathname));
        }
        doc.close();
    }
    //txt轉(zhuǎn)成轉(zhuǎn)成圖片
    public static void txtToImage(String txtPath, String imagePath) throws Exception {
        wordToImage(txtPath, imagePath);
    }
    //將word轉(zhuǎn)成圖片
    public static void wordToImage(String wordPath, String imagePath) throws Exception {
        Document doc = new Document(wordPath);
        File file = new File(wordPath);
        String filename = file.getName();
        String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf("."));
        for (int i = 0; i < doc.getPageCount(); i++) {
            Document extractedPage = doc.extractPages(i, 1);
            String path = pathPre + (i + 1) + ".png";
            extractedPage.save(path, SaveFormat.PNG);
        }
    }
    //pdf轉(zhuǎn)成圖片(多線程)
    public static void pdfToImageAsync(String pdfPath, String imagePath) throws Exception {
        long old = System.currentTimeMillis();
        File file = new File(pdfPath);
        PDDocument doc = PDDocument.load(file);
        PDFRenderer renderer = new PDFRenderer(doc);
        int pageCount = doc.getNumberOfPages();
        int numCores = Runtime.getRuntime().availableProcessors();
        ExecutorService executorService = Executors.newFixedThreadPool(numCores);
        for (int i = 0; i < pageCount; i++) {
            int finalI = i;
            executorService.submit(() -> {
                try {
                    BufferedImage image = renderer.renderImageWithDPI(finalI, 144); // Windows native DPI
                    String filename = file.getName();
                    filename = filename.substring(0, filename.lastIndexOf("."));
                    String pathname = imagePath + File.separator + filename + (finalI + 1) + ".png";
                    ImageIO.write(image, "PNG", new File(pathname));
                } catch (Exception ex) {
                    ex.printStackTrace();
                }
            });
        }
        executorService.shutdown();
        executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
        doc.close();
        long now = System.currentTimeMillis();
        System.out.println("pdfToImage 多線程 轉(zhuǎn)換完成..用時：" + (now - old) + "ms");
    }
    //將word轉(zhuǎn)成圖片(多線程)
    public static void wordToImageAsync(String wordPath, String imagePath) throws Exception {
        Document doc = new Document(wordPath);
        File file = new File(wordPath);
        String filename = file.getName();
        String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf("."));
        int numCores = Runtime.getRuntime().availableProcessors();
        ExecutorService executorService = Executors.newFixedThreadPool(numCores);
        for (int i = 0; i < doc.getPageCount(); i++) {
            int finalI = i;
            executorService.submit(() -> {
                try {
                    Document extractedPage = doc.extractPages(finalI, 1);
                    String path = pathPre + (finalI + 1) + ".png";
                    extractedPage.save(path, SaveFormat.PNG);
                } catch (Exception ex) {
                    ex.printStackTrace();
                }
            });
        }
    }
    //將txt轉(zhuǎn)成圖片(多線程)
    public static void txtToImageAsync(String txtPath, String imagePath) throws Exception {
        wordToImageAsync(txtPath, imagePath);
    }
    //將文件轉(zhuǎn)成圖片流
    public static List<byte[]> fileToImageStream(String pdfPath) throws Exception {
        String ext = pdfPath.substring(pdfPath.lastIndexOf("."));
        switch (ext) {
            case ".doc":
            case ".docx":
                return wordToImageStream(pdfPath);
            case ".pdf":
                return pdfToImageStream(pdfPath);
            case ".txt":
                return txtToImageStream(pdfPath);
            default:
                System.out.println("文件格式不支持");
        }
        return null;
    }
    //將pdf轉(zhuǎn)成圖片流
    public static List<byte[]> pdfToImageStream(String pdfPath) throws Exception {
        File file = new File(pdfPath);
        PDDocument doc = PDDocument.load(file);
        PDFRenderer renderer = new PDFRenderer(doc);
        List<byte[]> list = new ArrayList<>();
        for (int i = 0; i < doc.getNumberOfPages(); i++) {
            try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) {
                BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI
                ImageIO.write(image, "PNG", outputStream);
                list.add(outputStream.toByteArray());
            }
        }
        doc.close();
        return list;
    }
    //將word轉(zhuǎn)成圖片流
    public static List<byte[]> wordToImageStream(String wordPath) throws Exception {
        Document doc = new Document(wordPath);
        List<byte[]> list = new ArrayList<>();
        for (int i = 0; i < doc.getPageCount(); i++) {
            try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()){
                Document extractedPage = doc.extractPages(i, 1);
                extractedPage.save(outputStream, SaveFormat.PNG);
                list.add(outputStream.toByteArray());
            }
        }
        return list;
    }
    //將txt轉(zhuǎn)成圖片流
    public static List<byte[]> txtToImageStream(String txtPath) throws Exception {
        return wordToImageStream(txtPath);
    }
}

到此這篇關(guān)于Java實現(xiàn)將txt/word/pdf轉(zhuǎn)成圖片并在線預覽的功能的文章就介紹到這了,更多相關(guān)Java圖片在線預覽內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: