Aspose实现word转图片、pdf

Aspose实现word转图片、pdf

硅谷探秘者

2022-10-14发表其他 0 0 1766

Aspose

Aspose.Total是Aspose公司旗下的最全的一套office文档管理方案，主要提供.net跟java两个开发语言的控件套包，通过它，可以有计划地操纵一些商业中最流行的文件格式：Word, Excel, PowerPoint, Project,等office文档以及PDF文档。除了强大的文件操纵组件之外，Aspose.Total 还提供了用于制图、写电子邮件、拼写检查、创建条形码、生成ad hoc 查询、重现格式以及工作流等组件，可以整理一个完整的文档管理方案。

主要控件

Aspose.Words

Aspose.Words是一款先进的类库，通过它可以直接在各个应用程序中执行各种文档处理任务。Aspose.Words支持DOC，OOXML，RTF，HTML，OpenDocument, PDF, XPS, EPUB和其他格式。使用Aspose.Words，可以生成，更改，转换，渲染和打印文档而不使用Microsoft Word。

Aspose.Cells

Aspose.Cells是一个广受赞誉的电子表格组件，支持所有Excel格式类型的操作，用户无需依靠Microsoft Excel也可为其应用程序嵌入读写和处理Excel数据表格的功能。Aspose.Cells可以导入和导出每一个具体的数据，表格和格式，在各个层面导入图像，应用复杂的计算公式，并将Excel的数据保存为各种格式等等—-完成所有的这一切功能都无需使用Microsoft Excel 和Microsoft Office Automation。

Aspose.PDF

Aspose.PDF是一个PDF文档创建组件，可以帮助用户无需使用Adobe Acrobat 即可读写和操作PDF文件。Aspose.Pdf丰富功能：PDF文档压缩选项，表格创建与操作，图表支持，图像功能，丰富的超链接功能，扩展的安全性组件以及自定义字体处理。

Aspose.BarCode

Aspose.BarCode是一个功能强大，且稳健的条形码生成和识别组件，其使用托管的C#编写，能帮助开发者快速简便的向其Microsoft应用程序（WinForms, ASP .NET 和.NET Compact Framework）添加条形码生成和识别功能。有了Aspose.BarCode，开发者能对条形码图像的每一方面进行全面的控制：背景颜色，条形颜色，图像质量，旋转角度，X尺寸，标题，客户自定义分辨率等。Aspose.BarCode可以从任意图形和角度读取与识别常见的一维与二维条形码。

Aspose.Slide

Aspose.Slides是一个独特的可用于PowerPoint管理的控件，用户无需使用Microsoft PowerPoint即可在应用程序中对Microsoft PowerPoint文件进行读写以及操作。Aspose.Slides是第一个能在用户的应用程序中对PowerPoint文档进行管理的组件。

Aspose.Tasks

Aspose.Tasks 是一个非图形的.NET 项目管理组件，使.NET应用程序可以阅读以及撰写、管理项目文档时无须使用Microsoft Project。使用Aspose.Tasks 你可以阅读和改变任务，重现任务，资源，资源分配，关系和日历。

Aspose.OCR

Aspose.OCR 是一个字符识别组件，它使得开发人员可以添加OCR功能到ASP .NET Web应用程序、web服务和windows应用程序中。它提供了一个简单的类集用于控制字符识别。Aspose.OCR目的是为那些需要在他们自己的应用程序中使用图像（BMP和TIFF）的开发人员提供需求。它允许开发人员快速从图像中提取文本，并节省了从头开发一个OCR解决方案的时间和精力。

word转图片使用

jar包引入

<dependency>
    <groupId>com.aspose</groupId>
    <artifactId>aspose-words</artifactId>
    <version>19.1</version>
    <scope>system</scope>
    <systemPath>${project.basedir}/lib/aspose-words-19.1.jar</systemPath>
</dependency>

转换工具类

import com.aspose.words.*;
import com.google.common.collect.ImmutableMap;
import lombok.extern.slf4j.Slf4j;
import javax.imageio.ImageIO;
import javax.imageio.stream.ImageInputStream;
import java.awt.image.BufferedImage;
import java.io.*;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
/**
 * aspose words 操作工具类
 *
 * @author wuxianglong
 */
@Slf4j
public class WordUtils {
    private static final String OS_NAME_STR = "os.name";
    private static final String WINDOWS_STR = "windows";
    private static final String FORM_TEXT = "FORMTEXT";
    /**
     * linux系统下pdf操作需要指定字体库
     * Centos8 字体库文件目录
     */
    private static final String LINUX_FONTS_PATH = "/usr/share/fonts";

    public static void main(String[] args) throws Exception {
        checkLicense();
        String inPath = "D:\\tmp\\api_apply.doc";
        docToImage(inPath);
    }

    /**
     * word转html
     *
     * @param inPath  输入文件路径
     * @param outPath 输出文件路径
     * @throws Exception 操作异常
     */
    public static void docToHtml(String inPath, String outPath) throws Exception {
        long start = System.currentTimeMillis();
        Document doc = new Document(inPath);
        HtmlSaveOptions opts = new HtmlSaveOptions(SaveFormat.HTML);
        opts.setHtmlVersion(HtmlVersion.XHTML);
        opts.setExportImagesAsBase64(true);
        opts.setExportPageMargins(true);
        opts.setExportXhtmlTransitional(true);
        opts.setExportDocumentProperties(true);
        doc.save(outPath, opts);
        log.info("WORD转HTML成功，耗时：{}", System.currentTimeMillis() - start);
    }

    /**
     * word转pdf
     *
     * @param inPath  输入文件路径
     * @param outPath 输出文件路径
     * @throws Exception 操作异常
     */
    public static void docToPdf(String inPath, String outPath) throws Exception {
        long start = System.currentTimeMillis();
        log.info("WORD转PDF保存路径:{}", outPath);
        FileOutputStream os = getFileOutputStream(outPath);
        Document doc = new Document(inPath);
        doc.save(os, SaveFormat.PDF);
        os.close();
        log.info("WORD转PDF成功，耗时：{}", System.currentTimeMillis() - start);
    }

    /**
     * word转pdf
     *
     * @param inputStream 文件输入流
     * @param outPath     输出文件路径
     * @throws Exception 操作异常
     */
    public static void docToPdf(InputStream inputStream, String outPath) throws Exception {
        long start = System.currentTimeMillis();
        FileOutputStream os = getFileOutputStream(outPath);
        Document doc = new Document(inputStream);
        doc.save(os, SaveFormat.PDF);
        os.close();
        log.info("WORD转PDF成功，耗时：{}", System.currentTimeMillis() - start);
    }

    /**
     * word转换为图片，每页一张图片
     *
     * @param inPath word文件路径
     * @throws Exception 操作异常
     */
    public static void docToImage(String inPath) throws Exception {
        long start = System.currentTimeMillis();
        log.info("根据WORD页数转换多张图片");
        InputStream inputStream = Files.newInputStream(Paths.get(inPath));
        File file = new File(inPath);
        String name = file.getName();
        String fileName = name.substring(0, name.lastIndexOf("."));
        // 文件父级路径
        String parent = file.getParent();
        log.info("parent:{}", parent);
        // 创建目录
        boolean mkdir = new File(parent + "/" + fileName).mkdir();
        log.info("mkdir:{}", mkdir);
        List<BufferedImage> bufferedImages = wordToImg(inputStream);
        for (int i = 0; i < bufferedImages.size(); i++) {
            // 写入文件
            ImageIO.write(bufferedImages.get(i), "png", new File(parent + "/" + fileName + "/" + "第" + i + "页" + fileName + ".png"));
        }
        inputStream.close();
        log.info("WORD转图片成功，耗时：{}", System.currentTimeMillis() - start);
    }

    /**
     * word转换为图片，合并为一张图片
     *
     * @param inPath word文件路径
     * @throws Exception 操作异常
     */
    public static void docToOneImage(String inPath) throws Exception {
        long start = System.currentTimeMillis();
        log.info("WORD转换为一张图片");
        InputStream inputStream = Files.newInputStream(Paths.get(inPath));
        File file = new File(inPath);
        String name = file.getName();
        String fileName = name.substring(0, name.lastIndexOf("."));
        String parent = file.getParent();
        List<BufferedImage> bufferedImages = wordToImg(inputStream);
        // 合并为一张图片
        BufferedImage image = MergeImage.mergeImage(false, bufferedImages);
        ImageIO.write(image, "png", new File(parent + "/" + fileName + ".png"));
        inputStream.close();
        log.info("WORD转图片成功，耗时：{}", System.currentTimeMillis() - start);
    }

    /**
     * html转word
     *
     * @param inPath  输入文件路径
     * @param outPath 输出文件路径
     * @throws Exception 操作异常
     */
    public static void htmlToWord(String inPath, String outPath) throws Exception {
        Document wordDoc = new Document(inPath);
        DocumentBuilder builder = new DocumentBuilder(wordDoc);
        for (Field field : wordDoc.getRange().getFields()) {
            if (field.getFieldCode().contains(FORM_TEXT)) {
                // 去除掉文字型窗体域
                builder.moveToField(field, true);
                builder.write(field.getResult());
                field.remove();
            }
        }
        wordDoc.save(outPath, SaveFormat.DOCX);
    }

    /**
     * html转word，并替换指定字段内容
     *
     * @param inPath  输入文件路径
     * @param outPath 输出文件路径
     * @throws Exception 操作异常
     */
    public static void htmlToWordAndReplaceField(String inPath, String outPath) throws Exception {
        Document wordDoc = new Document(inPath);
        Range range = wordDoc.getRange();
        // 把张三替换成李四，把20替换成40
        ImmutableMap<String, String> map = ImmutableMap.of("张三", "李四", "20", "40");
        for (Map.Entry<String, String> str : map.entrySet()) {
            range.replace(str.getKey(), str.getValue(), new FindReplaceOptions());
        }
        wordDoc.save(outPath, SaveFormat.DOCX);
    }

    /**
     * word转pdf，linux下设置字体库文件路径，并返回FileOutputStream
     *
     * @param outPath pdf输出路径
     * @return pdf输出路径 -> FileOutputStream
     * @throws FileNotFoundException FileNotFoundException
     */
    private static FileOutputStream getFileOutputStream(String outPath) throws FileNotFoundException {
        if (!System.getProperty(OS_NAME_STR).toLowerCase().startsWith(WINDOWS_STR)) {
            // linux 需要配置字体库
            log.info("【WordUtils -> docToPdf】linux字体库文件路径:{}", LINUX_FONTS_PATH);
            FontSettings.getDefaultInstance().setFontsFolder(LINUX_FONTS_PATH, false);
        }
        return new FileOutputStream(outPath);
    }

    /**
     * word转图片
     *
     * @param inputStream word input stream
     * @return BufferedImage list
     * @throws Exception exception
     */
    public static List<BufferedImage> wordToImg(InputStream inputStream) throws Exception {
        Document doc = new Document(inputStream);
        ImageSaveOptions options = new ImageSaveOptions(SaveFormat.PNG);
        options.setPrettyFormat(true);
        options.setUseAntiAliasing(true);
        options.setUseHighQualityRendering(true);
        int pageCount = doc.getPageCount();
        List<BufferedImage> imageList = new ArrayList<>();
        for (int i = 0; i < pageCount; i++) {
            OutputStream output = new ByteArrayOutputStream();
            options.setPageIndex(i);
            doc.save(output, options);
            ImageInputStream imageInputStream = ImageIO.createImageInputStream(parse(output));
            imageList.add(ImageIO.read(imageInputStream));
        }
        return imageList;
    }

    /**
     * outputStream转inputStream
     *
     * @param out OutputStream
     * @return inputStream
     */
    private static ByteArrayInputStream parse(OutputStream out) {
        return new ByteArrayInputStream(((ByteArrayOutputStream) out).toByteArray());
    }

    /**
     * 校验许可文件
     */
    private static void checkLicense() {
        try {
            InputStream is = com.aspose.words.Document.class.getResourceAsStream("/com.aspose.words.lic_2999.xml");
            if (is == null) {
                return;
            }
            License asposeLicense = new License();
            asposeLicense.setLicense(is);
            is.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

}

图片合并工具类


import java.awt.image.BufferedImage;
import java.util.List;

/**
 * 图片合并工具
 *
 * @author wuxianglong
 */
public class MergeImage {

    /**
     * 合并任数量的图片成一张图片
     *
     * @param isHorizontal true代表水平合并，false代表垂直合并
     * @param images       待合并的图片数组
     * @return BufferedImage
     */
    public static BufferedImage mergeImage(boolean isHorizontal, List<BufferedImage> images) {
        // 生成新图片
        BufferedImage destImage;
        // 计算新图片的长和高
        int allWidth = 0, allHeight = 0, allWidthMax = 0, allHeightMax = 0;
        // 获取总长、总宽、最长、最宽
        for (int i = 0; i < images.size(); i++) {
            BufferedImage img = images.get(i);
            allWidth += img.getWidth();
            if (images.size() != i + 1) {
                allHeight += img.getHeight() + 2;
            } else {
                allHeight += img.getHeight();
            }
            if (img.getWidth() > allWidthMax) {
                allWidthMax = img.getWidth();
            }
            if (img.getHeight() > allHeightMax) {
                allHeightMax = img.getHeight();
            }
        }
        // 创建新图片
        if (isHorizontal) {
            destImage = new BufferedImage(allWidth, allHeightMax, BufferedImage.TYPE_INT_RGB);
        } else {
            destImage = new BufferedImage(allWidthMax, allHeight, BufferedImage.TYPE_INT_RGB);
        }
        // 合并所有子图片到新图片
        int wx = 0, wy = 0;
        for (BufferedImage img : images) {
            int w1 = img.getWidth();
            int h1 = img.getHeight();
            // 从图片中读取RGB
            int[] imageArrayOne = new int[w1 * h1];
            // 逐行扫描图像中各个像素的RGB到数组中
            imageArrayOne = img.getRGB(0, 0, w1, h1, imageArrayOne, 0, w1);
            if (isHorizontal) {
                // 水平方向合并
                // 设置上半部分或左半部分的RGB
                destImage.setRGB(wx, 0, w1, h1, imageArrayOne, 0, w1);
            } else {
                // 垂直方向合并
                // 设置上半部分或左半部分的RGB
                destImage.setRGB(0, wy, w1, h1, imageArrayOne, 0, w1);
            }
            wx += w1;
            wy += h1 + 2;
        }
        return destImage;
    }

}

fixed

没有一个冬天不可逾越，没有一个春天不会来临。最慢的步伐不是跬步，而是徘徊，最快的脚步不是冲刺，而是坚持。