java uses poi and itextpdf to convert word and ppt into pdf files, and stamps the pdf files

1 environment and background

SpringBoot project, use poi, itextpdf to convert excel, ppt, word files into pdf, and sign the pdf files;
Add image watermark to Excel file and set encryption as read-only.
The following methods are all byte arrays returned, and the file can be output directly according to the specific situation.
office files are divided into 2003 and 2007 versions, so the processing methods are different.
There are many other ways to convert pdf. For example, you can use the third-party plug-in aspose to process office files, but you need to charge;
spire can be used, which is also charged;
You can also use OpenOffice office components, plus org.jodconverter. The components are applicable to Linux and windows.

2. PDF documents are stamped

Digital certificate generation:

keytool -storepass 123456 -genkeypair -keyalg RSA -keysize 1024 -sigalg SHA1withRSA -validity 3650 -alias mycert -keystore my.keystore -dname "CN=www.sample.com, OU=sample, O=sample, L=BJ, ST=BJ, C=CN"

Open the cmd window and directly execute the above command, which will generate a file of my.keystore.

/**
     * Sign pdf
     * @param src pdf File input stream
     * @param imgPath Signature image path
     * @param reason Reason for signing
     * @param location Place of signature
     * @return
     * @throws GeneralSecurityException
     * @throws IOException
     * @throws DocumentException
     */
    public static byte[] sign(InputStream src,String imgPath,String reason,String location) throws GeneralSecurityException, IOException, DocumentException {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        byte[] resBytes = null;
        try{
            //Read the keystore to obtain the private key and certificate chain
            KeyStore keyStore = KeyStore.getInstance("JKS");
            keyStore.load(ConvertImgToBase64Util.loadImgResource("templates/keystore/keystore.p12"),"123456".toCharArray());
            String alias = (String)keyStore.aliases().nextElement();
            PrivateKey PrivateKey = (PrivateKey) keyStore.getKey(alias, "123456".toCharArray());
            Certificate[] chain = keyStore.getCertificateChain(alias);

            PdfReader pdfReader = new PdfReader(src);
            Rectangle  rectangle= pdfReader.getPageSize(1);
            float urx =  rectangle.getRight()-100;
            float ury = rectangle.getTop()-100;
            float llx = urx-200;
            float lly = ury-200;
            PdfStamper stamper = PdfStamper.createSignature(pdfReader, baos, '', null, false);
            // Get the attribute object of digital signature and set the attribute of digital signature
            PdfSignatureAppearance appearance = stamper.getSignatureAppearance();
            appearance.setReason(reason);
            appearance.setLocation(location);
            appearance.setVisibleSignature(new Rectangle(llx,lly,urx,ury), 1, "sign");
            //Get stamped pictures
            byte[] imgBytes = ConvertImgToBase64Util.image2Bytes(imgPath);
            Image image = Image.getInstance(imgBytes);
            appearance.setSignatureGraphic(image);
            //Set certification level
            appearance.setCertificationLevel(PdfSignatureAppearance.NOT_CERTIFIED);
            //The rendering method of the seal. Here, choose to display only the seal
            appearance.setRenderingMode(PdfSignatureAppearance.RenderingMode.GRAPHIC);
            ExternalDigest digest = new BouncyCastleDigest();
            //Signature algorithm, the parameters are: Certificate secret key, digest algorithm name, such as MD5 | SHA-1 | SHA-2
            ExternalSignature signature = new PrivateKeySignature(PrivateKey, DigestAlgorithms.SHA1, null);
            //Call itext signature method to complete pdf signature
            MakeSignature.signDetached(appearance, digest, signature, chain, null, null, null, 0,MakeSignature.CryptoStandard.CMS);
            resBytes = baos.toByteArray();
        }catch (Exception e){
            log.error("pdf Abnormal document signature:{}",e);
        }finally {
            try{
                if(baos != null){
                    baos.close();
                }
            }catch (IOException e){
                log.error("close io Flow exception:{}",e);
            }
        }
        return resBytes;
    }

keystore.p12 in the above code is my.keystore

3 word to pdf

3.1 docx to pdf

public static byte[] docxToPdf(InputStream src) {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        byte[] resBytes = null;
        String result;
        try {
            // pdf file size
            Document pdfDocument = new Document(PageSize.A3, 72, 72, 72, 72);
            PdfWriter pdfWriter = PdfWriter.getInstance(pdfDocument, baos);
            XWPFDocument doc = new XWPFDocument(src);
            pdfWriter.setInitialLeading(20);
            java.util.List<XWPFParagraph> plist = doc.getParagraphs();
            pdfWriter.open();
            pdfDocument.open();
            for (int i = 0; i < plist.size(); i++) {
                XWPFParagraph pa = plist.get(i);
                java.util.List<XWPFRun> runs = pa.getRuns();
                for (int j = 0; j < runs.size(); j++) {
                    XWPFRun run = runs.get(j);
                    java.util.List<XWPFPicture> piclist = run.getEmbeddedPictures();
                    Iterator<XWPFPicture> iterator = piclist.iterator();
                    while (iterator.hasNext()) {
                        XWPFPicture pic = iterator.next();
                        XWPFPictureData picdata = pic.getPictureData();
                        byte[] bytepic = picdata.getData();
                        Image imag = Image.getInstance(bytepic);
                        pdfDocument.add(imag);
                    }
                    // Solution of Chinese Fonts
                    BaseFont bf = BaseFont.createFont("STSong-Light", "UniGB-UCS2-H", BaseFont.NOT_EMBEDDED);
                    Font font = new Font(bf, 11.0f, Font.NORMAL, BaseColor.BLACK);
                    String text = run.getText(-1);
                    byte[] bs;
                    if (text != null) {
                        bs = text.getBytes();
                        String str = new String(bs);
                        Chunk chObj1 = new Chunk(str, font);
                        pdfDocument.add(chObj1);
                    }
                }
                pdfDocument.add(new Chunk(Chunk.NEWLINE));
            }
            //It needs to be closed, otherwise the output stream cannot be obtained
            pdfDocument.close();
            pdfWriter.close();
            resBytes = baos.toByteArray();
        } catch (Exception e) {
            log.error("docx turn pdf File exception:{}",e);
        }finally {
            try{
                if(baos != null){
                    baos.close();
                }
            }catch (IOException e){
                log.error("docx turn pdf close io Flow exception:{}",e);
            }
        }
        return resBytes;
    }

3.2 doc to pdf

/**
     * doc To convert to pdf, first convert doc to html, and then convert html to pdf, because using poi cannot directly convert doc to pdf
     * @param src
     * @return
     */
    public static  byte[]  doc2pdf(InputStream src){
        byte[]  res = null;
        try{
            String html = OfficeToPdfUtil.doc2Html(src);
            html = OfficeToPdfUtil.formatHtml(html);
            res = OfficeToPdfUtil.htmlToPdf(html);
        }catch (Exception e){
            log.error("doc turn pdf Exception:{}",e);
        }
        return res;
    }


public class OfficeToPdfUtil {
    /**
     * html Convert to pdf
     * @param html
     * @return
     */
    public static byte[] htmlToPdf(String html) {
        com.itextpdf.text.Document document = null;
        ByteArrayInputStream bais = null;
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        byte[] resBytes = null;
        try {
            document = new com.itextpdf.text.Document(PageSize.A4);
            PdfWriter writer = PdfWriter.getInstance(document, baos);
            document.open();
            bais = new ByteArrayInputStream(html.getBytes());
            XMLWorkerHelper.getInstance().parseXHtml(writer, document, bais,
                    Charset.forName("UTF-8"), new FontProvider() {
                        @Override
                        public boolean isRegistered(String s) {
                            return false;
                        }

                        @Override
                        public Font getFont(String s, String s1, boolean embedded, float size, int style, BaseColor baseColor) {
                            // Configure fonts
                            Font font = null;
                            try {
                                BaseFont bf = BaseFont.createFont("STSong-Light", "UniGB-UCS2-H", BaseFont.EMBEDDED);
                                font = new Font(bf, size, style, baseColor);
                                font.setColor(baseColor);
                            } catch (Exception e) {
                                e.printStackTrace();
                            }
                            return font;
                        }
                    });
            document.close();
            writer.close();
            resBytes = baos.toByteArray();
        } catch (Exception e) {
            log.error("html turn pdf Exception:{}",e);
        } finally {
            if (document != null) {
                document.close();
            }
            if (bais != null) {
                try {
                    bais.close();
                } catch (IOException e) {
                    log.error("html turn pdf close io Flow exception:{}",e);
                }
            }
        }
        return resBytes;
    }

    /**
     * doc File to html
     * @param inputStream
     * @return
     */
    public static String doc2Html(InputStream inputStream) {
        String content = null;
        ByteArrayOutputStream baos = null;
        try {
            HWPFDocument wordDocument = new HWPFDocument(inputStream);
            WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
            wordToHtmlConverter.setPicturesManager(new PicturesManager() {
                @Override
                public String savePicture(byte[] content, PictureType pictureType, String suggestedName, float widthInches, float heightInches) {
                    return null;
                }
            });
            wordToHtmlConverter.processDocument(wordDocument);
            org.w3c.dom.Document htmlDocument = wordToHtmlConverter.getDocument();
            DOMSource domSource = new DOMSource(htmlDocument);
            baos = new ByteArrayOutputStream();
            StreamResult streamResult = new StreamResult(baos);

            TransformerFactory tf = TransformerFactory.newInstance();
            Transformer serializer = tf.newTransformer();
            serializer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
            serializer.setOutputProperty(OutputKeys.INDENT, "yes");
            serializer.setOutputProperty(OutputKeys.METHOD, "html");
            serializer.transform(domSource, streamResult);
        } catch (Exception e) {
            log.error("doc turn html Exception:{}",e);
        } finally {
            try {
                if (baos != null) {
                    content = new String(baos.toByteArray(), "utf-8");
                    baos.close();
                }
            } catch (Exception e) {
                log.error("doc turn html close io Flow exception:{}",e);
            }
        }
        return content;
    }

    /**
     * Normalize html with jsoop
     * @param html html content
     * @return Normalized html
     */
    public static String formatHtml(String html) {
        org.jsoup.nodes.Document doc = Jsoup.parse(html);
        // Remove excessive width
        String style = doc.attr("style");
        if (StringUtils.isNotEmpty(style) && style.contains("width")) {
            doc.attr("style", "");
        }
        Elements divs = doc.select("div");
        for (org.jsoup.nodes.Element div : divs) {
            String divStyle = div.attr("style");
            if (StringUtils.isNotEmpty(divStyle) && divStyle.contains("width")) {
                div.attr("style", "");
            }
        }
        // Jsoop generates closed Tags
        doc.outputSettings().syntax(org.jsoup.nodes.Document.OutputSettings.Syntax.xml);
        doc.outputSettings().escapeMode(Entities.EscapeMode.xhtml);
        return doc.html();
    }

}

4 ppt to pdf

4.1 pptx to pdf

public static byte[] pptxToPdf(InputStream src) {
        Document document = null;
        XMLSlideShow slideShow = null;
        FileOutputStream fileOutputStream = null;
        PdfWriter pdfWriter = null;
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        byte[] resBytes = null;
        try {
            //Use input stream pptx file
            slideShow = new XMLSlideShow(src);
            //Get the size of the slide
            Dimension dimension = slideShow.getPageSize();
            //Create a container for writing content
            document = new Document(PageSize.A3, 72, 72, 72, 72);
            //Write with output stream
            pdfWriter = PdfWriter.getInstance(document, baos);
            //Must be opened before use
            document.open();
            pdfWriter.open();
            PdfPTable pdfPTable = new PdfPTable(1);
            //Get slides
            java.util.List<XSLFSlide> slideList = slideShow.getSlides();
            for (int i = 0, row = slideList.size(); i < row; i++) {
                //Get every slide
                XSLFSlide slide = slideList.get(i);
                for (XSLFShape shape : slide.getShapes()) {
                    //Determine whether it is text
                    if(shape instanceof XSLFTextShape){
                        // Set font to solve Chinese garbled code
                        XSLFTextShape textShape = (XSLFTextShape) shape;
                        for (XSLFTextParagraph textParagraph : textShape.getTextParagraphs()) {
                            for (XSLFTextRun textRun : textParagraph.getTextRuns()) {
                                textRun.setFontFamily("Song typeface");
                            }
                        }
                    }
                }
                //Create graphic objects based on slide size
                BufferedImage bufferedImage = new BufferedImage((int)dimension.getWidth(), (int)dimension.getHeight(), BufferedImage.TYPE_INT_RGB);
                Graphics2D graphics2d = bufferedImage.createGraphics();
                graphics2d.setPaint(Color.white);
                graphics2d.setFont(new java.awt.Font("Song typeface", java.awt.Font.PLAIN, 12));
                //Write content to graphic object
                slide.draw(graphics2d);
                graphics2d.dispose();
                //Encapsulated in Image object
                Image image = Image.getInstance(bufferedImage, null);
                image.scalePercent(50f);
                // Write cell
                pdfPTable.addCell(new PdfPCell(image, true));
                document.add(image);
            }
            document.close();
            pdfWriter.close();
            resBytes = baos.toByteArray();
        } catch (Exception e) {
            log.error("pptx turn pdf Exception:{}",e);
        } finally {
            try {
                if (baos != null) {
                     baos.close();
                }
            } catch (IOException e) {
                log.error("pptx turn pdf close io Flow exception:{}",e);
            }
        }
        return resBytes;
    }

4.2 ppt to pdf

/**
     * Convert ppt to pdf, compatible with ppt and pptx
     * @param is
     * @return
     */
    public static byte[] ppt2pdf(InputStream is) {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        byte[] resBytes = null;
        try {
            Document pdfDocument = new Document();
            PdfWriter pdfWriter = PdfWriter.getInstance(pdfDocument, baos);
            HSLFSlideShow hslfSlideShow = new HSLFSlideShow(is);
            double zoom = 2;
            if (hslfSlideShow == null) {
                XMLSlideShow ppt = new XMLSlideShow(is);
                if (ppt == null) {
                    throw new NullPointerException("obtain ppt File data failed");
                }
                Dimension pgsize = ppt.getPageSize();
                List<XSLFSlide> slide = ppt.getSlides();
                AffineTransform at = new AffineTransform();
                at.setToScale(zoom, zoom);
                pdfDocument.setPageSize(new Rectangle((float) pgsize.getWidth(), (float) pgsize.getHeight()));
                pdfWriter.open();
                pdfDocument.open();
                PdfPTable table = new PdfPTable(1);
                for (XSLFSlide xslfSlide : slide) {
                    BufferedImage img = new BufferedImage((int) Math.ceil(pgsize.width * zoom), (int) Math.ceil(pgsize.height * zoom), BufferedImage.TYPE_INT_RGB);
                    Graphics2D graphics = img.createGraphics();
                    graphics.setTransform(at);

                    graphics.setPaint(Color.white);
                    graphics.fill(new Rectangle2D.Float(0, 0, pgsize.width, pgsize.height));
                    xslfSlide.draw(graphics);
                    graphics.getPaint();
                    Image slideImage = Image.getInstance(img, null);
                    table.addCell(new PdfPCell(slideImage, true));
                }
                ppt.close();
                pdfDocument.add(table);
                pdfDocument.close();
                pdfWriter.close();
                resBytes = baos.toByteArray();
                return resBytes;
            }
            Dimension pgsize = hslfSlideShow.getPageSize();
            List<HSLFSlide> slides = hslfSlideShow.getSlides();
            pdfDocument.setPageSize(new Rectangle((float) pgsize.getWidth(), (float) pgsize.getHeight()));
            pdfWriter.open();
            pdfDocument.open();
            AffineTransform at = new AffineTransform();
            PdfPTable table = new PdfPTable(1);
            for (HSLFSlide hslfSlide : slides) {
                BufferedImage img = new BufferedImage((int) Math.ceil(pgsize.width * zoom), (int) Math.ceil(pgsize.height * zoom), BufferedImage.TYPE_INT_RGB);
                Graphics2D graphics = img.createGraphics();
                graphics.setTransform(at);

                graphics.setPaint(Color.white);
                graphics.fill(new Rectangle2D.Float(0, 0, pgsize.width, pgsize.height));
                hslfSlide.draw(graphics);
                graphics.getPaint();
                Image slideImage = Image.getInstance(img, null);
                table.addCell(new PdfPCell(slideImage, true));
            }
            hslfSlideShow.close();
            pdfDocument.add(table);
            pdfDocument.close();
            pdfWriter.close();
            resBytes = baos.toByteArray();
            return resBytes;
        } catch (Exception e) {
            log.error("ppt Convert to pdf Exception:{}",e);
        }
        return resBytes;
    }

There is a problem with compatibility here. When HSLFSlideShow hslfSlideShow = new HSLFSlideShow(is) cannot create an object, the io stream can no longer be used, and the XMLSlideShow object can no longer be created. However, you can use this method to convert ppt suffix files.

5. Add image watermark to excel file and encrypt it

@Slf4j
public class ExcelWaterMakerUtils {

    /**
     * Add a picture watermark to Excel, encrypt it, and return the byte array of the new file
     * @param imgPath
     * @param fileName
     * @param fis
     * @return
     */
    public static byte[] addWaterMaker(String imgPath,String fileName, InputStream fis){
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        BufferedImage bufferImg = null;
        byte[] resBytes = null;
        try {
            Workbook workbook;
            if (fileName.endsWith(".xlsx")){
                workbook = new XSSFWorkbook(fis);
            } else {
                workbook = new HSSFWorkbook(fis);
            }
            //Set Excel as read-only
            String password = RandomStringUtils.random(6,false,true);
            int sheetNumbers = workbook.getNumberOfSheets();
            short rightNum = 3;
            short leftNum = 2;
            for (int i = 0; i < sheetNumbers; i++) {
                Sheet sheet =  workbook.getSheetAt(i);
                sheet.protectSheet(password);
                Row row = sheet.getRow(1);
                if(row == null){
                    break;
                }
                short cellLastNum = row.getLastCellNum();
                rightNum = (short) (cellLastNum-1);
                leftNum = (short) (cellLastNum - 2);
                ByteArrayOutputStream byteArrayOut = new ByteArrayOutputStream();
                InputStream img = ConvertImgToBase64Util.loadImgResource(imgPath);
                bufferImg = ImageIO.read(img);
                //The second parameter will determine the form of the inserted picture. If it is a png picture, the background is transparent, but if it is set to jpg format here, the black background will be automatically added
                ImageIO.write(bufferImg, "png", byteArrayOut);
                //The top-level manager for drawing. Only one sheet can be obtained
                Drawing drawing = sheet.createDrawingPatriarch();
                //anchor is mainly used to set the properties of pictures
                ClientAnchor anchor = drawing.createAnchor(0, 0, 255, 255,leftNum, 1, rightNum, 1);
//            anchor.setAnchorType(2);
                //Insert picture
                drawing.createPicture(anchor, workbook.addPicture(byteArrayOut.toByteArray(), HSSFWorkbook.PICTURE_TYPE_PNG));
                Picture pic = drawing.createPicture(anchor,
                        workbook.addPicture(byteArrayOut.toByteArray(), Workbook.PICTURE_TYPE_PNG));
                pic.resize();
            }
            workbook.write(baos);
            resBytes = baos.toByteArray();
            fis.close();
            baos.close();
        } catch (Exception e) {
            log.error("Excel Abnormal seal:{}",e);
        }finally{
            try{
                if(fis != null){
                    fis.close();
                }
                if(baos != null){
                    baos.close();
                }
            }catch (IOException ioe){
                log.error("Excel Seal and close io Flow exception:{}",ioe);
            }
        }
        return resBytes;
    }
}

The position of the picture is located according to the row and column of the excel table, so the size of the picture of different Excel files may change, which needs to be optimized. I haven't been able to realize it for a long time, so I'll use it...
Originally, excel was converted into pdf file with controlled signature, but after the complex excel was converted into pdf, the style changed too much and was messy.

First of all, I would like to introduce myself. I graduated from Jiaotong University in 13 years. I once worked in a small company, went to large factories such as Huawei OPPO, and joined Alibaba in 18 years, until now. I know that most junior and intermediate Java engineers who want to improve their skills often need to explore and grow by themselves or sign up for classes, but there is a lot of pressure on training institutions to pay nearly 10000 yuan in tuition fees. The self-study efficiency of their own fragmentation is very low and long, and it is easy to encounter the ceiling technology to stop. Therefore, I collected a "full set of learning materials for java development" and gave it to you. The original intention is also very simple. I hope to help friends who want to learn by themselves and don't know where to start, and reduce everyone's burden at the same time. Add the business card below to get a full set of learning materials

Tags: Back-end Front-end Android Interview

Posted by Adrianc333 on Sat, 06 Aug 2022 02:47:41 +0930