word文件怎么转换成db文件(java使用pandoc将markdown转换为word文档)

大纲

pandoc 下载和基本使用
java 操作pandoc 批量转换为word文档
使用screw生成数据库文档

场景概述

在项目开发时,平时喜欢用markdown写文档.

但项目交付给客户的时候,需要将markdown转换为更加正式的word文档进行交付.

于是用pandoc将.md文件转换为word文档.

pandoc 下载和基本使用

官网下载pandoc: https://www.pandoc.org/installing.html

Pandoc 可以用于各种文档格式之间的转换。如: Markdown、Microsoft Word、HTML、EPUB、roff man、LaTeX、PDF。

pandoc命令行使用示例: https://www.pandoc.org/demos.html

// 1 .md 文件转换为word文档 pandoc test.md -o newtest.docx // 2 多个md转换为一个word文档, pandoc xx1.md xx2.md -o 合并后的.docx /**3 自定义文档格模板格式**/ //3.1 首先获取pandoc 自带的doc模板文档muban.docx pandoc.exe -o muban.docx --print-default-data-file=reference.docx //3.2 打开muban.docx,修改模板文件中样式(如下图,需要通过右键修改样式方式,模板才能生效) //3.3 使用模板文档,进行转换 pandoc.exe xx1.md xx2.md -o 合并后的.docx --reference-doc=C:\reference.docx

修改调整后的模板文档 muban.docx

https://javabus.oss-cn-beijing.aliyuncs.com/reference.docx

需要通过右键修改样式方式,模板才能生效

markdown转word样式没处理好,生成word后新建样式手动处理

通过名称可以控制样式排序

段落背景色设置

前后缩进控制

java 操作pandoc 批量转换为word文档

待转换文档目录结构如下:

效果图预览:

markdown文档

转换后word效果图

代码如下

/** * 通过pandoc输出默认模板到custom.docx: pandoc.exe -o custom.docx --print-default-data-file=reference.docx *

* 使用指定的文件作为输出文件的格式参考。参考文件的内容被忽略，只使用其中的样式和文档属性(包括边界、页面尺寸、页眉页脚等) (--reference-doc=File(custom.docx)) 指定要生成的word模板 */ public class Markdown2Docx_backup { //pandoc.exe 文件绝对路径 private static final String PANDOC_PATH = "C:\doc\pandoc.exe "; //markdown文档(.md后缀格式) 所在路径 private static final String DOCS_DIR = "C:\doc\docs\docs"; //文档依赖图片路径 private static final String IMG_DIR = "C:\doc\docs\docs\imgs"; //侧边栏_sidebar.md，根据侧边栏顺序转换文档 private static final String _sidebar = "C:\doc\docs\docs\_sidebar.md"; //docx模板使用 pandoc 进行文档转换（markdown转word） http://t.zoukankan.com/kofyou-p-14932700.html private static final String reference_docx = "C:\doc\docs\docs\reference.docx"; public static void main(String[] args) { copyImageDir(); String mdFilePath = buildAllMdFilePath(); convertmd2Docx(mdFilePath); } /** * 解析侧边栏 _sidebar.md 生成所有markdown文件路径,如: xx1.md xx2.md *

* 侧边栏内容示例: -[文档说明](/job-debug.md) */ private static final String buildAllMdFilePath() { StringBuilder mds = new StringBuilder(); File sidebarFile = new File(_sidebar); if (!sidebarFile.exists()) { System.err.println("_sidebar.md 侧边栏文件不存在"); return null; } //获取文件首行作为文件名 List contents = FileUtil.readTxtFile(_sidebar, "utf-8"); for (String content : contents) { //content示例必须有()将md路径包含: -[文档说明](/job-debug.md) try { if (StringUtil.isNullOrEmpty(content.trim())) { continue; } if (content.indexOf("](") < 0) { System.out.println(content + ", 不是markdown 路径不进行转换"); continue; } //解析出 /job-debug.md String mdPath = content.split("]\(")[1].replace(")", "").replace("/", "\").trim(); if (mdPath.endsWith(".md")) { mds.append(DOCS_DIR).append("\").append(mdPath).append(" "); } else { mds.append(DOCS_DIR).append("\").append(mdPath).append(".md").append(" "); } } catch (Exception e) { System.err.println("从文档中解析md文件路径失败"); } } return mds.toString(); } private static void convertmd2Docx(String mdFilePath) { String docxPath = DOCS_DIR + "\开发手册.docx"; Runtime rn = Runtime.getRuntime(); try { //pandoc xx1.md xx2.md -o test.docx String command; File mubanDoc = new File(reference_docx); if (mubanDoc.exists()) { System.out.println("使用docx模板进行文档转换: " + reference_docx); command = PANDOC_PATH + " " + mdFilePath + " -o " + docxPath + " --toc-depth=3 --reference-doc=" + reference_docx; } else { //pandoc xx1.md xx2.md -o test.docx command = PANDOC_PATH + " " + mdFilePath + " -o " + docxPath + " "; } System.out.println(command); Process exec = rn.exec(command); } catch (Exception e) { System.out.println("调用服务生成word文档错误 " + e.getMessage()); } } /** * 1 (pandoc test.md -o test.docx 无法识别../imgs ),将../imgs替换为/imgs绝对路径 *

* 2 修改 markdown文档中的图片引入方式,手动将文档中所有../imgs 替换为/imgs ![业务用例图](/imgs/complex-email.png) *

* 3 通过copyImageDir()方法,将/docs/imgs目录下的图片,复制到当前程序运行的根目录/的imgs下(/imgs) */ private static void copyImageDir() { //文档中依赖图片文件路径 File docImgsDir = new File(IMG_DIR); //解决pandoc 转换文件时,找不到图片路径../imgs问题 File dir = new File("/imgs"); if (!dir.exists()) { dir.mkdir(); File[] files = docImgsDir.listFiles(); for (File file : files) { String s = FileUtil.readToString(file); try { FileUtil.writeToFile(dir + "\" + file.getName(), s, "utf-8", false); } catch (IOException e) { e.printStackTrace(); } } System.out.println("复制文档图片到路径:" + dir.getAbsolutePath()); } else { System.out.println("文档图片已存在路径:" + dir.getAbsolutePath()); } } }

参考文档

Pandoc使用技巧: https://blog.csdn.net/weixin_39617497/article/details/117897721
pandoc使用latex模板转pdf文件: https://blog.51cto.com/u_1472521/5202317
使用 pandoc 进行文档转换（markdown转word） http://t.zoukankan.com/kofyou-p-14932700.html

使用screw生成数据库文档

文档地址: https://gitee.com/leshalv/screw
生成文档示例

自动生成文档目录

数据表和字段描述示例

引入screw依赖

org.freemarker freemarker 2.3.30 cn.smallbun.screw screw-core 1.0.3

编码实现生成文档

/** * 数据库文档生成器, 普通 main 方法生成方式 * 参考文档 https://gitee.com/leshalv/screw */ public class DocGen1 { public static final String db_users = "monitor,mgr"; public static final String db_pwds = "1,1"; public static void main(String[] args) { DocGen1 gen1 = new DocGen1(); String[] users = db_users.split(",");//循环生成多个用户 String[] pwds = db_pwds.split(","); for (int i = 0; i < users.length; i++) { gen1.docGeneration(hikariConfig(users[i], pwds[i])); } } interface Config { EngineFileType engineFileType = EngineFileType.HTML; String docDesc = "数据库表说明文档 "; String version = DateUtils.UTC_TIME_ZONE.getDisplayName(); } private static HikariConfig hikariConfig(String db_user, String pwd) { HikariConfig hikariConfig = new HikariConfig();//数据源 // hikariConfig.setDriverClassName("oracle.jdbc.driver.OracleDriver"); hikariConfig.setDriverClassName("com.mysql.cj.jdbc.Driver"); hikariConfig.setJdbcUrl("jdbc:mysql:.0.0.1:3306/database"); hikariConfig.setUsername(db_user); hikariConfig.setPassword(pwd); hikariConfig.addDataSourceProperty("useInformationSchema", "true"); hikariConfig.setMinimumIdle(2); hikariConfig.setMaximumPoolSize(5); return hikariConfig; } /** * 文档生成 */ void docGeneration(HikariConfig hikariConfig) { DataSource dataSource = new HikariDataSource(hikariConfig); EngineConfig engineConfig = EngineConfig.builder() .fileOutputDir(DocGeneratorApplication.OUTPUT_DIR)//生成文件路径 .openOutputDir(true)//打开目录 .fileType(Config.engineFileType)//文件类型 .produceType(EngineTemplateType.freemarker)//生成模板实现 .build(); ArrayList ignoreTableName = new ArrayList<>();//忽略表 ignoreTableName.add("test_user"); ArrayList ignorePrefix = new ArrayList<>(); ignorePrefix.add("test_"); ignorePrefix.add("bak_"); ArrayList ignoreSuffix = new ArrayList<>();//忽略表前缀 ignoreSuffix.add("_001"); ignoreSuffix.add("_009"); ignoreSuffix.add("_copy"); ProcessConfig processConfig = ProcessConfig.builder() .designatedTableName(new ArrayList<>())//指定生成逻辑 .designatedTablePrefix(new ArrayList<>())//根据名称指定表生成 .designatedTableSuffix(new ArrayList<>())//根据表前缀生成 .ignoreTableName(ignoreTableName)//根据表后缀生成 .ignoreTablePrefix(ignorePrefix)//忽略表名 .ignoreTableSuffix(ignoreSuffix).build();//忽略表前缀 Configuration config = Configuration.builder()//忽略表后缀 .version(Config.version)//配置 .description(Config.docDesc + hikariConfig.getUsername())//版本 .dataSource(dataSource)//描述 .engineConfig(engineConfig)//数据源 .produceConfig(processConfig).title(hikariConfig.getUsername())//生成配置 .build(); new DocumentationExecute(config).execute();//生成配置 } }