Python實(shí)現(xiàn)Word文檔與JSON格式雙向轉(zhuǎn)換的完整教程與代碼解析

更新時(shí)間：2025年12月18日 08:23:15 作者：東方佑

在現(xiàn)代辦公自動(dòng)化和數(shù)據(jù)處理中,Word文檔與JSON格式之間的轉(zhuǎn)換需求日益增多,本文將詳細(xì)介紹如何使用Python實(shí)現(xiàn).docx文件與JSON格式之間的高效雙向轉(zhuǎn)換,有需要的可以了解下

在現(xiàn)代辦公自動(dòng)化和數(shù)據(jù)處理中，Word文檔與JSON格式之間的轉(zhuǎn)換需求日益增多。本文將詳細(xì)介紹如何使用Python實(shí)現(xiàn).docx文件與JSON格式之間的高效雙向轉(zhuǎn)換，并提供一個(gè)完整的解決方案。

一、功能概述與應(yīng)用場(chǎng)景

Word文檔與JSON格式的轉(zhuǎn)換在多個(gè)場(chǎng)景下非常有用：

文檔內(nèi)容提取與分析：將Word文檔內(nèi)容轉(zhuǎn)換為結(jié)構(gòu)化JSON數(shù)據(jù)，便于后續(xù)處理和分析
自動(dòng)化報(bào)告生成：將JSON數(shù)據(jù)自動(dòng)填充到預(yù)定義的Word模板中
文檔格式轉(zhuǎn)換：作為Word與其他格式（如Markdown、HTML）轉(zhuǎn)換的中間步驟
內(nèi)容管理系統(tǒng)：實(shí)現(xiàn)文檔內(nèi)容的版本控制和結(jié)構(gòu)化存儲(chǔ)

二、核心技術(shù)與庫選擇

實(shí)現(xiàn)Word與JSON轉(zhuǎn)換主要依賴以下Python庫：

python-docx：專門用于讀寫Word .docx文件的主流庫
json：Python標(biāo)準(zhǔn)庫，處理JSON格式數(shù)據(jù)

與其他方案相比，如Simplify-Docx 或FastMCP框架，直接使用python-docx提供了更大的靈活性和控制力，適合需要精細(xì)處理文檔樣式的場(chǎng)景。

三、代碼實(shí)現(xiàn)詳解

3.1 從Word文檔提取JSON數(shù)據(jù)

docx_to_json函數(shù)負(fù)責(zé)將Word文檔轉(zhuǎn)換為結(jié)構(gòu)化JSON數(shù)據(jù)，其核心邏輯如下：

def docx_to_json(docx_path):
    document = Document(docx_path)
    doc_data = {
        "paragraphs": [],
        "styles": [],
        "tables": []
    }
    
    # 提取文檔樣式信息
    styles = document.styles
    for style in styles:
        if style.type == WD_STYLE_TYPE.PARAGRAPH:
            style_info = {}
            # 只提取非空的樣式屬性
            if style.name:
                style_info["name"] = style.name
            if style.font.name:
                style_info["font_name"] = style.font.name
            # 更多樣式屬性提取...
            
            if style_info:
                doc_data["styles"].append(style_info)

這種方法不僅提取文本內(nèi)容，還完整保留樣式信息，確保轉(zhuǎn)換后的JSON數(shù)據(jù)能夠準(zhǔn)確還原原始文檔格式。

3.2 從JSON數(shù)據(jù)還原Word文檔

json_to_docx函數(shù)實(shí)現(xiàn)反向轉(zhuǎn)換，其關(guān)鍵技術(shù)點(diǎn)包括：

def json_to_docx(json_data, output_path):
    document = Document()
    
    # 處理段落和文本樣式
    for para_data in json_data.get("paragraphs", []):
        style_name = para_data.get("style", "Normal")
        try:
            paragraph = document.add_paragraph(style=style_name)
        except:
            paragraph = document.add_paragraph(style="Normal")
        
        # 設(shè)置段落對(duì)齊方式
        alignment_str = para_data.get("alignment")
        if alignment_str:
            if "CENTER" in alignment_str:
                paragraph.alignment = WD_ALIGN_PARAGRAPH.CENTER
            # 其他對(duì)齊方式處理...
        
        # 處理文本運(yùn)行(runs)及其樣式
        runs_data = para_data.get("runs", [])
        if runs_data:
            for run_data in runs_data:
                text = run_data.get("text", "")
                run = paragraph.add_run(text)
                # 設(shè)置粗體、斜體、下劃線等樣式
                run.bold = run_data.get("bold", False)
                run.italic = run_data.get("italic", False)
                # 更多樣式設(shè)置...

此實(shí)現(xiàn)特別注重樣式還原的準(zhǔn)確性，為缺失的樣式屬性提供合理的默認(rèn)值，確保生成的文檔具有良好的可讀性。

3.3 表格處理機(jī)制

代碼還包含對(duì)Word表格的完整處理：

# 處理表格數(shù)據(jù)
for table_data in json_data.get("tables", []):
    if table_data.get("rows"):
        # 動(dòng)態(tài)創(chuàng)建表格
        first_row = table_data["rows"][0]
        num_rows = len(table_data["rows"])
        num_cols = len(first_row["cells"]) if first_row.get("cells") else 1
        
        table = document.add_table(rows=num_rows, cols=num_cols)
        
        # 填充表格內(nèi)容
        for i, row_data in enumerate(table_data["rows"]):
            row = table.rows[i]
            for j, cell_data in enumerate(row_data.get("cells", [])):
                if j < len(row.cells):
                    cell = row.cells[j]
                    cell.text = cell_data.get("text", "")

表格處理采用動(dòng)態(tài)結(jié)構(gòu)創(chuàng)建方式，根據(jù)JSON數(shù)據(jù)自動(dòng)確定行列數(shù)，保證表格結(jié)構(gòu)的準(zhǔn)確性。

四、使用教程

4.1 環(huán)境準(zhǔn)備

首先安裝必要的依賴庫：

pip install python-docx

4.2 基本使用示例

將Word文檔轉(zhuǎn)換為JSON：

from docx_to_json_converter import docx_to_json

# 轉(zhuǎn)換Word文檔為JSON
json_data = docx_to_json("示例文檔.docx")

# 保存JSON文件
import json
with open("文檔數(shù)據(jù).json", "w", encoding="utf-8") as f:
    json.dump(json_data, f, ensure_ascii=False, indent=2)

將JSON數(shù)據(jù)還原為Word文檔：

from docx_to_json_converter import json_to_docx

# 讀取JSON數(shù)據(jù)
with open("文檔數(shù)據(jù).json", "r", encoding="utf-8") as f:
    json_data = json.load(f)

# 轉(zhuǎn)換為Word文檔
json_to_docx(json_data, "還原的文檔.docx")

4.3 高級(jí)功能使用

代碼還提供了交互式命令行界面，直接運(yùn)行腳本即可選擇轉(zhuǎn)換方向：

python docx_to_json_converter.py

根據(jù)提示選擇操作類型（1或2），然后輸入文件路徑即可完成轉(zhuǎn)換。

五、擴(kuò)展應(yīng)用與進(jìn)階技巧

5.1 樣式模板復(fù)用

在實(shí)際應(yīng)用中，可以結(jié)合模板復(fù)用機(jī)制提高效率：

# 創(chuàng)建樣式模板
def create_style_template(docx_path):
    json_data = docx_to_json(docx_path)
    # 提取并保存樣式信息
    template = {
        "styles": json_data["styles"],
        "metadata": {"created_time": "2023-01-01", "type": "report"}
    }
    return template

這種方法特別適用于批量生成標(biāo)準(zhǔn)化文檔的場(chǎng)景，如報(bào)告、合同等。

5.2 與LangChain集成

可以將此工具與LangChain等AI框架集成，實(shí)現(xiàn)智能文檔處理：

from langchain.document_loaders import Docx2txtLoader

# 加載生成的Word文檔
loader = Docx2txtLoader("還原的文檔.docx")
documents = loader.load()

# 后續(xù)進(jìn)行文本分析、問答等AI處理

這種結(jié)合為文檔處理提供了更多可能性，如自動(dòng)摘要、內(nèi)容分類等。

六、性能優(yōu)化建議

大文件處理：對(duì)于大型Word文檔，可以采用分塊處理策略，避免內(nèi)存溢出
緩存機(jī)制：對(duì)常用樣式模板實(shí)施緩存，提高轉(zhuǎn)換效率
批量處理：大量文檔轉(zhuǎn)換時(shí)，可以實(shí)現(xiàn)并行處理機(jī)制

七、總結(jié)

本文介紹的Word文檔與JSON雙向轉(zhuǎn)換方案具有以下優(yōu)勢(shì)：

完整性：支持文本、樣式、表格等Word文檔核心元素的轉(zhuǎn)換
靈活性：提供了API和命令行兩種使用方式，適應(yīng)不同場(chǎng)景需求
實(shí)用性：代碼可直接用于生產(chǎn)環(huán)境，且易于擴(kuò)展

這種轉(zhuǎn)換工具在文檔自動(dòng)化處理、內(nèi)容管理系統(tǒng)和數(shù)據(jù)遷移等場(chǎng)景下具有重要價(jià)值。通過進(jìn)一步集成其他工具（如pandoc、OCR技術(shù)等），還可以擴(kuò)展更多文檔處理能力。

完整代碼已在文章開頭提供，您可以直接復(fù)制使用或根據(jù)需要進(jìn)行修改。

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

"""
Docx to JSON and JSON to Docx converter
可以將docx文件的所有樣式抽取成為json對(duì)象，也可以將json對(duì)象還原為docx文件
"""

import json
from docx import Document
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.enum.style import WD_STYLE_TYPE
from docx.shared import RGBColor, Pt
from docx.oxml.ns import qn
import os


def docx_to_json(docx_path):
    """
    將docx文件轉(zhuǎn)換為JSON格式
    忽略值為null的樣式屬性
    """
    document = Document(docx_path)

    # 存儲(chǔ)所有內(nèi)容的字典
    doc_data = {
        "paragraphs": [],
        "styles": [],
        "tables": []
    }

    # 獲取所有樣式
    styles = document.styles
    for style in styles:
        if style.type == WD_STYLE_TYPE.PARAGRAPH:
            style_info = {}
            # 只添加非空的屬性
            if style.name:
                style_info["name"] = style.name
            if style.type:
                style_info["type"] = "paragraph"
            if style.font.name:
                style_info["font_name"] = style.font.name
            if style.font.size:
                style_info["font_size"] = style.font.size.pt
            if style.font.bold is not None:
                style_info["bold"] = style.font.bold
            if style.font.italic is not None:
                style_info["italic"] = style.font.italic
            if style.font.underline is not None:
                style_info["underline"] = style.font.underline
            if style.font.color.rgb:
                style_info["color"] = str(style.font.color.rgb)

            # 添加段落格式信息
            if style.paragraph_format:
                paragraph_format = {}
                if style.paragraph_format.alignment is not None:
                    paragraph_format["alignment"] = str(style.paragraph_format.alignment)
                if style.paragraph_format.left_indent:
                    paragraph_format["left_indent"] = style.paragraph_format.left_indent.pt
                if style.paragraph_format.right_indent:
                    paragraph_format["right_indent"] = style.paragraph_format.right_indent.pt
                if style.paragraph_format.first_line_indent:
                    paragraph_format["first_line_indent"] = style.paragraph_format.first_line_indent.pt
                if style.paragraph_format.space_before:
                    paragraph_format["space_before"] = style.paragraph_format.space_before.pt
                if style.paragraph_format.space_after:
                    paragraph_format["space_after"] = style.paragraph_format.space_after.pt
                # 限制line_spacing值避免溢出
                if style.paragraph_format.line_spacing and style.paragraph_format.line_spacing <= 100:
                    paragraph_format["line_spacing"] = style.paragraph_format.line_spacing
                if style.paragraph_format.keep_with_next is not None:
                    paragraph_format["keep_with_next"] = style.paragraph_format.keep_with_next
                if style.paragraph_format.keep_together is not None:
                    paragraph_format["keep_together"] = style.paragraph_format.keep_together
                if style.paragraph_format.page_break_before is not None:
                    paragraph_format["page_break_before"] = style.paragraph_format.page_break_before
                if style.paragraph_format.widow_control is not None:
                    paragraph_format["widow_control"] = style.paragraph_format.widow_control
                
                if paragraph_format:
                    style_info["paragraph_format"] = paragraph_format

            # 只有當(dāng)style_info不為空時(shí)才添加
            if style_info:
                doc_data["styles"].append(style_info)

    # 獲取所有段落
    for para in document.paragraphs:
        para_info = {}

        # 只添加非空的屬性
        if para.text:
            para_info["text"] = para.text
        if para.style and para.style.name:
            para_info["style"] = para.style.name
        
        # 添加段落格式信息
        if para.paragraph_format:
            paragraph_format = {}
            if para.paragraph_format.alignment is not None:
                paragraph_format["alignment"] = str(para.paragraph_format.alignment)
            if para.paragraph_format.left_indent:
                paragraph_format["left_indent"] = para.paragraph_format.left_indent.pt
            if para.paragraph_format.right_indent:
                paragraph_format["right_indent"] = para.paragraph_format.right_indent.pt
            if para.paragraph_format.first_line_indent:
                paragraph_format["first_line_indent"] = para.paragraph_format.first_line_indent.pt
            if para.paragraph_format.space_before:
                paragraph_format["space_before"] = para.paragraph_format.space_before.pt
            if para.paragraph_format.space_after:
                paragraph_format["space_after"] = para.paragraph_format.space_after.pt
            # 限制line_spacing值避免溢出
            if para.paragraph_format.line_spacing and para.paragraph_format.line_spacing <= 100:
                paragraph_format["line_spacing"] = para.paragraph_format.line_spacing
            if para.paragraph_format.keep_with_next is not None:
                paragraph_format["keep_with_next"] = para.paragraph_format.keep_with_next
            if para.paragraph_format.keep_together is not None:
                paragraph_format["keep_together"] = para.paragraph_format.keep_together
            if para.paragraph_format.page_break_before is not None:
                paragraph_format["page_break_before"] = para.paragraph_format.page_break_before
            if para.paragraph_format.widow_control is not None:
                paragraph_format["widow_control"] = para.paragraph_format.widow_control
            
            if paragraph_format:
                para_info["paragraph_format"] = paragraph_format

        # 處理runs
        runs_list = []
        for run in para.runs:
            run_info = {}

            # 只添加非空的屬性
            if run.text:
                run_info["text"] = run.text
            if run.bold is not None:
                run_info["bold"] = run.bold
            if run.italic is not None:
                run_info["italic"] = run.italic
            if run.underline is not None:
                run_info["underline"] = run.underline
            if run.font.name:
                run_info["font_name"] = run.font.name
            if run.font.size:
                run_info["font_size"] = run.font.size.pt
            if run.font.color.rgb:
                run_info["color"] = str(run.font.color.rgb)
            if run.font.highlight_color:
                run_info["highlight_color"] = str(run.font.highlight_color)
            if run.font.strike is not None:
                run_info["strike"] = run.font.strike
            if run.font.superscript is not None:
                run_info["superscript"] = run.font.superscript
            if run.font.subscript is not None:
                run_info["subscript"] = run.font.subscript
            if run.font.all_caps is not None:
                run_info["all_caps"] = run.font.all_caps
            if run.font.small_caps is not None:
                run_info["small_caps"] = run.font.small_caps

            # 只有當(dāng)run_info不為空時(shí)才添加
            if run_info:
                runs_list.append(run_info)

        if runs_list:
            para_info["runs"] = runs_list

        # 只有當(dāng)para_info不為空時(shí)才添加
        if para_info:
            doc_data["paragraphs"].append(para_info)

    # 獲取所有表格
    for table in document.tables:
        table_info = {
            "rows": []
        }

        # 添加表格屬性
        if hasattr(table, 'style') and table.style:
            table_info["style"] = table.style.name

        for row in table.rows:
            row_info = {
                "cells": []
            }

            for cell in row.cells:
                cell_info = {}

                # 只添加非空的屬性
                if cell.text:
                    cell_info["text"] = cell.text

                paragraphs_list = []
                # 獲取單元格中的段落
                for para in cell.paragraphs:
                    para_dict = {}
                    if para.text:
                        para_dict["text"] = para.text
                    if para.style and para.style.name:
                        para_dict["style"] = para.style.name
                    
                    # 添加段落格式信息
                    if para.paragraph_format:
                        paragraph_format = {}
                        if para.paragraph_format.alignment is not None:
                            paragraph_format["alignment"] = str(para.paragraph_format.alignment)
                        if para.paragraph_format.left_indent:
                            paragraph_format["left_indent"] = para.paragraph_format.left_indent.pt
                        if para.paragraph_format.right_indent:
                            paragraph_format["right_indent"] = para.paragraph_format.right_indent.pt
                        if para.paragraph_format.first_line_indent:
                            paragraph_format["first_line_indent"] = para.paragraph_format.first_line_indent.pt
                        if para.paragraph_format.space_before:
                            paragraph_format["space_before"] = para.paragraph_format.space_before.pt
                        if para.paragraph_format.space_after:
                            paragraph_format["space_after"] = para.paragraph_format.space_after.pt
                        # 限制line_spacing值避免溢出
                        if para.paragraph_format.line_spacing and para.paragraph_format.line_spacing <= 100:
                            paragraph_format["line_spacing"] = para.paragraph_format.line_spacing
                        
                        if paragraph_format:
                            para_dict["paragraph_format"] = paragraph_format

                    if para_dict:
                        paragraphs_list.append(para_dict)

                if paragraphs_list:
                    cell_info["paragraphs"] = paragraphs_list

                if cell_info:
                    row_info["cells"].append(cell_info)

            if row_info["cells"]:
                table_info["rows"].append(row_info)

        if table_info["rows"]:
            doc_data["tables"].append(table_info)

    return doc_data


def json_to_docx(json_data, output_path):
    """
    將JSON數(shù)據(jù)轉(zhuǎn)換為docx文件
    為缺失的樣式屬性設(shè)置默認(rèn)值
    """
    document = Document()

    # 添加段落
    for para_data in json_data.get("paragraphs", []):
        # 設(shè)置默認(rèn)樣式
        style_name = para_data.get("style", "Normal")
        try:
            paragraph = document.add_paragraph(style=style_name)
        except:
            paragraph = document.add_paragraph(style="Normal")

        # 設(shè)置段落格式
        paragraph_format_data = para_data.get("paragraph_format", {})
        if paragraph_format_data:
            # 設(shè)置段落對(duì)齊方式
            alignment_str = paragraph_format_data.get("alignment")
            if alignment_str:
                # 解析對(duì)齊字符串，提取其中的枚舉值
                if "LEFT" in alignment_str:
                    paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
                elif "CENTER" in alignment_str:
                    paragraph.alignment = WD_ALIGN_PARAGRAPH.CENTER
                elif "RIGHT" in alignment_str:
                    paragraph.alignment = WD_ALIGN_PARAGRAPH.RIGHT
                elif "JUSTIFY" in alignment_str:
                    paragraph.alignment = WD_ALIGN_PARAGRAPH.JUSTIFY
                elif "DISTRIBUTE" in alignment_str:
                    paragraph.alignment = WD_ALIGN_PARAGRAPH.DISTRIBUTE
                elif "JUSTIFY_MED" in alignment_str:
                    paragraph.alignment = WD_ALIGN_PARAGRAPH.JUSTIFY_MED

            # 設(shè)置段落間距和縮進(jìn)
            if "left_indent" in paragraph_format_data:
                paragraph.paragraph_format.left_indent = Pt(paragraph_format_data["left_indent"])
            if "right_indent" in paragraph_format_data:
                paragraph.paragraph_format.right_indent = Pt(paragraph_format_data["right_indent"])
            if "first_line_indent" in paragraph_format_data:
                paragraph.paragraph_format.first_line_indent = Pt(paragraph_format_data["first_line_indent"])
            if "space_before" in paragraph_format_data:
                paragraph.paragraph_format.space_before = Pt(paragraph_format_data["space_before"])
            if "space_after" in paragraph_format_data:
                paragraph.paragraph_format.space_after = Pt(paragraph_format_data["space_after"])
            # 限制line_spacing值避免溢出
            if "line_spacing" in paragraph_format_data and paragraph_format_data["line_spacing"] <= 100:
                paragraph.paragraph_format.line_spacing = paragraph_format_data["line_spacing"]
            if "keep_with_next" in paragraph_format_data:
                paragraph.paragraph_format.keep_with_next = paragraph_format_data["keep_with_next"]
            if "keep_together" in paragraph_format_data:
                paragraph.paragraph_format.keep_together = paragraph_format_data["keep_together"]
            if "page_break_before" in paragraph_format_data:
                paragraph.paragraph_format.page_break_before = paragraph_format_data["page_break_before"]
            if "widow_control" in paragraph_format_data:
                paragraph.paragraph_format.widow_control = paragraph_format_data["widow_control"]

        # 清空默認(rèn)文本并添加runs
        paragraph.clear()

        # 處理runs
        runs_data = para_data.get("runs", [])
        if runs_data:
            for run_data in runs_data:
                text = run_data.get("text", "")
                run = paragraph.add_run(text)

                # 設(shè)置run屬性，默認(rèn)為False
                run.bold = run_data.get("bold", False)
                run.italic = run_data.get("italic", False)
                run.underline = run_data.get("underline", False)
                run.font.strike = run_data.get("strike", False)
                run.font.superscript = run_data.get("superscript", False)
                run.font.subscript = run_data.get("subscript", False)
                run.font.all_caps = run_data.get("all_caps", False)
                run.font.small_caps = run_data.get("small_caps", False)

                # 設(shè)置字體大小，默認(rèn)為Pt(12)
                font_size = run_data.get("font_size")
                if font_size:
                    run.font.size = Pt(font_size)
                else:
                    run.font.size = Pt(12)

                # 設(shè)置字體名稱，默認(rèn)為None（使用默認(rèn)字體）
                font_name = run_data.get("font_name")
                if font_name:
                    run.font.name = font_name
                    run._element.rPr.rFonts.set(qn('w:eastAsia'), font_name)

                # 設(shè)置字體顏色，默認(rèn)為黑色
                color = run_data.get("color")
                if color and color != "None":
                    try:
                        run.font.color.rgb = RGBColor.from_string(color)
                    except:
                        # 如果顏色格式錯(cuò)誤，使用默認(rèn)黑色
                        pass

                # 設(shè)置高亮顏色
                highlight_color = run_data.get("highlight_color")
                if highlight_color and highlight_color != "None":
                    try:
                        # 注意：此處簡(jiǎn)化處理，實(shí)際應(yīng)用中需要根據(jù)字符串映射到對(duì)應(yīng)的WD_COLOR_INDEX值
                        pass
                    except:
                        # 如果高亮顏色格式錯(cuò)誤，忽略
                        pass
        else:
            # 如果沒有runs數(shù)據(jù)，則直接添加段落文本
            text = para_data.get("text", "")
            run = paragraph.add_run(text)
            # 應(yīng)用默認(rèn)樣式
            run.font.size = Pt(12)

    # 添加表格
    for table_data in json_data.get("tables", []):
        if table_data.get("rows"):
            # 創(chuàng)建表格，行數(shù)和列數(shù)根據(jù)第一行確定
            first_row = table_data["rows"][0]
            num_rows = len(table_data["rows"])
            num_cols = len(first_row["cells"]) if first_row.get("cells") else 1

            table = document.add_table(rows=num_rows, cols=num_cols)
            
            # 設(shè)置表格樣式
            table_style = table_data.get("style")
            if table_style:
                try:
                    table.style = table_style
                except:
                    # 如果樣式不存在，使用默認(rèn)樣式
                    pass

            # 填充表格內(nèi)容
            for i, row_data in enumerate(table_data["rows"]):
                row = table.rows[i]
                for j, cell_data in enumerate(row_data.get("cells", [])):
                    if j < len(row.cells):
                        cell = row.cells[j]
                        cell.text = cell_data.get("text", "")

                        # 處理單元格中的段落
                        cell_paragraphs = cell_data.get("paragraphs", [])
                        if cell_paragraphs:
                            # 清除默認(rèn)段落
                            cell.paragraphs[0].clear()
                            
                            # 添加段落
                            for para_data in cell_paragraphs:
                                para = cell.add_paragraph()
                                para.text = para_data.get("text", "")
                                
                                # 設(shè)置段落樣式
                                para_style = para_data.get("style")
                                if para_style:
                                    try:
                                        para.style = para_style
                                    except:
                                        pass
                                
                                # 設(shè)置段落格式
                                paragraph_format_data = para_data.get("paragraph_format", {})
                                if paragraph_format_data:
                                    # 設(shè)置段落對(duì)齊方式
                                    alignment_str = paragraph_format_data.get("alignment")
                                    if alignment_str:
                                        # 解析對(duì)齊字符串，提取其中的枚舉值
                                        if "LEFT" in alignment_str:
                                            para.alignment = WD_ALIGN_PARAGRAPH.LEFT
                                        elif "CENTER" in alignment_str:
                                            para.alignment = WD_ALIGN_PARAGRAPH.CENTER
                                        elif "RIGHT" in alignment_str:
                                            para.alignment = WD_ALIGN_PARAGRAPH.RIGHT
                                        elif "JUSTIFY" in alignment_str:
                                            para.alignment = WD_ALIGN_PARAGRAPH.JUSTIFY
                                        elif "DISTRIBUTE" in alignment_str:
                                            para.alignment = WD_ALIGN_PARAGRAPH.DISTRIBUTE
                                        elif "JUSTIFY_MED" in alignment_str:
                                            para.alignment = WD_ALIGN_PARAGRAPH.JUSTIFY_MED

                                    # 設(shè)置段落間距和縮進(jìn)
                                    if "left_indent" in paragraph_format_data:
                                        para.paragraph_format.left_indent = Pt(paragraph_format_data["left_indent"])
                                    if "right_indent" in paragraph_format_data:
                                        para.paragraph_format.right_indent = Pt(paragraph_format_data["right_indent"])
                                    if "first_line_indent" in paragraph_format_data:
                                        para.paragraph_format.first_line_indent = Pt(paragraph_format_data["first_line_indent"])
                                    if "space_before" in paragraph_format_data:
                                        para.paragraph_format.space_before = Pt(paragraph_format_data["space_before"])
                                    if "space_after" in paragraph_format_data:
                                        para.paragraph_format.space_after = Pt(paragraph_format_data["space_after"])
                                    # 限制line_spacing值避免溢出
                                    if "line_spacing" in paragraph_format_data and paragraph_format_data["line_spacing"] <= 100:
                                        para.paragraph_format.line_spacing = paragraph_format_data["line_spacing"]

    # 保存文檔
    document.save(output_path)


def main():
    """
    主函數(shù)，演示如何使用轉(zhuǎn)換功能
    """
    print("Docx Converter")
    print("1. Convert docx to json")
    print("2. Convert json to docx")

    choice = input("請(qǐng)選擇操作 (1 或 2): ")

    if choice == "1":
        docx_path = input("請(qǐng)輸入docx文件路徑: ")
        if not os.path.exists(docx_path):
            print("文件不存在!")
            return

        json_data = docx_to_json(docx_path)
        json_path = docx_path.replace(".docx", ".json")

        with open(json_path, "w", encoding="utf-8") as f:
            json.dump(json_data, f, ensure_ascii=False, indent=2)

        print(f"轉(zhuǎn)換完成! JSON文件已保存為: {json_path}")

    elif choice == "2":
        json_path = input("請(qǐng)輸入json文件路徑: ")
        if not os.path.exists(json_path):
            print("文件不存在!")
            return

        with open(json_path, "r", encoding="utf-8") as f:
            json_data = json.load(f)

        output_path = json_path.replace(".json", "_restored.docx")
        json_to_docx(json_data, output_path)
        print(f"轉(zhuǎn)換完成! Docx文件已保存為: {output_path}")

    else:
        print("無效的選擇!")


if __name__ == "__main__":
    main()

以上就是Python實(shí)現(xiàn)Word文檔與JSON格式雙向轉(zhuǎn)換的完整教程與代碼解析的詳細(xì)內(nèi)容，更多關(guān)于Python Word與JSON互轉(zhuǎn)的資料請(qǐng)關(guān)注腳本之家其它相關(guān)文章！

您可能感興趣的文章:

国产无遮挡裸体免费直播视频,久久精品国产蜜臀av,动漫在线视频一区二区,欧亚日韩一区二区三区,久艹在线免费视频,国产精品美女网站免费,正在播放 97超级视频在线观看,斗破苍穹年番在线观看免费,51最新乱码中文字幕

Python實(shí)現(xiàn)Word文檔與JSON格式雙向轉(zhuǎn)換的完整教程與代碼解析

目錄

一、功能概述與應(yīng)用場(chǎng)景

二、核心技術(shù)與庫選擇

三、代碼實(shí)現(xiàn)詳解

3.1 從Word文檔提取JSON數(shù)據(jù)

3.2 從JSON數(shù)據(jù)還原Word文檔

3.3 表格處理機(jī)制

四、使用教程

4.1 環(huán)境準(zhǔn)備

4.2 基本使用示例

4.3 高級(jí)功能使用

五、擴(kuò)展應(yīng)用與進(jìn)階技巧

5.1 樣式模板復(fù)用

5.2 與LangChain集成

六、性能優(yōu)化建議

七、總結(jié)

相關(guān)文章

最新評(píng)論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具

国产无遮挡裸体免费直播视频,久久精品国产蜜臀av,动漫在线视频一区二区,欧亚日韩一区二区三区,久艹在线 免费视频,国产精品美女网站免费,正在播放 97超级视频在线观看,斗破苍穹年番在线观看免费,51最新乱码中文字幕

Python實(shí)現(xiàn)Word文檔與JSON格式雙向轉(zhuǎn)換的完整教程與代碼解析

目錄

一、功能概述與應(yīng)用場(chǎng)景

二、核心技術(shù)與庫選擇

三、代碼實(shí)現(xiàn)詳解

3.1 從Word文檔提取JSON數(shù)據(jù)

3.2 從JSON數(shù)據(jù)還原Word文檔

3.3 表格處理機(jī)制

四、使用教程

4.1 環(huán)境準(zhǔn)備

4.2 基本使用示例

4.3 高級(jí)功能使用

五、擴(kuò)展應(yīng)用與進(jìn)階技巧

5.1 樣式模板復(fù)用

5.2 與LangChain集成

六、性能優(yōu)化建議

七、總結(jié)

相關(guān)文章

最新評(píng)論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具

国产无遮挡裸体免费直播视频,久久精品国产蜜臀av,动漫在线视频一区二区,欧亚日韩一区二区三区,久艹在线免费视频,国产精品美女网站免费,正在播放 97超级视频在线观看,斗破苍穹年番在线观看免费,51最新乱码中文字幕

一、功能概述與應(yīng)用場(chǎng)景

二、核心技術(shù)與庫選擇

四、使用教程

五、擴(kuò)展應(yīng)用與進(jìn)階技巧

六、性能優(yōu)化建議

七、總結(jié)