A Simple Guide to Automating Word to PDF Conversion in Python

Casie LiuCasie Liu
5 min read

Converting Word documents to PDF is highly practical due to PDF’s strong compatibility across systems and platforms, as well as its stability in preserving layouts. PDFs are ideal for sharing, displaying, and printing documents. While you can manually export Word files to PDF or change file extensions, these methods are inefficient when handling large volumes. This article explores how to convert Word to PDF using Python automatically, along with advanced features like encryption, to help you streamline workflows and save time.


Python Library to Convert Word to PDF

Speaking of Word to PDF automation in Python, some Python libraries make it faster and more efficient. Enterprise-level options like Aspose.Words for Python via .NET and Spire.Doc for Python stand out for their ability to perform conversions independently, without requiring MS Word installation, while offering a wide range of additional features. Alternatively, libraries like comtypes and pywin32 can handle conversions while preserving formatting but rely on Microsoft Word, making them less convenient in certain scenarios.

In this article, we’ll focus on Spire.Doc for Python, because of its intuitive API and dependable technical support. You can install it using the pip command: pip install Spire.Doc.

Convert Word to PDF in Python Directly

After the preparation, let’s get down to the main topic, how to convert Word to PDF in Python. With the help of the Document.SaveToFile() method, you can quickly save Word documents as PDF, XPS, HTML, RTF, etc.

Steps to turn Word to PDF:

  • Create an instance of the Document class.

  • Load a Word file using the Document.LoadFromFile() method.

  • Save the Word file as a PDF with the Document.SaveToFile() method.

Here is a code example of converting a Word file to a PDF:

from spire.doc import *
from spire.doc.common import *

# Create a Document instance
document = Document()


# Load a doc or docx file
document.LoadFromFile("/sample.docx")

# Save the document to a PDF
document.SaveToFile("/ToPDF.pdf", FileFormat.PDF)
document.Close()

Python Convert Word to PDF Directly

Python Automate Word to PDF Conversion with Password

Building on the basics of Word-to-PDF conversion, let’s explore how to customize the process to meet specific requirements. In this section, we’ll cover PDF encryption, including setting an open password and a permissions password. These features help protect document confidentiality and prevent unauthorized access, ensuring your important information remains secure. Let’s dive into the details to see how it’s done.

Steps to finish Python Word-to-PDF conversion and encrypt the PDF:

  • Create an instance of the Document class, and read a Word document from files with the Document.LoadFromFile() method.

  • Create a ToPdfParameterList object.

  • Specify the open password and/or permission password.

  • Encrypt the PDF-to-be file using the ToPdfParameterList.PdfSecurity.Encrypt() method.

  • Save the Word document as a PDF through the Document.SaveToFile(string fileName, ToPdfParameterList paramList) method.

The code example below shows how to password-protect a PDF when converting a Word file to a PDF:

from spire.doc import *
from spire.doc.common import *

#Create a Document instance
document = Document()


# Load a doc or docx file
document.LoadFromFile("/sample.docx")

# Create a ToPdfParameterList object
parameter = ToPdfParameterList()

# Specify open password and permission password
openPsd = "abc-123"
permissionPsd = "permission"

# Protect the PDF to be generated with an open password and a permission password
parameter.PdfSecurity.Encrypt(openPsd, permissionPsd, PdfPermissionsFlags.Default, PdfEncryptionKeySize.Key128Bit)

# Save the Word document to PDF
document.SaveToFile("/ToPdfWithPassword.pdf", parameter)
document.Close()

Python Convert Word to PDF and Encrypt the PDF File

How to Convert Word to PDF in Python with Embedded Fonts

Embedding fonts is essential to maintain the compatibility and design consistency of converted PDF documents across different devices. This ensures the document’s appearance remains unchanged and avoids issues like missing fonts or garbled text. In Python, this can be achieved by creating a ToPdfParameterList object and setting its ToPdfParameterList.IsEmbeddedAllFonts property. This simple step ensures your PDF looks exactly as intended, no matter where it’s viewed.

Steps to embed fonts when saving Word as PDF:

  • Create a Document object and load a Word document as the source file using the Document.LoadFromFile() method.

  • Create a ToPdfParameterList object.

  • Embed fonts by setting the ToPdfParameterList.IsEmbeddedAllFonts property to True.

  • Save the Word document as a PDF through the Document.SaveToFile() method.

Below is an example of converting a Word file to a PDF and embedding fonts in it:

from spire.doc import *
from spire.doc.common import *

# Create a Document instance
document = Document()

# Load a doc or docx file
document.LoadFromFile("/sample.docx")

# Create a ToPdfParameterList object
parameter = ToPdfParameterList()

# Embed fonts in PDF
parameter.IsEmbeddedAllFonts = True

# Save the Word document to PDF
document.SaveToFile("/topdfEmbedFonts.pdf", parameter)
document.Close()

 Python Convert Word to PDF and Embed Fonts

Transform Word to PDF in Python and Set Image Quality

By adjusting the Document.JPEGQuality property, you can customize the image quality within Word documents. Compressing images reduces the PDF file's size, which not only saves storage space but also speeds up file transfer. This is especially useful when dealing with large documents, ensuring both efficient handling and faster sharing.

Steps to convert Word to PDF and compress images:

  • Create an instance of the Document class and specify the file path to load a Word document through the Document.LoadFromFile() method.

  • Adjusting the quality of images in the Word file using the Document.JPEGQuality property.

  • Save the Word file as a PDF with the Document.SaveToFile() method.

The code example here demonstrates how to compress the image to 40% of its original quality when transforming a Word document to a PDF:

from spire.doc import *
from spire.doc.common import *

# Create a Document instance
document = Document()

# Load a doc or docx file
document.LoadFromFile("/sample.docx")

# Compress the image to 40% of its original quality
document.JPEGQuality = 40

# Preserve original image quality
# document.JPEGQuality = 100

# Save the Word document to a PDF
document.SaveToFile("/toPDFSetImageQuality.pdf", FileFormat.PDF)
document.Close()

The Bottom Line

This article covers how to convert Word to PDF using Python, from basic direct conversion to more advanced features like encryption, font embedding, and image quality adjustments. Each section includes detailed steps and code examples for easy reference. By the end of this guide, you'll be equipped to efficiently handle Word to PDF conversions with ease.


Also Read

Convert Markdown to Word and Word to Markdown in Python: Quickly and Easily

3 Steps to Convert Text to Word or Word to Text in Python

Convert HTML to Word with Python: A Simple Three-Step Guide

0
Subscribe to my newsletter

Read articles from Casie Liu directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Casie Liu
Casie Liu