Translate the HTML file automatically, with Python and Google API .

2 min read
To translate the HTML file automatically, you can use Python and the Google Translate API or other translation services. Here's how you can do it:
Step 1: Install Required Libraries
Install the necessary Python libraries:
pip install googletrans==4.0.0-rc1 beautifulsoup4
Step 2: Create the Translation Script
Here’s the Python script to translate the HTML file:
from bs4 import BeautifulSoup
from googletrans import Translator
# File paths
input_html = "/media/backup_006_3/workspace_all/books_ws/business.analyst/output.html"
output_html = "/media/backup_006_3/workspace_all/books_ws/business.analyst/output_translated.html"
# Initialize the translator
translator = Translator()
# Read the HTML file
with open(input_html, "r", encoding="utf-8") as file:
soup = BeautifulSoup(file, "html.parser")
# Translate text nodes
for element in soup.find_all(string=True):
if element.strip(): # Ignore empty or whitespace-only strings
try:
translated_text = translator.translate(element, src="en", dest="es").text # Change "es" to target language code
element.replace_with(translated_text)
except Exception as e:
print(f"Error translating: {element.strip()} - {e}")
# Write the translated HTML to a new file
with open(output_html, "w", encoding="utf-8") as file:
file.write(str(soup))
print(f"Translated HTML saved to {output_html}")
Step 3: Execute the Script
Run the script:
python3 translate_html.py
Step 4: Verify the Translated HTML
Open the translated HTML file to verify the result:
xdg-open /media/backup_006_3/workspace_all/books_ws/business.analyst/output_translated.html
Customization
Change the source (
src="en"
) and destination (dest="es"
) languages to your preferences. Use language codes likefr
for French,de
for German, etc.If you encounter API rate limits, split the HTML into smaller parts and process them individually.
0
Subscribe to my newsletter
Read articles from user1272047 directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
