Excel Translation Skill
Version: 1.0.0 Category: Data Engineering Triggers: Translate Excel calculation files, Spanish to English conversion Tools:
scripts/python/translate_excel.py
Overview
This skill provides the capability to batch translate engineering calculation Excel files from Spanish to English. It uses a specialized dictionary of engineering terms and preserves the original file structure, formulas, and formatting.
Capabilities
- Batch Processing: Translates all
.xlsxfiles in a target directory. - Structure Preservation: Maintains sheet names, cell locations, and formulas.
- Engineering Vocabulary: Uses a curated dictionary for accurate technical translation.
- Safety: Uses regex-based word boundary detection to prevent partial word replacement errors (e.g., preventing "DE" replacement inside "MODEL").
Usage
Command
python3 scripts/python/translate_excel.py
Configuration
The script is currently configured to target:
/mnt/github/workspace-hub/doris/62092_sesa/data/calculations
To change the target, modify the target_dir variable in the main() function of the script.
Dictionary
The translation dictionary is embedded in the script. To add new terms, update the replacements dictionary in scripts/python/translate_excel.py.
Technical Details
- Dependencies:
openpyxl - Logic:
- Iterates through all
.xlsxfiles in the target directory (skipping*_en.xlsx). - Loads workbook using
openpyxl. - Translates sheet names (truncating to 31 chars if necessary).
- Iterates through all cells with string values.
- Applies regex-based replacement for short words and phrase-based replacement for longer terms.
- Saves the translated file with
_en.xlsxsuffix.
- Iterates through all
Verification
After running the translation, verify the output files (ending in _en.xlsx) to ensure:
- Sheet names are legible and correct.
- Technical terms are translated correctly.
- No formula corruption occurred.