Search

Formatting python-created Spreadsheets

Excel is a powerful tool that helps organize, store and visualize information effectively. However, the monotonous nature of some excel tasks can make working with the application a cumbersome experience. A plethora of excel operations like copying data, searching for specific rows, performing complex calculations can be automated via python which makes it a brilliant choice for automating excel operations especially when manipulating excel files is a part of a larger pythonic automation - like scraping web data, processing it, creating an excel report and auto-delivering it via email. Having said that, the excel files that are code-created can look quite unappealing with overlapping columns and no formatting. This undermines the viewing experience, especially when the reports are sent over to recipients via automated emails. Hence, it is only prudent that the excel file is appropriately formatted after it is code-created.


As an example, let us create an excel sheet using python to see how our resulting spreadsheet appears

import pandas as pd

data = [
            ["Saint Bernard", 190, 110, "Fair Working/Obedience Intelligence", 0.3],
            ["Mastiff", 190, 175, "Lowest Degree of Working/Obedience Intelligence", pd.NA],
            ["Great Dane", 160, 120, "Average Working/Obedience Intelligence", 0.5],
            ["Rottweiler", 110, 90, "Brightest Dogs", 0.95],
            ["Labrador Retriever", 80, 55, "Brightest Dogs", 0.95],
            ["Golden Retriever", 75, 55, "Brightest Dogs", 0.95],
            ["Belgian Sheepdog", 75, 60, "Excellent Working Dogs", 0.85],
            ["Dalmatian", 70, 45, "Above Average Working Dogs", 0.7],
        ]

df = pd.DataFrame(data, columns=["Breed", "weight_high_lbs", "weight_low_lbs", "Classification", "Obey"])
df.to_excel("dog breeds by weight and intelligence.xlsx", index=False)

The excel:

The resulting excel file's columns have a default width irrespective of the length of the data that is in them. This presents the report with a data that is incomprehensible at the first glance until columns are manually adjusted later on.


Let us try to format this report so that the data appears refined while also staying comprehensible.

from openpyxl.styles import Font, PatternFill
from openpyxl import load_workbook

workbook = load_workbook(filename="dog breeds by weight and intelligence.xlsx")

for sheet in workbook.sheetnames:
    active_sheet = workbook[sheet]
    rd = active_sheet.row_dimensions[1]
    rd.height = 18  # Fixed Column Height
    
    fill = PatternFill(start_color='c5d9f1', end_color='c5d9f1', fill_type="solid")
    # Applying a blue background to the header to distinguish it from the other rows 
    for cell in active_sheet["1:1"]:
        cell.fill = fill  
        cell.font = Font(size=11, bold=True)

    # Adjusting column widths automatically - 
    # according to the width of the column's largest cell
    for col in active_sheet.columns:
        max_length = 0
        column = col[0].column_letter
        for cell in col:
            try:
                if len(str(cell.value)) > max_length:
                    max_length = len(str(cell.value))
            except:
                pass

            adjusted_width = (max_length + 2) * 1.1
            active_sheet.column_dimensions[column].width = adjusted_width

workbook.save(filename="dog breeds by weight and intelligence.xlsx")

The resulting excel:

This snippet can be utilized to make excel reports look intelligible and polished while saving manual effort at formatting the files later on.