Large Language Models and Multimodal AI for Utilities

The Business Problem: Making Sense of Complex, Unstructured Data

Utilities operate in environments that generate a constant mix of structured and unstructured data. Structured data—such as SCADA readings, AMI meter data, and asset inventories—is well understood and widely used in analytics. However, an equally important layer of information lives in unstructured formats: maintenance logs, inspection notes, incident reports, regulatory filings, and even field images or drone footage.

Historically, these two worlds have remained separate. Engineers interpret logs and inspection notes manually, while structured data feeds into automated reports and dashboards. This split leads to slow, fragmented decision-making. For example, if a transformer fails, SCADA data may show an overload condition, but the underlying issue—like a minor oil leak noted in a maintenance report six months earlier—is buried in text. Without tools to unify these sources, valuable context is lost.

This challenge grows as the grid digitizes. Distributed energy resources, smart devices, and IoT sensors introduce new data streams. Field inspections increasingly involve photos or video from drones. Call center transcripts capture customer complaints that may indicate emerging reliability issues. Utilities need ways to synthesize these diverse data types to support faster, better-informed decisions.

The Analytics Solution: LLMs and Multimodal Integration

Large language models (LLMs) excel at interpreting unstructured text, summarizing documents, and extracting key entities or themes. When combined with structured data, they can create a more complete picture of utility operations. For example, LLMs can read maintenance logs, extract references to specific equipment, and pair those findings with SCADA measurements to create context-rich asset profiles.

Multimodal AI takes this further by handling text, images, and sensor data together. A single model can analyze drone photos for defects, summarize technician notes, and correlate findings with temperature or load history to generate prioritized maintenance recommendations. This integration enables workflows that previously required manual coordination between teams.

Practical Applications

Utilities can use LLMs to automate the review of compliance documents, quickly identifying sections relevant to specific standards. Field inspection notes can be summarized automatically, highlighting critical observations and linking them to asset IDs. Multimodal models can screen solar panel images for cracks and pair results with inverter performance data, pinpointing which defects are reducing output.

In control centers, operators could query LLM-powered systems in natural language to retrieve SCADA trends, past incidents, and maintenance history for any given feeder or substation. This reduces the friction of navigating multiple systems and databases manually.

Transition to the Demo

In this chapter’s demo, we will build a pipeline that uses an LLM to analyze a set of maintenance logs and pair extracted insights with sensor data. We will:

Use an LLM to summarize incident notes and extract asset references.
Join this information to SCADA-like sensor readings for the same equipment.
Demonstrate how combining text and structured data provides richer insights for asset risk assessment.

This example illustrates how LLMs and multimodal AI unlock a new dimension of utility analytics, bridging unstructured and structured data to create comprehensive, actionable intelligence.

pyfile shortcode: missing param 'file'. Example: {{< pyfile file="script.py" >}}

Code

"""
Chapter 17: Advanced AI for Utilities (LLMs and Multimodal Models)
Uses LLMs to analyze incident reports and combine insights with structured sensor data.
"""

import pandas as pd
import numpy as np
from openai import OpenAI

# --- Synthetic Data ---
def generate_incident_logs():
    logs = [
        "Breaker trip occurred at Substation A after storm. Reset required.",
        "Overheating detected on Transformer T-102. High oil temperature noted.",
        "Vegetation contact on feeder line caused temporary outage.",
        "Routine inspection completed with no anomalies detected.",
        "Cybersecurity alert: suspicious login attempts flagged at control center."
    ]
    return logs

def generate_sensor_data():
    np.random.seed(42)
    return pd.DataFrame({
        "TransformerID": ["T-101", "T-102", "T-103", "T-104", "T-105"],
        "Temp_C": np.random.normal(65, 5, 5),
        "Vibration_g": np.random.normal(0.22, 0.03, 5),
        "OilQuality_Index": np.random.uniform(60, 80, 5)
    })

# --- LLM Analysis ---
def analyze_incidents_with_llm(logs):
    """
    Summarize maintenance incident logs and extract root causes using an LLM.
    """
    client = OpenAI()
    prompt = "Analyze these incident logs and summarize root causes:\n\n" + "\n".join(logs)

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "system", "content": "You are an expert in utility operations."},
                  {"role": "user", "content": prompt}],
        max_tokens=300
    )
    return response.choices[0].message.content

# --- Multimodal Context Fusion ---
def fuse_structured_unstructured(sensor_df, logs_summary):
    """
    Combine sensor context with LLM-driven incident log summary.
    """
    print("\n--- Sensor Data Snapshot ---")
    print(sensor_df)
    print("\n--- Incident Log Analysis ---")
    print(logs_summary)

if __name__ == "__main__":
    logs = generate_incident_logs()
    sensors = generate_sensor_data()
    summary = analyze_incidents_with_llm(logs)
    fuse_structured_unstructured(sensors, summary)