The Elegant Dict-Object Hybrid: A Pythonic Design Pattern for Flexible Data Containers

Introduction
In Python development, we frequently work with structured data: configuration settings, API responses, form submissions, event logs, and more. Most developers reach for one of two common approaches:
# Dictionary approach
user_data = {"name": "Alice", "email": "alice@example.com", "active": True}
print(user_data["name"]) # Dictionary access
# Class-based approach
class UserData:
def __init__(self, name, email, active):
self.name = name
self.email = email
self.active = active
user = UserData("Alice", "alice@example.com", True)
print(user.name) # Attribute access
But what if we could get the best of both worlds? What if we could create a flexible data container that combines the dynamic nature of dictionaries with the elegant attribute access of objects, while adding powerful transformation capabilities?
This blog post explores a surprisingly simple yet powerful design pattern that can transform how you handle data in your Python applications.
The Problem Space
Let's consider a common scenario: handling API responses.
Imagine you're building an application that interacts with various REST APIs. Each API returns JSON data that needs to be parsed, validated, transformed, and passed through different parts of your application.
Here are the pain points:
Access syntax: Dictionary access (
data["key"]) is more verbose and error-prone than attribute access (data.key)Data transformation: You often need to create modified versions of the data without mutating the original
Contextual metadata: Sometimes you need to track metadata about the fields without cluttering the data itself
Consistency: Different parts of your codebase might expect different formats
Selective access: You frequently need to extract subsets of the data based on field types or categories
Let's see how our design pattern addresses these challenges.
Building a Flexible Data Container
Let's create a flexible DataContainer class that solves these problems:
class DataContainer:
def __init__(self, base=None, **kwargs):
# Internal storage
self._store = {}
self._metadata = {}
# Initialize from another DataContainer
if base and isinstance(base, type(self)):
self._store = base._store.copy()
self._metadata = base._metadata.copy()
# Initialize from a dict
elif base and isinstance(base, dict):
self._store = base.copy()
# Update with provided kwargs
self._store.update(kwargs)
def __getattr__(self, key):
if key.startswith("__") and key.endswith("__"):
raise AttributeError
if key in self._store:
return self._store[key]
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{key}'")
def __setattr__(self, key, value):
if key.startswith("_"):
super().__setattr__(key, value)
else:
self._store[key] = value
def __getitem__(self, key):
return self._store[key]
def __setitem__(self, key, value):
self._store[key] = value
def __contains__(self, key):
return key in self._store
def __repr__(self):
return f"{type(self).__name__}({self._store})"
def keys(self):
return self._store.keys()
def values(self):
return self._store.values()
def items(self):
return self._store.items()
def get(self, key, default=None):
return self._store.get(key, default)
def copy(self, **kwargs):
result = type(self)(base=self)
result._store.update(kwargs)
return result
def without(self, *keys):
result = self.copy()
for key in keys:
if key in result._store:
del result._store[key]
return result
def with_metadata(self, **metadata):
result = self.copy()
for key, value in metadata.items():
result._metadata[key] = value
return result
def filter(self, predicate):
filtered = {k: v for k, v in self._store.items() if predicate(k, v)}
return type(self)(filtered)
This class gives us:
Both dictionary-style and attribute-style access
Immutable operations that create new copies (like
copy()andwithout())Metadata tracking via the
_metadatadictionaryFiltering capabilities
Real-World Example: API Response Handling
Let's see how this improves a typical API workflow:
import requests
from datacontainer import DataContainer
def get_user(user_id):
response = requests.get(f"https://api.example.com/users/{user_id}")
data = response.json()
# Convert the plain dict to our DataContainer
user = DataContainer(data)
# Add some metadata about this request
user = user.with_metadata(
source="api.example.com",
timestamp=response.headers.get("Date"),
request_id=response.headers.get("X-Request-ID")
)
return user
# Usage
user = get_user(123)
# Attribute-style access is cleaner and IDE-friendly
print(f"Hello, {user.name}!")
# We can make a modified copy without affecting the original
display_user = user.copy(
full_name=f"{user.first_name} {user.last_name}",
display_date=format_date(user.created_at)
).without("password_hash", "security_question")
# We can filter fields based on type or other criteria
contact_info = user.filter(
lambda key, value: key in ("email", "phone", "address")
)
Comparison with Pydantic and dataclasses
A common question might be: "Why not just use Pydantic or dataclasses?" Let's compare these approaches to understand where our DataContainer fits in the ecosystem.
Dataclasses
from dataclasses import dataclass, field, asdict
@dataclass
class UserData:
name: str
email: str
active: bool = True
metadata: dict = field(default_factory=dict)
def to_dict(self):
return asdict(self)
Pros of dataclasses:
Type hints and IDE support
Auto-generated methods
Clear structure defined upfront
Cons of dataclasses:
Fields need to be predefined
Adding dynamic fields requires extra work
Transformations often require creating new classes
Pydantic
from pydantic import BaseModel, Field
from typing import Dict, Any
class UserData(BaseModel):
name: str
email: str
active: bool = True
metadata: Dict[str, Any] = Field(default_factory=dict)
Pros of Pydantic:
Powerful validation
Schema generation
Serialization capabilities
Type safety
Cons of Pydantic:
Less flexibility for dynamic fields
More verbose for simple use cases
Performance overhead for validation
Transformations can be cumbersome
Our DataContainer
user = DataContainer(name="Alice", email="alice@example.com")
user.active = True
user = user.with_metadata(source="signup_form")
Pros of DataContainer:
Maximum flexibility for both defined and dynamic fields
Clean, fluent API for transformations
Lightweight with minimal dependencies
Both dict-like and object-like interfaces
Excellent for evolving or unpredictable data structures
Cons of DataContainer:
No built-in validation
No type hinting for specific fields
Less formal structure can lead to inconsistency
When to Use Each Approach
Use dataclasses when:
You have a well-defined, stable data structure
You value type hints and IDE support
You don't need many dynamic transformations
Use Pydantic when:
Data validation is critical
You're working with external APIs and need schema validation
You're building larger applications where type safety matters
You need automatic documentation (via schema generation)
Use our DataContainer when:
You need maximum flexibility for evolving data structures
You're working with unpredictable data sources
You want clean, chainable transformations
You value simplicity and readability over strict validation
You're building prototypes or smaller applications
Enhanced Configuration Management
One area where our DataContainer pattern truly shines is in configuration management. Let's expand on this use case with a more robust implementation:
import os
import json
import yaml
from pathlib import Path
from typing import Any, Dict, Optional
class Config(DataContainer):
def __init__(self, base=None, **kwargs):
super().__init__(base, **kwargs)
self._frozen = False
self._sources = [] # Track where config values came from
def __setattr__(self, key, value):
if hasattr(self, "_frozen") and self._frozen and not key.startswith("_"):
raise AttributeError(f"Cannot modify frozen config key '{key}'")
super().__setattr__(key, value)
def freeze(self):
"""Make the config immutable"""
result = self.copy()
result._frozen = True
return result
def with_prefix(self, prefix, strip_prefix=True):
"""Extract all keys with a specific prefix"""
result = type(self)()
prefix_len = len(prefix)
for key, value in self.items():
if key.startswith(prefix):
new_key = key[prefix_len:] if strip_prefix else key
result[new_key] = value
return result
def with_source(self, source, **kwargs):
"""Add config values with tracking of their source"""
result = self.copy(**kwargs)
for key, value in kwargs.items():
if not hasattr(result, "_sources"):
result._sources = []
result._sources.append((key, source))
return result
def get_source(self, key):
"""Get the source of a config value"""
for k, source in getattr(self, "_sources", []):
if k == key:
return source
return None
def merge(self, other_config, overwrite=True):
"""Merge with another config, with option to preserve existing values"""
result = self.copy()
for key, value in other_config.items():
if overwrite or key not in result:
result[key] = value
# Preserve source information if available
if hasattr(other_config, "get_source"):
source = other_config.get_source(key)
if source and hasattr(result, "_sources"):
result._sources.append((key, source))
return result
def with_env_override(self, prefix="APP_"):
"""Override config values from environment variables"""
result = self.copy()
for env_key, env_value in os.environ.items():
if env_key.startswith(prefix):
config_key = env_key[len(prefix):].lower()
# Convert environment variable to appropriate type
if config_key in result and isinstance(result[config_key], bool):
typed_value = env_value.lower() in ("true", "yes", "1")
elif config_key in result and isinstance(result[config_key], int):
typed_value = int(env_value)
elif config_key in result and isinstance(result[config_key], float):
typed_value = float(env_value)
else:
typed_value = env_value
result = result.with_source(f"env:{env_key}", **{config_key: typed_value})
return result
@classmethod
def from_file(cls, filepath):
"""Load from a config file based on extension"""
path = Path(filepath)
if not path.exists():
raise FileNotFoundError(f"Config file not found: {filepath}")
with open(path, "r") as f:
if path.suffix.lower() in (".yml", ".yaml"):
data = yaml.safe_load(f)
elif path.suffix.lower() == ".json":
data = json.load(f)
else:
raise ValueError(f"Unsupported config file type: {path.suffix}")
config = cls(data)
return config.with_source(f"file:{filepath}")
@classmethod
def from_dict(cls, data, source="dict"):
"""Create from a dictionary with source tracking"""
config = cls(data)
for key in data:
if not hasattr(config, "_sources"):
config._sources = []
config._sources.append((key, source))
return config
def to_dict(self):
"""Convert config to a plain dictionary"""
return dict(self.items())
def to_file(self, filepath):
"""Save config to a file based on extension"""
path = Path(filepath)
with open(path, "w") as f:
if path.suffix.lower() in (".yml", ".yaml"):
yaml.dump(self.to_dict(), f)
elif path.suffix.lower() == ".json":
json.dump(self.to_dict(), f, indent=2)
else:
raise ValueError(f"Unsupported config file type: {path.suffix}")
Building a Layered Configuration System
With our enhanced Config class, we can create a sophisticated configuration system that loads from multiple sources with clear precedence:
def load_application_config(app_name, env="development"):
# Start with default configuration
config = Config.from_file(f"config/defaults.yaml")
# Add environment-specific configuration
try:
env_config = Config.from_file(f"config/{env}.yaml")
config = config.merge(env_config)
except FileNotFoundError:
print(f"No configuration found for environment: {env}")
# Add local overrides (not committed to version control)
try:
local_config = Config.from_file("config/local.yaml")
config = config.merge(local_config)
except FileNotFoundError:
# Local config is optional
pass
# Override with environment variables
config = config.with_env_override(prefix=f"{app_name.upper()}_")
# Freeze configuration to prevent accidental modification
return config.freeze()
# Usage
config = load_application_config("myapp", env="production")
# Extract subsystem configuration
database_config = config.with_prefix("database_")
logging_config = config.with_prefix("logging_")
# Check sources for debugging
print(f"Database URL source: {config.get_source('database_url')}")
Hierarchical Configuration
Our pattern also handles hierarchical configurations elegantly:
def nested_get(config, path, default=None):
"""Get a nested configuration value using dot notation"""
keys = path.split(".")
current = config
for key in keys:
if hasattr(current, key):
current = getattr(current, key)
else:
return default
return current
# Create a hierarchical config
server_config = Config(
host="localhost",
port=8080,
ssl=Config(
enabled=True,
cert="/path/to/cert.pem",
key="/path/to/key.pem"
),
cors=Config(
enabled=True,
origins=["https://example.com"]
)
)
# Access nested values
ssl_enabled = nested_get(server_config, "ssl.enabled")
cors_origins = nested_get(server_config, "cors.origins")
# Or use direct attribute access for a cleaner API
ssl_enabled = server_config.ssl.enabled
cors_origins = server_config.cors.origins
The Grand Reveal: Inspiration from DSPy
If you found this pattern useful, you might be interested to know that it was inspired by the Example class in DSPy, a powerful framework for programming with language models.
In DSPy, this pattern is used to represent examples in machine learning datasets, elegantly separating inputs from labels while maintaining a clean, unified interface.
Here's a simplified version of how DSPy uses this pattern:
# Creating a dataset for sentiment analysis
dataset = [
Example(text="I loved this movie!", sentiment="positive").with_inputs("text"),
Example(text="Terrible acting and plot.", sentiment="negative").with_inputs("text")
]
# Using the data in a model
for item in dataset:
# Just get the inputs (the "text" field)
inputs = item.inputs()
# Make a prediction
prediction = model(inputs.text)
# Compare to the actual label
accuracy = (prediction == item.sentiment)
Conclusion
The Dict-Object hybrid pattern demonstrates the elegance of Python's design flexibility. With just a few magic methods and thoughtful API design, we created a powerful data container that:
Provides both dictionary and object interfaces
Supports immutable transformations
Enables domain-specific extensions
Makes code more readable and maintainable
While Pydantic and dataclasses excel at validation and type safety, our DataContainer pattern shines in scenarios requiring flexibility, dynamic transformations, and clean APIs. The configuration management example shows how a relatively simple design can solve complex real-world problems in an elegant way.
This pattern can be adapted to numerous domains beyond the examples shown here:
Event data in logging systems
ETL pipeline transformations
Command-line argument parsing
JSON/XML data processing
Application state management
The next time you find yourself juggling dictionaries and custom classes for structured data, consider implementing this flexible container pattern. It might just transform how you think about data handling in Python.


