Collections

Collections

dbzero provides a set of persistent collections to use when building your models. These collections are designed to be drop-in replacements for their standard Python counterparts - they support the same operations, methods, and syntax you're already familiar with.

The key advantage is that you can write natural Python code without thinking about serialization, storage, or database operations. All this is handled transparently in the background.

Supported Collections

Apart from these standard collections, dbzero also provides a specialized dbzero.index collection for efficient range queries and sorting. You can learn more about it here.

Working with Collections

Basic Usage

Collections in dbzero are designed around transparency and interoperability with native Python types. You can assign standard Python collections to memo object attributes and they are automatically converted to their dbzero counterparts.

import dbzero as db0
 
@db0.memo(singleton=True)
class AppState:
    def __init__(self):
        # Use familiar Python syntax
        self.tasks = []
        self.config = {"host": "localhost", "port": 8080, "debug": False}
        self.tags = {"python", "database", "persistence"}
        
state = AppState()
 
# Work with collections naturally
state.tasks.append("deploy")
state.tasks.extend(["design", "implement", "test"])
state.tasks[0] = "research"
print(state.tasks[1:3])  # ['design', 'implement']
 
# Dictionaries work as expected
state.config["debug"] = True
print(state.config.get("timeout", 30))  # 30
 
# Sets maintain uniqueness
state.tags.update(["performance", "persistence"])
state.tags.discard("database")
print(state.tags == {"python", "persistence", "performance"})  # True
 
# Changes are automatically tracked and persisted

Collections with Memo Objects

dbzero collections integrate seamlessly with memo objects, enabling you to build complex, persistent data structures. An important part is that collections also manage the lifetime of contained memo objects. When a memo object is added to a collection, the collection maintains a strong reference to it, keeping it alive. When you remove an object from a collection and no other references exist, it will be automatically garbage-collected on the next commit.

import dbzero as db0
 
@db0.memo
class User:
    def __init__(self, name, email):
        self.name = name
        self.email = email
 
@db0.memo
class Role:
    def __init__(self, title, permissions):
        self.title = title
        self.permissions = permissions  # e.g., ["read", "write", "deploy"]
 
@db0.memo
class Team:
    def __init__(self, name):
        self.name = name
        self.members = []
        self.roles = {}  # Maps User -> Role
 
# Create team with members and roles
team = Team("Engineering")
alice = User("Alice", "alice@example.com")
bob = User("Bob", "bob@example.com")
 
lead_role = Role("Lead Developer", ["read", "write", "deploy", "admin"])
engineer_role = Role("Backend Engineer", ["read", "write"])
 
team.members.extend([alice, bob])
team.roles[alice] = lead_role
team.roles[bob] = engineer_role
 
# Access members and their roles
print(team.members[0].name)  # "Alice"
print(team.roles[alice].title)  # "Lead Developer"
print(team.roles[alice].permissions)  # ["read", "write", "deploy", "admin"]
 
# Collections maintain object identity
print(team.members[0] is alice)  # True
 
# Removing from collection triggers garbage collection
team.members.remove(bob)
del team.roles[bob]
# After commit "Bob" will be deleted

Composite Collections

You can build complex nested structures by combining different collection types.

import dbzero as db0
 
@db0.memo
class DataAnalyzer:
    def __init__(self):
        # List of tuples - ordered pairs
        self.coordinates = []
        
        # Dict of lists - categorized items
        self.categories = {}
        
        # Set of tuples - unique combinations
        self.seen_pairs = set()
        
        # Dict of dicts - nested mappings
        self.metadata = {}
 
analyzer = DataAnalyzer()
 
# List of tuples for coordinate pairs
analyzer.coordinates.append((10.5, 20.3))
analyzer.coordinates.append((15.7, 25.1))
 
# Dict of lists for categorization
analyzer.categories["positive"] = [1, 2, 3]
analyzer.categories["negative"] = [-1, -2, -3]
analyzer.categories["positive"].append(4)
 
# Set of tuples for tracking unique pairs
analyzer.seen_pairs.add(("user1", "action1"))
analyzer.seen_pairs.add(("user2", "action2"))
 
# Dict of dicts for nested configuration
analyzer.metadata["experiment1"] = {"runs": 10, "success": 8}
analyzer.metadata["experiment2"] = {"runs": 5, "success": 5}
 
# Access nested structures
print(analyzer.coordinates[0][0])  # 10.5
print(analyzer.categories["positive"][-1])  # 4
print(len(analyzer.seen_pairs))  # 2
print(analyzer.metadata["experiment1"]["success"])  # 8

Performance Considerations

Lazy Loading

dbzero collections use lazy loading - only the data you actually access is loaded into memory. This enables working with large data efficiently:

@db0.memo
class Dataset:
    def __init__(self):
        # Create a list with 1 million items
        self.data = list(range(1_000_000))
 
dataset = Dataset()
 
# Only items you access are loaded
first_ten = dataset.data[:10]  # Only loads 10 items
item = dataset.data[500_000]  # Loads single item
 
# Efficient iteration
for i, item in enumerate(dataset.data):
    if i >= 100:
        break
    # Only first 100 items loaded

Best Practices

  1. Use appropriate collection types: Choose the collection that best fits your use case - lists for ordered sequences, sets for unique items, dicts for key-value mappings.

  2. Leverage automatic conversion: Let dbzero convert Python collections automatically when assigning to memo objects - you don't need to explicitly create dbzero collections everywhere.

  3. Build composite structures: Don't hesitate to nest collections - lists of dicts, dicts of lists, sets of tuples - to model your domain accurately.

  4. Let garbage collection work: Trust dbzero's reference counting - you don't need to manually delete objects from collections in most cases.

  5. Consider immutability: Use tuples for data that shouldn't change.

  6. Profile before optimizing: dbzero's lazy loading handles most performance concerns automatically. Profile your application before making optimizations.