sifaka

Critics Guide

Critics are the core of Sifaka’s text improvement system. They analyze text and provide structured feedback for iterative refinement.

Overview

Each critic implements a specific evaluation strategy based on academic research:

Critic Best For Research Paper  
SELF_REFINE General improvement Self-Refine (2023)  
REFLEXION Learning from mistakes Reflexion (2023)  
CONSTITUTIONAL Safety & principles Constitutional AI (2022)  
SELF_CONSISTENCY Balanced perspectives Self-Consistency (2022)  
SELF_RAG Fact-checking Self-RAG (2023)  
META_REWARDING Self-evaluation Meta-Rewarding (2024)  
N_CRITICS Multiple perspectives Based on ensemble methods N-Critics (2023)
STYLE Tone & style Custom implementation  

Using Critics

Single Critic

from sifaka import improve
from sifaka.core.types import CriticType

result = await improve(
    "Your text here",
    critics=[CriticType.SELF_REFINE]
)

Multiple Critics

result = await improve(
    "Your text here",
    critics=[CriticType.SELF_REFINE, CriticType.REFLEXION]
)

Custom Configuration

from sifaka.core.config import Config, CriticConfig

config = Config(
    critic=CriticConfig(
        critics=[CriticType.CONSTITUTIONAL],
        confidence_threshold=0.8
    )
)

result = await improve("Your text", config=config)

Critic Details

SELF_REFINE

General-purpose improvement focusing on clarity, coherence, and completeness.

Use when:

Example:

result = await improve(
    "AI is important for business.",
    critics=[CriticType.SELF_REFINE],
    max_iterations=3
)

REFLEXION

Learns from previous attempts by reflecting on what worked and what didn’t.

Use when:

Example:

result = await improve(
    "Explain quantum computing simply.",
    critics=[CriticType.REFLEXION],
    max_iterations=4  # Benefits from more iterations
)

CONSTITUTIONAL

Evaluates text against constitutional principles for safety and ethics.

Use when:

Principles evaluated:

Example:

result = await improve(
    "Guide on pest control methods",
    critics=[CriticType.CONSTITUTIONAL]
)

SELF_CONSISTENCY

Generates multiple perspectives and finds consensus.

Use when:

Example:

result = await improve(
    "Analysis of renewable energy policies",
    critics=[CriticType.SELF_CONSISTENCY]
)

SELF_RAG

Fact-checks and retrieves information to verify claims.

Use when:

Requires tools:

# Tools must be configured separately
result = await improve(
    "The Great Wall of China facts",
    critics=[CriticType.SELF_RAG]
)

META_REWARDING

Evaluates its own critique quality through meta-evaluation.

Use when:

Example:

result = await improve(
    "Medical advice disclaimer",
    critics=[CriticType.META_REWARDING]
)

N_CRITICS

Uses multiple critical perspectives in parallel.

Use when:

Default perspectives:

Example:

result = await improve(
    "Product launch announcement",
    critics=[CriticType.N_CRITICS]
)

STYLE

Transforms text style and tone.

Use when:

Configuration:

from sifaka.critics.style import StyleCritic

critic = StyleCritic(
    style_description="Casual and friendly",
    style_examples=[
        "Hey there! Let me explain...",
        "No worries, we've got you covered!"
    ]
)

Combining Critics

Critics work well together:

# Technical accuracy + readability
result = await improve(
    text,
    critics=[CriticType.REFLEXION, CriticType.STYLE]
)

# Safety + factual accuracy
result = await improve(
    text,
    critics=[CriticType.CONSTITUTIONAL, CriticType.SELF_RAG]
)

# Comprehensive review
result = await improve(
    text,
    critics=[
        CriticType.SELF_REFINE,
        CriticType.N_CRITICS,
        CriticType.META_REWARDING
    ]
)

Performance Considerations

Speed vs Quality

Fast critics:

Thorough critics:

Resource intensive:

Optimization Tips

  1. Start simple: Use SELF_REFINE first
  2. Add specificity: Layer on specialized critics
  3. Limit iterations: 2-3 for most use cases
  4. Use appropriate models: Smaller models for simple critics

Custom Critics

Create your own critic:

from sifaka.plugins import CriticPlugin
from sifaka.core.models import CritiqueResult

class DomainExpertCritic(CriticPlugin):
    def __init__(self, domain: str):
        self.domain = domain

    async def critique(self, text: str, result):
        # Your critique logic
        feedback = f"From a {self.domain} perspective..."

        return CritiqueResult(
            critic=f"domain_expert_{self.domain}",
            feedback=feedback,
            suggestions=["Add more domain-specific details"],
            needs_improvement=True,
            confidence=0.75
        )

# Use it
result = await improve(
    text,
    critics=[DomainExpertCritic("medical")]
)

Best Practices

  1. Match critic to content: Use domain-appropriate critics
  2. Start broad, get specific: SELF_REFINE → specialized critics
  3. Consider your audience: Use STYLE for audience adaptation
  4. Verify facts: Use SELF_RAG for factual content
  5. Ensure safety: Use CONSTITUTIONAL for public content
  6. Iterate wisely: More isn’t always better (2-4 iterations usually sufficient)

Troubleshooting

Text not improving?

Too many changes?

Inconsistent results?