Agent Skills: Active Inference Robotics Skill (Second-Order)

Second-order skill synthesizing Patrick Kenny's discrete active inference framework with K-Scale's JAX/MuJoCo robotics stack for predictive coding in robot locomotion

UncategorizedID: plurigrid/asi/active-inference-robotics

Install this agent skill to your local

pnpm dlx add-skill https://github.com/plurigrid/asi/tree/HEAD/plugins/asi/skills/active-inference-robotics

Skill Files

Browse the full folder contents for active-inference-robotics.

Download Skill

Loading file tree…

plugins/asi/skills/active-inference-robotics/SKILL.md

Skill Metadata

Name
active-inference-robotics
Description
"Second-order skill synthesizing Patrick Kenny's discrete active inference framework with K-Scale's JAX/MuJoCo robotics stack for predictive coding in robot locomotion"

Active Inference Robotics Skill (Second-Order)

"The agent's job is to predict its actions by predicting its sensations." — Patrick Kenny

Trigger Conditions

  • User asks about bridging active inference with robot control
  • Questions about predictive coding in locomotion policies
  • Connecting KL divergence minimization to RL training
  • Mean field approximation in robotics state estimation
  • Sim2Real as inference about future observations

Overview

Second-order skill synthesizing Patrick Kenny's discrete active inference framework with K-Scale's JAX/MuJoCo robotics stack. This skill emerges from the constructive collision between:

  1. Active Inference Institute (ActInf ModelStream 019.1, Jan 2025)
  2. K-Scale Labs (ksim, kos, kinfer ecosystem)
  3. MuJoCo Playground (DeepMind's sim2real framework)

The Constructive Collision

┌─────────────────────────────────────────────────────────────────────────────┐
│  CONSTRUCTIVE COLLISION: Two Threads Converging                              │
│                                                                              │
│  Thread A: Patrick Kenny (Nov 2025)                                          │
│  ════════════════════════════════════                                        │
│  "Active inference can be formulated as constrained KL divergence           │
│   minimization solved by standard mean field methods"                        │
│                                                                              │
│  Key insight: Expected Free Energy ≈ KL Divergence + Entropy Regularizer    │
│                                                                              │
│  Thread B: K-Scale Labs (2024-2025)                                          │
│  ═══════════════════════════════════                                         │
│  "RL-based closed-loop control using policies trained in simulation         │
│   has firmly won as the best way of achieving real-time control"            │
│                                                                              │
│  Key insight: Stateless vs Stateful behaviors as pure/coalgebraic semantics │
│                                                                              │
│  COLLISION POINT: Both minimize surprise about future observations          │
│  ══════════════════════════════════════════════════════════════════         │
│                                                                              │
│       Active Inference              Robotics RL                              │
│       ────────────────              ──────────                               │
│       Predictive Distribution  ←→   Policy π(a|s)                           │
│       Hidden Markov Model      ←→   MDP/POMDP                                │
│       Mean Field Updates       ←→   PPO Gradient Steps                       │
│       Variational Free Energy  ←→   Policy Loss                              │
│       Expected Free Energy     ←→   Value Function + Entropy                 │
│       Perception/Action Loop   ←→   Observation/Action Loop                  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Kenny's Key Contribution

From arXiv:2511.20321:

Perception/Action Divergence = VFE(past) + KL(future states)

Where:
- VFE(past) = Standard variational free energy on observed history
- KL(future) = Divergence of predictive distribution from HMM

This differs from Expected Free Energy by an ENTROPY REGULARIZER:
  EFE ≈ Pragmatic Value + Mutual Information
  PAD ≈ Pragmatic Value + Entropy(Q)

Why Entropy Regularization Matters for Robotics

# In ksim PPO training, entropy bonus prevents policy collapse:
loss = policy_loss + value_loss - entropy_coef * entropy

# Kenny's formulation shows this is NOT ad-hoc but principled:
# Entropy regularizer = not being overconfident about predictions
# Biological rationale: know limitations of future predictions

Mapping to ksim Architecture

| Active Inference Concept | ksim Implementation | |--------------------------|---------------------| | Hidden Markov Model | PhysicsEngine (MJX/MuJoCo) | | Observation distribution | Observation.observe(state) | | State inference Q(s) | Critic.forward(obs, carry) | | Action inference Q(a) | Actor.forward(obs, carry) | | Mean field factorization | Independent Q(s_t) per timestep | | Predictive distribution | Policy rollout trajectory | | VFE minimization | PPO policy gradient | | EFE/PAD minimization | Value function + entropy bonus |

Second-Order Behavior Types

1. Reflexive Control (Kenny's "Sufficient" Model)

# Agent predicts proprioceptive sensations → fulfills reflexively
class ReflexiveController:
    """
    Kenny: "If the agent can successfully predict its future sensations,
    it can fulfill them unconsciously via motor reflexes."
    """
    def step(self, predicted_proprio: Array) -> Action:
        # Low-level PD control fulfills proprioceptive predictions
        return self.pd_controller(predicted_proprio, self.current_state)

2. Deliberative Planning (EFE Extension)

# When reflexive prediction fails, engage deliberative inference
class DeliberativeController:
    """
    Extends reflexive control with policy search over trajectories.
    This is where EFE differs from Kenny's PAD formulation.
    """
    def plan(self, beliefs: Distribution, horizon: int) -> Policy:
        # Tree search over policies weighted by expected free energy
        for policy in self.policy_space:
            efe = self.expected_free_energy(beliefs, policy, horizon)
            # EFE includes mutual information (curiosity/exploration)
            # PAD would use entropy instead (uncertainty awareness)

3. Hierarchical Composition

Level 3: Goal Selection (minimize long-horizon EFE)
    ↓ sets reference for
Level 2: Trajectory Planning (predictive distribution)
    ↓ sets reference for  
Level 1: Reflexive Execution (fulfill proprio predictions)
    ↓ actuates
Level 0: Motor Primitives (PD control, actuator dynamics)

GF(3) Balanced Quad

active-inference (0) ⊗ kscale-ksim (0) ⊗ mujoco-playground (0) = 0 ✓

All three are ERGODIC — coordination/infrastructure skills.
This is a "resonant triad" where all components coordinate.

For generation (+1), add: skill-creator, algorithmic-art
For verification (-1), add: sheaf-cohomology, code-review

Skill Colors (drand seed 12005093902789493003)

| Skill | Trit | Color | Role | |-------|------|-------|------| | active-inference | 0 | #DF8D0F | Coordination (theory) | | kscale-ksim | 0 | #25BC3D | Coordination (simulation) | | mujoco-playground | 0 | #93DBDA | Coordination (framework) |

2-3-5-7 Prime Sieve Experts

Applying prime-indexed refinement to identify domain experts:

| Prime | Expert | Domain | Key Contribution | |-------|--------|--------|------------------| | 2 | Patrick Kenny | Active Inference | Mean field formulation, PAD criterion | | 3 | Thomas Parr | Active Inference | 2022 textbook, EFE derivation | | 5 | Ben Bolte | K-Scale | ksim architecture, open-source humanoids | | 7 | Karl Friston | Free Energy Principle | FEP foundations, continuous formulation | | 11 | (DeepMind team) | MuJoCo Playground | MJX, sim2real zero-shot | | 13 | Wesley Maa | K-Scale | Tooling, visualization |

Mutual Awareness

This skill references and is referenced by:

depends_on:
  - kscale-ksim        # Simulation implementation
  - kscale-ecosystem   # Hardware context
  - mujoco-playground  # Framework foundation
  
referenced_by:
  - cognitive-superposition  # Team mental models
  - parametrised-optics-cybernetics  # Category theory bridge
  - reafference-corollary-discharge  # Sensorimotor prediction

Implementation Pattern

# Unified Active Inference + RL Training Loop
class ActiveInferenceTrainer:
    """
    Combines Kenny's PAD criterion with ksim's PPO.
    """
    def __init__(self, hmm: PhysicsEngine, config: Config):
        self.hmm = hmm
        self.actor = Actor(config)
        self.critic = Critic(config)
        
    def perception_action_divergence(
        self, 
        observations: Array,  # O_{1:t} (past)
        q_future: Distribution  # Q(S_{t+1:T}, O_{t+1:T})
    ) -> Scalar:
        """
        Kenny's PAD = VFE(past) + KL(future states from HMM)
        """
        # Past: standard VFE on observation history
        vfe_past = self.variational_free_energy(observations)
        
        # Future: KL divergence of predicted states from HMM
        # Note: Observable emissions cancel out in future KL
        kl_future = self.kl_future_states(q_future, self.hmm)
        
        return vfe_past + kl_future
    
    def train_step(self, trajectory: Trajectory) -> Metrics:
        # PPO updates approximate mean field coordinate ascent
        # Entropy bonus provides Kenny's regularization
        return ppo_update(
            self.actor, 
            self.critic, 
            trajectory,
            entropy_coef=0.01  # ← The regularizer!
        )

References

ACSet Schema

@present SchActiveInferenceRobotics(FreeSchema) begin
    # Objects
    HMM::Ob           # Hidden Markov Model (generative model)
    State::Ob         # Latent state
    Observation::Ob   # Sensory observation
    Action::Ob        # Motor command
    Policy::Ob        # Action sequence
    
    # Morphisms (inference)
    perceive::Hom(Observation, State)    # Perception: O → S
    predict::Hom(State, Observation)     # Prediction: S → O
    act::Hom(State, Action)              # Action selection: S → A
    transition::Hom(State × Action, State)  # Dynamics: S × A → S'
    
    # Attributes
    FreeEnergy::AttrType
    vfe::Attr(State, FreeEnergy)         # Variational free energy
    efe::Attr(Policy, FreeEnergy)        # Expected free energy
    pad::Attr(Policy, FreeEnergy)        # Perception/action divergence
    
    # The key relationship (Kenny's contribution):
    # pad ≈ efe + entropy_regularizer
end