How to Use Chakra — Complete Guide¶
This guide covers everything you need to know to use Chakra: from installation to running experiments, understanding results, tracking with W&B, extending the framework, and troubleshooting.
Table of Contents¶
- Installation
- Project Structure
- Core Concepts
- Running Your First Experiment
- The Seven-Step Lifecycle
- Configuration System
- Weights & Biases Integration
- CLI Reference
- Working with Each Domain
- Adding Your Own Domain
- Testing
- Troubleshooting
1. Installation¶
Requirements¶
- Python 3.10+ (tested with 3.10, 3.11, 3.12, 3.13)
- Git
- (Optional) Weights & Biases account for cloud experiment tracking
- (Optional) Kaggle CLI for cloud notebook execution
Clone and Install¶
# Clone the repository
git clone https://github.com/The-Harsh-Vardhan/Chakra-Autonomous-Research-System.git
cd Chakra-Autonomous-Research-System
# Create a virtual environment
python -m venv .venv
# Activate it
# Linux/macOS:
source .venv/bin/activate
# Windows PowerShell:
.venv\Scripts\activate
# Windows Command Prompt:
.venv\Scripts\activate.bat
# Install with all dependencies (including dev/test tools)
pip install -e ".[dev]"
Verify Installation¶
python -m chakra list-domains
You should see output like:
Name Display Name Primary Metric
--------------------------------------------------------------------------------
hndsr_vr HNDSR Satellite Super-Resolution psnr_mean
nlp_lm NLP Language Modelling val_bpb
tabular_cls Tabular Classification accuracy
If you see all three domains, you're ready to go.
2. Project Structure¶
Chakra-Autonomous-Research-System/
│
├── src/chakra/ # Python source code
│ ├── core/ # Domain-agnostic engine (never touch for new domains)
│ │ ├── interfaces.py # DomainLifecycleHooks protocol
│ │ ├── domain_registry.py # Auto-discovers domains
│ │ ├── lifecycle.py # Generic scaffold → sync → review → promote
│ │ ├── tracker.py # W&B tracker + NullTracker fallback
│ │ └── utils.py # Config loading, .env, seeding, path helpers
│ │
│ ├── domains/ # Each domain is a self-contained plugin
│ │ ├── hndsr_vr/ # Satellite super-resolution (CV)
│ │ ├── nlp_lm/ # Character-level language model (NLP)
│ │ └── tabular_cls/ # Tabular classification (ML)
│ │
│ └── cli.py # Traditional CLI (python -m chakra)
│ └── chakra_cli.py # Chakra CLI (chakra sutra/yantra/...)
│
├── configs/ # YAML configuration files
│ ├── hndsr_vr/ # Per-domain config sets
│ ├── nlp_lm/
│ └── tabular_cls/
│
├── benchmarks/ # Measured baseline registries (JSON)
├── data/ # Datasets (e.g., titanic.csv)
├── programs/ # Research program documents
├── docs/ # Documentation
├── notebooks/versions/ # Generated Kaggle/Colab notebooks
├── reports/ # Reviews and generated ablation plans
├── artifacts/ # Training outputs (gitignored)
├── tests/ # Test suite (47 tests)
├── .github/workflows/test.yml # CI pipeline
├── pyproject.toml # Build config and dependencies
└── .env # W&B credentials (gitignored, create manually)
Key Directories¶
| Directory | What Goes Here | Tracked in Git? |
|---|---|---|
configs/ |
YAML training configs | ✅ Yes |
artifacts/ |
Checkpoints, metrics, manifests | ❌ No (gitignored) |
benchmarks/ |
Measured baseline JSON registries | ✅ Yes |
notebooks/versions/ |
Generated Kaggle notebooks | ✅ Yes |
reports/reviews/ |
Version review/roast docs | ✅ Yes |
reports/generated/ |
Ablation suggestions | ❌ No (gitignored) |
3. Core Concepts¶
Domains¶
A domain is a self-contained research lane (e.g., NLP, computer vision, tabular ML). Each domain defines its own models, datasets, metrics, and lifecycle hooks. The core engine doesn't know or care about domain-specific details — it only calls the standardized protocol.
Versions¶
Every experiment run is tied to a version (e.g., v1.0, v2.0, v1.0.1). Each version has:
- A notebook (Kaggle-ready .ipynb)
- A doc (markdown describing the experiment)
- A review (findings + roast + ablation suggestions)
- Three configs: control, smoke, train
Config Variants¶
Each version always has three config variants:
| Variant | Purpose | When to Use |
|---|---|---|
control |
Baseline model (e.g., logistic regression), minimal or no training | First — establishes the accuracy floor |
smoke |
Main model with heavily limited batches (3 epochs, 5 batches) | Second — validates the pipeline runs end-to-end |
train |
Main model, full training (30+ epochs, all data) | Third — produces the real results |
The Lifecycle¶
Scaffold → Control → Smoke → Train → Evaluate → Sync → Review → Validate → (Promote or Iterate)
Frozen Ablation Plans¶
Once a version is scaffolded, its configs are frozen. This prevents goal-drift during autonomous runs. If you want to try different hyperparameters, create a new version (e.g., v1.1).
4. Running Your First Experiment¶
The fastest experiment uses the Tabular Classification domain with the Iris dataset (150 samples, runs in ~10 seconds on CPU).
Step 1: Scaffold¶
python -m chakra --domain tabular_cls scaffold-version --version v1.0 --force
This generates the notebook, doc, review template, and config files for v1.0.
Step 2: Control Baseline¶
python -m chakra.domains.tabular_cls.train_runner \
--config configs/tabular_cls/v1.0_control.yaml \
--run-name v1.0-control \
--device cpu
Output: Best val accuracy: ~16.67% — an untrained logistic regression on 3 classes is essentially random.
Step 3: Smoke Test¶
python -m chakra.domains.tabular_cls.train_runner \
--config configs/tabular_cls/v1.0_smoke.yaml \
--run-name v1.0-smoke \
--device cpu
Output: Best val accuracy: ~83% — confirms the MLP is learning, pipeline is working.
Step 4: Full Training¶
python -m chakra.domains.tabular_cls.train_runner \
--config configs/tabular_cls/v1.0_train.yaml \
--run-name v1.0-train \
--device cpu
Output: Best val accuracy: ~93.3% — the MLP learns the Iris classification task.
Step 5: Evaluate¶
python -m chakra.domains.tabular_cls.evaluate_runner \
--config configs/tabular_cls/v1.0_train.yaml \
--run-name v1.0-eval \
--checkpoint artifacts/v1.0-train/checkpoints/v1.0_train_best.pt \
--device cpu
Output: Accuracy=93.33%, F1=0.9296
Step 6: Sync Results¶
python -m chakra --domain tabular_cls sync-run \
--version v1.0 \
--source-dir artifacts/v1.0-train
This indexes training outputs into a structured run manifest at artifacts/runs/v1.0/run_manifest.json.
Step 7: Review¶
python -m chakra --domain tabular_cls review-run --version v1.0
Generates findings, metric deltas against the benchmark baseline, ablation suggestions, and a "roast" at reports/reviews/v1.0_Tabular_CLS.review.md.
Step 8: Validate¶
python -m chakra --domain tabular_cls validate-version --version v1.0
Output: v1.0 contract passed for domain 'tabular_cls'.
This checks that all required files (notebook, doc, review, configs) exist.
5. The Seven-Step Lifecycle¶
Every version goes through this lifecycle, regardless of domain:
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌──────────┐
│ Scaffold │→ │ Control │→ │ Smoke │→ │ Train │
└─────────┘ └─────────┘ └─────────┘ └──────────┘
│
┌──────────┐ ┌────────┐ ┌──────────┐ │
│ Validate │← │ Review │← │ Sync │ ←─┘
└──────────┘ └────────┘ └──────────┘
│
┌─────┴─────┐
│ Promote? │
└─────┬─────┘
Yes │ No
│ │
Freeze Fork→ v1.1
| Step | CLI Command | What It Does |
|---|---|---|
| Scaffold | scaffold-version --version v1.0 |
Creates notebook, doc, review, configs |
| Control | train_runner --config *_control.yaml |
Runs baseline model (sets the floor) |
| Smoke | train_runner --config *_smoke.yaml |
Quick pipeline sanity check |
| Train | train_runner --config *_train.yaml |
Full training run |
| Evaluate | evaluate_runner --checkpoint ... |
Evaluate on validation set |
| Sync | sync-run --version v1.0 --source-dir ... |
Index outputs into a run manifest |
| Review | review-run --version v1.0 |
Generate findings + ablation suggestions |
| Validate | validate-version --version v1.0 |
Check all files exist |
6. Configuration System¶
YAML with Inheritance¶
All configs use YAML format with an inherits: key for layered configuration:
# configs/tabular_cls/v1.0_train.yaml
inherits: configs/tabular_cls/base.yaml # ← loads base config first
project:
group: v1.0-train # ← only override what changes
training:
epochs: 30
checkpoint_name: v1.0_train_best.pt
The inherits key tells the config loader to:
1. Load base.yaml first
2. Deep-merge the current file's keys on top
3. Return the merged result
This means you only need to specify what's different from the base.
Config Schema¶
Here's every field supported in a config file:
seed: 42 # Random seed for reproducibility
project:
name: chakra # W&B project name
group: v1.0-train # W&B run group
tags: [tabular, classification] # W&B tags
runtime:
version: v1.0 # Version label
lineage: scratch # "scratch" or "pretrained"
parent: null # Parent version (null for first version)
paths:
artifact_root: artifacts # Where checkpoints/metrics are saved
report_root: reports # Where reviews/generated docs go
data:
dataset: iris # Dataset name ("iris" or "titanic")
data_file: data/titanic.csv # Path to CSV (Titanic only)
val_split: 0.2 # Validation set fraction
batch_size: 32 # DataLoader batch size
tracking:
enabled: true # Enable W&B tracking
mode: online # "online", "offline", or "disabled"
project: chakra
entity: null # W&B team/entity (null for personal)
notes: "Description of this run" # Shown in W&B dashboard
model:
kind: mlp # Model type ("logistic", "mlp", etc.)
hidden_dim: 64 # MLP hidden layer size
dropout: 0.2 # Dropout rate
training:
epochs: 30 # Number of training epochs
lr: 0.001 # Learning rate
weight_decay: 0.0001 # L2 regularization
max_train_batches: null # Limit train batches (null = all)
max_val_batches: null # Limit val batches (null = all)
checkpoint_name: v1.0_train_best.pt # Best checkpoint filename
evaluation:
sample_limit: 100 # Max val batches during evaluation
save_limit: 8 # Max samples to save for visualization
Creating a New Version's Configs¶
For a new version (e.g., v1.1), create three config files:
configs/tabular_cls/v1.1_control.yaml # Baseline
configs/tabular_cls/v1.1_smoke.yaml # Quick check
configs/tabular_cls/v1.1_train.yaml # Full training
Each inherits from base.yaml and overrides the specific fields for that variant.
7. Weights & Biases Integration¶
Setup¶
-
Get your API key from wandb.ai/authorize
-
Create a
.envfile in the project root:WANDB_API_KEY=your_api_key_here -
That's it. All runners call
load_dotenv()at startup, which loads the key.
How It Works¶
The init_tracker() function in core/tracker.py decides which backend to use:
Has WANDB_API_KEY? ──Yes──→ Has wandb installed? ──Yes──→ WandbTracker (full cloud tracking)
│ │
No No
│ │
└──────────────────────────┴──→ NullTracker (local JSON only)
You never lose data. If W&B isn't available, the NullTracker saves all metrics and artifacts to local JSON files in the artifacts/ directory.
Tracking Modes¶
Set in your config YAML under tracking.mode:
| Mode | Behavior |
|---|---|
online |
Streams metrics to W&B cloud in real-time |
offline |
Saves W&B run locally, sync later with wandb sync |
disabled |
Forces NullTracker regardless of API key |
What Gets Tracked¶
| Artifact | Logged As |
|---|---|
| Full resolved config | W&B artifact (type: config) |
| Dataset split manifest | W&B artifact (type: dataset_manifest) |
| Per-epoch metrics | tracker.log_metrics() — visible as charts in W&B |
| Best checkpoint | W&B artifact (type: checkpoint) |
| Train/eval summary | W&B artifact (type: metrics_summary) |
Viewing Results¶
After a run with mode: online, go to your W&B dashboard:
- Metrics tab: Loss curves, accuracy over epochs
- Artifacts tab: Checkpoints, configs, summaries with full lineage
- System tab: GPU/CPU utilization, memory usage
8. CLI Reference¶
The CLI has two interfaces: the Chakra CLI (chakra command) and the Traditional CLI (python -m chakra).
Chakra CLI (Recommended)¶
The Chakra CLI maps each command to a stage of the research cycle:
chakra <command> --domain DOMAIN [OPTIONS]
| Command | Stage | Description |
|---|---|---|
chakra sutra |
Plan | Scaffold version assets and freeze configs |
chakra yantra |
Execute | Run training or evaluation (--stage control\|smoke\|train\|eval) |
chakra rakshak |
Guard | Validate version contract (all required files exist) |
chakra vimarsh |
Review | Sync results and generate structured review |
chakra manthan |
Improve | Propose ablation suggestions for next iteration |
chakra aavart |
Full Cycle | Run the complete loop: Plan → Execute → Guard → Review → Improve |
chakra list-domains |
Discovery | List all auto-discovered domains |
One-command full cycle:
chakra aavart --domain tabular_cls --version v1.0 --device cpu --force
Traditional CLI¶
The traditional entrypoint is python -m chakra.
Global Options¶
python -m chakra [--domain DOMAIN_NAME] COMMAND [OPTIONS]
--domain is required for all lifecycle commands.
Discovery Commands¶
# List all auto-discovered domains
python -m chakra list-domains
# Show detailed info about a domain
python -m chakra --domain tabular_cls domain-info
Lifecycle Commands¶
| Command | Arguments | Description |
|---|---|---|
scaffold-version |
--version V [--parent P] [--lineage scratch\|pretrained] [--force] |
Create version assets |
validate-version |
--version V |
Check all files exist |
sync-run |
--version V [--source-dir DIR] [--wandb-url URL] [--dry-run] |
Index outputs into manifest |
review-run |
--version V |
Generate review + roast |
next-ablation |
--version V |
Write ablation suggestions |
mirror-obsidian |
--version V [--output-dir DIR] [--dry-run] |
Generate Obsidian note |
Kaggle Commands¶
| Command | Arguments | Description |
|---|---|---|
push-kaggle |
--version V [--title T] [--username U] [--dry-run] |
Push notebook to Kaggle |
kaggle-status |
--version V [--username U] [--dry-run] |
Check kernel run status |
pull-kaggle |
--version V [--username U] [--dry-run] |
Pull outputs into artifacts |
Execution Orchestration¶
The run-execution command chooses between local and Kaggle execution paths and always runs a local smoke gate before Kaggle submission.
python -m chakra --domain nlp_lm run-execution --version v1.0 --strategy auto --dry-run
python -m chakra --domain tabular_cls run-execution --version v1.0 --strategy local --dry-run
Behavior summary:
localruns the train runner directly.kaggleperforms a local smoke gate first, then push/status/pull.autouses manifest lifecycle/execution hints and system info to choose a backend.--dry-runprints the command flow without invoking the backend tools.
Runner Commands (Direct)¶
Runners are invoked directly via Python modules, not through the CLI:
# Training
python -m chakra.domains.<domain>.train_runner \
--config <path> --run-name <name> [--device cpu|cuda]
# Evaluation
python -m chakra.domains.<domain>.evaluate_runner \
--config <path> --run-name <name> --checkpoint <path> [--device cpu|cuda]
9. Working with Each Domain¶
Tabular Classification (tabular_cls)¶
Datasets: Iris (150 samples, 4 features, 3 classes) and Titanic (891 samples, 9 features, 2 classes)
Models:
- logistic — Single linear layer (control baseline)
- mlp — Two hidden layers with ReLU and dropout
Metrics: Accuracy (primary), macro F1, cross-entropy loss
Iris (v1.0):
# Full lifecycle
python -m chakra --domain tabular_cls scaffold-version --version v1.0 --force
python -m chakra.domains.tabular_cls.train_runner --config configs/tabular_cls/v1.0_control.yaml --run-name v1.0-control --device cpu
python -m chakra.domains.tabular_cls.train_runner --config configs/tabular_cls/v1.0_train.yaml --run-name v1.0-train --device cpu
python -m chakra.domains.tabular_cls.evaluate_runner --config configs/tabular_cls/v1.0_train.yaml --run-name v1.0-eval --checkpoint artifacts/v1.0-train/checkpoints/v1.0_train_best.pt --device cpu
python -m chakra --domain tabular_cls sync-run --version v1.0 --source-dir artifacts/v1.0-train
python -m chakra --domain tabular_cls review-run --version v1.0
python -m chakra --domain tabular_cls validate-version --version v1.0
Titanic (v2.0):
python -m chakra --domain tabular_cls scaffold-version --version v2.0 --parent v1.0 --force
python -m chakra.domains.tabular_cls.train_runner --config configs/tabular_cls/v2.0_control.yaml --run-name v2.0-control --device cpu
python -m chakra.domains.tabular_cls.train_runner --config configs/tabular_cls/v2.0_train.yaml --run-name v2.0-train --device cpu
python -m chakra.domains.tabular_cls.evaluate_runner --config configs/tabular_cls/v2.0_train.yaml --run-name v2.0-eval --checkpoint artifacts/v2.0-train/checkpoints/v2.0_train_best.pt --device cpu
python -m chakra --domain tabular_cls sync-run --version v2.0 --source-dir artifacts/v2.0-train
python -m chakra --domain tabular_cls review-run --version v2.0
python -m chakra --domain tabular_cls validate-version --version v2.0
NLP Language Modelling (nlp_lm)¶
Dataset: Tiny Shakespeare (character-level text)
Models:
- bigram — Bigram baseline (control, no context)
- gpt_nano — Minimal GPT transformer (4 layers, 4 heads, 64 embedding dim)
Metrics: Bits-per-byte (BPB, primary — lower is better), perplexity, cross-entropy
# Full lifecycle
python -m chakra --domain nlp_lm scaffold-version --version v1.0 --force
python -m chakra.domains.nlp_lm.train_runner --config configs/nlp_lm/v1.0_control.yaml --run-name v1.0-control --device cpu
python -m chakra.domains.nlp_lm.train_runner --config configs/nlp_lm/v1.0_train.yaml --run-name v1.0-train --device cpu
python -m chakra.domains.nlp_lm.evaluate_runner --config configs/nlp_lm/v1.0_train.yaml --run-name v1.0-eval --checkpoint artifacts/v1.0-train/checkpoints/v1.0_train_best.pt --device cpu
python -m chakra --domain nlp_lm sync-run --version v1.0 --source-dir artifacts/v1.0-train
python -m chakra --domain nlp_lm review-run --version v1.0
python -m chakra --domain nlp_lm validate-version --version v1.0
Note: NLP training takes ~5 minutes on CPU. Use
--device cudaif you have a GPU.
HNDSR Satellite Super-Resolution (hndsr_vr)¶
Dataset: Satellite imagery patches (requires local data — see configs/hndsr_vr/base.yaml)
Models: - Bicubic baseline (control) - SR3 diffusion model
Metrics: PSNR (primary — higher is better), SSIM
Note: This domain requires the HNDSR dataset configured locally. See
configs/hndsr_vr/local.yamlfor path overrides.
10. Adding Your Own Domain¶
AutoResearch is designed to be extended. Adding a new domain requires zero changes to the core engine. See Contributing for the full tutorial.
Step-by-Step¶
1. Create the domain package:
src/chakra/domains/my_domain/
├── __init__.py
├── domain.yaml
├── lifecycle.py
├── models.py
├── dataset.py
├── metrics.py
├── train_runner.py
└── evaluate_runner.py
2. Write domain.yaml:
name: my_domain
display_name: My Research Domain
version_pattern: "^v\\d+\\.\\d+(?:\\.\\d+)?$"
model_kinds: [baseline, main_model]
primary_metric: my_metric
metric_direction: higher_is_better
benchmark_registry: benchmarks/my_domain_registry.json
config_dir: configs/my_domain
programs_doc: programs/my_domain.md
entrypoints:
lifecycle: chakra.domains.my_domain.lifecycle
train_runner: chakra.domains.my_domain.train_runner
evaluate_runner: chakra.domains.my_domain.evaluate_runner
3. Implement LifecycleHooks in lifecycle.py:
Use tabular_cls/lifecycle.py as your template — it's the simplest complete implementation. You must implement all methods from the DomainLifecycleHooks protocol:
| Method | Purpose |
|---|---|
version_stem() |
Convert v1.0 → v1.0_My_Domain |
version_slug() |
Convert v1.0 → v1-0-my-domain |
resolve_version_paths() |
Return all canonical file paths for a version |
build_version_configs() |
Generate control/smoke/train config dicts |
render_notebook() |
Render a Kaggle-ready notebook JSON |
render_doc() |
Render version documentation markdown |
render_review() |
Render initial review template |
render_notebook_readme() |
Render notebooks directory README |
default_kernel_metadata() |
Return Kaggle kernel metadata dict |
build_findings() |
Analyse a run and return findings + deltas |
ablation_suggestions() |
Suggest next-version experiments |
roast_lines() |
Return domain-specific roast/audit lines |
validate_version() |
Check all required files exist |
4. Wire W&B tracking in your runners:
from chakra.core.tracker import init_tracker
from chakra.core.utils import load_dotenv, describe_run_dirs
def main():
load_dotenv()
# ... parse args, load config ...
dirs = describe_run_dirs(config, run_name)
tracker = init_tracker(config, run_name, dirs["tracker"])
# During training:
tracker.log_metrics({"loss": 0.5, "accuracy": 0.9}, step=epoch)
# Save artifacts:
tracker.log_file_artifact("checkpoint", path, "checkpoint")
# Always finish:
tracker.finish()
5. Add supporting files:
# Config files
configs/my_domain/base.yaml
configs/my_domain/v1.0_control.yaml
configs/my_domain/v1.0_smoke.yaml
configs/my_domain/v1.0_train.yaml
# Benchmark registry (empty initially)
benchmarks/my_domain_registry.json
# Research program doc
programs/my_domain.md
6. Register in pyproject.toml:
[tool.setuptools.package-data]
"chakra.domains.my_domain" = ["domain.yaml"]
7. Add tests:
tests/test_my_domain.py
8. Verify:
python -m chakra list-domains # Should show your domain
python -m pytest tests/test_my_domain.py -v
11. Testing¶
Run All Tests¶
python -m pytest tests/ -v
Run Specific Domain Tests¶
python -m pytest tests/test_tabular_domain.py -v
python -m pytest tests/test_nlp_domain.py -v
python -m pytest tests/test_core.py -v
Quick Mode¶
python -m pytest tests/ -q
Test Coverage Summary¶
| Test File | Tests | What It Covers |
|---|---|---|
test_core.py |
16 | Config loading, utils, seeding, domain registry |
test_tabular_domain.py |
9 | Discovery, protocol, models, dataset, metrics |
test_nlp_domain.py |
6 | GPT-nano, bigram, dataset, metrics |
test_domain_registry.py |
5 | Multi-domain discovery, manifest validation |
test_cli_dispatch.py |
3 | CLI argument parsing, domain dispatch |
test_runtime_contract.py |
6 | Path resolution, workspace isolation |
test_lifecycle_review.py |
1 | Full sync → review pipeline |
test_notebook_contract.py |
1 | Notebook JSON structure |
12. Troubleshooting¶
"No module named 'chakra'"¶
You need to install the package:
pip install -e ".[dev]"
"ModuleNotFoundError: No module named 'sklearn'"¶
Install scikit-learn:
pip install scikit-learn
"No research domains discovered"¶
Make sure domain.yaml files exist in each domain's directory and are listed in pyproject.toml under [tool.setuptools.package-data].
W&B Not Tracking (using NullTracker)¶
Check:
1. .env file exists in the repo root with WANDB_API_KEY=...
2. wandb is installed: pip install wandb
3. Config has tracking.enabled: true and tracking.mode: online
"CONFLICT: Merge conflict in README.md"¶
This happens when merging branches with different histories. Resolve by choosing the version you want:
git checkout --theirs README.md # Accept incoming changes
git add README.md
git commit --no-edit
Artifacts Directory Missing¶
The artifacts/ directory is gitignored. It's created automatically when you run a training command. If you need to start fresh:
# Windows
Remove-Item -Recurse -Force artifacts
# Linux/macOS
rm -rf artifacts
GPU Not Detected¶
Use --device cpu to force CPU mode, or check:
import torch
print(torch.cuda.is_available())
Config Inheritance Not Working¶
Make sure the inherits: path is relative to the repo root, not the config file:
# Correct
inherits: configs/tabular_cls/base.yaml
# Wrong
inherits: base.yaml
Summary of All Commands Cheat Sheet¶
# ---- Setup ----
pip install -e ".[dev]"
python -m chakra list-domains
# ---- Lifecycle (replace DOMAIN and VERSION) ----
python -m chakra --domain DOMAIN scaffold-version --version VERSION --force
python -m chakra.domains.DOMAIN.train_runner --config configs/DOMAIN/VERSION_control.yaml --run-name VERSION-control --device cpu
python -m chakra.domains.DOMAIN.train_runner --config configs/DOMAIN/VERSION_smoke.yaml --run-name VERSION-smoke --device cpu
python -m chakra.domains.DOMAIN.train_runner --config configs/DOMAIN/VERSION_train.yaml --run-name VERSION-train --device cpu
python -m chakra.domains.DOMAIN.evaluate_runner --config configs/DOMAIN/VERSION_train.yaml --run-name VERSION-eval --checkpoint artifacts/VERSION-train/checkpoints/VERSION_train_best.pt --device cpu
python -m chakra --domain DOMAIN sync-run --version VERSION --source-dir artifacts/VERSION-train
python -m chakra --domain DOMAIN review-run --version VERSION
python -m chakra --domain DOMAIN validate-version --version VERSION
# ---- Chakra CLI (one-command cycle) ----
chakra aavart --domain DOMAIN --version VERSION --device cpu --force
# ---- Testing ----
python -m pytest tests/ -v