Skip to content

sft_trainer

sft_trainer

General-purpose SFT trainer -- fine-tune any local LM on trace-derived pairs.

Delegates to :class:LoRATrainer from training/lora.py when use_lora=True. Supports train(trace_store) for end-to-end pipeline and train_on_pairs() for pre-extracted data.

Classes

SFTTrainer

SFTTrainer(config: SFTConfig)

General-purpose supervised fine-tuning trainer.

PARAMETER DESCRIPTION
config

SFTConfig controlling model, LoRA params, and training hyperparams.

TYPE: SFTConfig

Source code in src/openjarvis/learning/intelligence/sft_trainer.py
def __init__(self, config: SFTConfig) -> None:
    self.config = config
Attributes
target_module_list property
target_module_list: List[str]

Parse comma-separated target_modules string into a list.

Functions
train
train(trace_store: Any) -> Dict[str, Any]

End-to-end: mine SFT pairs from traces, then train.

PARAMETER DESCRIPTION
trace_store

Object with list_traces() returning trace objects.

TYPE: Any

RETURNS DESCRIPTION
dict with at least ``status`` key.
Source code in src/openjarvis/learning/intelligence/sft_trainer.py
def train(self, trace_store: Any) -> Dict[str, Any]:
    """End-to-end: mine SFT pairs from traces, then train.

    Parameters
    ----------
    trace_store:
        Object with ``list_traces()`` returning trace objects.

    Returns
    -------
    dict with at least ``status`` key.
    """
    pairs = self._mine_pairs(trace_store)
    return self.train_on_pairs(pairs)
train_on_pairs
train_on_pairs(pairs: List[Dict[str, Any]]) -> Dict[str, Any]

Train on pre-extracted SFT pairs.

PARAMETER DESCRIPTION
pairs

List of dicts with at least input and output keys.

TYPE: List[Dict[str, Any]]

RETURNS DESCRIPTION
dict with ``status``, ``training_samples``, and training metrics.
Source code in src/openjarvis/learning/intelligence/sft_trainer.py
def train_on_pairs(self, pairs: List[Dict[str, Any]]) -> Dict[str, Any]:
    """Train on pre-extracted SFT pairs.

    Parameters
    ----------
    pairs:
        List of dicts with at least ``input`` and ``output`` keys.

    Returns
    -------
    dict with ``status``, ``training_samples``, and training metrics.
    """
    if not pairs:
        return {"status": "skipped", "reason": "no training data"}

    if len(pairs) < self.config.min_pairs:
        return {
            "status": "skipped",
            "reason": f"only {len(pairs)} pairs, min_pairs={self.config.min_pairs}",
        }

    if self.config.use_lora:
        return self._train_lora(pairs)

    return self._train_full(pairs)