Index
training
¶
Training data extraction and fine-tuning pipelines for trace-driven learning.
Classes¶
TrainingDataMiner
¶
Extract supervised training pairs from stored traces.
| PARAMETER | DESCRIPTION |
|---|---|
trace_store
|
Any object with a
TYPE:
|
min_quality
|
Minimum
TYPE:
|
min_samples_per_class
|
Minimum number of samples a query class must have to appear in routing/agent-config results.
TYPE:
|
Source code in src/openjarvis/learning/training/data.py
Functions¶
extract_sft_pairs
¶
Return SFT training pairs from high-quality traces.
Each entry is a dict with keys: input, output,
query_class, model, feedback.
Duplicate (input, output) pairs are collapsed; the first
occurrence is kept.
Source code in src/openjarvis/learning/training/data.py
extract_routing_pairs
¶
Return per-query-class routing recommendations.
Returns a dict mapping query class to:
best_model— model with highest average feedback for the class.avg_feedback— average feedback across all models for the class.sample_count— total number of qualifying traces in the class.all_models— dict of{model: {"avg_feedback": float, "count": int}}.
Source code in src/openjarvis/learning/training/data.py
extract_agent_config_pairs
¶
Return per-query-class agent and tool recommendations.
Returns a dict mapping query class to:
best_agent— agent with the highest average feedback.best_tools— most frequently used tools by the best agent.avg_feedback— average feedback across all agents for the class.sample_count— total number of qualifying traces in the class.
Source code in src/openjarvis/learning/training/data.py
LoRATrainer
¶
LoRATrainer(config: LoRATrainingConfig, *, model_name: str = 'Qwen/Qwen3-0.6B', device: Optional[str] = None)
Fine-tune a local causal LM with LoRA (or QLoRA) adapters.
| PARAMETER | DESCRIPTION |
|---|---|
config
|
LoRA training configuration.
TYPE:
|
model_name
|
HuggingFace model identifier or local path.
TYPE:
|
device
|
PyTorch device string.
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ImportError
|
If |
Source code in src/openjarvis/learning/training/lora.py
Functions¶
prepare_dataset
¶
Convert SFT pairs to tokenized examples.
Each returned dict contains input_ids, attention_mask,
and text (the raw formatted string before tokenization).
| PARAMETER | DESCRIPTION |
|---|---|
pairs
|
List of dicts with at least
TYPE:
|
Source code in src/openjarvis/learning/training/lora.py
train
¶
Run LoRA fine-tuning on the given SFT pairs.
| PARAMETER | DESCRIPTION |
|---|---|
pairs
|
List of dicts with at least
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict
|
Training summary with keys: |
Source code in src/openjarvis/learning/training/lora.py
LoRATrainingConfig
dataclass
¶
LoRATrainingConfig(lora_rank: int = 16, lora_alpha: int = 32, lora_dropout: float = 0.05, target_modules: List[str] = (lambda: ['q_proj', 'v_proj'])(), num_epochs: int = 3, batch_size: int = 4, learning_rate: float = 2e-05, weight_decay: float = 0.01, warmup_ratio: float = 0.1, max_grad_norm: float = 1.0, max_seq_length: int = 2048, use_4bit: bool = False, output_dir: str = 'checkpoints/lora', save_every_n_epochs: int = 1, gradient_checkpointing: bool = True)
Configuration for LoRA / QLoRA fine-tuning.