Hydra Configuration Reference
This document describes the Hydra configuration structure and parameters used for model training and management.
Configuration Overview
The configuration is organized into several sections controlling different aspects of the training pipeline:
defaults:
- _self_
- override hydra/job_logging: disabled
- override hydra/hydra_logging: disabled
model:
# Model architecture and training parameters
# ...
training:
# Training hyperparameters
# ...
paths:
# Directory paths and system locations
# ...
Main Configuration Sections
Defaults Configuration
Key |
Description |
|---|---|
|
Includes current config in composition hierarchy |
|
Disables Hydra’s default job logging |
|
Disables Hydra’s internal system logging |
Model Configuration
Parameter |
Description |
Default |
|---|---|---|
model_name |
Base model identifier from Hugging Face Hub |
“Vikhrmodels/Vikhr-YandexGPT-5-Lite-8B-it” |
lora_r |
LoRA rank dimension |
16 |
lora_alpha |
LoRA alpha scaling factor |
32 |
qtype |
Quantization type for GGUF conversion |
“q4_1” |
torch_dtype |
Base model dtype (float16/float32) |
“float16” |
Training Configuration
Parameter |
Description |
Default |
|---|---|---|
per_device_train_batch_size |
Batch size per GPU |
1 |
gradient_accumulation_steps |
Number of update steps before backward pass |
4 |
learning_rate |
Initial learning rate |
2e-5 |
max_seq_length |
Maximum input sequence length |
2048 |
gradient_checkpointing |
Enable memory-efficient training |
true |
Paths Configuration
Parameter |
Description |
Example |
|---|---|---|
data_dir |
Input dataset directory |
“data” |
output_dir |
Trained model output directory |
“models” |
llama_cpp_dir |
Path to llama.cpp installation |
“../llama.cpp” |
quantized_path |
llama.cpp quantizer executable path |
“build/bin/llama-quantize” |
Training Pipeline Workflow
The complete training process follows these stages:
- Initialization
Configure logging and environment
Load base model with 4-bit quantization
Prepare tokenizer with custom padding
- Data Preparation
Load dataset from JSON files
Generate chat-formatted prompts
Tokenize with sequence length truncation
- Model Training
Apply LoRA configuration to base model
Train using either SFTTrainer or GRPO
Merge adapter weights with base model
- Model Conversion
Convert merged model to GGUF format
Quantize using llama.cpp tools
Save final weights to output directory
# Simplified pipeline flow
def train_pipeline(cfg):
steps = train(cfg)
with TemporaryDirectory() as tmp_dir:
model_merge_for_converting(cfg, steps, tmp_dir)
convert_to_gguf(tmp_dir, ...)
quantize_model(...)
copy_final_weights(...)
Important Implementation Notes
LoRA Configuration
The model uses Low-Rank Adaptation with these key settings:
Module |
Target Layers |
Parameters |
|---|---|---|
peft.LoraConfig |
proj layers (q_proj, v_proj, etc) |
r=16, alpha=32 |
Modules to Save |
lm_head |
Quantization Setup
The system supports two-stage quantization:
- Training Quantization
4-bit NFQuant via BitsAndBytes
Compatible dtype: float16
- Post-Training Quantization
GGUF conversion with llama.cpp
Supported types: q4_0, q4_1, etc
Note
For optimal performance, ensure llama.cpp is compiled with CUDA support when quantizing on GPU systems.
Logging Configuration
Custom logging setup includes:
Hydra logging disabled for cleaner outputs
W&B integration for experiment tracking
Custom logging levels via
logging_config.py
Warning
The hf_token field must be updated with a valid Hugging Face token
when using private models or datasets.
Environment Requirements
The system requires these key dependencies:
Python 3.8+
PyTorch 2.0+
Transformers 4.30+
PEFT 0.4+
Hydra 1.3+
llama.cpp (latest version)
Full configuration schema available in conf/config.yaml