.. _configuration-reference: Hydra Configuration Reference ================================= This document describes the Hydra configuration structure and parameters used for model training and management. Configuration Overview ---------------------------- The configuration is organized into several sections controlling different aspects of the training pipeline: .. code-block:: yaml defaults: - _self_ - override hydra/job_logging: disabled - override hydra/hydra_logging: disabled model: # Model architecture and training parameters # ... training: # Training hyperparameters # ... paths: # Directory paths and system locations # ... Main Configuration Sections --------------------------- Defaults Configuration ~~~~~~~~~~~~~~~~~~~~~~ .. list-table:: Defaults Overrides :widths: 30 70 :header-rows: 1 * - Key - Description * - ``_self_`` - Includes current config in composition hierarchy * - ``override hydra/job_logging`` - Disables Hydra's default job logging * - ``override hydra/hydra_logging`` - Disables Hydra's internal system logging Model Configuration ~~~~~~~~~~~~~~~~~~~ .. list-table:: Model Parameters :widths: 25 50 25 :header-rows: 1 * - Parameter - Description - Default * - model_name - Base model identifier from Hugging Face Hub - "Vikhrmodels/Vikhr-YandexGPT-5-Lite-8B-it" * - lora_r - LoRA rank dimension - 16 * - lora_alpha - LoRA alpha scaling factor - 32 * - qtype - Quantization type for GGUF conversion - "q4_1" * - torch_dtype - Base model dtype (float16/float32) - "float16" Training Configuration ~~~~~~~~~~~~~~~~~~~~~~ .. list-table:: Training Parameters (Key Items) :widths: 25 50 25 :header-rows: 1 * - Parameter - Description - Default * - per_device_train_batch_size - Batch size per GPU - 1 * - gradient_accumulation_steps - Number of update steps before backward pass - 4 * - learning_rate - Initial learning rate - 2e-5 * - max_seq_length - Maximum input sequence length - 2048 * - gradient_checkpointing - Enable memory-efficient training - true Paths Configuration ~~~~~~~~~~~~~~~~~~~ .. list-table:: Path Directories :widths: 25 50 25 :header-rows: 1 * - Parameter - Description - Example * - data_dir - Input dataset directory - "data" * - output_dir - Trained model output directory - "models" * - llama_cpp_dir - Path to llama.cpp installation - "../llama.cpp" * - quantized_path - llama.cpp quantizer executable path - "build/bin/llama-quantize" Training Pipeline Workflow -------------------------- The complete training process follows these stages: 1. **Initialization** - Configure logging and environment - Load base model with 4-bit quantization - Prepare tokenizer with custom padding 2. **Data Preparation** - Load dataset from JSON files - Generate chat-formatted prompts - Tokenize with sequence length truncation 3. **Model Training** - Apply LoRA configuration to base model - Train using either SFTTrainer or GRPO - Merge adapter weights with base model 4. **Model Conversion** - Convert merged model to GGUF format - Quantize using llama.cpp tools - Save final weights to output directory .. code-block:: python # Simplified pipeline flow def train_pipeline(cfg): steps = train(cfg) with TemporaryDirectory() as tmp_dir: model_merge_for_converting(cfg, steps, tmp_dir) convert_to_gguf(tmp_dir, ...) quantize_model(...) copy_final_weights(...) Important Implementation Notes ------------------------------ LoRA Configuration ~~~~~~~~~~~~~~~~~~ The model uses Low-Rank Adaptation with these key settings: .. list-table:: LoRA Parameters :widths: 30 50 20 :header-rows: 1 * - Module - Target Layers - Parameters * - peft.LoraConfig - proj layers (q_proj, v_proj, etc) - r=16, alpha=32 * - Modules to Save - lm_head - - Quantization Setup ~~~~~~~~~~~~~~~~~~ The system supports two-stage quantization: 1. **Training Quantization** - 4-bit NFQuant via BitsAndBytes - Compatible dtype: float16 2. **Post-Training Quantization** - GGUF conversion with llama.cpp - Supported types: q4_0, q4_1, etc .. note:: For optimal performance, ensure llama.cpp is compiled with CUDA support when quantizing on GPU systems. Logging Configuration ~~~~~~~~~~~~~~~~~~~~~ Custom logging setup includes: - Hydra logging disabled for cleaner outputs - W&B integration for experiment tracking - Custom logging levels via ``logging_config.py`` .. warning:: The ``hf_token`` field must be updated with a valid Hugging Face token when using private models or datasets. Environment Requirements ------------------------ The system requires these key dependencies: - Python 3.8+ - PyTorch 2.0+ - Transformers 4.30+ - PEFT 0.4+ - Hydra 1.3+ - llama.cpp (latest version) Full configuration schema available in ``conf/config.yaml``