| | --- |
| | library_name: diffusers |
| | tags: |
| | - fp8 |
| | - safetensors |
| | - precision-recovery |
| | - mixed-method |
| | - converted-by-gradio |
| | --- |
| | # FP8 Model with Per-Tensor Precision Recovery |
| | - **Source**: `https://huggingface.co/spacepxl/Wan2.1-VAE-upscale2x` |
| | - **Original File**: `Wan2.1_VAE_upscale2x_imageonly_real_v1.safetensors` |
| | - **FP8 Format**: `E5M2` |
| | - **FP8 File**: `Wan2.1_VAE_upscale2x_imageonly_real_v1-fp8-e5m2.safetensors` |
| | - **Recovery File**: `Wan2.1_VAE_upscale2x_imageonly_real_v1-recovery.safetensors` |
| |
|
| | ## Recovery Rules Used |
| | ```json |
| | [ |
| | { |
| | "key_pattern": "vae", |
| | "dim": 4, |
| | "method": "diff" |
| | }, |
| | { |
| | "key_pattern": "encoder", |
| | "dim": 4, |
| | "method": "diff" |
| | }, |
| | { |
| | "key_pattern": "decoder", |
| | "dim": 4, |
| | "method": "diff" |
| | }, |
| | { |
| | "key_pattern": "all", |
| | "method": "none" |
| | } |
| | ] |
| | ``` |
| |
|
| | ## Usage (Inference) |
| | ```python |
| | from safetensors.torch import load_file |
| | import torch |
| | |
| | # Load FP8 model |
| | fp8_state = load_file("Wan2.1_VAE_upscale2x_imageonly_real_v1-fp8-e5m2.safetensors") |
| | |
| | # Load recovery weights if available |
| | recovery_state = load_file("Wan2.1_VAE_upscale2x_imageonly_real_v1-recovery.safetensors") if "Wan2.1_VAE_upscale2x_imageonly_real_v1-recovery.safetensors" and os.path.exists("Wan2.1_VAE_upscale2x_imageonly_real_v1-recovery.safetensors") else {} |
| | |
| | # Reconstruct high-precision weights |
| | reconstructed = {} |
| | for key in fp8_state: |
| | fp8_weight = fp8_state[key].to(torch.float32) # Convert to float32 for computation |
| | |
| | # Apply LoRA recovery if available |
| | lora_a_key = f"lora_A.{key}" |
| | lora_b_key = f"lora_B.{key}" |
| | if lora_a_key in recovery_state and lora_b_key in recovery_state: |
| | A = recovery_state[lora_a_key].to(torch.float32) |
| | B = recovery_state[lora_b_key].to(torch.float32) |
| | # Reconstruct the low-rank approximation |
| | lora_weight = B @ A |
| | fp8_weight = fp8_weight + lora_weight |
| | |
| | # Apply difference recovery if available |
| | diff_key = f"diff.{key}" |
| | if diff_key in recovery_state: |
| | diff = recovery_state[diff_key].to(torch.float32) |
| | fp8_weight = fp8_weight + diff |
| | |
| | reconstructed[key] = fp8_weight |
| | |
| | # Use reconstructed weights in your model |
| | model.load_state_dict(reconstructed) |
| | ``` |
| |
|
| | > **Note**: For best results, use the same recovery configuration during inference as was used during extraction. |
| | > Requires PyTorch ≥ 2.1 for FP8 support. |
| |
|
| | ## Statistics |
| | - **Total layers**: 194 |
| | - **Layers with recovery**: 60 |
| | - LoRA recovery: 0 |
| | - Difference recovery: 60 |
| |
|