mem-agent 4B - Q4_K_M GGUF

This is a 4-bit GGUF quantization of driaforall/mem-agent, a persistent memory agent trained with online RL.

Model Description

mem-agent is a 4B parameter language model based on Qwen3-4B-Thinking-2507, trained using GSPO (Generalized Supervised Policy Optimization) to interact with a markdown-based memory system inspired by Obsidian.

This GGUF conversion enables:

✅ LM Studio compatibility - Run the model locally with an intuitive GUI
✅ Windows support - Part of a broader mem-agent Windows porting project
✅ CPU/GPU inference - Optimized for various hardware configurations
✅ Reduced memory footprint - ~2GB model size with minimal performance loss

Original Model

The original model was developed by driaforall and achieves impressive results:

75% overall score on md-memory-bench
Rivals models 50x its size on memory tasks
Trained on three core capabilities: Retrieval, Updating, and Clarification

Read the full technical blog post: mem-agent: Persistent, Human Readable Memory Agent

Quantization Details

Format: GGUF (Q4_K_M)
Precision: 4-bit quantization
Size: ~2GB
Performance: Minimal degradation compared to full precision (original 4-bit MLX version: 76.8% overall score)

Why This Port?

The original mem-agent MCP server was designed for Mac and Linux environments. This GGUF conversion is part of a Windows porting project to make mem-agent accessible to a broader audience through:

LM Studio integration for easy local deployment
Cross-platform compatibility
Standard GGUF toolchain support (llama.cpp, Ollama, etc.)

🪟 Windows Port Project

This model is part of the Windows porting effort of mem-agent:

Repository: mem-agent-mcp-Windows
Goal: Enable mem-agent functionality on Windows systems
Integration: Compatible with LM Studio and other GGUF-based tools

Usage

LM Studio

Download the GGUF file
Load it in LM Studio
Configure the model with appropriate system prompts for memory agent functionality

llama.cpp

./main -m mem-agent-4B-Q4-K-M.gguf -p "Your prompt here" -n 512

Downloads last month: 11

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fcampanini74/mem-agent-4B-Q4-K-M-GGUF

Base model

Qwen/Qwen3-4B-Thinking-2507

Quantized

(78)

this model