mem-agent 4B - Q4_K_M GGUF

This is a 4-bit GGUF quantization of driaforall/mem-agent, a persistent memory agent trained with online RL.

Model Description

mem-agent is a 4B parameter language model based on Qwen3-4B-Thinking-2507, trained using GSPO (Generalized Supervised Policy Optimization) to interact with a markdown-based memory system inspired by Obsidian.

This GGUF conversion enables:

  • βœ… LM Studio compatibility - Run the model locally with an intuitive GUI
  • βœ… Windows support - Part of a broader mem-agent Windows porting project
  • βœ… CPU/GPU inference - Optimized for various hardware configurations
  • βœ… Reduced memory footprint - ~2GB model size with minimal performance loss

Original Model

The original model was developed by driaforall and achieves impressive results:

  • 75% overall score on md-memory-bench
  • Rivals models 50x its size on memory tasks
  • Trained on three core capabilities: Retrieval, Updating, and Clarification

Read the full technical blog post: mem-agent: Persistent, Human Readable Memory Agent

Quantization Details

  • Format: GGUF (Q4_K_M)
  • Precision: 4-bit quantization
  • Size: ~2GB
  • Performance: Minimal degradation compared to full precision (original 4-bit MLX version: 76.8% overall score)

Why This Port?

The original mem-agent MCP server was designed for Mac and Linux environments. This GGUF conversion is part of a Windows porting project to make mem-agent accessible to a broader audience through:

  • LM Studio integration for easy local deployment
  • Cross-platform compatibility
  • Standard GGUF toolchain support (llama.cpp, Ollama, etc.)

πŸͺŸ Windows Port Project

This model is part of the Windows porting effort of mem-agent:

  • Repository: mem-agent-mcp-Windows
  • Goal: Enable mem-agent functionality on Windows systems
  • Integration: Compatible with LM Studio and other GGUF-based tools

Usage

LM Studio

  1. Download the GGUF file
  2. Load it in LM Studio
  3. Configure the model with appropriate system prompts for memory agent functionality

llama.cpp

./main -m mem-agent-4B-Q4-K-M.gguf -p "Your prompt here" -n 512
Downloads last month
11
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for fcampanini74/mem-agent-4B-Q4-K-M-GGUF

Quantized
(78)
this model