Aaron Feng
AaronFeng753
AI & ML interests
None yet
Organizations
None yet
GGUF seems broken on latest llama.cpp
6
#1 opened 2 months ago
by
AaronFeng753
outdated chat template
👍
➕
2
#3 opened 2 months ago
by
AaronFeng753
Exceptional Release, This is The Most Powerful OSS Model for 24GB Cards
👍
4
1
#29 opened 2 months ago
by
AaronFeng753
How to break the safety policy more reliably even when using MXFP4 quant
#14 opened 4 months ago
by
AaronFeng753
20B Parameters vs ChatGPT 4
13
#123 opened 4 months ago
by
Maria99934
how to change reasoning_effort when using llama-server
2
#20 opened 4 months ago
by
AaronFeng753
baichuan-inc/Baichuan-M2-32B
4
#1263 opened 4 months ago
by
AaronFeng753
baichuan-inc/Baichuan-M2-32B
1
#1 opened 4 months ago
by
AaronFeng753
when 32B?
👍
8
2
#1 opened 4 months ago
by
AaronFeng753
UD-Q4_K_XL and Q5KS
👍
1
2
#2 opened 4 months ago
by
AaronFeng753
microsoft/NextCoder-32B
#1 opened 5 months ago
by
AaronFeng753
Multilingual?
#1 opened 6 months ago
by
AaronFeng753
Potentially still broken?
👍
3
13
#8 opened 7 months ago
by
qenthousiast
UD-Q5_K_XL ?
👍
3
5
#3 opened 7 months ago
by
AaronFeng753
Does YaRN gguf works with ollama?
#2 opened 8 months ago
by
AaronFeng753
What's the performance improvements of UD-Q8_K_XL?
#3 opened 8 months ago
by
AaronFeng753
14B-128K?
2
#1 opened 7 months ago
by
AaronFeng753
[Bug] Potential performance degradation
👀
2
3
#6 opened 8 months ago
by
AaronFeng753
different KV Count
#7 opened 8 months ago
by
AaronFeng753
Re-quant GLM-4-32B using llama.cpp/pull/13021, even works in ollama v0.6.6
#2 opened 8 months ago
by
AaronFeng753