More

mudkipdev · 2026-04-18T20:28:35 1776544115

If anyone has a better workflow for creating lots of captions in kdenlive please let me know. I had to duplicate each title to the media library and drag it into the timeline, because if I simply copy/pasted then the text content/styling would be shared across instances

mudkipdev · 2026-04-16T16:51:02 1776358262

Re-read that

storus · 2026-04-16T17:05:27 1776359127

You should. 3.5 MoE was worse than 3.5 dense, so expecting 3.6 MoE to be superior than 3.5 dense is questionable, one could argue that 3.6 dense (not yet released) to be superior than 3.5 dense.

spuz · 2026-04-16T19:31:06 1776367866

Ok but you made a claim about the new model by stating a fact about the old model. It's easy to see how you appeared to be talking about different things. As for the claim, Qwen do indeed say that their new 3.6 MoE model is on a par with the old 3.5 dense model:

> Despite its efficiency, Qwen3.6-35B-A3B delivers outstanding agentic coding performance, surpassing its predecessor Qwen3.5-35B-A3B by a wide margin and rivaling much larger dense models such as Qwen3.5-27B.

https://qwen.ai/blog?id=qwen3.6-35b-a3b

storus · 2026-04-16T21:03:32 1776373412

This says a slightly different thing:

https://x.com/alibaba_qwen/status/2044768734234243427?s=48&t...

If you look, at many benchmarks the old dense model is still ahead but in couple benchmarks the new 35B demolishes the old 27B. "rivaling" so YMMV.

mudkipdev · 2026-04-13T17:07:47 1776100067

Does the large system prompt work fine for this model? If needed, you could use a lightweight CLI like Pi, which only comes with 4 tools by default

mudkipdev · 2026-04-13T02:07:09 1776046029

I built a Claude-inspired UI for Ollama/llama.cpp

https://github.com/mudkipdev/chat

mudkipdev · 2026-04-07T04:58:32 1775537912

People have made toki pona translation models before, not exclusively trained though

mudkipdev · 2026-04-06T06:20:16 1775456416

It's strange that my iPhone 14 is at regular temperature when using the E2B model. But also it's a lot slower (not sure how to measure the exact tokens per second, ~12 if I had to guess)

mudkipdev · 2026-04-06T05:55:41 1775454941

This is probably a consequence of the training data being fully lowercase:

You> hello Guppy> hi. did you bring micro pellets.

You> HELLO Guppy> i don't know what it means but it's mine.

functional_dev · 2026-04-06T06:51:10 1775458270

Great find! It appears uppercase tokens are completely unknonw to the tokenizer.

But the character still comes through in response :)

mudkipdev · 2026-04-03T15:46:17 1775231177

If you use the 'run' command, it pulls automatically for you

mudkipdev · 2026-04-03T15:45:21 1775231121

Under 15 is too slow for conversation personally. I guess 5 tokens per second is nice if you're one of the people who likes letting coding agents run overnight

mudkipdev · 2026-04-02T16:45:18 1775148318

Can't wait for gemma4-31b-it-claude-opus-4-6-distilled-q4-k-m on huggingface tomorrow

entropicdrifter · 2026-04-02T17:36:50 1775151410

I'd rather see a distill on the 26B model that uses only 3.8B parameters at inference time. Seems like it will be wildly productive to use for locally-hosted stuff

indrora · 2026-04-02T17:54:46 1775152486

gemma4-31b-it-claude-opus-4-6-distilled-abliterated-heretic-GGUF-q4-k-m