More

tcsenpai · 2025-12-11T09:34:48 1765445688

> But what if our neurobiological reality includes a system that behaves something like an LLM?

It almost seems like we got inspiration from our brain to build neural networks!

seanmcdirmid · 2025-12-11T13:39:42 1765460382

It isn’t clear though. Neural networks were inspired by the brain, but transformers? It is totally plausible but do we really think just in words?

SAI_Peregrinus · 2025-12-11T19:41:19 1765482079

> It is totally plausible but do we really think just in words?

I find that proposition totally implausible. Some people certainly report only thinking in words & having a continuous inner monologue, but I'm not one of them. I think, then I describe my thoughts in words if I'm speaking or writing or thinking about speaking or writing.

tcsenpai · 2025-12-11T09:32:12 1765445532

I find it amazing, that we can build microprocessors with transistors the size of a DNA molecule

tcsenpai · 2025-11-21T09:37:37 1763717857

Also: I simply asked to the 7B model "hi, who are u":

Thought for 1 seconds Okay, the user is asking "Who are u?" with a casual tone ("u" instead of "you", exclamation mark). Let me break this down.

First, I need to address the question itself. They want to know who I am as an AI assistant. My standard response is that I'm DeepSeek R1, designed to be helpful and handle reasoning step by step. But maybe they're not familiar with AI terms in general. The "u" part might just be their way of speaking, so I should reply politely using "you" form.

Next, considering the context. They started with an HTML tag " "

----

And it hung there without a reply. It might be an OpenWebUI bug but not so promising

hasperdi · 2025-11-21T09:56:41 1763719001

I just tried that on their playground:

7B:Hi! I'm Olmo 3, an AI assistant created by the non-profit organization Ai2. I'm here to help with questions, ideas, or tasks you have—just let me know what you need! How can I assist you today? Rawr!

32B: Hi! I'm Olmo, a helpful AI assistant built by the Allen Institute for AI (Ai2). My knowledge is up to December 2024, and I'm designed to assist with a wide range of tasks. How can I help you today?

embedding-shape · 2025-11-21T13:13:25 1763730805

Every time a new model is released, there are a bunch of reports or written experiences about people using the model with software that seemingly doesn't support it.

GPT-OSS really made that clear, where 90% of the ecosystem declared it broken, but most people were using dumb quants and software that didn't properly support it. Guess we'll repeat the same thing with OLMo now.

Sabinus · 2025-11-21T23:52:43 1763769163

I'm really glad to read this, as this was my experience in LM studio with olmo. Worked for the first message but got progressively more unstable. Also doesn't seem to reset model state for a new conversation, every response following the model load gets progressively worse, even in new chats.

andy99 · 2025-11-21T13:45:29 1763732729

There are a bunch (currently 3) of examples of people getting funny output, two of which saying it’s in LM studio (I don’t know what that is). It does seem likely that it’s somehow being misused here and the results aren’t representative.

embedding-shape · 2025-11-21T14:33:25 1763735605

Definitely. Usually I'd wait 2-3 weeks for the ecosystem to catch up and iron out the kinks, or do what I did for GPT-OSS, fix it in the places where it's broken, then judge it when I'm sure it's actually used correctly.

Otherwise, in that early period of time, only use the provided scripts/tools from the people releasing the model itself, which is probably the only way in those 2-3 weeks to be sure you're actually getting the expected responses.

fnbr · 2025-11-21T15:06:25 1763737585

(I’m a researcher on the post-training team at Ai2.)

Where did you try this? On the Ai2 playground?

tcsenpai · 2025-11-22T09:07:49 1763802469

Hello! On Open WebUI using ollama as a backend :)

I guess Ollama needs to update their version, maybe!

latexr · 2025-11-21T15:09:26 1763737766

Reminds me of an old joke where a guy is walking down the street and another person says “good morning”. The guy starts deconstructing what “good morning” means until he finally reaches the conclusion “that bastard was calling me an asshole”.

red2awn · 2025-11-21T18:53:07 1763751187

Check the maximum number of completion tokens allowed.

tcsenpai · 2025-11-21T09:29:35 1763717375

I think they should start aiming for 20B models along with 32B and 7B. Usually 7B is enough for a 8GB GPU, 32B requires a 24GB GPU for decent quants (I can fit a 32B with IQ3_XXS but is not ideal) while 20-ish B models (such as magistral or gpt-oss) are a perfect fit for 16GB GPUs

embedding-shape · 2025-11-21T13:12:14 1763730734

Depends heavily on the architecture too, I think a free-for-all to find the better sizes is still kind of ongoing, and rightly so. GPT-OSS-120B for example fits in around 61GB VRAM for me when on MXFP4.

Personally, I hope GPU makers instead start adding more VRAM, or if one can dream, expandable VRAM.

refulgentis · 2025-11-21T14:00:31 1763733631

Unlikely to see more VRAM in the short term, memory prices are thru the roof :/ like, not subtly, 2-4x.

embedding-shape · 2025-11-21T14:38:16 1763735896

Well, GPUs are getting more VRAM, although it's pricey. But we didn't used to have 96GB VRAM GPUs at all, now they do exist :) But for the ones who can afford it, it is at least possible today. Slowly it increases.

refulgentis · 2025-11-21T15:14:00 1763738040

Agreed, in the limit, RAM go up. As billg knows, 128KB definitely wasn't enough for everyone :)

embedding-shape · 2025-11-21T15:36:55 1763739415

I'm already thinking 96GB might not be enough, and I've only had this GPU for 6 months or so :|

refulgentis · 2025-11-21T18:22:30 1763749350

Hehe me too…went all out on a MBP in 2022, did it again in April. Only upgrade I didn’t bother with was topping out at 128 GB of RAM instead of 64. Then, GPT-OSS 120B comes out and quickly makes me very sad I can’t use it locally

anon373839 · 2025-11-21T23:29:30 1763767770

Same. I repeatedly kick myself for not getting the 128GB version, although not for the GPT-OSS model because I really haven’t been too impressed with it (through cloud providers). But now it’s best to wait until the M5 Max is out due to the new GPU neural accelerators that should greatly speed up prompt processing.

tcsenpai · 2025-11-19T09:05:14 1763543114

I was very hyped: maybe Google finally did something new, complete, unifying CLI and IDE, a sort of Claude Code Web but as an efficient, IDE-like, local thing.

Then I installed it and it was a VSCode fork.

tcsenpai · 2025-03-12T20:01:07 1741809667

Looks like Gemma 3 27b is quite creative in fictional scenarios.

https://garden.tcsenpai.com/bookmarks/ai/ai-convos-notes/gem...

tcsenpai · 2025-03-09T16:51:37 1741539097

Underrated. And magnificent.

tcsenpai · on Oct 27, 2024

I like this. I am imagining a companion extension for chrome/ff that uses you-get as a backend to implement it in a seamless way. Forward thinking idea: imagine going on youtube and have you-get extension bypass the youtube player and playing the content directly without ads. When I say youtube I might also say any other platform.

mikojan · on Oct 27, 2024

Sounds like FastStream Video Player

https://addons.mozilla.org/en-US/firefox/addon/faststream/?u...

tcsenpai · on Oct 14, 2024

This is surely useful right now. I wonder what will happens to all the nice X11 tools once Wayland (hopefully soon) will be the golden standard. There are options to enable X11 behaviors in Wayland but I guess that is just a fallback to the insecure implementation.

tcsenpai · on Oct 13, 2024

Update: v 1.1 is out!

- # Changelog

## [1.1] - 2024-03-19

### Added - New `model_tokens.json` file containing token limits for various Ollama models. - Dynamic token limit updating based on selected model in options. - Automatic loading of model-specific token limits from `model_tokens.json`. - Chunking and recursive summary for long pages - Better handling of markdown returns

### Changed - Updated `manifest.json` to include `model_tokens.json` as a web accessible resource. - Modified `options.js` to handle dynamic token limit updates: - Added `loadModelTokens()` function to fetch model token data. - Added `updateTokenLimit()` function to update token limit based on selected model. - Updated `restoreOptions()` function to incorporate dynamic token limit updating. - Added event listener for model selection changes.

### Improved - User experience in options page with automatic token limit updates. - Flexibility in handling different models and their respective token limits.

### Fixed - Potential issues with incorrect token limits for different models.