More

manishsharan · 2026-01-02T17:11:16 1767373876

Gemini's large context window is incredible. I concatenate the my entire repo and repos of supporting libraries and then ask it questions.

My last use case was like this : I had a old codebase code that was using bakbone.js for ui with jquery and a bunch of old js with little documentation to generat UI for a clojure web application.

Gemini was able to unravel this hairball of code and guiding me step by step to htmx. I am not using AI studio; I am using Gemini subscription.

Since I manually patch the code, its like pair programming with an incredibly patient and smart programmer.

For the record, I am too old for vibe coding .. I like to maintain total control over my code and all the abstractions and logic.

manishsharan · 2025-12-11T13:36:12 1765460172

I had Gemini ingest our huge aws cloudformation repo . I had it describe each infrastructure component and how it related to others and creation hierarchy and IAM.

I got a nice and comprehensive infrastructure requirement document out of this.

Now I am using it to create Terraform repo , deploying it via OpenTofu and comparing it to my existing AWS cloud formation . This part is still a WIP .

manishsharan · 2025-12-09T15:18:42 1765293522

Yes the cost of building software dropped by 90%.

However, the cost of software maintenance went up by 1000% . Lets hope you don't need to ever add a new business rule or user interface to your vibe coded software.

manishsharan · 2025-12-03T02:49:39 1764730179

I am curious : could GenAI have written the paper "Attention is all you need"? We were trapped in CNN RNN architectures for a while : could genAi have arrived at a better architecture ?

Antibabelic · 2025-12-04T08:54:52 1764838492

I'm yet to see a convincing example of LLMs producing anything substantially insightful.

theshrike79 · 2025-12-04T11:34:01 1764848041

Depends on how you define "insight" really.

Is doing meta-analysis and discovering a commonality "insightful" for example?

Or is insight only something new you discover without basing your discovery on anything?

survirtual · 2025-12-03T09:38:30 1764754710

No, it couldn't have unless these ideas were sandwiched between other ideas that it could interpolate between.

You have to approach genai as a high-dimensional interpolation machine. It can perform extrapolation when you, the user, provide enough information to operate on. It can interpolate between what you provide and what it knows as well.

With these constraints, it is still pretty powerful, and I am generalizing of course. But in my experience, it is terrible at truly novel implementations of anything. It makes countless mistakes, because it continually attempts to fit to patterns found in existing code.

So you can really see the weaknesses at the frontier. I would encourage experimenting there to confirm what I am saying.

manishsharan · 2025-12-02T13:50:01 1764683401

BS. I grew up in Delhi. We used to have a large open space where I and neighborhood kids used to play cricket . Eventually the whole area converted to slums with people from Bangladesh. They took over the whole area. I was too young to care about ethnicity but the loss of my cricket field still bothers me. My neighbor was a bank manager and he once said that the government politicians forced him to give "loans" to Bangladeshi people , with no documents and only their thumbprint,before elections to those people to ensure victory of the ruling party.

ethbr1 · 2025-12-02T20:23:56 1764707036

What does any of this have to do with voter registrations?

manishsharan · 2025-11-06T21:37:26 1762465046

How.. please don't say use langxxx library

I am looking for a language or library agnostic pattern like we have MVC etc. for web applications. Or Gang of Four patterns but for building agents.

tptacek · 2025-11-06T21:39:35 1762465175

The whole post is about not using frameworks; all you need is the LLM API. You could do it with plain HTTP without much trouble.

manishsharan · 2025-11-06T21:45:22 1762465522

When I ask for Patterns, I am seeking help for recurring problems that I have encountered. Context management .. small llms ( ones with small context size) break and get confused and forget work they have done or the original goal.

zahlman · 2025-11-06T23:01:02 1762470062

Start by thinking about how big the context window is, and what the rules should be for purging old context.

Design patterns can't help you here. The hard part is figuring out what to do; the "how" is trivial.

skeledrew · 2025-11-06T22:50:47 1762469447

That's why you want to use sub-agents which handle smaller tasks and return results to a delegating agent. So all agents have their own very specialized context window.

tptacek · 2025-11-06T22:56:02 1762469762

That's one legit answer. But if you're not stuck in Claude's context model, you can do other things. One extremely stupid simple thing you can do, which is very handy when you're doing large-scale data processing (like log analysis): just don't save the bulky tool responses in your context window once the LLM has generated a real response to them.

My own dumb TUI agent, I gave a built in `lobotomize` tool, which dumps a text list of everything in the context window (short summary text plus token count), and then lets it Eternal Sunshine of the Spotless Agent things out of the window. It works! The models know how to drive that tool. It'll do a series of giant ass log queries, filling up the context window, and then you can watch as it zaps things out of the window to make space for more queries.

This is like 20 lines of code.

adiasg · 2025-11-07T00:15:31 1762474531

Did something similar - added `summarize` and `restore` tools to maximize/minimize messages. Haven't gotten it to behave like I want. Hoping that some fiddling with the prompt will do it.

lbotos · 2025-11-07T00:32:58 1762475578

FYI -- I vouched for you to undead this comment. It felt like a fine comment? I don't think you are shadowbanned but consider emailing the mods if you think you might me.

oooyay · 2025-11-06T21:49:50 1762465790

I'm not going to link my blog again but I have a reply on this post where I link to my blog post where I talk about how I built mine. Most agents fit nicely into a finite state machine or a directed acyclic graph that responds to an event loop. I do use provider SDKs to interact with models but mostly because it saves me a lot of boilerplate. MCP clients and servers are also widely available as SDKs. The biggest thing to remember, imo, is to keep the relationship between prompts, resources, and tools in mind. They make up a sort of dynamic workflow engine.

manishsharan · 2025-11-06T20:50:49 1762462249

Fine .. say your country has a several years of drought and bad harvest. What happens then ? Do you trade then ?

Or .. lets say due to weather, your farmers can not grow enough oranges or some fruit which drives up local prices. Should only the richest people in your country get to eat fruits ?

Or you discover lithium deposits that your national industry can not use . Should you let that just sit there knowing it could make your province prosperous if traded.

foxglacier · 2025-11-12T21:00:15 1762981215

You took a far too extreme interpretation that ended up backwards. What normal countries do is trade anyway but with tariffs and subsidies so that in normal circumstances, local food is competitive to keep farmers operating but if there's a local production problem, imported goods become more competitive. That buffers the population from extremes of price/availability.

golemotron · 2025-11-06T22:27:07 1762468027

Sure, you can trade but it is a choice. Claiming that free trade is universally good is saying that there is only one right choice - no barriers.

manishsharan · 2025-10-31T15:59:32 1761926372

I am here to hear from folks running LLM on Framework desktop (128GB). Is it usable for agentic coding ?

strangattractor · 2025-10-31T17:15:24 1761930924

Just started going down that route myself. For the money it performs well and runs most of the models at reasonable speeds.

1. Thermal considerations are important due to throttling for thermal protection. Apple seems best at this but $$$$. The Framework (AMD) seems a reasonable compromise (you can have almost 3 for 1 Mini). Laptops will likely not perform as well. NVIDIA seems really bad at thermal/power considerations.

2. Memory model matters and AMD's APU design is an improvement. NVIDIA GPUs where designed for graphics but where better than CPUs for AI so they got used. Bespoke AI solutions will eventually dominate. That may or may not be NVIDIA in the future.

My primary interest is AI at the edge.

manishsharan · 2025-10-20T16:30:03 1760977803

Thanks for sharing. TIL about rerankers.

Chunking strategy is a big issue. I found acceptable results by shoving large texts to to gemini flash and have it summarize and extract chunks instead of whatever text splitter I tried. I use the method published by Anthropic https://www.anthropic.com/engineering/contextual-retrieval i.e. include full summary along with chunks for each embedding.

I also created a tool to enable the LLM to do vector search on its own .

I do not use Langchain or python.. I use Clojure+ LLMs' REST APIs.

crassT · 2025-10-24T09:35:02 1761298502

I made a startup, https://tokencrush.ai/, to do just this.

I've struggled to find a target market though. Would you mind sharing what your use case is? It would really help give me some direction.

esafak · 2025-10-20T16:46:11 1760978771

Have you measured your latency, and how sensitive are you to it?

manishsharan · 2025-10-20T17:04:02 1760979842

>> Have you measured your latency, and how sensitive are you to it?

Not sensitive to latency at all. My users would rather have well researched answers than poor answers.

Also, I use batch mode APIs for chunking .. it is so much cheaper.

manishsharan · 2025-10-20T15:10:46 1760973046

You mean multi-cloud strategy ! You wanna know how you got here ?

See the sales team from Google flew out an executive to NBA Finals, Azure Sales team flew out another executive to NFL superBowl and the AWS team flew out yet another executive to Wimbledon finals. And thats how you end up with multi-cloud strategy.

ibejoeb · 2025-10-20T16:10:22 1760976622

In this particular case, it was resume-oriented architecture (ROAr!) The original team really wanted to use all the hottest new tech. The management was actually rather unhappy, so the job was to pare that down to something more reliable.

kevstev · 2025-10-20T15:49:27 1760975367

Eh, businesses want to stay resilient to a single vendor going down. My least favorite question in interviews this past year was around multi-cloud. Because imho it just isn't worth it- the increased complexity, the trying to like-like services across different clouds that aren't always really the same, and then just the ongoing costs of chaos monkeying and testing that this all actually works, especially in the face of a partial outage like this vs something "easy" like a complete loss of network connectivity... but that is almost certainly not what CEOs want to hear (mostly who I am dealing with here going for VPE or CTO level jobs).

I could care less about having more vendor dinners when I know I am promising a falsehood that is extremely expensive and likely going to cost me my job or my credibility at some point.

pluto_modadic · 2025-10-20T17:40:15 1760982015

sticker shock / looking at alternative vendors