I did a post [0] about this last year, and vanilla LLMs didn’t do nearly as well as I’d expected on advent of code, though I’d be curious to try this again with Claude code and codex
> LLMs, and especially coding focused models, have come a very long way in the past year.
I see people assert this all over the place, but personally I have decreased my usage of LLMs in the last year. During this change I’ve also increasingly developed the reputation of “the guy who can get things shipped” in my company.
I still use LLMs, and likely always will, but I no longer let them do the bulk of the work and have benefited from it.
Last April I asked Claude Sonnet 3.7 to solve AoC 2024 day 3 in x86-64 assembler and it one-shotted solutions for part 1 and 2(!)
It's true this was 4 months after AoC 2024 was out, so it may have been trained on the answer, but I think that's way too soon.
Day 3 in 2024 isn't a Math Olympiad tier problem or anything but it seems novel enough, and my prior experience with LLMs were that they were absolutely atrocious at assembler.
Let me ask the same with:
- runs on a laptop CPU
- decide if a long article is relevant to a specified topic. Maybe even a summary of the article or picking the interesting part as specified in prompt instructions.
- no fine tuning please.
Though instead of being a single file, you and LLMs cater your context to be easily searchable (folders and files). It’s all version controlled too so you can easily update context as projects evolves.
Although I developed it explicitly without search, and catered it to the latest agents which are all really good at searching and reading files. Instead you and LLMs cater your context to be easily searchable (folders and files). It’s meant for dev workflows (i.e a projects context, a user context)
I made a video showing how easy it is to pull in context to whatever IDE/desktop app/CLI tool you use
I’ve been building a tool to help me co-manage context better with LLMs
When you load it to your favourite agents, you can safely assume whatever agent you’re interacting is immediately up to speed with what needs to get done, and it too can update the context via MCP
I built a tool for this exact workflow in mind but with MCP and versioning included so you can easily track and edit the files on any platform including cursor, Claude desktop etc
CLI for the humans, MCP for the LLMS. Whatever is in the context repository should be used by the LLM for its next steps and you are both responsible for maintaining it as tasks start getting done and the project evolves
I have a similar flow to this, but using files which are part of the repository. For you tool; What made you choose to version it with git but also write context is not code? Wouldn't you end up with multiple versions of say your backlog, and to what benefit?
The way I see it is that context evolves somewhat orthogonally to code. But you still want to track it in similar ways. Having git under the hood makes it easy to track its evolution and undo/diff things LLMs might decide to change, but also means that tracking your todos and new feature ideas don’t pollute your codebase
https://www.jerpint.io/blog/2021-03-18-cnn-cheatsheet/
reply