I think both Cursor and Cognition and going in the same direction of SWE-grep[0].
SWE-grep was able to hit ~700tokens/s and Cursor ~300token/s, hard to compare the precision/recall and cost effectiveness though, considering SWE-grep also adopted a "hack" of running it on Cerebras.
I'm trying to kickstart a RL-based code search project called "op-grep" here[1], still pretty early, but looking for collaborators!
Thanks for sharing your blog! Very interesting work, 100% agree with your 3 criteria on the sweet spot for AI. Most systems performance problems fit right in
I recommend reading Shopify CEO Tobi's try[0] for good example of how Ruby's block behavior and meta-programming makes it easy to create a single file, shell wrapper.
I've drafted an architecture, with the steps mainly as so:
1. Collect actions (grep/glob/read) policies either from usage logs or open datasets
2. Optimize by removing redundant actions or parallelization
3. Train model on optimized action policy
4. Release model as a single file, MCP tool
(Refer to repo for visual diagram of the architecture)
I've just released the base model and added `openai_forwarder.py` to start collecting action policies.
Looking for more eyes and contributors to make this a reality, thanks!
This has very little resemblance of SWE-grep haha. At least fine-tune a small pre-trained LLM or something on a retrieval dataset. But no, this literally tries to train a small RNN from scratch to retrieve results given a natural language query...
we have other things in store that can be used by other coding agents, this one was tuned to use custom fast search tools that kinda wouldnt be useful in other agents
[0]: https://blog.toolkami.com/mcp-server-in-a-file/