Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLMs forcing us to improve our documentation habits. Seriously though, many languages allow API doc generation out of comments. Maybe these docs can just be flattened into a file.


Yes sort of. This particular codebase is a mix of Java and Kotlin, and all my internal code is documented with proper Javadocs/KDocs already since years, just for myself and other people I work with. That's partly why Gemini can make such accurate maps.

The problem isn't a lack of docs but rather birds-eye context: even with models that allow huge context windows and are fast, you can drown a model in irrelevant stuff and it's expensive. I'm still with Claude 3.5 for coding and its window is large but not unlimited. You really don't want to add a bunch of source files _and_ the complete API docs for tens of thousands of classes into every prompt, not unless you like waiting whilst money burns and getting problems due to the model getting distracted.

It's also just wasteful, docs contain a lot of redundancy and stuff the model can guess. If you ask models to make notes about only the surprising stuff, it's a form of compression that lets you make smaller prompts.

Aider provides a quick fix because it's easy to control what files are in the context. But to 'level up' I need to let the AI find and add files itself. Aider can do this: it gives the model tools for requesting files to be added to the chat. And in theory, Aider computes a PageRank over symbols and symbolic references to find the most important stuff in the repository and computes a map that's prepended to the prompt so the model knows what to ask for. In practice for reasons I don't understand, Aider's repo maps in this project are full of random useless stuff. Maybe it works better for Python.

Finding the right way to digest codebases is still an open problem. I haven't tried RAG, for instance. If things are well abstracted it in theory shouldn't be needed.


Indexing automatically generated API docs using Cursor seems to work very well. I also index any guides/mdbooks libraries have available, depending on whether I’m trying to implement something new or modifying existing code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: