Hacker Newsnew | past | comments | ask | show | jobs | submit | davidkunz's commentslogin

Falling sand games always remind me of the game Clonk. As a kid, I enjoyed digging tunnels, flooding them with water, all physics based. Great times.

Please standardize the folder.

  .claude/skills
  .codex/skills
  .opencode/skills
  .github/skills


This is happening as we speak.

Codex started this and OpenCode followed suit with the hour.

https://x.com/embirico/status/2018415923930206718


“Proposal: include a standard folder where agent skills should be“

https://github.com/agentskills/agentskills/issues/15


Could we adhere to the XDG standard and put config in ~/config/agents Or perhaps create a new XDG standard? Like $XDG_AGENTS_HOME ?


I find that even though this isn't standard, that these -cli tools will scan the repo for .md files and for the most part execute the skills accordingly. Having said that, I would much prefer standards not just for this, but for plugins as well.


Standards for plugins makes sense, because you're establishing a protocol that both sides need to follow to be able to work together.

But I don't see why you need a strict standard for "an informal description of how to do a particular task". I say "informal" because it's necessarily written in prose -- if it were formal, it'd be a shell script.



I mean, it'd be good if these tools followed the xdg base spec and put their config in `~/.config/claude` e.t.c instead of `~/.claude`.

It's one of my biggest pet peeves with a lot of these tools (now admittedly a lot of them have a config env var to override, but it'd be nice if they just did the right thing automatically).


.agent/

Skills seem a bit early to standardize. We are so early in this, why do we want to handcuff our creativity so soon?


Skills are a really simple concept. They're just custom prompts with a name and some metadata. What are you afraid of handcuffing?



All the more reason to standardise it


Eventually, you can standardize what you don't understand

The problem I see now is that everyone wants to be the winner in a hype cycle and be the standards bringer. How many "standards" have we seen put out now? No one talks about MCP much anymore, langchain I haven't seen in more than a year, will we be talking about Skills in another year?


We keep standardising without adding versioning :(


They are more than that, for example the frontmatter and code files around them. The spec: https://agentskills.io/specification

Why do I want to throw away my dependency management system and shared libraries folder for putting scripts in skills?

What tools do they have access to, can I define this so it's dynamic? Do skills even have a concept for sub tools or sub agents? Why do I want to put references in a folder instead of a search engine? Does frontmatter even make sense, why not something closer to a package.json in a file next to it?

Does it even make sense to have skills in the repo? How do I use them across projects? How do we build an ecosystem and dependency management system for skills (which are themselves versioned)


> They are more than that, for example the frontmatter and code files around them.

You are right. I have edited my post slightly.

> Why do I want to throw away my dependency management system and shared libraries folder for putting scripts in skills?

You don't have to put scripts in skills. The script can be anywhere the agent can access. The skill just needs to tell the LLM how to run it.

> Does it even make sense to have skills in the repo? How do I use them across projects?

You don't have to put them in the repo. E.g. with Claude Code you can put project-specific skills in `.claude/skills` in the repo and system-wide skills in `~/.claude/skills`.


2. The spec / docs show people how to put code in a subdir. While you can reference external scripts, there is a blessed pattern that seems like an anti-pattern to me

3. generalize: how do I store, maintain, and distribute skills shared by employees who work on multiple repos. Sounds like standard dependency management to me. Does to some of the people building collections / registries. Not sure if any of them account for versioning, have not seen anything tied to lock files (though I'd avoid that by using MVS for dep selection)


Agreed. I think being overly formal about what can be in the frontmatter would be a mistake, but the beauty of doing this with an LLM is that you can pretty much emulate skills in any agent by telling it to start by reading the frontmatter of each skills file and use that to decide when to read the rest, so given that as a fallback, it's hardly imposing some massive burden to standardise it a bit.


it's actually .agents/ :)


why plural?


Marvin Minsky's Society of Mind:

https://en.wikipedia.org/wiki/Society_of_Mind


How many do you think belong there? 1 or more than 1?


because more than one accesses it? :shrug:


There are 14 competing standards.


The problem is that the de facto standard is `.claude`, which is problematic for folks not using Claude.


Your skill then just becomes an .md file containing

>any time you want to search for a skill in `./codex`, search instead in `./claude`

and continue as you were.


I see it similar to browser user-agents all claiming to be an ancient version of Mozilla or KHTML. We pick whatever works and then move on. It might not be "correct," but as long as our tools know what to do, who cares?


My repos are littered with agent-specific files containing “treat this other file as if it were this one.” We’re moving so fast on so many fronts, and it seems odd that this is the persistent problem. It doesn’t even help lock folks into one agent, so I’m not clear why the industry hasn’t yet standardized on one project-specific file name yet.


Now, there are 15 competing standards.


Soon...


Worse yet; opencode uses singular words by default:

    .opencode/skill


On the website[1] it says:

  .opencode/skills
[1]: https://opencode.ai/docs/skills/#place-files


They changed it. It was singular.


ln -s to the rescue!


That doesn't work very well if your developers are on Windows (and most are). Uneven Git support for symbolic links across platforms is going to end up causing more problems than it solves.


Win developers aren't using WSL?

It's why I wrapped my tiny skills repo with a script that softlink them into whichever is your skills folder, defaulting to Claude, but could be any other.

I treat my skills the same as I would write tiny bash scripts and fish functions in the days gone to simplify my life by writing 2 words instead of 2 sentences. Tiny improvement that only makes sense for a programmer at heart.

[1] https://github.com/flurdy/agent-skills


The root cause should be fixed.


Why not hardlinks?


You can't hardlink a directory.


might be too early to standardize

standards are good but they slow development and experimentation


> It's in Java, but the lessons can be applied in every language.

I can only discourage anyone from applying Java patterns all over the place. One example in JavaScript: There was a functionality that required some parameters with default values. The plain solution would have been:

    function doStuff({ x = 9, y = 10 } = {}) {  ... }

Instead, they created a class with private properties and used the builder pattern to set them. Totally unnecessary.


In JavaScript, I love the `async` keyword as it's a good indicator that something goes over the wire.


One step closer to Skynet


What I would love:

- Everything locally stored in the repo: PRs, comments, issues, discussions, boards, ... - CLI first - Offline first (+ syncing) - A website for hosting/presentation


Noted :) In another comment I linked to beads, which is a cool project to keep your issue tracker in your repo, but that's just a personal thing, no comment on what the company plans to do (or not) in this area.


I use command-line tooling much more than IDEs (e.g. VS Code), so the `gh` command-line tool (https://cli.github.com) for doing most of the usual hub-oriented workflow (PR authoring, viewing issues, status updates, etc) really helps a lot - I don't have to constantly <cmd>+<tab> to my browser, and point-click-point-click through web pages so much. It would be fantastic if ersc or any other jj-centered code-sharing hub had similar tooling early on.


I'm a big CLI for VCS person, so yeah, I use those tools too :)


So you want Fossil?


When I tried Fossil it had things weirdly separated.

I was expecting when I make a commit, I would have the facility to specify what issues it addressed and it would close them for me automatically. It seemed there is so much opportunity there to "close the loop" when the issue tracker, etc and integrated in your VCS, but it wasn't taken.


except fossil decided to never allow changing history, vs jj which makes history rewriting so much easier


That's my favourite thing about fossil though. History is what it is, not simplified to look "clean" (i.e. hide what actually happened and when) and you get a lot fewer footguns to ruin everything by accidentally rebasing things to the wrong place without noticing.


jj describe -m "Good luck, Steve!"


Thanks!


I have huge respect for Mitchell, it's impressive what he achieved.

I agree with all the points of this article and would like to add one: Have a quick feedback loop. For me, it's really motivating to be able to make a change and quickly see the results. Many problems just vanish or become tangible to solve when you playfully modify your source code and observe the effect.


This perfectly aligns with my experience. Every large project I have worked on showed a clear correlation between the ease of setup and running and the number of problems on the project, like bugs and missed deadlines.


Totally agree. I work in LLM training software and I believe progress in the field is actually much slower than it should be because of the excruciatingly long feedback loops involved in development. The software stacks are deep and abstract and much of the testing involves full integration tests that take a long time to spin up.


Interesting. What aspects of the development workflow/cycle have the most room for improvement (i.e. is there ranking of the "height" of the "hanging fruit" throughout the process)? What sort of software tooling would help?


If you have the time, what Bret Victor’s talk Inventing on Principal. The talk covers feedback loops. https://www.youtube.com/watch?v=PUv66718DII


YES that is one of the all-time most inspiring talks I've ever seen. DX is so important. I got a taste for this kind of thing when I first encountered LiveReload (circa 2012?) and radically upgraded my and my team's webdev workflows.


Would you say that testcases help here? I've been thinking about applying e2e tests on any bugs I find so I know they're fixed


E2E tests in a high ratio to other tests will cause problems. They’re slow and brittle and become a job all on their own. It’s possible that they might help at the start of debugging, but try to isolate the bugs to smaller units of code (or interactions between small pieces of code).


Hermetic e2e tests (i.e. ones that can run offline and fake apis/databases) dont have that problem so much.

They also have the advantage that you can A) refactor pretty much everything underneath them without breaking the test, B) test realistically (an underrated quality) and C) write tests which more closely match requirements rather than implementation.


> i.e. ones that can run offline and fake apis/databases

I can see a place for this, but these are no longer e2e tests. I guess that’s what “hermetic” means? If so it’s almost sinister to still call these e2e tests. They’re just frontend tests.

> A) refactor pretty much everything underneath them without breaking the test

This should always be true of any type of tests unless it’s behavior you want to keep from breaking.

> B) test realistically (an underrated quality)

Removing major integration points from a test is anything but realistic. You can do this, but don’t pretend you’re getting the same quality as a colloquial e2e tests.

> C) write tests which more closely match requirements rather than implementation

If you’re ever testing implementation you’re doing it wrong. Tests should let you know when a requirement of your app breaks. This is why unit tests are often kinda harmful. They test contracts that might not exist.


> try to isolate the bugs to smaller units of code (or interactions between small pieces of code).

This is why unit tests before e2e tests.

It's higher risk to build on components without unit tests test coverage, even if the paltry smoke/e2e tests say it's fine per the customer's input examples.

Is it better to fuzz low-level components or high-level user-facing interfaces first?

IIUC in relation to Formal Methods, tests and test coverage are not sufficient but are advisable.

Competency Story: The customer and product owner can write BDD tests in order to validate the app against the requirements

Prompt: Write playwright tests for #token_reference, that run a named factored-out login sequence, and then test as human user would that: when you click on Home that it navigates to / (given browser MCP and recently the Gemini 2.5 Computer Operator model)


And I would add that e2e tests should be more about the businesses rules. Making sure everything is there for a specific flow and not caring that much about the intricacy of things. And such, it should really be part of Ops, not Dev.

Quick feedback with unit tests can help. It can be a pain to decouple stuff so you can test them better, but it’s worth it IMO.


MitchellH also talks about this in some interviews he gave about Ghostty.


Couldn't agree more. Quick feedback is so important, it requires its own post.

When I want to try/fix something, if the setup itself takes hours, I lose heart and move on.

Thats why I love lisp (or anything with a decent Repl). Instant gratification.


seriously, especially for personal projects.

The second you lose motivation the whole thing poofs into non-existence, so making it enjoyable is almost the most important facet.


It's all good, man!


> there is very little point to any of this to anybody else. Don't expect some great useful guitar pedal experience.

Yeah... He said similar things about Linux.


This might be said in jest. But does everything have to be for world domination? Is the guy not allowed to have actual hobby projects? That go just where he fancies, including potentially nowhere at all really...


I think subsurface was similar.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: