Hacker Newsnew | past | comments | ask | show | jobs | submit | tybug's commentslogin

I didn't expect to see Hegel when opening up HN today! Feel free to ask any questions about it. We released hegel-go earlier this week, and plan to release hegel-cpp sometime next week, so look forward to that :)

How exciting! I wrote my own pbt lib for zig (https://github.com/AntoineBalaine/zlowcheck) and it made me sad I couldn't get it nearly close to hypothesis. Looking forward to see this grow! Any hope for ffi through the c abi?

We plan to rewrite hypothesis in rust and expose an FFI through that! This is a medium term plan (months, not weeks or years), but we're acutely aware that relying on a python component is not a long-term solution.

Is the protocol documented so that other people can build language front-ends?

Yes! I just wrote up documentation for the protocol earlier this week: https://hegel.dev/reference/protocol.

In reality, we hope to provide more guidance than this to people who want to write their own language frontend. This protocol reference doesn't talk about the realities of [hegel-core](https://github.com/hegeldev/hegel-core) and how to invoke it, for example.

We intend to write a "How to write your own Hegel library" how-to guide. You can subscribe to this issue to get notified when we write that: https://github.com/hegeldev/website/issues/3.

If you're eager, pointing your favorite LLM at https://hegel.dev/reference/protocol + https://github.com/hegeldev/hegel-rust and asking it to write you one for your language of choice should be enough to get you started!


To put it on the record: my position is current models can't get us there, and neither can the next iteration of models, but in two model iterations this will be worth doing. There's a lot of fiddly details in Hypothesis that are critical to get right. You can get a plausible 80% port with agents today but find they've structured it in a way to make it impossible to get to 100%.


Yep, `#[derive(DefaultGenerator)]` and `generators::default<T>()` are the right tools here.

This is one of the areas we've dogfooded the least, so we'd definitely be happy to get feedback on any sharp corners here!

I think `from_type` is one of Hypothesis's most powerful and ergonomic strategies, and that while we probably can't get quite to that level in rust, we can still get something that's pretty great.


What do you think we're currently missing that Python's `from_type` has? I actually think the auto-deriving stuff we currently have in Rust is as good or better than from_type (e.g. it gets you the builder methods, has support for enums), but I've never been a heavy from_type user.


`from_type` just supports a bunch more things than rust ever can due to the flexibility of python's type system. `from_type(object)` is amazing, for example, and not something we can write in rust.


Yeah, that's true. I was going to say that it's maybe not fair to count things that just don't even make sense in Rust, but I guess the logical analogue is something like `Box<dyn MyTrait>` which it would make sense to have a default generator for but also we're totally not going to support that.


Thank you! I have some particularly annoying proptest-based tests that I'll try porting over to Hegel soon. (Thanks for writing the Claude skill to do this.)


Please let us know how it goes!

As Liam says, the derive generator is not very well dogfooded at present. The claude skill is a bit better, but we've only been through a few iterations of using it and getting Claude to improve it, and porting from proptest is one of the less well tested areas (because we don't use proptest much ourselves).

I expect all of this works, but I'd like to know ways that it works less well than it could. Or, you know, to bask in the glow of praise of it working perfectly if that turns out to be an option.


Yep! Here's an experience report: https://github.com/hegeldev/hegel-rust/issues/148


I actually think there's another angle here where PBT helps, which wasn't explored in the blog post.

That angle is legibility. How do you know your AI-written slop software is doing the right thing? One would normally read all the code. Bad news: that's not much less labor intensive as not using AI at all.

But, if one has comprehensive property-based tests, they can instead read only the property-based tests to convince themselves the software is doing the right thing.

By analogy: one doesn't need to see the machine-checked proof to know the claim is correct. One only needs to check the theorem statement is saying the right thing.


Right, I said that property based tests are easier to read, and that's good. But people still have to actually read them. Also, because they still work best at the "unit" level, to understand them, the people reading them need to know how all the units are connected (e.g. a single person cannot review even PBTs required for 10KLOC per day [1]).

My point isn't so much about PBT, but about how we don't yet know just how much agents help write real software (and how to get the most help from them).

[1]: I'm only using that number because Garry Tan, CEO of YC, claimed to generate 10K lines of text per day that he believes to be working code and developers working with AI agents know they can't be.


As possibly the one community on earth where it's actually better to post the code than the blog post: TL;DR this is a universal property-based testing protocol (https://github.com/hegeldev/hegel-core) and family of libraries (https://github.com/hegeldev/hegel-rust, more to come later).

I've talked with lots of people in the PBT world who have always seen something like this as the end goal of the PBT ecosystem. It seemed like a thing that would happen eventually, someone just had to do it. I'm super excited to actually be doing it and bringing great PBT to every and any language.

It doesn't hurt that this is coming right as great PBT in every language is suddenly a lot more important thanks to AI code!


(Hypothesis maintainer here) If you have recommendations for a better example on the front page, I'd love to hear them! (I mean this entirely genuinely and non-sarcastically; I agree sorting can give misleading ideas, but it is also concise and well understood by every reader).


The more I think about it the more I think calling it a bad example may be unfair. It can be extremely misleading for someone unfamiliar with the concept coming at it with a particular viewpoint, but I’m less sure, with more time to think, that an example that is better for that wouldn’t be worse in other ways.

I like sorting as an example, and I like that using the built-in is concise, and reimplementing the behavior of an existing function where using the existing function as an oracle is a reasonable thing to do for a test isn't all that uncommon.

I feel like something with a couple of properties described in comments with assertions testing those properties (but where the functionality and properties are familiar enough that it would make a clear connection) would be a bit better, in theory, but I don't have a great particular example to use, and anything done that way will be, at best, somewhat less concise.


Appreciate the thoughts <3. I do think there might be stronger examples we could choose. Possibly json encode/decode..


(Hypothesis maintainer here)

Yup, a standard test suite just doesn't run for long enough for coverage guidance to be worthwhile by default.

That said, coverage-guided fuzzing can be a really valuable and effective form of testing (see eg https://hypofuzz.com/).


Thank you, Hypothesis is brilliant!


Thanks for the good work!


Nice! "testing your test code" is particularly important when dealing with PBT distributions, especially when your generator gets more complicated.

Tyche [0] is another cool tool for addressing the same problem, visualizing the PBT distribution but not making assertions about it.

[0] https://github.com/tyche-pbt/tyche-extension


The Hypothesis explain phase [1][2] does this!

  fails_on_empty_third_arg(
      a = "",  # or any other generated value
      b = "",  # or any other generated value
      c = "",  
      d = "",  # or any other generated value
  )
[1] https://hypothesis.readthedocs.io/en/latest/reference/api.ht...

[2] https://github.com/HypothesisWorks/hypothesis/pull/3555


That kind of behavior can happen at the threshold of Hypothesis' internal limit on entropy - though if you're not hitting HealthCheck.data_too_large then this seems unlikely.

Let me know if you have a reproducer, I'd be curious to take a look.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: