Hacker Newsnew | past | comments | ask | show | jobs | submit | akoboldfrying's commentslogin

The Poisson distribution is well approximated by the binomial distribution when n is high and p is low, which is exactly the case here. Despite the high variance in the sample mean, we can still make high-confidence statements about what range of incident rates are likely -- basically, dramatically higher rates are extremely unlikely. (Not sure, but I think it will turn out that confidence in statements about the true incident rate being lower than observed will be much lower.)

More data would certainly be better, but it's not as bad as you suggest -- the large number of miles driven till first incident does tell us something statistically meaningful about the incident rate per mile driven. If we view the data as a large sample of miles driven, each with some observed number of incidents, then what we have is "merely" an extremely skewed distribution. I can confidently say that, if you pick any sane family of distributions to model this, then after fitting just this "single" data point, the model will report that P(MTTF < one hundredth of the observed number of miles driven so far) is negligible. This would hold even if there were zero incidents so far.

We get a statistically meaningful result about an upper bound of the incident rate. We get no statistically meaningful lower bound.

No doubt the top influencer is doing better than the top plumber, but I'd say the median plumber is streets ahead of the median influencer.

The Kalman filter examples I've seen always involve estimating a very simple quantity, like the location of a single 3D point, from noisy sensors. It's clear how multiple estimates can be combined into a new estimate.

I'd guess that cameras on a self-driving car are trying to estimate something much more complex, something like 3D surfaces labeled with categories ("person", "traffic light", etc.). It's not obvious to me how estimates of such things from multiple sensors and predictions can be sensibly and efficiently combined to produce a better estimate. For example, what if there is a near red object in front of a distant red background, so that the camera estimates just a single object, but the lidar sees two?


https://www.bzarg.com/p/how-a-kalman-filter-works-in-picture...

Kalman filters basic concept is essentially this.

1. make prediction on the next state change of some measurable n dimentional quantity, and estimate the covariance matrix across those n dimentions, which describe essentially a probability that the i-th dimention is going to increase (or decrease) with j-th dimention, where i and j are between 0 and n (indices of the vector)

2. Gather sensor data (that can be noisy), and reconcile the predicted measurement with the measured to get the best guess. The covariance matrix acts as a kind of weight for each of the elements

3. Update the covariance matrix based on the measurements in previous step.

You can do this for any vector of numbers. For example, instead of tracking individual objects, you can have a grid where each element represents a physical object that the car should not drive into, with a value representing certainty of that object being there. Then when you combine sensor reading, you still can use your vision model but that model would be enhanced by what lidar detects, both in terms of seeing things that camera doesn't pick up and rejecting things that aren't there.

And the concept is generic enough to where you can set up a system to be able to plug in any additional sensor with its own noise, and it all works out in the end. This is used all the You can even extend the concept past Gaussian noise and linearity, there are a number of other filters that deal with that, broadly under the umbrella of sensor fusion.

The problem is that Karpathy is more of a computer scientist, so he is on his Code 2.0 train of having ML models do everything. I dunno if he is like that himself or Musks "im smarter than everyone else that came before me" rubbed off.

And of course when you think like that, its going to be difficult to integrate lidar into the model. But the problem with that thinking is that forward inference LLM is not AI, and it will never ever be able to drive a car well compared to a true "reasoning" AI with feedback loops.


You're a machine. You're literally a wet, analog device converting some forms of energy into other forms just like any other machine as you work, rest, type out HN comments, etc. There is nothing special about the carbon atoms in your body -- there's no metadata attached to them marking them out as belonging to a Living Person. Other living-person-machines treat "you" differently than other clusters of atoms only because evolution has taught us that doing so is a mutually beneficial social convention.

So, since you're just a machine, any text you generate should be uninteresting to me -- correct?

Alternatively, could it be that a sufficiently complex and intricate machine can be interesting to observe in its own right?


If humans are machines, they are still a subset of machines and they (among other animals) are the only ones who can be demotivated and so it is still a mistake to assume an entirely different kind of machine would have those properties.

>Other living-person-machines treat "you" differently than other clusters of atoms only because evolution has taught us that doing so is a mutually beneficial social convention

Evolution doesn't "teach" anything. It's just an emergent property of the fact that life reproduces (and sometimes doesn't). If you're going to have this radically reductionist view of humanity, you can't also treat evolution as having any kind of agency.


"If humans are machines, they are still a subset of machines and they (among other animals) are the only ones who can be demotivated and so it is still a mistake to assume an entirely different kind of machine would have those properties."

Yet.


Sure but the entire context of the discussion is surprisial that they don't.

Agreed - There is no guarantee of what will happen in the future. I'm not for or against the outcome, but certainly curious to see what it is.

Humans and all other organisms are "literally" not machines or devices by the simple fact that those terms refer to works made for a purpose.

Even as an analogy "wet machine" fails again and again to adequately describe anything interesting or useful in life sciences.


Wrong level of abstraction. And not the definition of machine.

I might feel awe or amazement at what human-made machines can do -- the reason I got into programming. But I don't attribute human qualities to computers or software, a category error. No computer ever looked at me as interesting or tenacious.


Berkson's Paradox seems to rely on the selection criteria being a combination of the two traits in question -- in the example I keep reading about, only "famous" actors are selected, and actors can be famous if they are either highly talented or highly attractive. But in TFA, surely the "high performance" selection filter applies only to the adult performance level?

To put it another way: If selection was restricted to people who performed highly in either their youth or in adulthood (or both), Berkson's Paradox explains the result. If selection was restricted to people who performed highly in their youth, or if selection was restricted to people who performed highly in adulthood, Berkson's doesn't explain it.


>Berkson's Paradox seems to rely on the selection criteria being a combination of the two traits in question

100% correct. For traits x and y, selecting for datapoints in the region x + y > z will always yield a spurious negative correlation for sufficiently uncorrelated data, since the boundary of the inequality x + y > z is a negatively sloping line.

>But in TFA, surely the "high performance" selection filter applies only to the adult performance level?

Doesn't seem that way. Reading the full paper [0], they say:

   In sports, several predictor effects on early junior performance and on later senior world-class performance are not only different but are opposite. [...] The different pattern of predictor effects observed among adult world-class athletes is also evident in other domains. For example, Nobel laureates in the sciences had slower progress in terms of publication impact during their early years than Nobel nominees. Similarly, senior world top-3 chess players had slower performance progress during their early years than 4th-to 10th-ranked senior players, and fewer world top-3 than 4th- to 10th-ranked senior chess players earned the grandmaster title of the International Chess Federation (FIDE) by age 14.
It really does seem they took the set of people who were either elite as a kid, elite as an adult, or both, and concluded that this biased selection constitutes a negative correlation.

[0] https://www.kechuang.org/reader/pdf/web/viewer?file=%2Fr%2F3...


Thanks for finding the relevant quote, but I interpret it the opposite way. Specifically:

> The different pattern of predictor effects observed among adult world-class athletes

This seems to be selecting on just one trait (high adult performance). Where is the second trait (high childhood performance) mentioned?


Well, it's to be expected that heuristics are needed, since the join ordering subproblem is already NP-hard -- in fact, a special case of it, restricted to left-deep trees and with selectivity a function of only the two immediate child nodes in the join, is already NP-hard, since this is amounts to the problem of finding a lowest-cost path in an edge-weighted graph that visits each vertex exactly once, which is basically the famous Traveling Salesperson Problem. (Vertices become tables, edge weights become selectivity scores; the only difficulty in the reduction is dealing with the fact that the TSP wants to include the cost of the edge "back to the beginning", while our problem doesn't -- but this can be dealt with by creating another copy of the vertices and a special start vertex, ask me for the details if you're interested.)

Self-nitpick (too late to edit my post above): I used the phrase "special case" wrongly here -- restricting the valid inputs to a problem creates a strictly-no-harder special case, but constraining the valid outputs (as I do here regarding left-deep trees) can sometimes actually make the problem harder -- e.g., integer linear programming is harder than plain "fractional" linear programming.

So it's possible that the full optimisation problem over all join tree shapes is "easy", even though an output-constrained version of it is NP-hard... But I think that's unlikely. Having an NP-hard constrained variant like this strongly suggests that the original problem is itself NP-hard, and I suspect this could be shown by some other reduction.

> with selectivity a function of only the two immediate child nodes in the join

This should be "with selectivity a function of the rightmost leaves of the two child subtrees", so that it still makes sense for general ("bushy") join trees. (I know, I'm talking to myself... But I needed to write this down to convince myself that the original unconstrained problem wasn't just the (very easy) minimum spanning tree problem in disguise.)


Also AMT-130 for Huntington's disease.

Nit: s/Reddit/Redis/

Though it is fun to imagine using Reddit as a key-value store :)


That is hilarious.... and to prove the point of this whole comment thread, I created reddit-kv for us. It seems to work against a mock, I did not test it against Reddit itself as I think it violates ToS. My prompts are in the repo.

https://github.com/ConAcademy/reddit-kv/blob/main/README.md


Typo-Driven Development!

Aaarg I was typing quickly and mistyped. :face-palm:

Thanks for the correction.


Why would statically linking a library reduce the number of vulnerabilities in it?

AFAICT, static linking just means the set of vulnerabilities you get landed with won't change over time.


> Why would statically linking a library reduce the number of vulnerabilities in it?

I use pure go implementations only, and that implies that there's no statically linked C ABI in my binaries. That's what disabling CGO means.


What I mean is: There will be bugs* in that pure Go implementation, and static linking means you're baking them in forever. Why is this preferable to dynamic linking?

* It's likely that C implementations will have bugs related to dynamic memory allocation that are absent from the Go implementation, because Go is GCed while C is not. But it would be very surprising if there were no bugs at all in the Go implementation.


They're prioritizing memory corruption vulnerabilities, is the point of going to extremes to ensure there's no compiled C in their binaries.

It would be nice if there was something similar to the ebpf verifier, but for static C, so that loop mistakes, out of boundary mistakes and avoidable satisfiability problems are caught right in the compile step.

The reason I'm so avoidant to using C libraries at all cost is that the ecosystem doesn't prioritize maintenance or other forms of code quality in its distribution. If you have to go to great lengths of having e.g. header only libraries, then what's the point of using C99/C++ at all? Back when conan came out I had hopes for it, but meanwhile I gave up on the ecosystem.

Don't get me wrong, Rust is great for its use cases, too. I just chose the mutex hell as a personal preference over the wrapping hell.


I believe this is fil-c[1].

[1] https://fil-c.org/


What do you consider to be a loop mistake?

Everything that is a "too clever" state management in an iterative loop.

Examples that come to mind: queues that are manipulated inside a loop, slice calls that forget to do length-- of the variable they set in the begin statement, char arrays that are overflowing because the loop doesn't check the length at the correct position in the code, conditions that are re-set inside the loop, like a min/max boundary that is set by an outer loop.

This kind of stuff. I guess you could argue these are memory safety issues. I've seen so crappy loop statements that the devs didn't bother to test it because they still believed they were "smart code", even after sending the devs a PoC that exploited their naive parser assumptions.

In Go I try to write clear, concise and "dumb" code so that a future me can still read it after years of not touching it. That's what I understand under Go's maintainability idiom, I suppose.


You can have memory corruption in pure Go code, too.

And in Rust (yes, safe Rust can have memory safety vulnerabilities). Who cares? They basically don't happen in practice.

Uh huh. That's where all the Go memory corruption vulnerabilities come from!

Nobody claimed otherwise. You're interacting with a kernel that invented its own programming language based on macros, after all, instead of relying on a compiler for that.

What could go wrong with this, right?

/s


About a year ago I had some code I had been working on for about a year subject to a pretty heavy-duty security review by a reputable review company. When they asked what language I implemented it in and I told them "Go", they joked that half their job was done right there.

While Go isn't perfect and you can certainly write some logic bugs that sufficiently clever use of a more strongly-typed language might let you avoid (though don't underestimate what sufficiently clever use of what Go already has can do for you either when wielded with skill), it has a number of characteristics that keep it somewhat safer than a lot of other languages.

First, it's memory safe in general, which obviously out of the gate helps a lot. You can argue about some super, super fringe cases with unprotected concurrent access to maps, but you're still definitely talking about something on the order of .1% to .01% of the surface area of C.

Next, many of the things that people complain about Go on Hacker News actually contribute to general safety in the code. One of the biggest ones is that it lacks any ability to take an string and simply convert it to a type, which has been the source of catastrophic vulnerabilities in Ruby [1] and Java (Log4Shell), among others. While I use this general technique quite frequently, you have to build your own mechanism for it (not a big deal, we're talking ~50 lines of code or so tops) and that mechanism won't be able to use any class (using general terminology, Go doesn't have "classes" but user-defined types fill in here) that wasn't explicitly registered, which sharply contains the blast radius of any exploit. Plus a lot of the exploits come from excessively clever encoding of the class names; generally when I simply name them and simply do a single lookup in a single map there isn't a lot of exploit wiggle room.

In general though it lacks a lot of the features that get people in trouble that aren't related to memory unsafety. Dynamic languages as a class start out behind the eight-ball on this front because all that dynamicness makes it difficult to tell exactly what some code might do with some input; goodness help you if there's a path to the local equivalent of "eval".

Go isn't entirely unique in this. Rust largely shares the same characteristics, there's some others that may qualify. But some other languages you might expect to don't; for instance, at least until recently Java had a serious problem with being able to get references to arbitrary classes via strings, leading to Log4Shell, even though Java is a static language. (I believe they've fixed that since then but a lot of code still has to have the flag to flip that feature back on because they depend on it in some fundamental libraries quite often.) Go turns out to be a relatively safe security language to write in compared to the landscape of general programming languages in common use. I add "in common use" and highlight it here because I don't think it's anywhere near optimal in the general landscape of languages that exist, nor the landscape of languages that ought to exist and don't yet. For instance in the latter case I'd expect capabilities to be built in to the lowest layer of a language, which would further do great, great damage to the ability to exploit such code. However no such language is in common use at this time. Pragmatically when I need to write something very secure today, Go is surprisingly high on my short list; theoretically I'm quite dissatisfied.

[1]: https://blog.trailofbits.com/2025/08/20/marshal-madness-a-br...


I love golang a lot and I feel like in this context of QuickJS it would be interesting to see what a port of QuickJS with Golang might look like security wise & a comparison to rust in the amount of security as well.

Of course Golang and rust are apples to oranges comparison but still, if someone experienced in golang were to say port to QuickJS to golang and same for rust, aside from some performance cost which can arise from Golang's GC, what would be the security analysis of both?

Also Offtopic but I love how golang has a library for literally everything mostly but its language development ie runtime for interpreted langs/JIT's or transpilation efforts etc. do feel less than rust.

Like For python there's probably a library which can call rust code from Python, I wish if there was something like this for golang and I had found such a project (https://github.com/go-python/gopy) but it still just feels a little less targeted than rust within python which has libraries like polars and other more mature libraries


If you want to see what a JS interpreter in Go would look like, you can look at https://pkg.go.dev/github.com/robertkrimen/otto and https://github.com/dop251/goja . Of course they aren't "ports", but I feel like having two fairly complete intepreters is probably enough to prove the point. Arguably even a "port" would require enough changes that it wouldn't really be a "port" anyhow.

(The quickjs package in the sibling comment is the original compiled into C. It will probably have all the security quirks of the original as a result.)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: