I think OP is referring to the "unprivileged user namespaces" [1] feature of Linux, which caused numerous security incidents in the past. AFAIK, this is mainly because with this feature enabled, unprivilged users can create environments/namespaces which allow them to exploit kernel bugs much more easily. Most of them revolve around broken permission checks (read: root inside container but not outside, yet feature X falsely checks for the permissions _inside_). [2] has a nice list of CVEs caused by unprivileged user namespaces.
Given that rootful docker e.g. is also prone to causing security issues, it's ultimately an attacker model / pick-your-poison situation though.
My experience with Cpp isn't that extensive, but what is the use-case of a garbage collector in this language? I always had the impression that with the well-defined lifetimes of objects, you wouldn't really create garbage to begin with, but I guess there are some use-cases I don't yet know about?
It's pretty useful. Chrome uses one extensively for example (called Oilpan). So does Unreal Engine. GC in C++ is much more widely used than people realize.
The problem is that big programs in the real world often don't have well defined lifetimes for everything. The idea that everything in your app can have its lifetime worked out in advance isn't true when you're modelling the world (a world), or lifetimes are controlled by other people (e.g. website authors).
Generally what you see is that these apps start out trying to do manual memory management, decide it's too difficult to do reliably at scale, and switch to a conservatively GCd heap with a custom collector or Boehm.
Note that Rust doesn't fix this problem. Rust just encodes the assumption of pre-known lifetimes much more strongly in the language. If you're not in such a domain then you have to fall back to refcounting, which is (a) slow and (b) easy to get wrong such that you leak anyway. Early versions of Rust used refcounting for everything and iirc they found anywhere between 10-15% of the resulting binaries was refcounting instructions!
> Note that Rust doesn't fix this problem. Rust just encodes the assumption of pre-known lifetimes much more strongly in the language. If you're not in such a domain then you have to fall back to refcounting, which is (a) slow and (b) easy to get wrong such that you leak anyway. Early versions of Rust used refcounting for everything and iirc they found anywhere between 10-15% of the resulting binaries was refcounting instructions!
Well, modern idiomatic Rust only uses Arc/Rc on the few objects where it's needed, so the overhead of reference count adjustment is so tiny as to never show up. You typically only see reference count traffic be expensive when either (a) everything is reference counted, as in ancient Rust; or (b) on super-inefficient implementations of reference counting, as in COM where AddRef() and Release() are virtual calls.
Right, but that's what I mean by encoding the assumption of known lifetimes in the language. The design is intended for cases where most lifetimes are known, and only a few need ref counting, and you can reason about lifetimes and relationships well enough in advance to avoid cycles. At least, that's my understanding (my Rust-fu is shamefully poor).
> Early versions of Rust used refcounting for everything and iirc they found anywhere between 10-15% of the resulting binaries was refcounting instructions!
Do you happen to have a citation for this? I don’t remember ever hearing about it, but it’s possible this was before my time, as I started in the smart pointers era.
The Rust compiler does this. Even so, 19% of the binary size in rustc is adjusting reference counts.
I am not exaggerating this. One-fifth of the code in the binary is sitting there wasted adjusting reference counts. This is much of the reason we're moving to tracing garbage collection.
It's interesting how many strategies Rust tried before settling on linear types.
> It's interesting how many strategies Rust tried before settling on linear types.
Rust doesn’t actually have linear types. I’m not sure what Rust’s types are called (affine?), but linear types are the “must be consumed” (can’t leak) types, and Rust doesn’t have any support for this.
Rust’s guarantee is that you MUST NOT use an object after dropping it. Linear types would add the additional requirement that you MUST drop the object.
Due to the nature of web engine workloads migrating objects to being GC'd isn't performance negative (as most people would expect). With care it can often end up performance positive.
There are a few tricks that Oilpan can apply. Concurrent tracing helps a lot (e.g. instead of incrementing/decrementing refs, you can trace on a different thread), in addition when destructing objects, the destructors typically become trivial meaning the object can just be dropped from memory. Both these free up main thread time. (The tradeoff with concurrent tracing is that you need atomic barriers when assigning pointers which needs care).
This is on top of the safey improvements you gain from being GC'd vs. smart pointers, etc.
One major tradeoff that UAF bugs become more difficult to fix, as you are just accessing objects which "should" be dead.
> Are you referring to access through a raw pointer after ownership has been dropped and then garbage collection is non deterministic?
No - basically objects sometimes have some state of when they are "destroyed", e.g. an Element detached from the DOM tree[1]. Other parts of the codebase might have references to these objects, and previously accessing them after they destroyed would be a UAF. Now its just a bug. This is good! Its not a security bug anymore! However much harder to determine what is happening as it isn't a hard crash.
Not sure early versions of rust is the best example of refcounting overhead. There are a bunch of tricks you can use to decrease that, and it usually doesn't make sense to invest too much time into that type of thing while there is so much flux in the language.
Yeah I was thinking the same thing. "10 years ago the Rust compiler couldn't produce a binary without significant size coming from reference counts after spending minimal effort to try and optimize it" doesn't seem like an especially damning indictment of the overall strategy. Rust is a language which is sensitive to binary size, so they probably just saw a lower-effort, higher-reward way to get that size back and made the decision to abandon reference counts instead of sinking time into optimizing them.
It was probably right for that language at that time, but I don't see it as being a generalizable decision.
Swift and ObjC have plenty of optimizations for reference counting that go beyond "Elide unless there's an ownership change".
it's worth noting that percent of instructions is a bad metric since modern CPUs have lots of extra compute, so adding simple integer instructions that aren't in the critical path will often not affect the wall time at all.
The problem is that once threading gets involved they have to be interlocked adjustments and that's very slow due to all the cache coherency traffic. Refcounts are also branches that may or may not be well predicted. And you're filling up icache, which is valuable.
You could have a hybrid recount where you use basic integers when adjusting the ref count on the current thread and atomics to adjust the global count when you hit 0 (hybrid-rc is the crate you can use but something Swift-like where the compiler does ARC for specific values when you opt in may not be the worst). Also when the type isn’t Send then you don’t need to do atomic refcounts although the interaction with unsafe does get a like more complex.
But yeah, at this point Rust’s niche is a competitor to c/c++ and in that world implicit recounting doesn’t have a lot of traction and people favor explicit gc and “manual” resource management (RAII mechanisms like drop and destructors are ok).
> what is the use-case of a garbage collector in this language?
Same as other languages.
> I always had the impression that with the well-defined lifetimes of objects, you wouldn't really create garbage to begin with
There's no well defined lifetime of objects when it comes to dynamic allocation. For example, if you allocate something with the new keyword, there are no language guarantees that this won't leak memory.
I'm using C++ to build jank, a native Clojure dialect on LLVM with C++ interop. All of Clojure's runtime objects are dynamically allocated and the churn of reference counting is far too slow, compared to garbage collection. I had originally started with an intrusive_ptr and atomic count and the Boehm GC was about 2x faster for that benchmark (and at least as much for every later benchmark).
Even outside of writing languages on top of C++, if you're using something like immer, for persistent immutable data structures in C++ (as jank does), it has memory policies for reference counting or garbage collection. This is because immutable data generally results in more garbage, even when transients are used for the most pathological cases. That garbage is the trade off for knowing your values will never be mutated in place. The huge win of that is complete thread safety for reading those values, as well as complete trust on reproducibility/purity and trivial memoization.
MIT licensed projects can be taken and sold by others without sharing any changes made to it. With GPL, all changes when taken by a third party need to be openly available for everybody.
So the hardcore FOSS community (which GNU certainly counts to, see Richard Stallmann) is against using MIT and alike as it enables others to profit from your work without giving back any improvements to the open community.
(hand wavy, non-detailed explanation - read other sources for clarity)
So the MIT license is less restricted for users, but ultimately only in the interest of people looking for profit, not consumers
Specifically, if you are making a widget with embedded linux, the anti-tivoisation clauses in GPLv3 are a bit of a pain (even if you are not particularly trying to lock down the system from tinkering): they effectively mean you need to develop and provide an extra partial-update mechanism.
This is an ideological view that I would oppose.
GPL is beneficial for the user compared to MIT because any improvements made by users of a project will be open for everybody again.
With MIT it's just easier to make a profit without giving back. That is not in the interest of consumers
Conversely, it means that the software can end up in places it wouldn’t otherwise. This could be better for consumers if the alternative is a poorer quality library with a more permissive license.
It is worth noting that this release currently breaks the banIP package [1], which relies on the old fw3. So for those relying on it, it might be worth waiting for a short while.
Fascinating link, thank you!
On that note though: How are the horizontal bars (first one right below the introduction) made? My guess is that they gathered all colours from all pixels, made a histogram, and then converted that into a horizontal stacked bar. However, the ordering seems weird, as some colours look like they appear multiple times (say for example the almost black-ish regions), disproving my histogram hypothesis. Then again, that may be due to image compression. And ordering the colour vectors would highly depend on the used colour space & the used partial order.
I similarly stumbled over the image-version of these bars in section "The colours within a single object" further down, which I just don't quite understand how they're made.
Does someone have insight on the methodology or maybe a link to a paper? The present methodology section seems to focus on object similarity.
> Note: This blog post has been rewritten at least 3 times. I started with describing how I configured the system, then I went to bragging about how I love bspwm, how I set up all my jails, etc. I might still write about it at some point, but not this time. Every time I started writing the post, I realised that I was missing a point. I can say now that I know what I really wanted to say: that I love FreeBSD and I find joy in using it.
I really appreciate the honesty and I think it was a good choice to focus on fleshing out the message.
Also: Quite the un-evangelistic stance on this rather controversy-inducing topic!
What I found interesting is the ripple-like throughput fluctuations. Especially on the TX2 v8.1a, this seems very odd.
User 'MB' has already asked this on the sites forum, but the author could not explain it either.
Maybe someone on HN has an idea?
I took a university class on said book last year, held by one of the authors Prof. Tobias Nipkow.
Its a fascinating introduction into general program semantics, formal analyses and proofs for program properties.
During the lecture I struggled somewhat as I didn‘t really have sufficient background knowledge for it to run smoothly.
But all the concepts are supported by, well, concrete examples which you can immediately try to solve in Isabelle. So I definitely recommend giving it a go!
It was an eye-opening experience for me, especially since I never really looked into formal program analysis.
I am no expert, but Coq and Isabe
le "feel" very similar at first, but Isabelle seems to provide much more powerful tools for proof automation than Coq. Isar, Isabelle's proof language, tries to be close to natural language proofs which makes it more intuitive.
I really wish Coq had equally powerful automation tools and a more intuitive proof language, because it feels much more of a solid language and I really liked that the clear relation of types <-> statements, terms <-> proofs, type checking <-> proof verification; including the ability to print the raw proof terms of theorems.
[1] https://www.man7.org/linux/man-pages/man7/user_namespaces.7....
[2] https://security.stackexchange.com/a/209533