Hacker Newsnew | past | comments | ask | show | jobs | submit | panstromek's commentslogin

> There must be a really good reason for this, such as Rust doesn’t interop well with C++

Yea, I'd bet it's that. Ideally, you'd want to stop writing C++ and continue with Rust on all new code, but Rust has stricter semantics, so the interop is somewhat "easy" in one direction, but very hairy in the other direction.

This means that in practice, you want to start porting from leaf components and slowly grow closer to the root, which stays in C++ for quite some time and just calls into Rust through C API (or something close to it).

If you're curious about the topic, there's a interop library called Zngur (https://hkalbasi.github.io/zngur/), which is built on this assumption. They have a pretty good explanation of the concrete problems on the homepage.


Lot of DOM APIs are like that. You have methods like element.[parent|children]() which implies circular structure, and then you have APIs like element.click(), which emits a click event that bubbles through the DOM - which means that element has to have some mutable reference to the DOM state. Or even element.remove(), which seems like a super weird api to have on an element of a collection, from Rust API design point of view.

You can model these with reference counting, but this turned out unfeasible in browsers. There's a great talk from when Blink (Chrome) transitioned from reference counting to GC, which provides a lot more details about these problems in practice: https://www.youtube.com/watch?v=_uxmEyd6uxo

> I'd get it if they were writing the UI in this, but the rest of this post is about the JS engine.

I think this might be the reason they started with the JS engine and not with some more fundamental browser structures. JS object model has these problems, too, but the engine has to solve them in more generic way. All JS objects can just be modeled as some JSObject class/struct where this is handled on the engine level.

DOM and other browser structures are different because the engine has to understand them, so the browser developers have to interact with the GC manually and if you watch the talk above, you'll see that it's quite involved to do even in C++, let alone in Rust, which puts a bunch of restrictions on top of that.


> Modern C++ pretty much solves the safety issues.

I always wonder how can one come to such a conclusion. Modern C++ has no way to enforce relationship between two objects in memory and the shared xor mutable rule, which means it can't even do the basic checks that are the foundation of Rust's safety features.

Of course, this statement is also trivially debunked by the reality of any major C++ program with complexity and attack surface of something like a browser. Modern C++ certainly didn't save Chrome from CVEs. They ban a bunch of C++ features, enforce the rule of two, and do a bunch of hardening and fuzzing on top of it and they still don't get spared from safety issues.


FWIW Chrome includes third party libraries like freetype and lots of bugs are in javascript. I imagine defensive checks in javascript will be controversial since performance of javascript is controlled by webdev, not by browser.

Note that Chrome is replacing[1] FreeType with Skrifa[2], which is a Rust-based library that can handle a lot of the things FreeType is being used for in Chrome. A lot of Chrome's dependencies are being rewritten in Rust.

[1]: https://developer.chrome.com/blog/memory-safety-fonts

[2]: https://github.com/googlefonts/fontations/tree/main/skrifa


Yeah sure. Thing is, C does just fine people are making “safe” ways to run libc. Rust is a complicated monstrosity with a bunch of “unsafe” sprinkles.

What does the memory safety even matter when hackers poison heavily used crates?


> besides the obvious

Well, what else is there besides the obvious? It's a browser.


I don't see the idea is visual tools, I never even heard somebody to talk about it like that. The plan is to target enterprise customers with advanced features. I feel like you should just go and watch some interviews or something where talk about their plan, Evan You was recently on a few podcasts mentioning their plans.

Also, the paradox is not really even there. JS ecosystem largely gave up on JS tools long time ago already. Pretty much all major build tools are migrating to native or already migrated, at least partially. This has been going on for last 4 years or something.

But the key to all of this is that most of these tools are still supporting JS plugins. Rolldown/Vite is compatible with Rollup JS plugins and OXLint has ESLint compatible API (it's in preview atm). So it's not really even a bet at all.


I think agree (but I think I think about this maybe a one level higher). I wrote about this a while ago in https://yoyo-code.com/programming-breakthroughs-we-need/#edi... .

One interesting thing I got in replies is Unison language (content adressed functions, function is defined by AST). Also, I recommend checking Dion language demo (experimental project which stores program as AST).

In general I think there's a missing piece between text and storage. Structural editing is likely a dead end, writing text seems superior, but storage format as text is just fundamentally problematic.

I think we need a good bridge that allows editing via text, but storage like structured database (I'd go as far as say relational database, maybe). This would unlock a lot of IDE-like features for simple programmatic usage, or manipulating langauge semantics in some interesting ways, but challenge is of course how to keep the mapping between textual input in shape.


Structural diff tools like difftastic[1] is a good middle ground and still underexplored IMO.

[1] https://github.com/Wilfred/difftastic


IntelliJ diffs are also really good, they are somewhat semi-structural I'd say. Not going as far as difftastic it seems (but I haven't use that one).

  > Dion language demo (experimental project which stores program as AST).
Michael Franz [1] invented slim binaries [2] for the Oberon System. Slim binaries were program (or module) ASTs compressed with the some kind of LZ-family algorithm. At the time they were much more smaller than Java's JAR files, despite JAR being a ZIP archive.

[1] https://en.wikipedia.org/wiki/Michael_Franz#Research

[2] https://en.wikipedia.org/wiki/Oberon_(operating_system)#Plug...

I believe that this storage format is still in use in Oberon circles.

Yes, I am that old, I even correctly remembered Franz's last name. I thought then he was and still think he is a genius. ;)


Interesting. It looks to me this was more about the portability of the resulting binary, IIUC.

Dion project was more about user interface to the programming language and unifying tools to use AST (or Typed AST?) as a source of truth instead of text and what that unlocks.

Dion demo is here: https://vimeo.com/485177664


I took a look.

Their system allow for intermediate state with errors. If that erroneous state can be stored to disk, they using a storage representation that is equivalent to text. If erroneous state cannot be stored, this makes Dion system much less usable, at least for me.

They also deliberately avoided pitfalls of languages like C. While they can do that because they can, I'd like to see how they will extend their concepts of user interface to the programming language and unifying tools to use (Typed) AST to C or, forgive me, C++, and what it'll unlock.

Also, there is an interesting approach of error correcting parsers: https://www.cs.tufts.edu/comp/150FP/archive/doaitse-swierstr...

Much extended version is in Haskell at Hackage: https://hackage.haskell.org/package/uu-parsinglib

As it allows monadic parsing combinators, it can parse context-sensitive grammars such as C.

It's interesting to see whether their demonstration around 08:30 of Visual Studio unable to recover from an error properly can be improved with error correction.


I'm quite sure I've read your article before and I've thought about this one a lot. Not so much from GIT perspective, but about textual representation still being the "golden source" for what the program is when interpreted or compiled.

Of course text is so universal and allows for so many ways of editing that it's hard to give up. On the other hand, while text is great for input, it comes with overhead and core issues for (most are already in the article, but I'm writing them down anyway):

  1. Substitutions such as renaming a symbol where ensuring the correctness of the operation pretty much requires having parsed the text to a graph representation first, or letting go of the guarantee of correctness in the first place and performing plain text search/replace.
  2. Alternative representations requiring full and correct re-parsing such as:
  - overview of flow across functions
  - viewing graph based data structures, of which there tend to be many in a larger application
  - imports graph and so on...
  3. Querying structurally equivalent patterns when they have multiple equivalent textual representations and search in general being somewhat limited.
  4. Merging changes and diffs have fewer guarantees than compared to when merging graphs or trees.
  5. Correctness checks, such as cyclic imports, ensuring the validity of the program itself are all build-time unless the IDE has effectively a duplicate program graph being continuously parsed from the changes that is not equivalent to the eventual execution model.
  6. Execution and build speed is also a permanent overhead as applications grow when using text as the source. Yes, parsing methods are quite fast these days and the hardware is far better, but having a correct program graph is always faster than parsing, creating & verifying a new one.
I think input as text is a must-have to start with no matter what, but what if the parsing step was performed immediately on stop symbols rather than later and merged with the program graph immediately rather than during a separate build step?

Or what if it was like "staging" step? Eg, write a separate function that gets parsed into program model immediately, then try executing it and then merge to main program graph later that can perform all necessary checks to ensure the main program graph remains valid? I think it'd be more difficult to learn, but I think having these operations and a program graph as a database, would give so much when it comes to editing, verifying and maintaining more complex programs.


> what if the parsing step was performed immediately on stop symbols rather than later and merged with the program graph immediately rather than during a separate build step?

I think this is the way to go, kinda like on Github, where you write markdown in the comments, but that is only used for input, after that it's merged into the system, all code-like constructs (links, references, images) are resolveed and from then you interact with the higher level concept (rendered comment with links and images).

For programinng langauge, Unison does this - you write one function at a time in something like a REPL and functions are saved in content addressed database.

> Or what if it was like "staging" step?

Yes, and I guess it'd have to go even deeper. The system should be able to represent broken program (in edited state), so conceptually it has to be something like a structured database for code which separates the user input from stored semantic representation and the final program.

IDE's like IntelliJ already build a program model like this and incrementally update it as you edit, they just have to work very hard to do it and that model is imperfect.

There's million issues to solve with this, though. It's a hard problem.


Why would structural editing be a dead end? It has nothing to do with storage format. At least the meaning of the term I am familiar with, is about how you navigate and manipulate semantic units of code, instead of manipulating characters of the code, for example pressing some shortcut keys to invert nesting of AST nodes, or wrap an expression inside another, or change the order of expressions, all at the pressing of a button or key combo. I think you might be referring to something else or a different definition of the term.

I'm referring to UI interfaces that allow you to do structural editing only and usually only store the structural shape of the program (e.g. no whitespace or indentation). I think at this point nobody uses them for programming, it's pretty frustrating to use because it doesn't allow you to do edits that break the semantic text structure too much.

I guess the most used one is styles editor in chrome dev tools and that one is only really useful for small tweaks, even just adding new properties is already pretty frustrating experience.

[edit] otherwise I agree that structural editing a-la IDE shortcuts is useful, I use that a lot.


Some very bright Jetbrains folks were able to solve most of those issues. Check out their MPS IDE [1], its structured/projectional editing experience is in a class of its own.

[1] https://www.youtube.com/watch?v=uvCc0DFxG1s


Come the BABLR side. We have cookies!

In all seriousness this is being done. By me.

I would say structural editing is not a dead end, because as you mention projects like Unison and Smalltalk show us that storing structures is compatible with having syntax.

The real problem is that we need a common way of storing parse tree structures so that we can build a semantic editor that works on the syntax of many programming languages


I think neither Unison nor Smalltalk use structural editing, though.

[edit] on the level of a code in a function at least.


No, I know that. But we do have an example of something that does: the web browser.

> but storage format as text is just fundamentally problematic.

Why? The ast needs to be stored as bytes on disk anyways, what is problematic in having having those bytes be human-readable text?


Steam takes 30% cut, though?


Yes, and that is also excessive.

https://en.wikipedia.org/wiki/Whataboutism


I have to respond to your point, though. Whether 30% cut is excessive depends on whether devs feel like they are getting a good deal. As far as I can tell, game developers don't seem to complain about Steam cut very much, it seems like the value you get is worth it.

For example, this thread https://www.reddit.com/r/Steam/comments/10wvgoo/do_you_think... seems like majority is positive about it, even though people debate. When Apple tax is brought up, there's almost never even a discussion there, it's pretty universally hated.

Apple seems to have almost adveserial relationship to its developers. I deploy to App Store and I feel like I'm getting screwed. Even compared to Google, which takes the same cut, but does bahave a lot more nicely to its developers.


I'm not judging that, it just seems to contradict the "But Steam shows us another model..." sentence, so I'm trying to make sense of that.


You're right, I didn't know it was 30%.

Checking an LLM, it sounds like they more or less all charge 30%. That's shit.


> Note: This image has been edited to include a pile of cash.

I giggled


There's so many ways this benchmark can go wrong that there's pretty much no way I can trust this conclusion.

> All the loops call a dummy function DATA.doSomethingWithValue() for each array element to make sure V8 doesn't optimize out something too much.

This is probably the most worrying comment - what is "too much?" How are you sure it doesn't change between different implementations? Are you sure v8 doesn't do anything you don't expect? If you don't look into what's actually happening in the engine, you have no idea at this point. Either you do the real work and measure that, or do the fake work but verify that the engine does what you think it does.


There are a lot of "probably"s in the article. I was also suspicious that the author didn't say they did any pre measurement runs of the code to ensure that it was warmed up first. Nor did they e.g. use V8 arguments with Node (like --trace-opt) to check what was actually happening.


u can compile to v8 turbofan final bytecode and use ai to analyze and compare the instructions.


> Maybe that's me, but I rarely saw teams which over-document, under-documenting is usually the case.

This is a good point, although this recently changed with LLMs, which often spit out a ton of redundant comments by default.


Claude Code in particular seems to use very few redundant comments. That or it's just better at obeying the standing instruction I give it to not create them, something other assistants seem to blithely ignore.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: