Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Terminal graphics protocol (kovidgoyal.net)
77 points by tosh on Dec 8, 2023 | hide | past | favorite | 70 comments


"I retired from modeling a long time ago. It's time I retired from tech, too... Let's commit to losing me."

https://en.wikipedia.org/wiki/Lenna


[flagged]


Could you please not break the site guidelines like this? The parent comment has been heavily upvoted.

If you see a comment that has been unfairly downvoted, give it a corrective upvote (https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...) and move on. If it's an egregious case, you can always let us know at hn@ycombinator.com and we'll be happy to take a look. But comments like what you posted here just add noise and usually end up false, as yours did. Unfortunately they linger on as uncollected garbage (https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...).

If you'd please review https://news.ycombinator.com/newsguidelines.html and stick to the rules when posting here, we'd appreciate it.

Edit: it looks like you've unfortunately been breaking the site guidelines in other places too. Please see https://news.ycombinator.com/item?id=38561818 also.


[flagged]


Respecting a single real person's basic requests about use of their own image is "politically correct" now? It's worth considering at what point your knee-jerk reactions have gone off the deep end on the other side.


It's HER wish. We could respect it without sacrificing other minorities.


My brain read this as "we could respect it without respecting other minorities", maybe the orange has my brain conditioned.


I have, so far, been lucky not to have experienced much of this attitude myself.


“Hey maybe we should all stop slapping this one person in the face?” “How is that going to help with other things that really matter?”


I'm wondering if one day we'll see an alternative to terminal technology that leaves behind in-band signalling for good.


It wouldn't be called a "terminal" then, and wouldn't be connected to a tty (at least not on Unix-like OSs).

A lot of the power of the terminal metaphor is its simplicity and universality. You can implement one as a purely mechanical device (as has been done numerous times), or as an application running on your graphical environment. With a GUI terminal window you can run the very same applications you'd run on a ASR-33 teletype (although the sounds and smells won't be present).


I don't understand why such legacy support is necessary for 99% of my terminal use cases. I just want to run command-line programs in something that doesn't emulate weird hardware from last century, but uses OS standard text input and display. I will never be convinced that setting colors and cursor positions with keycodes is a good design.


I know it sounds unbelievable, but the default apps bundled with stock operating systems have not, in fact, been selected to exactly match your personal use cases.

There's actually this rather common expectation that the end user is the one that is in charge of personalising their system to their exact needs.


> I don't understand why such legacy support is necessary for 99% of my terminal use cases.

People need to run apps via uart or any other kind of remote link that might come in the future.


Yes, but even UART could handle some kind of a simple multiplexer protocol with multiple data streams.


It's the OS's job to expose a stream interface on top of a UART.

There was a graphical terminal called the Blit (sold as the AT&T 5620, succeeded by the DMD 630 and then the 730 "Multi-Tasking Graphical Terminals"). They implemented windowing, multiple streams, graphics and a lot more on top of serial connections. Essentially, multiplexing streams on top of "a UART". What you describe is more or less what they did back then, minus the windowing and multi-tasking functions, because we let the host OS deal with that.

There is even a neat 5620 emulator for macOS you can download from the app store. It requires a proprietary shell running for that tty, and I never took the time to figure out exactly how it works.

https://archives.loomcom.com/3b2/documents/DMD_Terminal/ has some good docs on that, and bitsavers.org has a ton more on the 630 and 730.

I hope one day someone finds a ton of them forgotten in a warehouse in the middle of a desert in the US, because, where I live people recycle old computers with an uncanny dedication (those barbarians!).


which would be in-band signalling again since uart is just timed 0's and 1's over a single tx and rx signal.


Windows has out-of-band terminal controls (size, ...) through a terminal API.

They decided it was a bad thing (also compatibility) and now they implemented in-band ANSI escape sequences and recommend them.


I'm not sure if this was they reason for them, but one potential problem is synchronization between the two bands.


Fascinating. Do you know why?


The biggest one is compatibility with Linux console apps.

But also the difficulty of running a terminal session outside the traditional Windows command line app (think VS Code terminal pane).

https://devblogs.microsoft.com/commandline/windows-command-l...


Surely VS Code terminal pane can't display arbitrary unix terminal session, so this shouldn't be a concern. Arguably windows makes it easier: since color control is out of band, it can be ignored and only text is displayed.


because of the hacker community circle jerk. there was no rational reason. it's like back when people used ANSI.SYS to display a text file with colors. it's leet factor and has no application in reality.


Plan 9 is built as a text first graphical system without any text terminal protocols. Like Unix it has a command line but it's just a typewriter, you cant control the cursor. Instead you have draw(3) which serves the graphical facilities of a system which you load assets into and issue draw commands to manipulate and display them. Of course to maintain backwards compatibility there is vt(1) which is a graphical program that transparently gives you a vt-100 or 220 display.


I don't think in-band signaling is the problem. You can describe anything with a sequence of symbols and I find it elegant to be honest. It's just that the protocols are outdated and fragmented.


>You can describe anything with a sequence of symbols

That does not mean it is efficient. If the data is on the GPU then with in-band signaling you have to copy it off of the GPU to send it to the terminal and then the terminal has to copy it back to the GPU.


What's wrong with extending ReGIS, Sixel, or Tektronix (just kidding on that one) protocols? Or, maybe, PostScript/Display PostScript?

Can't we define what behavior should a modern terminal have with these already existing tools instead of inventing a brand new wheel?


Those existing tools are poorly designed, if you read the article it has a link to the discussion about its design choices, which contains in turn discussion about all the problems with sixel https://github.com/kovidgoyal/kitty/issues/33#issuecomment-2...


I have been part of these discussions, mostly on the VTE side, as I made a promise to implement something that I'm yet to be able to deliver upon.


The linked spec in the original has all the hallmarks of "oh wait, maybe we do need...".

It reads like what you get when someone has a poor idea of what 2D UI requirements are (alignment, relative positioning, layout, out-of-band image delivery, updates, animation, ...) and then has to retroactively fit them in one by one.

And then the mechanisms it has for doing so read like they are a poor fit for how you want to use them, requiring weird global bookkeeping/state, and sound like a nightmare without a proper API around it, abstracting it away.

Aka reinventing all the lessons of 30 years of UI development from scratch.

From what I remember sixel is _worse_ but that doesn't really change things.


Or instead it reads like someone thoughtfully came up with a minimal design and then iterated on that based on real world feedback for what is actually useful in this context.

Your only actual complaints were "weird global state" and "no API". I look forward to seeing you design a protocol that involves two way communication between two unrelated entities that avoids some shared state between them. As for no API, this is a protocol specification. if you need an API so badly go browse NPM.


because pixel-perfect display of images is nowhere nearly as important as people think it is, and because implementing a terminal which can display both images and text is much easier if you let go of the idea that pixels need to be displayed perfectly by the terminal.

Pixel aspect ratios are not always 1:1. Fonts have character height to width ratios which are all over the place and a terminal which must adapt any character cell to any region of an image pixel perfectly is way more difficult than it needs to be. Unless you're Casey Muratori, probably. then there are things like display scaling, which, ideally, the terminal application itself shouldn't need to know anything about in order to render correctly. What good is a terminal that renders pixels exactly if the OS can then magnify the window so that you lose the pixel "perfection"?

I don't like the Kitty protocol described in this post, either. The Windows Terminal people are thinking about this, believe it or not, and they are the ones that convinced me that existing methods are insufficient. We need yet another standard for doing this. One that can adapt to video display technology of the 1990s, at least.


The only protocol I mentioned that expects pixel-perfect images is the sixel one (which is, essentially, what a DEC printer would print on paper, but on the screen). ReGIS, while it is traditionally rendered as pixels, doesn't really care about how many pixels the screen has. DPS doesn't care much about pixels, and Tektronix doesn't even have the concept of pixels, just coordinates in an x-y plane (because Tektronix terminals didn't even have pixels).

> I don't like the Kitty protocol described in this post, either. The Windows Terminal people are thinking about this

I've been part of these discussions (mostly from the VTE side). I agree sixel, ReGIS, and Tek are "legacy" and have been implemented in such ways it's a miracle that they ever worked, and what you see on one brand of terminal is entirely dependent on the positions of the planets when the code was written and when the image is rendered and might not resemble at all what you'd see on another brand unless by coincidence ;-P.

Which is kind of why I brought up DPS - it's the only one that takes pure geometry and generates pixels (Tektronix does so as well, but it's very limited, and doesn't really like to be anchored to the text screen).


Display PostScript makes sense to me, as well. It's almost as if it designed for this exact problem.

Very different from VT100-compatible communication, though, but maybe it can be encoded to fit that mold.


The time-honored tradition of escape sequences can deal with this. Unfortunately, we might need some new escape sequences as the current ones are already taken with the traditional protocols.


Synthetic images like generated graphs with fine lines and text (i.e. not photographs), need pixel perfect rendering otherwise they look terrible. This is likely to be a far more common and useful use case for images in a terminal than showing photos.


For images with fine lines we have both ReGIS and Tek4014. ReGIS specifies pixel values (and became somewhat of a pain when DEC terminals with more than 240 scanlines appeared) but Tektronix graphics work with (IIRC) a coordinate space of 4096x4096 addressable points. It's not too amenable to extensions and has very little functionality besides drawing straight lines in various styles and monospaced text in a couple (3, IIRC) sizes.


> What's wrong with extending ReGIS, Sixel, or Tektronix (just kidding on that one) protocols? Or, maybe, PostScript/Display PostScript?

Nothing is wrong with them: Sixels for example work perfectly for all the tasks I throw at them in my terminals, from showing pixel perfect images to playing videos with mpv.

> Can't we define what behavior should a modern terminal have with these already existing tools instead of inventing a brand new wheel?

Actually, something is wrong with these already existing tool: they are preexisting, and therefore someone else idea. If you don't invent a new wheel (on which you can take credit for), you have to recognize that 40 year old technology is now perfectly adequate.

I think it wasn't ideal in the 1980s due to the bandwith requirements, but in 2023, if I can play videos with mpv, I think that's enough for most use.

I think the tech has been adequate for some time (at least about 15 years), and I can freely recognize that both because I have no horse in the game, and I care more about actual use than having/creating a shiny new tech.

Inventing a better wheel is the perfect way to avoid confronting such issues: you can argue your new wheel is better, and technically it will be true, even if in practice it's no different that the old wheel, in the sense it allows your vehicle to go forward for all the practical measures you care about.

There may also be a loss-aversion effect: if what you invented that was great is no longer needed, it may be painful to recognize it. You may not want to go for what's technologically not as nice as your invention, but that can now do the job just as well as your invention did, thanks to Moore's law.

Some people also just want to see their favored pet idea to win - unfortunately, they pet idea is only theoretical, or if it exists, it has a very small install base: since sixel is "adequate" or "sufficient", blocking/sabotaging sixel support is the logical action to prevent it from becoming established through network effects.

Put all these people together, add a little handwaving about how old formats can't work, and you get the current situation.

BTW here's a project I'll try to resurect: rending X applications ... in sixels: https://github.com/csdvrx/xorg-sixel

So you could have graphical applications in your terminal! I've just up. I've only uploaded the picture as I'm still fighting with the code but what you see is Xeyes show inside wezterm thanks to the sixel output of Xorg.

For terminals that don't support sixels (yet), sixel-tmux will convert them "in flight" into derasterized graphics: see https://github.com/csdvrx/sixel-tmux/

I'm sure this tech stack (X apps -> Xsixel -> sixel-tmux -> text) will make many people cringe, but in terms of functionality, it works (latency, bandwidth etc are sufficient for my uses) and if you have sixel support in your terminal, you get pixel perfect results.

FWIW I use and love Hyprland, but I want to use portable graphics apps more than I care about the underlying tech being better.


Discussed a bit here:

Terminal Graphics Protocol - https://news.ycombinator.com/item?id=35940691 - May 2023 (3 comments)


This is ignoring how modern graphics work on a computer. If you want to make a terminal into a compositor you do want to have programs communicate pixel data over stdout. You instead allocate a buffer (texture) that programs draw into. The approach described is unable to utilize hardware acceleration that is useful for a compositor. If you play a video there will be a lot of overhead in this approach compared to a normal compositor where the video decoding and composting can happen entirely on the GPU.


Anyone remember RIP graphics from the BBS days? There were only a couple BBS’ near me that supported it, but it sure seemed magical at the time.


This was my first thought, and I'm glad it was the first comment HN showed to me. RIP graphics were pretty slow to render compared to their ANSI counterparts, but that vector art could render visuals otherwise unseen on independent BBSs.

Legend of the Red Dragon even supported RIP.


Mitchell Hashimoto's new terminal Ghostty (private beta) supports the Kitty graphics protocol [0].

0. https://mitchellh.com/writing/ghostty-and-useful-zig-pattern...


Sidelining all the interesting stuff that the article discusses people decided to focus on the picture of Lenna. Yes, I understand it is a problem but it needs to be dealt with in a collective manner by tech community, by say introducing an alternative.



There are plenty of alternatives, as well of several calls for using them. For example you can't use Lena in any Nature journal, and they suggest ‘Cameraman’, ‘Mandril’ or ‘Peppers’.

In other words, there is practically speaking a well established alternative, and the remaining task is to badger all the holdouts into switching.


scikit image replaced lenna with a picture of astronaut Eileen Collins: https://scikit-image.org/docs/stable/api/skimage.data.html#s...


If everyone decided to discuss that then it sounds like they are dealing with it in a collective manner.

And I think it's a little ignorant to suggest that the issue is lack of an alternative. How about just don't use it? Or take a photo of your own kid and upload it


Demonstrations like this used various images forever.


If you are not using kitty, some programs like chafa can convert image formats into ascii color code + unicode characters. It's very handy in many cases.


You can have that functionality integrated within tmux with https://github.com/csdvrx/sixel-tmux/ : if you terminal doesn't support sixels, you'll at least see something close to the picture they represent.

Then of course it's not pixel-perfect unless you make your terminal very large (like 800x240 instead of 80x24) but something being better than nothing, I'd argue it's for the better if all you can do is 80x24 with no pictures otherwise.


I really like the idea of the Terminal Graphics Protocol and the aims of the project, but I feel it's important to address the choice of the Lenna photo. This image, although historically significant in the field of image processing, is increasingly viewed as an outdated and inappropriate choice due to its origins and the implicit message it conveys.

The Lenna photo, originally taken from a Playboy magazine, perpetuates a narrow and objectified view of women. Its continued use in tech demos and educational materials not only overlooks the rich diversity of alternative images available for such purposes but also subtly endorses a culture that many in the tech community are striving to move away from.

In an era where inclusivity and sensitivity are rightfully gaining prominence, it's crucial that we reassess and update our educational and demonstrative tools to reflect these values. Opting for more neutral and universally appropriate images would not only avoid potential discomfort among a diverse audience but also demonstrate a commitment to fostering a more inclusive and respectful tech culture.


If you don't want your image being used broadly without your control, then don't become a model for playboy. Ultimately she was paid fairly for the rights to her image and so that's that. It is completely fair use and hearing about how she should be able to somehow universally dictate the usage afterwards is boring. If the copyright holder wishes to issue takedowns they can do so, nobody elses opinion on the topic matters whatsoever, including hers.


Sorry, is there some shortage of other pictures on the Internet? Do you only do the right thing when you're legally or contractually forced to do so? Were you expecting a Playboy model in the 70's to anticipate the Internet when she signed that contract?!

Just use a different image. It's a pretty simple ask. For fuck's sake, this shouldn't be so hard to understand with a tiny shred of human empathy.


There's a shortage of photos that have a 50 year history in the field. By using this image you can compare directly to the 1973/4 paper. Using some other stock photo won't let you do that.

This is ignoring that the photo itself is completely innocuous. People who complain about the Lena photo, I suspect that 99.9% of them have absolutely no problem with AOC's appearance at the Met Gala in her Eat Mor Chikin dress. Which, from a perspective of propriety, there is no possible differentiation from the Lena photo. Their complaint has to do with the photo's origins, not with the photo itself. And, sorry, but the world isn't ruled by the pearl clutchers on their fainting couches.

https://www.nbcnewyork.com/entertainment/the-scene/met-gala/...


Yeah, it's not rocket science. It's also not about objectification or diversity or whatever, the picture is 100% innocuous. But if the model does not want to, just don't fucking use it, your paper will be fine.


Are you going to go back and update all the reports over the last 50 years and update them? That is not an easy ask.

It's like asking Americans to use metric.


It's still a dick move. Just because being a dick is legal does not mean it is free of consequences.


So your opinion on the topic doesnt matter whatsoever either


I'm not the one trying to dictate who does and doesn't use it.


> In an era where inclusivity and sensitivity are rightfully gaining prominence

We don't live in that era. We live in an era where you are either "with us" or "agaist us".


It's all well and good that you feel that way but what do you offer to replace the Lenna photo? It's easy to say it needs to be replaced but it's actually quite a different matter to find something so universally recognized. Everyone knows exactly what it's /supposed/ to look like and that's why it's useful as a reference to judge image processing.

Just my two cents.


The Nature-family of journals have banned its use, and recommends ‘Cameraman’, ‘Mandril’ or ‘Peppers’ as alternatives. In other words there is a well established alternative and you just didn't try googling for it.

https://www.nature.com/articles/s41565-018-0337-2


If only they included the sample images. The top Google hit for cameraman doesn't seem very useful.


The images came from this database: https://sipi.usc.edu/database/

Of Pirate, Cameraman and Peppers, only Peppers seems to remain:

https://sipi.usc.edu/database/database.php?volume=misc&image...

Or I'm being thick.

(Edited to add: Mandrill is still there, but the paper doesn't recommend it.)


I still feel like I must be missing something obvious, but I found the complete set of images here:

https://github.com/PLEX-GR00T/OpenCV_Tutorials/tree/master/I...


Sure, let's pick a new image. How about an astronaut.

https://commons.wikimedia.org/wiki/File:Mae_Carol_Jemison.jp...

This is a NASA photo so it's in the public domain in contrast to the copyrighted Lenna image.


According to wiki, Playboy (copyright owner) decided to not pursue copyright infringement for that centerfold.

    > Although Playboy is notorious for cracking down on illegal uses of its images, it has decided to overlook the widespread distribution of this particular centerfold.


I personally have no problem if it was a different photo but lets look at some theoretical objections to the one you provided.

An astronaut? Some would see that as classicist. Blue and Orange? This impacts people with deuteranomaly and protanomaly color blindness. NASA? They are a capitalist space program that hired Nazis after WWII. A woman? Sitting? A person of color? Rectangular dimensions? ect... ect...

These are, of course, completely absurd objections. But someone somewhere would complain about your choice as firmly as you complain about the Lenna photo. And would you think their objections are as absurd as some think your objections are? Probably.

Standards are standards because they provide a solid reference that everyone can use and they should, by definition, be resistant to change. If it ain't broke, don't fix it.

BUT if we do change it, I vote for a cute cat photo. https://www.photos-public-domain.com/2018/04/26/cat-with-blu...


She wants us to find something else.

"I retired from modeling a long time ago. It's time I retired from tech, too... Let's commit to losing me." -- Lena Forsén


Sounds like a fake problem. Just include the original image for comparison. Problem solved.


There are a ton of universally recognized images.

Put the Linux mascot. Use the Windows XP background image. The Monalisa or the Sixteen Chapel.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: