There are not really any newer instruction sets as we are locked into the von Neumann architecture and, until we move away from it, we will continue to move data between memory and CPU registers, or registers ↭ registers etc, which means that we will continue to add, shift, test conditions of arithmetic operations – same instructions across pretty much any CPU architecture relevant today.
So we have:
CISC – which is still used outside the x86 bubble;
RISC – which is widely used;
Hybrid RISC/CISC designs – x86 excluding, that would be the IBM z/Architecture (i.e. mainframes);
EPIC/VLIW – which has been largely unsuccessful outside DSP's and a few niches.
They all deal with registers, movements and testing the conditions, though, and one can't say that an ISA 123 that effectively does the same thing as an ISA 456 is older or newer. SIMD instructions have been the latest addition, and they also follow the same well known mental and compute models.
Radically different designs, such as Intel APX 432, Smalltalk, Java CPU's, have not received any meaningful acceptance, and it seems that the idea of a CPU architecture that is tied to a higher level compute model has been eschewed in perpetuity. Java CPU's were the last massively hyped up attempt to change it, and that was 30 years ago.
What other viable alternatives outside the von Neumann architecture are available to us? I am not sure.
Modern GPU instructions are often VLIW and the compiler has to do a lot to schedule them. For example, Nvidia's Volta (from 2017) uses 128-bit to encode each instruction. According to [1], the 128 bits in a word are used as follows:
• at least 91 bits are used to encode the instruction
• at least 23 bits are used to encode control information associated to multiple instructions
• the remaining 14 bits appeared to be unused
AMD GPUs are similar, I believe. VLIW is good for instruction density. VLIW was unsuccessful in CPUs like Itanium because the compiler was expected to handle (unpredictable) memory access latency. This is not possible, even today, for largely sequential workloads. But GPUs typically run highly parallel workload (e.g. MatMul), and the dynamic scheduler can just 'swap out' threads that wait for memory loads. Your GPU will also perform terribly on highly sequential workloads.
[1] Z. Jia, M. Maggioni, B. Staiger, D. P. Scarpazza, Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking.https://arxiv.org/abs/1804.06826
Personally, I have a soft spot for VLIW/EPIC architectures, and I really wish they were more successful in the mainstream computing.
I didn't consider GPU's precisely for the reason you mentioned – because of their unsuitability to run sequential workloads, which is most applications that end users run, even though nearly every modern computing contraption in existence has them today.
One, most assuredly, radical departure from the von Neumann architecture that I completely forgot about is the dataflow CPU architecture, which is vastly different from what we have been using in the last 60+ years. Even though there have been no productionised general purpose dataflow CPU's, it has been successfully implemented for niche applications, mostly in the networking. So, circling back to the original point raised, dataflow CPU instructions would certainly qualify for a new design.
The reason that
VLIW/EPIC architectures have not been successful that for mainstream workloads is the combination of
• the "memory wall",
• the static unpredictability of memory access, and
• the lack of sufficient parallelism for masking latency.
Those make dynamically scheduling instructions is just much more efficient.
Dataflow has been tried many many many times for general-purposed workloads.
And every time it failed for general-purposed workloads.
In the early 2020s I was part of an expensive team doing a blank-slate dataflow architecture for a large semi company: the project got cancelled b/c the performance figures were weak relative to the complexity of micro-architecture, which was high (hence expensive verification and high area). As one of my colleagues on that team says: "Everybody wants to work on dataflow until he works on dataflow." Regarding history of dataflow architectures, [1] is from 1975, so half a century old this year.
Nope, not until now. It seems to be a much more modern take on the idea of an object oriented CPU architecture.
Yet, there is something about object oriented ISA's that has made CPU designers eschew them consistently. Ranging from the Intel iAPX-432, to the Japanese Smalltalk Katana CPU, to jHISC, to another, unrelated, Katana CPU by the University of Texas and the University of Illinois, none of them have ever yielded a mainstream OO CPU. Perhaps, modern computing is not very object oriented after all.
So we have:
They all deal with registers, movements and testing the conditions, though, and one can't say that an ISA 123 that effectively does the same thing as an ISA 456 is older or newer. SIMD instructions have been the latest addition, and they also follow the same well known mental and compute models.Radically different designs, such as Intel APX 432, Smalltalk, Java CPU's, have not received any meaningful acceptance, and it seems that the idea of a CPU architecture that is tied to a higher level compute model has been eschewed in perpetuity. Java CPU's were the last massively hyped up attempt to change it, and that was 30 years ago.
What other viable alternatives outside the von Neumann architecture are available to us? I am not sure.