Some people in that discussion wonder about "->", which is used for indirect addressing through a structure member.
When C has added structures, which did not exist in B, it has taken the keyword "struct" and also both "." and "->" from the IBM PL/I language, from which C has also taken some other features.
In general, almost any feature added by C to B was taken either from PL/I or from Algol 68. The exceptions are "continue" and the generalized "for", which did not exist in any previous language.
(However the generalized "for" of C was a mistake, because it complicates the frequent use cases in order to simplify seldom encountered use cases. The right way to generalize "for", i.e. with iterators, was introduced by Alphard in the same year with C, i.e. in 1974.) (Compare "for (I=0;I<N;I+=5) {" of C with "for I from 0 to N by 5 do" or "for I in 0:N:5 do" of previous languages. C requires typing a lot of redundant characters in the most frequent cases.)
The oldest symbol for indirection through a pointer (in the language Euler, in January 1966) was a raised middle dot (i.e. a point). This was before ASCII and ASCII did not include the raised middle dot (U+00B7), so it was replaced by the most similar ASCII character, "*".
Euler had used "@" for "address of" and indirection was a postfix operator, as it should. Making "*" a prefix operator in B and C was a mistake, which forced the importing of "->" from PL/I, to avoid an excessive number of parentheses. Otherwise "(*x).y" would have been needed, instead of "x->y". With a postfix "*", that would have been "x*.y", and "->", would not have been needed.
In CPL, the ancestor of BCPL and B, indirection was implicit, like with the C++ references. Instead of having an "address of" operator, CPL had a distinct symbol for an assignment variant that assigns the address of a variable, instead of assigning its value.
> the generalized "for" of C was a mistake, because it complicates the frequent use cases in order to simplify seldom encountered use cases
The worst offender is the quasi-obligatory "break" statement in "case" blocks. Fall-through cases are useful for things like lexers but probably not needed in 99.999% of other programs. I wonder how many millions of (wo)man hours have been wasted on debugging missing breaks. (yes, I know that linters exist)
> The worst offender is the quasi-obligatory "break" statement in "case" blocks.
This was (almost certainly) done to simplify the compiler. CASE in C is not actually a structured control construct despite its syntactic appearance, it's just a computed GOTO. This is what makes things like Duff's device [1] possible.
I think computed-GOTO is easier to understand and more flexible than Duff's device, but computed-GOTO is not part of the C standard. It is sometimes available as a non-portable extension, e.g. https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html. Anyone know why it is not included in C?
Yeah, I remember having my mind blown when I first discovered Duff's Device.
My mental picture of what C-compilers are up to turned out to be a lot more sophisticated than reality.
But having written a few Forth-interpreters, it makes total sense to me. Treating the input as a mostly unstructured stream of tokens is very convenient.
What I find inconvenient is that a case body can be made with or without curly brackets. I'd rather have they were mandatory, like when an if statement aplies to more than one statement. The downside is that you need more indentation to have a logical curly brackets structure.
Choices/options for the same result, tend to make code less readable.
case 1:
... code ...
goto case;
case 2:
... code ...
which makes it clear. No new keywords are needed. C could adopt this easily. All those /* fallthrough */ comments and warnings and compiler switches will just go away.
That (also) came from BCPL. (BCPL had a separate `endcase` statement; `break` applied only to loops. It also had `docase ‹expr›`, which is the equivalent of a loop `continue`, re-entering the `switchon`.)
As I recall, B was modelled on an early BCPL compiler that didn't have an ENDCASE statement, hence the overloading of break for exiting a switch block.
You don't really need a linter. It's just a warning setting, see:
-Wimplicit-fallthrough
From the compiler docs. It's just not a problem. If you waste time once and learn about warning options in the compiler you will be better off anyway.
It's like with assignment being an expression panic - something that maybe bites you once and then you learn to read the compiler warnings and it never causes problems again.
If you don't want to read then there is:
No separate linters needed, but details differ between compilers.
GCC has a warning in the '-Wextra' warning set, Clang requires the explicit option `-Wimplicit-fallthrough', MSVC is completely silent (apparently it's in the CppCoreCheck rules though).
This should really be in the default warning set with an annotation that the fallthrough is intended (unless the case-branch is completely empty).
Note that gcc and clang will warn you about fallthrough statements i you use -Wextra and that you have to either have insert a /fallthrough/ comment to make your intent clear.
Pity the (wo)man who does not use -Wall -Wextra for (s)he is truly mistaken
(For backwards compatibility, it still must fallthrough with or without the attribute; the attribute just signals programmer intention and silences the warning.)
Incidently lint was invented exactly because of improving the way C was, they decided it was better in the UNIX tradition to add yet another tool, which those millions rather not use.
> Otherwise "(*x).y" would have been needed, instead of "x->y". With a postfix "*", that would have been "x*.y", and "->", would not have been needed.
I mean technically the compiler could just have been less stupid and made `.` auto-deref, it’s not like C has operator overloading so the LHS is either a struct or a pointer, not both.
It is not that simple, because in C you may chain a great number of prefix and postfix operators.
It is frequent to have multiple "*", "[]", "." and "->".
Now you have the simple rule of executing first the postfix operators from left to right, then the prefix operators from right to left.
If some special rules had to be invented to avoid writing the parentheses in "(*x).y", after adding several "*", "[]" and "." it would become impossible to understand the right evaluation order.
When all of "*", "[]" and "." are postfix, they are just executed in order from left to right, which is easy to understand.
While sometimes auto-dereferencing a pointer would be convenient, in C you want frequently to get the value of a link, or to do pointer arithmetic. With auto-dereferencing, some new ways of using an "address of" operator would have to be invented, which are unlikely to be more simple than the current rules.
Like I have said, the 2 languages that have introduced the concept of pointers were Euler, with explicit dereferencing, and CPL, with implicit auto-dereferencing.
Both are valid ways to design a programming language, but in both case a lot of rules must be added to cover all the use cases, which would have to be more complex in the languages with auto-dereferencing, so these languages usually solve this problem lazily, by just prohibiting some uses, e.g. by prohibiting pointer arithmetic.
I believe GP means that "a.b" should be enough for pointers as well, you don't need "a->b" or "(*a).b". It's not like pointers can have fields in C, so the meaning would have been unambiguous.
When stepping up from C to C++, it feels ultimately unlogical to use an ampersand to indicate a reference in a declaration, while in an assigment or actual parameter it means address of. Where they out of special chars?
Sorry, I had posted a partial reply before finishing writing the complete one.
As I have written above, a language with implicit dereferencing needs additional syntactic means for denoting the cases when the values of the pointers are needed, e.g. for doing pointer arithmetic, which is frequent in C.
If dereferencing would have been made implicit, that would have required a large number of changes in the language. C++ has introduced pointers with implicit dereferencing, i.e. what C++ calls references, but due to the fact that the other syntax changes that are needed have not been made, because they would break compatibility, the C++ references can replace C pointers only in a subset of their uses.
No one here is proposing implicit dereferencing (everywhere), the proposal is just to make the syntax `pointer.field` have the exact same meaning that `pointer->field` has in C today. In all other places, the current syntax and semantics would stay the same.
Since in all situations where pointer->field is valid pointer.valid is an explicit error in C today, this would have been very much doable without major changes in compiler or language implementation.
Of course, at C's level of abstraction, and given the speed limitations of the day, the syntax difference between -> and . may well be argued to have been helpful instead of harmful.
And yes, the same could not be said for C++, where you can implement dereferencing for your own type and get a variable where both `a.b` and `a->b` are valid and have different meanings. This is anyway not a proposal for changing C (or C++) today, merely a "what if" discussion about how C could have been designed differently.
While that’s technically true, it does use a severely cut down version thereof which is worse in every way: it only supports paragraphs, code blocks, and emphasis so the ability to format is extremely limited but the emphasis can still screw you over, which requires an explicit preview (or posting) to notice.
And I think escaping emphasis was only introduced somewhat recently? I do remember that for the longest time you basically had to trick HN into not breaking your comments by using a different character in stead.
for italics, which sometimes causes a comment to go haywire because someone uses it to reference a footnote or otherwise drops it in mid-sentence, with a matching closing one, not intending italics.
(Up-thread I think it probably was like that when they commented, but has since been fixed.)
Yes, it's also automatic in some cases - originally my comment had an attempt at a joke example of it, but it was caught and didn't work. It was definitely improved relatively recently.
I think you're still misunderstanding the proposal. What are the types of your variables? This discussion makes no sense without type information. We're defining the meaning of . for pointers here. You seem to be misunderstanding the proposal as being purely a textual transformation that ignores types?
The proposal is literally: "If you see . and the left operand is a pointer, pretend the . was -> instead, because otherwise the code is already invalid."
OK, I believe that you are right and if implicit dereferencing had been done only when this is the only interpretation that leads to a valid expression, then "->" would not have been necessary.
However, I assume that this would have been a too complex solution for compilers that had to work in a few tens of kilobytes of memory, while a postfix "*", as already used a decade before C, would have been a trivial solution.
> However, I assume that this would have been a too complex solution for compilers that had to work in a few tens of kilobytes of memory
It's literally the same complexity as the type checking compilers already do to tell you that your `.` does not work because the LHS is a pointer not a struct.
It is not the same complexity, because syntax checking stops immediately at an error.
To determine if implicit dereferencing may be applied, more analysis has to be done, because there it may be not only a pointer to a structure, but a pointer to a pointer to a structure and so on, so multiple implicit dereferencing may be needed to obtain a valid expression.
However I agree that the difference in complexity is not big.
`a.b` is always valid syntax, there's no way to know if it is a valid C instruction until you resolve the types of a and b. Assigning it valid semantics at that point is exactly as easy as assigning it error semantics.
And no, this would not perform multiple levels of dereferencing anymore than -> does today. You could have literally find&replaced every use of -> with . and every C program would have had the exact same semantics. `struct point **a; a.x = 1` would throw the exact same compilation error that `struct point **a; a->x = 1` throws today. The only difference would be that `struct point *a; a.x = 1;` would write 1 to the field x of the object pointed to by a, instead of throwing an error that says "object of type struct point* has no field named x".
> As I have written above, a language with implicit dereferencing needs additional syntactic means for denoting the cases when the values of the pointers are needed, e.g. for doing pointer arithmetic, which is frequent in C.
No it doesn’t. Let’s take this example in valid C:
struct my_struct { int field };
struct my_struct s = { .field = 42 };
struct my_struct *p = &s;
printf("direct : %d", s.field);
printf("pointer: %d", p->field);
printf("address: %x", p);
What is being proposed here is to make that code valid:
struct my_struct { int field };
struct my_struct s = { .field = 42 };
struct my_struct *p = &s;
printf("direct : %d", s.field);
printf("pointer: %d", p.field); // note the use of a dot here
printf("address: %x", p);
That is, the naked p is still to be interpreted as what it is: a pointer. It’s just that when we write `a.b`, the language would first check the type of `a`, then dereference it as many times as necessary to get to the underlying struct, and then access its field. For instance:
struct my_struct { int field };
struct my_struct s = { .field = 42 };
struct my_struct *p = &s;
struct my_struct **pp = &p;
struct my_struct ***ppp = &pp;
Now let’s see how this automatic indirection would work:
// All would print the same value
printf("%d", ppp.field);
printf("%d", pp .field);
printf("%d", p .field);
printf("%d", s .field);
// We can still use explicit indirections
printf("%d", (*ppp ).field);
printf("%d", (**ppp ).field);
printf("%d", (***ppp).field);
printf("%d", (*pp ).field);
printf("%d", (**pp ).field);
printf("%d", (*p ).field);
We can still get to the actual addresses no problem:
The kicker here is that the decision on whether an access to a struct member requires dereferencing the pointer or not, is not done at parsing time. It’s done at type checking time. And by the way, in standard C the decision to give you an error or not is already done at type checking time. All this to say, this would be a fairly benign change to compilers.
Now would users get confused? Possibly. With the conflation of pointers and arrays, the following would be equivalent:
array.field
(*array).field
array[0].field
Looks nifty to some perhaps, but some people really meant:
There is something to "explicit is better than implicit" mantra. I appreciate the C way because if I write a.b and it's an error it clarifies what's going on in my head (I now know a is a pointer not an object).
> There is something to "explicit is better than implicit" mantra.
I’ll entertain this when C fixes (or at least removes) integer promotion, which is a source of far more bugs and misunderstandings than this could ever be: `.` auto-deref-ing does not actively undermine what little type system C has.
Integer promotion will probably never be removed from the fundamental types (char, signed char, short, etc.). You can use `_BitInt`(N) types in the future which don't follow the implicit integer promotion rule. There are at least two compilers today that implement it (clang and SDCC).
> Integer promotion will probably never be removed from the fundamental types
Oh let me assure you, I have no actual expectation that C would ever change in such a way.
Although it should be noted that both suggestions are entirely syntactic and could be gated behind a per-file (or even per function) stricture a la strict mode.
Man, this is completely unrelated problem. Implicit operator overload is just a bad design. Integer promotion has advantages and disadvantages, especially in a language like C where you often use 8bit or 16bit types for memory optimization/alignment reasons.
You assert that you do not want this on grounds of clarity. It is not an unrelated problem to point out that C has numerous obscurity issues significantly worse than this could ever be in the language right now.
> Implicit operator overload is just a bad design.
That's at best a bunch of words arranged nonsensically, and at worst an assertion that you want to remove arithmetic operators from the language?
> Integer promotion has advantages and disadvantages, especially in a language like C where you often use 8bit or 16bit types for memory optimization/alignment reasons.
It's mostly a major actual source of obscurity and bugs.
You are not arguing in good faith but just in case I will point out what you are missing:
1)Implicit operator overload is a bad design in case of pointers and objects because a.b and a->b are both readable, easy to type and short.
2)Integer promotion is not as simple because requiring explicit promotion makes a holy unreadable mess out of your code in the simplest of cases.
2) is demonstrated many times by mongo code lines in Java with all the explicit casts. This is also the reason Python has implicit promotion for its number type. You sacrifice something to get something unlike in 1) where you only lose for no gain what so over.
> 1)Implicit operator overload is a bad design in case of pointers and objects because a.b and a->b are both readable, easy to type and short.
That's just a baseless assertion. Here I can do the same: implicit overloading is a great design in case of pointers and objects because a.b is always unambiguous and uniform, and -> is a harder to read and type extra operator which has no justification.
And then obviously your assertion can be used exactly the same way to similarly assert that every numeric type should have its own set of arithmetic operators, after all u4+ is also readable, easy to type, and short.
> 2)Integer promotion is not as simple because requiring explicit promotion makes a holy unreadable mess out of your code in the simplest of cases.
Integer promotion is much simpler because it's literally a never-ending source of bugs.
> This is also the reason Python has implicit promotion for its number type.
Python does not have implicit promotion for its number type, at best Python 2 had that for performance reasons (definitely not "mongo code lines with all the explicit casts" which it could not care less about), and in reality that was not even the case because it does not corrupt your data upfront as C does.
The thing is, the `->` operator in C is completely unnecessary. `.` can be used instead. The compiler can distinguish by looking to see if the lvalue is a pointer or a value.
The result makes it much easier to refactor code. Ever try replacing a value with a pointer in C? Arrghh.
The stack is not some magical location in RAM that is faster, and in any case you can have a pointer to the stack.
It's super common to define a struct on the stack and then pass a pointer to an init function to initialize the value that is on the stack.
Sometimes pointers are faster because you don't have to copy the entire struct.
What largely determines the performance of values vs. pointers is whether that location in RAM is cached, and whether you need to allocate memory on the heap.
The prefix * in C comes from prefix ! in BCPL. C’s weird a[b] == b[a] is also a hidden BCPLism.
BCPL has infix ! as well as prefix ! so you write an array index expression like array!index instead of array[index]. Infix ! is commutative like addition and [] in C.
You declare structures in BCPL by defining a constant for the offset of each member, so you can write object!MEMBER somewhat like C object->member. The semantics of -> in early C were very similar to BCPL infix ! except that members had types as well as offsets, but like BCPL there was nothing to tie a particular member to a particular structure as there is in modern C.
There’s a certain elegance to BCPL’s syntax that you don’t get from prefix-only or postfix-only indirection operators. C might have been better if it had stuck closer to BCPL in this respect, but sadly * conflicts with infix multiplication.
Some readers might also have used BCPL-style syntax for WIMP programming in BBC BASIC on RISC OS.
> The right way to generalize "for", i.e. with iterators, was introduced by Alphard in the same year with C, i.e. in 1974.) (Compare "for (I=0;I<N;I+=5) {" of C with "for I from 0 to N by 5 do" or "for I in 0:N:5 do" of previous languages.
Really? How do you do the C's equivalent of `for(i = 0; i < N && j < M; i++, j++)` with your `for I from 0 to N` syntax? The for loop in C is extremely flexible and it captures the idea of the loop perfectly: there is the initialization block, the exit condition and the iteration. In other languages at that time the for loop was written in terms of a range, but without a strong range abstraction such loop is really primitive and limited in applicability.
> In other languages at that time the for loop was written in terms of a range, but without a strong range abstraction such loop is really primitive and limited in applicability.
That's a good thing: it gives you a clear syntactic marker for "this loop definitely terminates". There's already a full power anything-goes looping construct: `while`.
Using the for(;;) loop in Awk, and the C preprocessor, in a project called cppawk, I created a loop facility that has multiple clauses of different types, that can combine in parallel or as Cartesian-product, and are programmer-definable.
The above manual page includes an example of how to define an alpha_range clause that iterates over string ranges like alpha_range(var, "000", "999") or alpha_range(var, "AAA", "ZZZ").
There is a conditional clause if(...) which takes a condition and another clause as arguments. The iteration of the other clause is suspended while the condition is false.
At the shell prompt: add the values of the odd integers in the range 1 to 50.
$ cppawk '
> #include <iter.h>
>
> BEGIN {
> loop (range (i, 1, 50),
> if (i % 2 == 1, summing (sum, i)))
> ;
> print sum
> }
> '
625
Though if is an Awk keyword, this isn't a problem because if is a clause, not an ordinary expression. Moreover, though the clause is defined by macros, none of them are called if; they have if embedded in their name.
I interpreted that line to be two separate clauses: firstly, a note that the statement that the correct generalisation for for loops is iterators, which were introduced in Alphard; secondly, an unrelated point about how the C-style generalised for loop is more verbose for the simple case than existing syntaxes.
That is, the commenter was not asserting the supremacy of "from 0 to N" as a for-loop construct, but rather asserting the supremacy of iterators, and pointing out how poor the C-style syntax is even compared to its less-generalised cousins.
Before C and Alphard, all the "for" loops had control variables that took all values of some arithmetic progression.
Even today, such simple loops with arithmetic progressions include an overwhelming majority of all "for" loops.
The C syntax made writing these simple "for" loops more difficult, by having to write a lot of redundant symbols, instead of writing the minimum number of separators. Also for reading, the redundant symbols obscure the meaningful text.
The C syntax allows the writing of "for" loops where the values of the control variables are not taken from a progression, but they are for instance the values of the links of a linked list.
This kind of "for" loops can be written in a simple way by using iterators, without complicating the syntax of the loops with simple arithmetic progressions.
In modern C++ there is no longer any case when you would want to use the C kind of "for", but the kinds of loops used by languages like C++ already existed in languages introduced at about the same time with C, e.g. Alphard and Clu.
Your example would be "for i in o:N for j in o:M do".
Equivalent syntax was used in Fortran for writing cycles in a simpler way than in C already 20 years before C. ("do 10 i=0,N do 20 j=0,M")
The only case when the C "for" is useful is when the third operation is neither an addition nor a subtraction. The most frequent such case is when the operation is a link dereferencing, for accessing a linked list.
Such cases, for visiting all members of some non-array aggregate data, e.g. linked lists or trees, are solved more clearly with iterators.
for-loop in C is still primitive and limited in applicability, while it makes the most frequent use-case much more difficult.
For example, it is normal practice to do some of initialization outside of a loop. It is normal to deal with finishing loop by using goto or if-branching after the loop (think of a search, that can be succesful or not). It is normal to ignore update counter part of loop and do updates in a body of the loop, because either updates are too big to squeese them between into for, or you need to do something between updating your counter and checking for the necessity of running another iteration.
With all this said I know 2 another approaches to the problem.
1. Common Lisp approach: create really flexible loop clause, allowing to represent any loop in a structural way. I'd recommend to look at cl-iterate package for CL, to see what happens to maniacs who had chosen this way.
2. A less ambitious approach defining some simple loops for most frequent cases (while, range iteration, iteration by iterator) and a loop for a general case looking like "loop { iteration }".
I personally prefer the second approach. I loved C-way, then I was a big fun of a Lisp-way, but now I believe that it is silly to create a whole new language just to write iteration, and a half-baked attempt to cover with for-loop more cases then just range iteration has more downsides than upsides.
What you are doing there is no for. Not in the mathematical sense of “for all values in a set”. The semantic of the word “for” should matter.
If you want to iterate 2 variables i and j is much better to do it explicitly. It will be much more readable.
A better syntax for loops with geometric progressions like this would be obtained by modifying the syntax for writing arithmetic progressions (e.g. "a:b:c") by using a different separator symbol.
For example, if the separator for geometric progressions would be ":>", your loop example would become "for bit in 1:>128:>2". Another example of (non-ASCII) separator would be "for bit in 1⋮128⋮2"
Heh, hopefully not too often. First I thought this loops 128 times. Then 7 times. Then 8 times. Then I thought "undefined behavior" if bit is a uint8_t. But then I thought, I don't have freaking clue because I can't keep all of C's implicit casting and arithmetic operation rules in my head.
I have the feeling the generalized "For" is nice because it allows for multiple initializers in a compact manner, how does
for I from 0 to N by 5 deals with this ?
Or we could use a different operator for pointer indirection and multiplication? There’s also the pointer/array conflation that could cause some confusion.
>because it complicates the frequent use cases in order to simplify seldom encountered use cases.
Putting complex logic in the three places of a for loop is something we all frown on now because of readability and bugs, but back then it was almost more common than just iterating from 1 to N. Same for dropping through case statements - frowned on now (rightly) because gotcha bugs, but used heavily.
>Making "*" a prefix operator in B and C was a mistake
C dominates for the same reason it is terrible: it gets shit done for some value of "shit". In this light, there are no mistakes. There are other languages that don't make these "mistakes" and, behold, nobody wrote the majority of the world's operating systems and software in them. I've used C when the viable options were hand-crafted assembly or C. There are no mistakes here, only practicalities.
> Compare "for (I=0;I<N;I+=5) {" of C with "for I from 0 to N by 5 do" or "for I in 0:N:5 do" of previous languages. C requires typing a lot of redundant characters in the most frequent cases.
Um, "for (I=0;I<N;I+=5) {" is fewer characters than "for I from 0 to N by 5 do".
Sure. And the observant will note that I didn't say that C was shorter than that version. Even if you replace "∈" with "in", it's still shorter.
Still... when did that syntax come out? Was it in Algol 68, or PL/I? It seems a bit unfair to complain about C being verbose if it was shorter than anything else available at the time.
"∈" was actually proposed for Alphard, at the same time with C.
When later implemented, including in an Algol 68 variant that inspired the UNIX Bourne shell, "∈" was replaced with "in", which could be written with ASCII.
Algol 68 could use either keyword pairs, like "do" and "od" (which became "do" and "done" in the UNIX Bourne shell) or, optionally, various kinds of parentheses in their place, for conciseness.
The notation "0:N:5" or "0:N", when the step is 1, for arithmetic progressions is ancient. Even Fortran used it, but with the mistake of using comma instead of colon, which introduced a syntactic ambiguity, because comma was also used for other purposes. Using colon started with Algol 60, but that one preferred keywords instead of symbols in the arithmetic progressions used inside "for", so it did not use the notation consistently.
Fortran in 1954 or the language of Heinz Rutishauser, in 1951 (the first one with a "for" statement), were already much more concise than C.
It’s not. Not for humans (the extra parenthesis is just useless clutter), and not for machines (I’ve written parsers, both alternatives are as easy to implement).
Yup. But as I understand it @ wasn't available on the terminals used to create C. Likewise #. They were used for later features, once they could actually be typed. Happy outcome, that there were characters available for fresh features, since C used absolutely everything available on the original (ADM?) terminals.
Another curious thing: Unix came shortly after and was entirely typed with lower case. Because those same terminals had only one font (font ROMs were tiny back then, and expensive). It looked like UPPERCASE but in fact the keyboard produced lowercase. Once better terminals were used to edit the code it was noticed for the first time(?) that it's all be edited as lowercase.
Or so the story went, back in those days, when it was all new.
The DEC PDP-11 assembler used the @ symbol for dereferences of registers ("@Rn" or "(Rn)" dereferenced a register, for example).
However, in Unix, the terminal convention was to use the @ symbol to delete the current line, and they didn't have a DEC assembler yet, so they used the asterisk (*) instead in B, as well as in the Unix PDP-11 assembler, which was written in B.
As far as I know @ on modern keyboards above the two from the IBM Selectric[1], released in 1961; the Model D and the VT-52 both copied the keyboard layout. The 1963 ASCII draft had @, despite not having lowercase yet.
Prior to IBM, it was on at least the Underwood typewriters[2], so it was never quite absent.
1963 ASCII was upper-case only, and many early terminals (notably for Unix, including the Teletype 33) were upper case only. `stty lcase`, which maps upper-case input to lower case, existed early — https://www.tuhs.org/cgi-bin/utree.pl?file=V1/man/man2/stty....
Instance-variable heavy Ruby code does look rather like someone has sneezed on the monitor, so I think it would have been a bad move, aesthetically ...
At the very least, &/* are in the same place on many keyboards, I've been bitten by muscle memory many times going back and forth between us/uk/fr keyboard layouts.
For reference, it's above the ' and between the ;: and #~ keys on my current board, rather than on shift-2 (where " lives).
Maybe this would have stayed consistent worldwide, if it was used in c.
The oldest trace of @ being used to mean "at" can be traced to typeriters for commerce around 1880, where it was used as "5 apples @ 10 p" meaning "5 apples at 10 pence each."
At least that's what German Wikipedia Claims while the corresponding paragraph in english Wikipedia is short.
It's also the "a commercial" in Quebec French [1]. Despite lacking the accent on the A (à), the intention of the A is to stand in for the single letter French word "à" (at) [2].
More recently (but still 1968, before email): "In ALGOL 68, the @ symbol is brief form of the at keyword; it is used to change the lower bound of an array. For example: arrayx[@88] refers to an array starting at index 88."
There are claims it comes from the latin "ad" similar to how & comes from the latin "et" but the etymology is far less certain, with the french à being another possible source.
What I can't find is when it was first used in the US for indicating home-team for sports. E.g. [1] where there is vs. or @ depending on whether it is a home or away game. I suspect it's relatively modern, but not sure how far back it goes.
My stepdad is from NJ and pronounces toilet as tore-let. He’s a real character and now I don’t know if you’re serious or not and it’s hilarious to me as a non-NJ person who has visited and met his side of the family.
Same! Someone should post a video/audio recording of the pronunciation. Even trying to imitate different accents I'm familiar with, I can't find one where the two words sound even remotely similar lol
Interesting, I've always thought it's & pronounced as Latin `et', which sounds like `at', so &x gives a pointer that "points at" x.
Also, the "near on the keyboard" idea doesn't make sense anyway because C was developed on a bit-paired tty33 where the two symbols are not that close to each other.
And C++/CLI uses ^ for "managed pointers" (pointer to .net objects) and % for "managed references" which means there are all-together 4 ways to declare various types of pointers which is super-fun.
I have a pet theory that '#' and '*' have such prominent roles in C because Thompson and Richie developed B at Bell Labs in 1969 when the first push button phones were appearing with '#' and '*' buttons.
TIL "ampersand" etymology.
from the Oxford dictionary:
"Origin: late 18th century: alteration of and per se and ‘& by itself is and ’, formerly chanted as an aid to learning the sign."
i've taught c and c++ extensively commercially, and i really can't remember anyone having real problems with understanding pointers, despite some trepidations they might have had when the topic was first introduced.
There are usually patterns that you match the code you're reading to. For example, the simple pattern of int f(int x, int y, int len, int *z);
and you immediately think "oh ok, they're doing something with x and y, and z is the out parameter. The fn is also likely going to allocate". Usually it matches what you see in the function body and it's all good. In (hopefully) rare cases you view it with such a lens and something doesn't make sense and you have to scrutinise the code for what it's actually doing. This is all without variable names. Proper variable names makes it much more straightforward to read e.g calling the function vec_add
When C has added structures, which did not exist in B, it has taken the keyword "struct" and also both "." and "->" from the IBM PL/I language, from which C has also taken some other features.
In general, almost any feature added by C to B was taken either from PL/I or from Algol 68. The exceptions are "continue" and the generalized "for", which did not exist in any previous language.
(However the generalized "for" of C was a mistake, because it complicates the frequent use cases in order to simplify seldom encountered use cases. The right way to generalize "for", i.e. with iterators, was introduced by Alphard in the same year with C, i.e. in 1974.) (Compare "for (I=0;I<N;I+=5) {" of C with "for I from 0 to N by 5 do" or "for I in 0:N:5 do" of previous languages. C requires typing a lot of redundant characters in the most frequent cases.)
The oldest symbol for indirection through a pointer (in the language Euler, in January 1966) was a raised middle dot (i.e. a point). This was before ASCII and ASCII did not include the raised middle dot (U+00B7), so it was replaced by the most similar ASCII character, "*".
Euler had used "@" for "address of" and indirection was a postfix operator, as it should. Making "*" a prefix operator in B and C was a mistake, which forced the importing of "->" from PL/I, to avoid an excessive number of parentheses. Otherwise "(*x).y" would have been needed, instead of "x->y". With a postfix "*", that would have been "x*.y", and "->", would not have been needed.
In CPL, the ancestor of BCPL and B, indirection was implicit, like with the C++ references. Instead of having an "address of" operator, CPL had a distinct symbol for an assignment variant that assigns the address of a variable, instead of assigning its value.