Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I believe GP means that "a.b" should be enough for pointers as well, you don't need "a->b" or "(*a).b". It's not like pointers can have fields in C, so the meaning would have been unambiguous.


I really prefer that '.' and '->' remain separate, because it makes it immediately clear when a potentially expensive pointer derefence is happening:

    x = a->b->c->d->e;
...is a pointer-hunting-nightmare / potential cache-miss-galore.

    x = a.b.c.d.e;
...no problem, the whole chain is resolved at compile time into a single offset and results in at most one memory access.

    x = a.b.c->d.e;
...makes it immediately clear that c->d is a pointer access and everything else isn't.

PS: ...and of course C++ messed this simple rule up with the introduction of references.


When stepping up from C to C++, it feels ultimately unlogical to use an ampersand to indicate a reference in a declaration, while in an assigment or actual parameter it means address of. Where they out of special chars?


Sorry, I had posted a partial reply before finishing writing the complete one.

As I have written above, a language with implicit dereferencing needs additional syntactic means for denoting the cases when the values of the pointers are needed, e.g. for doing pointer arithmetic, which is frequent in C.

If dereferencing would have been made implicit, that would have required a large number of changes in the language. C++ has introduced pointers with implicit dereferencing, i.e. what C++ calls references, but due to the fact that the other syntax changes that are needed have not been made, because they would break compatibility, the C++ references can replace C pointers only in a subset of their uses.


No one here is proposing implicit dereferencing (everywhere), the proposal is just to make the syntax `pointer.field` have the exact same meaning that `pointer->field` has in C today. In all other places, the current syntax and semantics would stay the same.

Since in all situations where pointer->field is valid pointer.valid is an explicit error in C today, this would have been very much doable without major changes in compiler or language implementation.

Of course, at C's level of abstraction, and given the speed limitations of the day, the syntax difference between -> and . may well be argued to have been helpful instead of harmful.

And yes, the same could not be said for C++, where you can implement dereferencing for your own type and get a variable where both `a.b` and `a->b` are valid and have different meanings. This is anyway not a proposal for changing C (or C++) today, merely a "what if" discussion about how C could have been designed differently.


> If dereferencing would have been made implicit

You're misunderstanding the comment. They aren't asking for implicit derefs. They're asking for explicit derefs, using * or . instead of * or ->


Your and GP's make no sense unless the reader knows that * is a markdown directive and infers where you typed them.

Which is ironic when discussing overloading!


In confused, where in my comment does the Markdown syntax come into play? You're not seeing italics or backslashes are you?


I see now - Hacki renders italics, but the browser doesn't.

Excuse my confusion, this possibility didn't occur to me.


Hacker News does not use Markdown.


While that’s technically true, it does use a severely cut down version thereof which is worse in every way: it only supports paragraphs, code blocks, and emphasis so the ability to format is extremely limited but the emphasis can still screw you over, which requires an explicit preview (or posting) to notice.

And I think escaping emphasis was only introduced somewhat recently? I do remember that for the longest time you basically had to trick HN into not breaking your comments by using a different character in stead.


No, but it does use:

    *
for italics, which sometimes causes a comment to go haywire because someone uses it to reference a footnote or otherwise drops it in mid-sentence, with a matching closing one, not intending italics.

(Up-thread I think it probably was like that when they commented, but has since been fixed.)


Btw, you can now(?) escape * in a paragraph with \*.


Yes, it's also automatic in some cases - originally my comment had an attempt at a joke example of it, but it was caught and didn't work. It was definitely improved relatively recently.


With explicit derefs, if "*x.y" is taken to mean "(*x).y", then which is the meaning of "****x1.x2[7].x3.x4[9].x5"?

> "a.b" should be enough for pointers as well,

I have interpreted this to mean implicit deref, as there is no "*" (could be a formatting problem).


I think you're still misunderstanding the proposal. What are the types of your variables? This discussion makes no sense without type information. We're defining the meaning of . for pointers here. You seem to be misunderstanding the proposal as being purely a textual transformation that ignores types?

The proposal is literally: "If you see . and the left operand is a pointer, pretend the . was -> instead, because otherwise the code is already invalid."


OK, I believe that you are right and if implicit dereferencing had been done only when this is the only interpretation that leads to a valid expression, then "->" would not have been necessary.

However, I assume that this would have been a too complex solution for compilers that had to work in a few tens of kilobytes of memory, while a postfix "*", as already used a decade before C, would have been a trivial solution.


> However, I assume that this would have been a too complex solution for compilers that had to work in a few tens of kilobytes of memory

It's literally the same complexity as the type checking compilers already do to tell you that your `.` does not work because the LHS is a pointer not a struct.


It is not the same complexity, because syntax checking stops immediately at an error.

To determine if implicit dereferencing may be applied, more analysis has to be done, because there it may be not only a pointer to a structure, but a pointer to a pointer to a structure and so on, so multiple implicit dereferencing may be needed to obtain a valid expression.

However I agree that the difference in complexity is not big.


`a.b` is always valid syntax, there's no way to know if it is a valid C instruction until you resolve the types of a and b. Assigning it valid semantics at that point is exactly as easy as assigning it error semantics.

And no, this would not perform multiple levels of dereferencing anymore than -> does today. You could have literally find&replaced every use of -> with . and every C program would have had the exact same semantics. `struct point **a; a.x = 1` would throw the exact same compilation error that `struct point **a; a->x = 1` throws today. The only difference would be that `struct point *a; a.x = 1;` would write 1 to the field x of the object pointed to by a, instead of throwing an error that says "object of type struct point* has no field named x".


> As I have written above, a language with implicit dereferencing needs additional syntactic means for denoting the cases when the values of the pointers are needed, e.g. for doing pointer arithmetic, which is frequent in C.

No it doesn’t. Let’s take this example in valid C:

  struct my_struct { int field };
  struct my_struct  s = { .field = 42 };
  struct my_struct *p = &s;
  printf("direct : %d", s.field);
  printf("pointer: %d", p->field);
  printf("address: %x", p);
What is being proposed here is to make that code valid:

  struct my_struct { int field };
  struct my_struct  s = { .field = 42 };
  struct my_struct *p = &s;
  printf("direct : %d", s.field);
  printf("pointer: %d", p.field); // note the use of a dot here
  printf("address: %x", p);
That is, the naked p is still to be interpreted as what it is: a pointer. It’s just that when we write `a.b`, the language would first check the type of `a`, then dereference it as many times as necessary to get to the underlying struct, and then access its field. For instance:

  struct my_struct { int field };
  struct my_struct    s   = { .field = 42 };
  struct my_struct   *p   = &s;
  struct my_struct  **pp  = &p;
  struct my_struct ***ppp = &pp;
Now let’s see how this automatic indirection would work:

  // All would print the same value
  printf("%d", ppp.field);
  printf("%d", pp .field);
  printf("%d", p  .field);
  printf("%d", s  .field);

  // We can still use explicit indirections
  printf("%d", (*ppp  ).field);
  printf("%d", (**ppp ).field);
  printf("%d", (***ppp).field);
  printf("%d", (*pp   ).field);
  printf("%d", (**pp  ).field);
  printf("%d", (*p    ).field);
We can still get to the actual addresses no problem:

  printf("p: %x", p);
  printf("p: %x", *pp);
  printf("p: %x", **ppp);

  printf("pp: %x", pp);
  printf("pp: %x", *ppp);

  printf("ppp: %x", ppp);
Note that we can play with the & operator too. In valid C we can do this already:

  printf("s.field: %d", s.field);
  printf("s.field: %d", (*&s).field);
  printf("s.field: %d", (**&&s).field);
  printf("s.field: %d", (***&&&s).field);
With automatic indirection the following would be valid too:

  printf("s.field: %d", (&s).field);
  printf("s.field: %d", (&&s).field);
  printf("s.field: %d", (&&&s).field);
---

The kicker here is that the decision on whether an access to a struct member requires dereferencing the pointer or not, is not done at parsing time. It’s done at type checking time. And by the way, in standard C the decision to give you an error or not is already done at type checking time. All this to say, this would be a fairly benign change to compilers.

Now would users get confused? Possibly. With the conflation of pointers and arrays, the following would be equivalent:

  array.field
  (*array).field
  array[0].field
Looks nifty to some perhaps, but some people really meant:

  array[i].field
and forgot to write the index.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: