More

thradams · 2025-12-15T19:26:54 1765826814

In this model, ownership is checked statically when variables go out of scope and before assignment.

"Owner pointers" must be uninitialized or null at the end of their scope.

Basically, the nullable state needs to be tracked at compile time, and nullable pointers,despite being a separate feature, reuse the same flow analysis.

For the impatient reader, a simplified way to think about it is to compare it with C++'s unique_ptr.

The difference is that, instead of runtime code being executed at the end of the scope (a destructor), we perform a compile-time check to ensure that the owner pointer is not referring to any object. The same before assignment.

So we get the same guarantees as C++ RAII, with some extras. In C++, the user has to adopt unique_ptr and additional wrappers (for example, for FILE). In this model, it works directly with malloc, fopen, etc., and is automatically safe, without the user having to opt in to "safety" or write wrappers. Safety is the default, and the safety requirements are propagated automatically.

It is interesting to note that propagation also works very well for struct members. Having an owner pointer as a struct member requires the user to provide a correct "destructor" or free the member manually before the struct object goes out of scope.

#pragma safety enable

#include <stdio.h>

int main() { FILE *_Owner _Opt f = fopen("file.txt", "r"); if (f) { fclose(f); } }

At the end of the scope of f, it can be in one of two possible states: "null" or "moved" (f is moved in the fclose call).

These are the expected states for an owner pointer at the end of its scope, so no warnings are issued.

Removing _Owner _Opt we have exactly the same code as users write today. But with the same or more guarantees than C++ RAII.

thradams · on Oct 6, 2024

When exploring the design of nullable pointers in C and comparing them with other languages like C# and TypeScript, which have constructors, I realized that C might benefit from a way to represent transient states, the state equivalent of when object is being constructed.

The C++ mutable keyword came to mind as a potential solution.

During the object creation (or destruction), the instance is considered to be in a transitional state, where the usual constraints—such as non-nullable pointers and immutability—are lifted. Once the transitional phase is over and the object is returned, the contract that governs the object (such as immutability of name and non-nullability of pointers) is fully reinstated.

thradams · on Aug 9, 2024

A two-minute video explaining the concepts of nullable pointers and pointer ownership in Cake, using a simple example.

thradams · on April 28, 2024

Cake is a open source compiler and static analyzer in development. (Not production quality yet.)

This video shows how cake can help programmers to create safe code just fixing warnings.

https://youtu.be/X5tmkF16UMQ

We copy paste code then we add pragma safety enable

This enables two features ownership and nullable checks. Ownership will check if the fclose is called for instance, also checks double free etc, while nullable checks will check for de-referencing null pointers.

New qualifiers _Opt and _Owner are used but they can be empty macros, allowing the same code to be compiled without cake.

thradams · on Feb 28, 2024

I think at "Keep the language small and simple" it should say avoid "two ways of doing something"

( The sample I have is 0, NULL and nullptr where nullptr is something new. Two ways of doing something makes the language complex. )

Jorengarenar · on Feb 28, 2024

Yeah, we didn't copy that one over precisely because it was kind of a blocker to introducing replacements for outdated design.

But I think it can be weaseled into that principle. Thanks!

thradams · on Feb 20, 2024

The idea is to keep cake aligned with C, not a language fork. But Cake itself could have a fork to Cake++. :D

JonChesterfield · on Feb 20, 2024

A 'C' -> C compiler which preserves most source code unchanged (i.e. would be the identity transform on some input) and which implements something like constexpr on functions (by running the interpreter during the transform) could be argued to be a forward looking C implementation. Specifically C23 has constexpr, but in an extremely limited form, and aspires to extend that to be more useful later.

Equally one which replaces 'auto' with the name of the type (and similar desugaring games) is still a C to C compiler, just running as a C23 to C99 or whatever. Resolve the branch in _Generic before emitting code as part of downgrading C11.

The lifetime annotations are an interesting one because they're a different language which, if it typechecks, can be losslessly converted into C (by dropping the annotations on the way out).

I'm not sure where in that design space the current implementation lies. In particular folding preprocessed code back into code that has the #defines and #includes in is a massive pain and only really valuable if you want to lean into the round trip capability.

thradams · on Feb 20, 2024

auto, typeof, _Generic are implemented in cake. Sometimes when they are used inside macros the macros needs to be expanded. Then cake has #pragma expand MACRO. for this task.

Sample macro NEW using c23 typeof.

    #include <stdlib.h>
    #include <string.h>

    static inline void* allocate_and_copy(void* s, size_t n) {
        void* p = malloc(n);
        if (p) {
            memcpy(p, s, n);
        }
        return p;
    }

    #define NEW(...) (typeof(__VA_ARGS__)*) allocate_and_copy(&(__VA_ARGS__), sizeof(__VA_ARGS__))
    #pragma expand NEW

    struct X {
        const int i;
    };

    int main() { 
        auto p = NEW((struct X) {});     
    }

The generated code is

    #include <stdlib.h>
    #include <string.h>

    static inline void* allocate_and_copy(void* s, size_t n) {
        void* p = malloc(n);
        if (p) {
            memcpy(p, s, n);
        }
        return p;
    }

    #define NEW(...) (typeof(__VA_ARGS__)*) allocate_and_copy(&(__VA_ARGS__), sizeof(__VA_ARGS__))
    #pragma expand NEW

    struct X {
        const int i;
    };

    int main() { 
        struct X  * p =  (struct X*) allocate_and_copy(&((struct X) {0}), sizeof((struct X) {0}));     
    }

thradams · on Feb 20, 2024

(by the way, embed is not working on web version because of include directory bug - it is an open issue and regression)

thradams · on Feb 20, 2024

Rust needs to add some runtime checks when calling destructors in scenarios where some object may or may not be moved.

In C++ for instance, for smart pointers, the destructor will have a "if p!= NULL". Then if the smart pointer was moved, it makes the pointer null and the destructor checks at runtime for it.

thradams · on Feb 20, 2024

Cake implements defer as an extension, where ownership and defer work together. The flow analysis must be prepared for defer.

    int * owner p = calloc(1, sizeof(int));
    defer free(p);

However, with ownership checks, the code is already safe. This may also change the programmer's style, as generally, C code avoids returns in the middle of the code.

In this scenario, defer makes the code more declarative and saves some lines of code. It can be particularly useful when the compiler supports defer but not ownership.

One difference between defer and ownership checks, in terms of safety, is that the compiler will not prompt you to create the defer. But, with ownership checks, the compiler will require an owner object to hold the result of malloc, for instance. It cannot be ignored.

The same happens with C++ RAII. If you forgot to free something at our destructor or forgot to create the destructor, the compiler will not complain.

In cake ownership this cannot be ignored.

    struct X {
      FILE * owner file;
    };

    int main(){
       struct X x = {};
       //....
       
    } //error x.file not freed

thradams · on Feb 20, 2024

>Can you ask Github Co-pilot to look at C code and answer the question "What is >the length of the array 'buf' passed to this function"? That tells you how to >express the array in a language where arrays have enforced lengths, whicn >includes both C++ and Rust

this is the way you tell C what is the size of array.

    void f(int n, int a[n]) {
    }

Animats · on Feb 20, 2024

You can write that in C, but it doesn't really do anything. It's equivalent to

    void f(int n, int a[]) {
    }

Why? So that you can write

    void f(int n, int m, int a[n][m]) {
    }

which declares a 2-dimensional array parameter. In that case, the "m" is used to compute the position in the array for a 2D array. The "m" doesn't do anything. This is equivalent to writing

   void f(int n, int m, int a[][m]) {
   }

This is C's minimal multidimensional array support, known by few and used by fewer.

Over a decade ago, I proposed that sizes in parameters should be checkable and readable I worked out how to make it work.[1] But I didn't have time for the politics of C standards.

[1] http://animats.com/papers/languages/safearraysforc43.pdf

a_t48 · on Feb 20, 2024

Do you have source on this syntax? Does the `[n]` actually do anything here? Fooling around in godbolt, `void f(int n, int a[n]) {` is the same as `void f(int n, int a[]) {` and doesn't appear to change assembly or generate any warnings/errors with improper usage.

unnah · on Feb 20, 2024

It looks like standard C99 variable-length array (VLA) syntax: https://en.cppreference.com/w/c/language/array#Variable-leng...

The major difference is when the array is multi-dimensional. If you don't have VLAs then you can only set the inner dimensions at compile time, or alternatively use pointer-based work-arounds.

Even in the case of one-dimensional arrays, a compiler or a static analyzer can take advantage of the VLA size information to insert run-time checks in debug mode, or to perform compile-time checks.

a_t48 · on Feb 20, 2024

Thank you - that makes total sense.

JonChesterfield · on Feb 20, 2024

you're missing the word "static" to have that work as intended. Option (2) at https://en.cppreference.com/w/c/language/array

Parameters like `const double b[static restrict 10]` for at least 10 long and doesn't alias other parameters.

Syntactically this is pretty weird.