To achieve a reasonable level of encapsulation in C, a header file must be seen ...

simias · on July 17, 2023

I think what you say makes complete sense at module-level (as in, for a standalone lib for instance) but I never bother segregating things internally within a lib/module/exe and rely on good documentation and coding practices to avoid having member mutations all over the place.

If I code in Rust or C++ I can use namespacing and public/private to give every single object in the codebase a clean interface, but in C doing that is just frustrating, not to mention potentially inefficient.

menaerus · on July 17, 2023

Opaque pointers usually impose the restriction on the API such that in order to use the handle one has to dynamically allocate the object on heap. That's a quite unfortunate tradeoff IMO.

chongli · on July 18, 2023

Why not have the API take an allocator as a parameter? Pass in pointers to malloc, realloc, and free. Then the library can use your static allocator (or some 3rd party malloc such as jemalloc, or your own arena allocator for that matter) or it can default to the system one if you pass null pointers instead.

10000truths · on July 18, 2023

Callback based APIs are better, but not good enough. The problem remains that the user still has to deal with some potentially undesirable constraints:

* I may be running in an environment where allocating memory is an asynchronous operation. A callback based API forces me to block, which can cause unpleasant side effects like halting an event loop.

* In my experience, some libraries with custom allocation hooks forget to define one or both of the two basics in the callback signature: a "context" or "user data" parameter, and a way to return an error.

The proper solution is to decouple memory allocation from object initialization entirely. There are two different approaches to this:

1. Expose get_foo_size() and get_foo_align() functions that return the size and alignment that the opaque foo struct needs (at runtime, of course). Then I as the user can allocate that memory, and initialize my opaque foo objects in-place:

  size_t foo_size = get_foo_size();
  void* buf = alloc_aligned_memory(foo_size * 1000, get_foo_align());
  for(size_t i = 0; i < 1000; ++i) {
    int err = foo_init(buf + i * foo_size, /* params here */);
  }

2. Define foo_init(void* buf, size_t len, ...) which attempts to initialize an opaque foo object in the buffer defined by [buf, buf+len). If the buffer does not have enough space, return an error. Otherwise, return the number of bytes actually used by the object.

chongli · on July 18, 2023

That second method works fine if you just want a buffer of a bunch of foo’s and they all happen to be the same size. Not so fine if foo is a data structure in its own right with growable capacity.

menaerus · on July 18, 2023

That is possible and reason why I said "usually". Still, it's an unfortunate complication because you need to manage the pool of your (static) objects now. Also, it's not possible to use stack with this scheme.

cozzyd · on July 17, 2023

there are ways around this, if VLAs are allowed.

  // in <opaque_foo.h> 
  typedef struct opaque_foo opaque_foo_t;
  size_t opaque_foo_sz(void); 
  void opaque_foo_init(opaque_foo_t* foo) 

  // in your code, which you could write a helper macro for if you were so inclined
  char opaque_foo_mem[opaque_foo_sz()]; 
  opaque_foo_t * my_foo = (opaque_foo_t*) opaque_foo_mem;
  opaque_foo_init(my_foo);

menaerus · on July 18, 2023

This approach violates alignment requirements causing misaligned accesses that on some architectures is mitigated by generating extra instructions while on some other architectures it's a violation (e.g. segfault).

There is no way for a compiler to infer the alignment requirement of a struct because it does not see its definition. You would always have to align the char buffer by hand but you cannot do it for the same reason compiler cannot.

What you can do is to always greedily align the char buffer to the strictest (largest) fundamental requirement for that platform - in other words alignas(max_align_t).

And this is actually what alloca() does for you by default. From https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

> The object is aligned on the default stack alignment boundary for the target determined by the BIGGEST_ALIGNMENT_macro.

kopecs · on July 17, 2023

Doesn't this violate strict aliasing?

cozzyd · on July 17, 2023

yes, though you can fix that with compiler flags (or, #pragma if you want strict aliasing elsewhere in your code)

alternatively, gcc supports VLAs in unions, but I don't think clang does, but that makes it extra annoying to do.

edit: apparently you can probably apply the may_alias attribute to the type? Or you could try using transparent_union. No idea if clang supports either...

mananaysiempre · on July 17, 2023

At that point, it is IMO better to obtain that block from alloca()[1]—it’s not standard, but where it’s available the compiler will treat the result as untyped for the purposes of aliasing. (If you’re on GCC/LLVM, __builtin_alloca_with_align is also an option, although note that the memory it returns may not outlive the current block—similar to a standard automatic variable, but unlike memory from traditional alloca.) ISO C has a gigantic hole when it comes to obtaining and recycling untyped memory, pretending the hole is not there isn’t going to help.

[1] https://nullprogram.com/blog/2019/10/28/, discussed at the time at https://news.ycombinator.com/item?id=21374863

jenadine · on July 17, 2023

And alignment?

kazinator · on July 18, 2023

Just use alloca.

  // in your code, which you could write a helper macro for if you were so inclined

  opaque_foo_t *my_foo = alloca(opaque_foo_sz());
  opaque_foo_init(my_foo);

cozzyd · on July 17, 2023

yes, you may need an alignas depending on platform (though you probably want it even if unaligned access is supported).

bensecure · on July 18, 2023

char* can alias anything

cozzyd · on July 18, 2023

The strict aliasing optimization in gcc in theory can cause problems (which is why many, including the Linux kernel, disable that)

hbossy · on July 17, 2023

This is how it's supposed to be done but you always end-up moving them to header just to make writing unit tests less painful.

10000truths · on July 17, 2023

This is a smell. Your unit tests should not have to rely on internal implementation details.

gpderetta · on July 17, 2023

And in the worst case you can have module-private headers. No need to pollute your interface.

icedchai · on July 17, 2023

Are unit tests common in C? In the mid-2000's, I worked on an "enterprise" system, written in C and C++. There were about 300,000 lines of code, maybe 10 tests. This thing was the core of a billion dollar business

not2b · on July 17, 2023

That's very broken; more usual practice at the time was to have large numbers of tests, aiming for good code coverage. I've sometimes seen too many full-tool tests and few true unit tests (tests that just test functions for correctness), but "maybe 10 tests" is frightening.

icedchai · on July 18, 2023

This particular company was very heavy on manual testing, unfortunately. (This was roughly 2003 - 2006 ish.)

coldtea · on July 17, 2023

They don't have to and shouldn't, but it's convenient. That's the parent's point ("to make writing unit tests less painful").