Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To achieve a reasonable level of encapsulation in C, a header file must be seen as a public-only interface. It should declare only the structs that are relevant for the user of the module. If that's "struct my_module_handle { ... }", declare it and document the corresponding accessor and modifier functions. Everything else must reside in the C source file with internal linkage (static storage class). The whole source file is your implementation.

There is an anti-pattern where header files are used for all the declarations needed internally by the source file. Including (pasting verbatim with the preprocessor) that file from another module would bring in all the unnecessary declarations.



I think what you say makes complete sense at module-level (as in, for a standalone lib for instance) but I never bother segregating things internally within a lib/module/exe and rely on good documentation and coding practices to avoid having member mutations all over the place.

If I code in Rust or C++ I can use namespacing and public/private to give every single object in the codebase a clean interface, but in C doing that is just frustrating, not to mention potentially inefficient.


Opaque pointers usually impose the restriction on the API such that in order to use the handle one has to dynamically allocate the object on heap. That's a quite unfortunate tradeoff IMO.


Why not have the API take an allocator as a parameter? Pass in pointers to malloc, realloc, and free. Then the library can use your static allocator (or some 3rd party malloc such as jemalloc, or your own arena allocator for that matter) or it can default to the system one if you pass null pointers instead.


Callback based APIs are better, but not good enough. The problem remains that the user still has to deal with some potentially undesirable constraints:

* I may be running in an environment where allocating memory is an asynchronous operation. A callback based API forces me to block, which can cause unpleasant side effects like halting an event loop.

* In my experience, some libraries with custom allocation hooks forget to define one or both of the two basics in the callback signature: a "context" or "user data" parameter, and a way to return an error.

The proper solution is to decouple memory allocation from object initialization entirely. There are two different approaches to this:

1. Expose get_foo_size() and get_foo_align() functions that return the size and alignment that the opaque foo struct needs (at runtime, of course). Then I as the user can allocate that memory, and initialize my opaque foo objects in-place:

  size_t foo_size = get_foo_size();
  void* buf = alloc_aligned_memory(foo_size * 1000, get_foo_align());
  for(size_t i = 0; i < 1000; ++i) {
    int err = foo_init(buf + i * foo_size, /* params here */);
  }
2. Define foo_init(void* buf, size_t len, ...) which attempts to initialize an opaque foo object in the buffer defined by [buf, buf+len). If the buffer does not have enough space, return an error. Otherwise, return the number of bytes actually used by the object.


That second method works fine if you just want a buffer of a bunch of foo’s and they all happen to be the same size. Not so fine if foo is a data structure in its own right with growable capacity.


That is possible and reason why I said "usually". Still, it's an unfortunate complication because you need to manage the pool of your (static) objects now. Also, it's not possible to use stack with this scheme.


there are ways around this, if VLAs are allowed.

  // in <opaque_foo.h> 
  typedef struct opaque_foo opaque_foo_t;
  size_t opaque_foo_sz(void); 
  void opaque_foo_init(opaque_foo_t* foo) 

  // in your code, which you could write a helper macro for if you were so inclined
  char opaque_foo_mem[opaque_foo_sz()]; 
  opaque_foo_t * my_foo = (opaque_foo_t*) opaque_foo_mem;
  opaque_foo_init(my_foo);


This approach violates alignment requirements causing misaligned accesses that on some architectures is mitigated by generating extra instructions while on some other architectures it's a violation (e.g. segfault).

There is no way for a compiler to infer the alignment requirement of a struct because it does not see its definition. You would always have to align the char buffer by hand but you cannot do it for the same reason compiler cannot.

What you can do is to always greedily align the char buffer to the strictest (largest) fundamental requirement for that platform - in other words alignas(max_align_t).

And this is actually what alloca() does for you by default. From https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

> The object is aligned on the default stack alignment boundary for the target determined by the BIGGEST_ALIGNMENT_macro.


Doesn't this violate strict aliasing?


yes, though you can fix that with compiler flags (or, #pragma if you want strict aliasing elsewhere in your code)

alternatively, gcc supports VLAs in unions, but I don't think clang does, but that makes it extra annoying to do.

edit: apparently you can probably apply the may_alias attribute to the type? Or you could try using transparent_union. No idea if clang supports either...


At that point, it is IMO better to obtain that block from alloca()[1]—it’s not standard, but where it’s available the compiler will treat the result as untyped for the purposes of aliasing. (If you’re on GCC/LLVM, __builtin_alloca_with_align is also an option, although note that the memory it returns may not outlive the current block—similar to a standard automatic variable, but unlike memory from traditional alloca.) ISO C has a gigantic hole when it comes to obtaining and recycling untyped memory, pretending the hole is not there isn’t going to help.

[1] https://nullprogram.com/blog/2019/10/28/, discussed at the time at https://news.ycombinator.com/item?id=21374863


And alignment?


Just use alloca.

  // in your code, which you could write a helper macro for if you were so inclined

  opaque_foo_t *my_foo = alloca(opaque_foo_sz());
  opaque_foo_init(my_foo);


yes, you may need an alignas depending on platform (though you probably want it even if unaligned access is supported).


char* can alias anything


The strict aliasing optimization in gcc in theory can cause problems (which is why many, including the Linux kernel, disable that)


This is how it's supposed to be done but you always end-up moving them to header just to make writing unit tests less painful.


This is a smell. Your unit tests should not have to rely on internal implementation details.


And in the worst case you can have module-private headers. No need to pollute your interface.


Are unit tests common in C? In the mid-2000's, I worked on an "enterprise" system, written in C and C++. There were about 300,000 lines of code, maybe 10 tests. This thing was the core of a billion dollar business


That's very broken; more usual practice at the time was to have large numbers of tests, aiming for good code coverage. I've sometimes seen too many full-tool tests and few true unit tests (tests that just test functions for correctness), but "maybe 10 tests" is frightening.


This particular company was very heavy on manual testing, unfortunately. (This was roughly 2003 - 2006 ish.)


They don't have to and shouldn't, but it's convenient. That's the parent's point ("to make writing unit tests less painful").




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: