A couple make flags that are useful and probably not very well known:
Output synchronization which makes `make` print stdout/stderr only once a target finishes. Otherwise it's typically interleaved and hard to follow:
make --output-sync=recurse -j10
On busy / multi-user systems, the `-j` flag for jobs may not be best. Instead you can also limit parallelism based on load average:
make -j10 --load-average=10
Randomizing the order in which targets are scheduled. This is useful for your CI to harden your Makefiles and see if you're missing dependencies between targets:
Maybe the make authors could compile a list of options somewhere and ship it with their program, so users could read them? Something like a text file or using some typesetting language. This would make that knowledge much more accessible.
If "make -j" successfully drowns a machine, I can argue that the machine has no serious bottlenecks for the job. Because, make is generally I/O bound when run with high parallelism, and if you can't saturate your I/O bandwidth, that's a good thing in general.
However, if "make -j" is saturates a machine, and this is unintentional, I'd assume PEBKAC, or "holding it wrong", in general.
The problem is ‘make -j’ spinning up 100s of C++ compilation jobs, using up all of the systems RAM+swap, and causing major instability.
I get that the OS could mitigate this, but that’s often not an option in professional settings. The reality is that most of the time users are expecting ‘make -j $(N_PROC)’, get bit in the ass, and then the GNU maintainers say PEBKAC—wasting hundreds of hours of junior dev time.
> The problem is ‘make -j’ spinning up 100s of C++ compilation jobs, using up all of the systems RAM+swap, and causing major instability.
I would put that in the “using it improperly” category. I never use⁰ --jobs without specifying a limit.
Perhaps there should have been a much more cautious default instead of the default being ∞, maybe something like four¹, or even just 2, and if people wanted infinite they could just specify something big enough to encompass all the tasks that could possibly run in the current process. Or perhaps --load-average should have defaulted to something like min(2, CPUs×2) when --jobs was in effect⁴.
The biggest bottleneck hit when using --jobs back then wasn't RAM or CPU though, it was random IO on traditional high-latency drives. A couple of parallel jobs could make much better use of even a single single-core CPU, by the CPU-crunching of a CPU-busy task or two and the IO of other tasks ending up parallel, but too many concurrent tasks would result in an IO flood that could practically stall the affected drives for a time, putting the CPU back into a state of waiting ages for IO (probably longer than it would be without multiple jobs running) - this would throttle a machine² before it ran out of RAM even with the small RAM we had back then compared to today. With modern IO and core counts, I can imagine RAM being the bigger issue now.
--------
[0] Well, used, I've not touched make for quite some time
[1] Back when I last used make much at all small USB sticks and SD cards were not uncommon, but SSDs big++quick+hardy enough for system or work drives were an expensive dream. With frisby-based drives I found a four job limit was often a good compromise, approaching but not hitting significantly diminishing returns if you had sufficient otherwise unused RAM, while keeping a near-zero chance of effectively completely stalling the machine due to a flood of random IO.
[2] Or every machine… I remember some fool³ bogging down the shared file server of most of the department with a vast parallel job, ignoring the standing request to run large jobs on local filesystems where possible anyway.
[3] Not me, I learned the lesson by DoSing my home PC!
[4] Though in the case of causing an IO storm on a remote filesystem, a load-average limit might be much less effective.
Thanks for the historical perspective. It probably was less of an issue on older hardware because you can ctrl-c if you’re IO starved. Linux user spaces do not do well when the OOM killer comes out to play.
Personally, I don’t think these footguns need to exist.
Though in the shared drive example, only the host causing the problem can have ctrl+c done to solve it. Running something on the file server to work out the culprit (by checking the owner of the files being accessed for instance) will be pretty much blocked behind everything else affected by the IO storm.
I’ll kindly disagree on wasting junior developer time. Any person who’s using tools professionally should read (or at least skim) the manual of the said tool. Especially, if it’s something foundational to their all workflow.
They are junior because they are inexperienced, but being junior is the best place to make mistakes and learn good habits.
If somebody asks what is the most important thing I have learnt over the years, I’d say “read the manual and the logs”.
There’s a difference between understanding your tool and unnecessary cognitive load.
Make does not provide a sane way to run in parallel. You shouldn’t have to compose a command that parses /proc/cpuinfo to get the desired behavior of “fully utilize my system please”. This is not a detail that is particularly relevant to conditional compilation/dependency trees.
This feels like it’s straight out of the Unix Haters Handbook.
It's trivial to go OOM on a modern dev machine with -j$(nproc) these days because of parallel link jobs. Make is never the bottleneck, it's just the trigger.
I will not restrict myself to an arcane subset of Make just because you refuse to type 'gmake' instead of 'make'. Parallel execution, pattern rules, order-only prerequisites, includes, not to mention the dozens of useful function like (not)dir, (pat)subst, info... There's a reason why most POSIX Makefiles nowadays are generated. It's not GNU's fault that POSIX is stale.
EDIT: There's one exception, and that would be using Guile as an extension language, as that is often not available. However, thanks to conditionals (also not in POSIX, of course), it can be used optionally. I once sped up a Windows build by an order of magnitude by implementing certain things in Guile instead of calling shell (which is notoriously slow on Windows).
Agreed. My company decided on using GNU Make on every platform we supported, which back then (last century) was a bunch of Unix variants, and Linux. That made it possible to write a simple and portable build system which could be used for everything we did, no hassle. And not difficult, because gmake was available basically everywhere, then just as now.
Completely agree. POSIX is irrelevant anyway. Every single unixlike has unique features that are vastly superior to whatever legacy happens to be standardized by POSIX. Avoiding their use leads to nothing but misery.
And it's available everywhere. All Unix platforms had it back then, and the still existing ones (AIX is alive, at least) have it available. Which made it easy for our company to base our build system on GNU Make for everything, back in the day.
> Portability is overrated.
> GNU Make is [..] itself portable.
Sounds like it's not overrated, then. You just prefer that other people write portable C and package GNU Make for all systems instead of you writing POSIX Make.
Not at all. I think we should all be using the full potential of our preferred system instead of sucky abstractions that provide the lowest common denominator of features.
Portability is overrated. Portability between POSIX systems is especially overrated. Linux and the BSDs have powerful exclusive features and people should be using them as much as possible in their software, simply because it's better than the legacy POSIX nonsense. This also applies to the features of Windows, macOS, iOS, etc.
GNU Make is powerful, ubiquitous and portable. That makes it even more pointless to avoid it. I won't claim it's perfect but it's absolutely a hell of a lot better than some "standard" POSIX variant of make that virtually nobody actually cares about. GNU Make will be present in pretty much every system capable of compiling software. Everyone is used to running make to build things. Avoiding things that make life easier because POSIX is pointless masochism.
People are too quick to [ab]use GNU Make features. IME, learning how to make do with portable make constructs can help discipline oneself to avoid excessive complexity, especially when it comes to macro definitions where GNU Make's Lispy looping and eval constructs are heavily overused and quickly lead to obtuse, impenetrable code. POSIX pattern substitutions are quite powerful and often produce easier to read code than the GNU equivalent. I'm not sure if computed variable names/nested variable references are well-defined in POSIX (e.g. "$($(FOO))"), but they are widely supported nonetheless, and often more readable than $(eval ...). (They can also be used for portable conditional constructs; I wouldn't argue they're more readable, though I often find them so.)
Some GNU Make constructs, like pattern rules, are indispensable in all but the simplest projects, but can also be overused.
For some reason there's a strong urge to programmatically generate build rules. But like with SQL queries, going beyond the parameterization already built into the language can be counter productive. A good Makefile, like a good SQL query, should be easy to read on its face. Yes, it often means greater verbosity and even repetition, but that can be a benefit to be embraced (at least embraced more than is instinctively common).
EDIT: Computed variable references are well-defined as of POSIX-2024, including (AFAICT) on the left-hand side of a definition. In the discussion it was shown the semantics were already supported by all extant implementations.
It's a matter of praxis. Targeting portable constructs is (IMO) a useful methodology for achieving the abstract goal. It doesn't have to be strict, but it provides a quantifiable, objective metric (i.e. amount of non-portable constructs employed) to help achieve an otherwise subjective goal.
Otherwise you face an ocean of choices that can be overwhelming, especially if you're not very experienced in the problem space. It's like the common refrain with C++: most developers settle on a subset of C++ to minimize code complexity; but which subset? (They can vary widely, across projects and time.) In the case of Make, you can just pick the POSIX and/or de facto portable subset as your target, avoiding alot of choice paralysis/anxiety (though you still face it when deciding when to break out of that box to leverage GNU extensions).
Not every project has to be a multi-platform, multi-os, multi-language monster. It is perfectly fine to target a specific set of architecture, os, etc. And I find insulting and silly calling it a “toy project”
Agreed if you're looking at it through the lens of portable software that you plan to distribute. Automake generates portable Makefiles for a reason.
But there's another huge category: people who are automating something that's not open-source. Maybe it stays within the walls of their company, where it's totally fine to say "build machines will always be Ubuntu" or whatever other environment their company prefers.
GNU Make has a ton of powerful features, and it makes sense to take advantage of them if you know that GNU Make will always be the one you use.
Output synchronization which makes `make` print stdout/stderr only once a target finishes. Otherwise it's typically interleaved and hard to follow:
On busy / multi-user systems, the `-j` flag for jobs may not be best. Instead you can also limit parallelism based on load average: Randomizing the order in which targets are scheduled. This is useful for your CI to harden your Makefiles and see if you're missing dependencies between targets: