So we’re hiring (there's a good chance that link will be dead by the time you’re reading this since we will have hopefully hired someone). I’ve been dusting off my interview questions since it’s been a while since I’ve done any interviewing . We’re a C++ shop and all our software is multithreaded so I’ve got a whole set of questions on multithreading. I start out by asking about what one needs to do to safely increment an integer variable from two different threads. It’s a basic question and I have yet to have an applicant that made it to the in-person interview not get it correct.
But next comes the extra credit question: what else is the mutex doing for you beyond synchronizing access? I have yet to receive anything but blank stares and crickets in response. There isn’t even just one answer that I’m looking for; the applicant has got two answers to choose from. Both answers are a bit obscure, but they are important. If the mutex didn’t form a fence for instruction reordering then an over eager compiler or processor could reorder your instructions right out of the critical section. And if the mutex didn’t maintain cache coherency for you then you’d have undefined behavior all over the place. So far no one has come up with either of these answers in an interview.
You don’t have to know about this stuff to effectively program with multiple threads, that’s why I consider it extra credit in the interview. You use the mutexes and they quietly handle all the obscure stuff for you. Unfortunately people can step outside the safe bounds of mutexes without being aware of the pitfalls they’re headed for. There was a code base I encountered where one of the programmers had decided that it was ok to read and write boolean variables from multiple threads without mutexes since booleans are “atomic”. This predates the c++11 memory model by years. They also were not using any of the platform specific predecessors of
std::atomic. So the definition of “atomic” that was in use here was woefully incomplete. Luckily that code was only being run on a single core processor where their definition of “atomic” was complete enough to keep them out of trouble. But they’re going to be in for lots of fun heisen-bugs if that code ever gets run on a multi-core processor. So while I may consider this outside what you have to know, if you’re going to try and be clever like the ill-fated developer above then you better know it.
Maybe this position will be the one where an applicant will have the answer I’ve been
looking for. We are looking for a senior level developer this time around, so if anyone is going to know this stuff it will probably be one of our applicants. And if you are an applicant and you’re reading this – I’m guessing that’s probably not the case given this blog’s traffic numbers – notice that I was really vague up above. If you do regurgitate one of my answers above then you better be prepared to talk about memory barriers and cache coherency. I’ll know if you don’t know what you’re talking about. Hopefully that’s motivation enough to study up.
The funny thing is that we’ll spend a bunch of time in the interview talking about mutexes and condition variables and the like, but if the applicant ends up coming to work here she’ll probably never end up using any of those things. Instead we’ve got a software transactional memory system that’s been handling all that stuff for us for a few years now (I also ask if the applicant has ever heard of STM, so far that’s gotten nothing but crickets as well). I’ve been meaning to write about my experiences with STM in a decent sized scientific data collection and analysis software package, but haven’t found the time. For now I’ll give you the tl;dr version: it’s awesome and if you want to take it away from me you can pry it out of my cold, dead hands.