Blog Archive for / 2021 /
Online Concurrency Workshop at C++ on Sea 2021
Wednesday, 09 June 2021
The restrictions brought upon us by COVID-19 are not over yet, and C++ on Sea is the latest conference that will be running as an online-only conference.
I will be running my More Concurrent Thinking class as an online workshop for C++ on Sea on 30th June and 1st July 2021.
The workshop will run from 09:30 UTC to 18:15 UTC each day. For attendees from North and South America, this is likely quite an early morning, and may be a late night for attendees from the far East, so please check the times in your local timezone.
Tickets include the full day of "normal" conference presentations on 2nd July 2021. Get yours from the C++ On Sea tickets page.
I hope to see you there!
Posted by Anthony Williams
[/ news /] permanent link
Tags: C++, concurrency, classes, workshops
Stumble It! | Submit to Reddit | Submit to DZone
If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.
Using atomics for thread synchronization in C++
Monday, 19 April 2021
In my previous blog post I wrote about spin locks, and how compilers must not move the locking loop above a prior unlock.
After thinking about this done more, I realised that is not something specific to locks — the same issue arises with any two step synchronization between threads.
Consider the following code
std::atomic<bool> ready1{false};
std::atomic<bool> ready2{false};
void thread1(){
ready1.store(true, std::memory_order_release);
while(!ready2.load(std::memory_order_acquire)){}
}
void thread2() {
while(!ready1.load(std::memory_order_acquire)) {}
ready2.store(true, std::memory_order_release);
}
thread1
sets ready1
to true
, then waits for thread2
to set ready2
to
true
. Meanwhile, thread2
waits for ready1
to be true
, then sets
ready2
to true
.
This is almost identical to the unlock/lock case from the previous
blog post, except the waiting thread is just using plain load
rather
than exchange
.
If the compiler moves the wait loop in thread1
above the store
then
both threads will hang forever. However it cannot do this for the same
reason the spinlocks can't deadlock in the previous post: the store
has to be visible to the other thread in a finite period if time, so
must be issued before the wait loop. https://eel.is/c++draft/intro.multithread#intro.progress-18
An implementation should ensure that the last value (in modification order) assigned by an atomic or synchronization operation will become visible to all other threads in a finite period of time.
If the optimizer moved the store
across the loop in thread1
, then
it could not guarantee that the value became visible to the other
thread in a finite period of time. Therefore such an optimization is
forbidden.
Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: cplusplus, atomics, multithreading, synchronization
Stumble It! | Submit to Reddit | Submit to DZone
If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.
Can non-overlapping spinlocks deadlock in C++?
Thursday, 15 April 2021
There has been discussion on Twitter recently about whether or not the C++
memory model allows spinlocks to deadlock if they just use
memory_order_acquire
in lock
and memory_order_release
in unlock
, due to
compiler optimizations. The case in question is where a thread locks one mutex,
unlocks it, and locks a second: can the compiler reorder the second lock above
the first unlock? If it does, and another thread does the same in the reverse
order, with the same optimization, then sequential locks could deadlock.
Here is the code in question, with all the lock/unlock code inlined.
std::atomic<bool> mutex1{false};
std::atomic<bool> mutex2{false};
int x=0;
int y=0;
void thread1(){
while(mutex1.exchange(true,std::memory_order_acquire)){} // #1
x=1;
mutex1.store(false,std::memory_order_release); // #2
while(mutex2.exchange(true,std::memory_order_acquire)){} // #3
y=1;
mutex2.store(false,std::memory_order_release); // #4
}
void thread2(){
while(mutex2.exchange(true,std::memory_order_acquire)){} // #5
x=2;
mutex2.store(false,std::memory_order_release); // #6
while(mutex1.exchange(true,std::memory_order_acquire)){} // #7
y=2;
mutex1.store(false,std::memory_order_release); // #8
}
For there to even be the possibility of deadlock, thread1
must successfully
execute line #1 before thread2
successfully executes line #7, and thread2
must
successfully execute line #5 before thread1
successfully executes line #3.
Because these are RMW operations, the threads must agree on the ordering.
The modification order of mutex1
must thus be #1(success), #2, #7(success), #8.
Similarly, the modification order of mutex2
must be #5(success), #6, #3(success), #4.
All threads must agree on these modification orders. https://eel.is/c++draft/intro.multithread#intro.races-4
From the point of view of thread1
, everything must run in program order:
compilers can only optimize things as long as they run "as if" in program order.
The store to mutex1
at #2 is guaranteed to be visible to thread2
in "a finite
period of time". https://eel.is/c++draft/intro.multithread#intro.progress-18
Consequently, thread2
must eventually see that store at line #7, even if it
executes line #7 a large number of times first.
Therefore, the compiler cannot move line #3 completely above line #2, since
doing so would not guarantee the visibility of #2 to other threads in a finite
period of time. It can move an arbitrary number of executions of line #3 above
line #2 (all of which will see that mutex2
is still true
), but not all the
executions of line #3.
Given that thread2
eventually sees the store from #2 at line #7, the exchange at
line #7 will eventually succeed, and thread2
will eventually complete.
Likewise, the store at #6 must become visible to thread1
in a finite period of
time. Therefore the exchange at line #3 will eventually see the value stored by
6, the exchange will succeed, and thread1
will complete, and the compiler is
not allowed to move all the executions of line #7 above #6.
No amount of compiler optimization is allowed to break this, so no: spinlocks cannot deadlock if they don't overlap.
Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: cplusplus, atomics, multithreading, spinlocks
Stumble It! | Submit to Reddit | Submit to DZone
If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.
Ticket Maps
Saturday, 20 March 2021
It has been an increasingly common scenario that I've encountered where you have some ID that's monotonically increasing, such as a subscription or connection index, or user ID, and you need your C++ program to hold some data that's associated with that ID value. The program can then pass round the ID, and use that ID to access the associated data at a later point.
Over time, IDs can become invalidated, so the data associated with
that value is removed, and you end up with a sparse set of
currently-active IDs. You would therefore naturally lean towards using
a map (whether a std::map
, std::unordered_map
or some other custom
map) to associate the data with the ID.
Often such maps are implemented as node-based containers, which means that the nodes can be allocated at disparate memory addresses, which is bad for cache performance. Adding and removing nodes also always requires memory allocation/deallocation.
In his "Better Code: Relationships" presentation, Sean Parent describes an alternative implementation, which he calls the "Russian Coat-Check Algorithm". In this algorithm, the map is implemented as a vector of pairs of key/optional data. Because the keys come from a monotonically increasing index, the vector is always sorted, and inserts are always at the end. Entries can be removed by clearing the data, and if there are too many empty entries then the vector can be compacted. Lookups are always fast, because the vector is always sorted, so a simple binary search will find the right element.
Inspired by watching Sean's presentation at ACCU 2021 last week, I implemented what I call a Ticket Map (it maps a Ticket to a Value). This is an implementation of the algorithm Sean described, fleshed out to a full container. When you insert a value, it is assigned the next available ticket value. You can later access or erase the value using that ticket.
#include <string>
#include <iostream>
#include "ticket_map.hpp"
int main(){
jss::ticket_map<int,std::string> map;
auto ticket1=map.insert("hello");
auto ticket2=map.insert("world");
std::cout<<map[ticket1]<<" "<<map[ticket2]<<std::endl;
map.erase(ticket1);
}
You can of course iterate through the container: in this case the iterators are
Input Iterators, where the value_type
is a std::pair<Ticket const&,Value&>
. This allows you to access both the tickets and the raw
elements, but also allows the iterator to provide a nice view over the data
without exposing the std::optional
implementation detail.
#include <string>
#include <iostream>
#include "ticket_map.hpp"
int main(){
jss::ticket_map<int,std::string> map;
auto ticket1=map.insert("hello");
auto ticket2=map.insert("world");
auto ticket3=map.insert("goodbye");
for(auto& [ticket,value]: map){
std::cout<<ticket<<": "<<value<<std::endl;
}
}
The code is available on GitHub under the Boost Software License.
Enjoy!
Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: cplusplus, maps, containers
Stumble It! | Submit to Reddit | Submit to DZone
If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.
Design and Content Copyright © 2005-2025 Just Software Solutions Ltd. All rights reserved. | Privacy Policy