Thread confinement in Java10 min read
As a Java developer, perhaps all of us have written some kind of single-threaded program. In this program, we can reason about our code much easier when compared with multi-threaded ones since we don’t have to worry about sharable resources because after all there is only one thread. However, as Java is a multi-threaded programming language and the computer itself is much faster year after year as more transistors and more CPU power are added, making programs multi-threaded is essential for some reasons, and the most conspicuous one is to harness and take advantage of the available computer resources.
But to write a correct, thread-safe program is not a trivial task. There is more care that we need to consider to make our code works correctly in a multithreading environment. And as we strive for writing a thread-safe program, we should know several ways that let us accomplish it. Some of the most typical ways to obtain a thread-safe class are using synchronization (using an intrinsic lock or a private lock on the resources that are shared between threads), or by using immutable data structure (since the immutable objects cannot be modified after being created, then it’s inherently thread-safe). In this article, I want to introduce yet another technique to achieve thread safety, which is called thread confinement.
We know that accessing mutable, shared data between threads requires some level of synchronization, but one way to avoid doing this is to simply not share. The basic idea of thread confinement is that, if the data is only accessed by a single thread, then there is no synchronization needed. There are several ways to achieve thread confinement, let’s explore.
Ad-hoc thread confinement
The first type of thread confinement is called ad-hoc thread confinement, where the responsibility of managing thread confinement is fall entirely to the implementation. In other words, it’s your responsibility to document your code so that the use of an object is restricted to a single thread, in this type of confinement, the correctness of the program is based on human-level protection, rather than any language level features, which can be very fragile and should be avoided in most case. Here is a simple example:
public class ThreadConfinement {
// only use this object in the thread X and not from any other threads.
private Point point;
public Point getPoint() {
return point;
}
}
class Point {
int x;
int y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
}
You document your code like this and hope other developers will use your object properly, but again, other developers can violate the rules that you described, there are better techniques that utilize language features to enforce thread confinement (described in later sessions).
One special case of thread confinement is by using the volatile
keyword, it has the semantic for memory visibility, when a variable uses this modifier, this guarantees any threads that read the most recently written value. In other words, without using this volatile
modifier, the value of the variable is stored in the CPU cache, and because there can be multiple CPUs, then there are multiple caches storing the value of the variable, and because of this, there might be different versions of your variable in different CPU caches, the value a thread see in one cache may not reflect what other threads see.
But with the volatile
keyword, it will address the memory visibility problem since all read and write will be performed in the main memory instead of CPU cache, it is safe to perform read-modify-write operations on the volatile
variable as long as you can ensure this volatile
variable is only written by a single thread, but multiple threads can read this volatile
variable since the memory visibility of it guarantee other threads see the most up-to-date value. Here is another example of ad-hoc thread confinement:
class Counter {
// You can call getCount() on multiple thread if you want
// but there should be only one thread would make a call to inc().
private volatile int count = 0;
public int getCount() {
return count;
}
public void inc() {
count++;
}
}
Stack confinement
The next type of thread confinement is called stack confinement, the idea is that each thread has its own execution stack to hold local variables and function calls, by avoiding making object references escape and encapsulate the object so it is only referenced from local variables. Local variables are confined to the executing threads, meaning that they only exist in the executing thread’s stack, and not from any other thread. Let’s see an example to make it more obvious:
class ThreadConfinement {
public long countWordStartWithVowel(Dictionary dictionary)
Set<Character> vowels = Set.of('a', 'e', 'u', 'i', 'o');
SortedSet<String> words = dictionary.getDictionary();
TreeSet<String> localSortedWords = new TreeSet<>();
localSortedWords.addAll(words);
return localSortedWords.stream()
.filter(word -> vowels.contains(word.charAt(0)))
.count();
}
}
In this method countWordStartWithVowel
, we simply take a dictionary and count the total number of the words starting with a vowel on this. Notice that instead of using the set returned from dictionary.getDictionary()
directly, we create a local variable localSortedWords
and copy all of the words in the directory to this variable and because of localSortedWords
is a local variable that intrinsically exists on the executing thread’s stack and cannot be referenced to any other thread, we say that this variable is stack confined, and this method countWordStartWithVowel
is thread-safe, and it will remain so if we don’t try to violate the stack confinement by publishing the words
to the outside world and let it escape. And in case the method contains some primitive local variables, we cannot violate the stack confinement even if we try since we cannot make references to these variables.
But what if my Dictionary
is thread-safe already, do I still need to enforce stack confinement in the method that using it? The point of making the stack confined is to make the method itself behaves correctly in a multi-threaded environment, but this invariant is already held, then making variables stack confined is often unnecessary, copying a dictionary with a small number of words has no performance overhead, but this is far from the ideal world since the dictionary can contain millions of entries.
ThreadLocal
The final way I want to introduce here in this article for thread confinement is by using ThreadLocal
class. This high-level class lets you associate a per-thread value with a value-holding object, in other words, it allows you to store different objects for different threads and manage which object belongs to which thread. This class provides you with accessor methods get
and set
that maintain a separate copy of the value for each thread that uses this value. When you call the get
method it will return the most recently written value passed to the set
method from the currently executing thread. Conceptually, you can think ThreadLocal<T>
as a Map<Thread, T>
, meaning for each entry, you map a Thread
with a corresponding T
value, however, it’s not how it’s actually implemented under the hood. Here is a simple example of ThreadLocal
:
class ThreadConfinement {
public static void main(String[] args) {
ThreadLocal<Author> threadLocal = new ThreadLocal<>();
Runnable authorWorker1 = () -> {
Article article1 = Article.of("Java concurrency", "Thread confinement in Java...");
Article article2 = Article.of("Regex", "Let's have a regex tutorial...");
List<Article> articles = new ArrayList<>();
articles.add(article1);
articles.add(article2);
Author authorA = Author.of("Nam V. Do", articles);
threadLocal.set(authorA);
System.out.println(threadLocal.get());
};
Runnable authorWorker2 = () -> {
System.out.println("What do we get here: " + threadLocal.get());
Article article1 = Article.of("Java Concurrency High Level APIs", "ExecutorService, CompletableFuture, ForkJoinPool, some others...");
Article article2 = Article.of("Clone vs Copy Constructor", "We prefer the Copy Constructor to Clone for so many reasons...");
List<Article> articles = new ArrayList<>();
articles.add(article1);
articles.add(article2);
Author authorB = Author.of("Anonymous", articles);
threadLocal.set(authorB);
System.out.println("After setting author in the current thread for `threadLocal`\n" + threadLocal.get());
System.out.println("Change the author's name, remove the first article and add a new one...");
try {
Thread.sleep(2000);
} catch (InterruptedException e) {
e.printStackTrace();
}
authorB = threadLocal.get();
authorB.name = "Rick Sanchez";
authorB.articles.remove(0);
Article newArticle = Article.of("IO programming in Java", "Buffered vs Non-buffered IO characteristic...");
authorB.articles.add(newArticle);
System.out.println(threadLocal.get());
};
Thread thread1 = new Thread(authorWorker1);
Thread thread2 = new Thread(authorWorker2);
thread1.start();
thread2.start();
}
}
class Author {
String name;
final List<Article> articles;
public static Author of(String name, List<Article> articles) {
return new Author(name, articles);
}
private Author(String name, List<Article> articles) {
this.name = name;
this.articles = articles;
}
@Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("Author: ").append(name).append("\n");
sb.append("Articles:\n");
for(Article article: articles) {
sb.append("Title: ").append(article.title)
.append(" - ").append(article.content.substring(0, Math.min(article.content.length(), 50)))
.append("\n");
}
return sb.toString();
}
}
class Article {
final String title;
final String content;
public static Article of(String title, String content) {
return new Article(title, content);
}
private Article(String title, String content) {
this.title = title;
this.content = content;
}
}
In the code example above, we create a ThreadLocal object with its parameterized type of Author
, for each author, we need to set its name
and list of this author’s articles, I created 2 worker threads, each creating a new author and adding them to the threadLocal
object that I’ve just created, here is one of the output (but keep it might the order of the output in your machine might look slightly different than mine):
Author: Nam V. Do
Articles:
Title: Java concurrency - Thread confinement in Java...
Title: Regex - Let's have a regex tutorial...
What do we get here: null
After setting author in the current thread for `threadLocal`
Author: Anonymous
Articles:
Title: Java Concurrency High Level APIs - ExecutorService, CompletableFuture, ForkJoinPool,
Title: Clone vs Copy Constructor - We prefer the Copy Constructor to Clone for so man
Change the author's name, remove the first article and add a new one...
Author: Rick Sanchez
Articles:
Title: Clone vs Copy Constructor - We prefer the Copy Constructor to Clone for so man
Title: IO programming in Java - Buffered vs Non-buffered IO characteristic...
For each worker, it runs on a different thread, and for the authorWorker1
we create an author (let’s say authorA
) and then add it to the threadLocal
, let’s presume the thread that runs our authorWorker1
is Thread-A
, then now our threadLocal
object has one entry that associates the Thread-A
with the author firstAuthor
.
Notice that in the authorWorker2
, it runs on a different thread (let’s say Thread-B
) compared with authorWorker1
, on the very first line of the run()
method, we try to get the value from the ThreadLocal
object, because the current thread Thread-B
doesn’t set any value yet, hence it returns null
. Then we add a new author (let’s say authorB
) to the threadLocal
on the Thread-B
, now we have a value associated with the Thead-B
on the ThreadLocal
instance, when we call threadLocal.get()
, it returns the author instance associated with the current thread Thread-B
. Later on, we let the thread sleep for a while and then modify this authorB
on this Thread-B
, and keep in mind that the value of authorA
is still intact since it’s associated with a different thread and doesn’t affect by what we do by the thread Thread-B
.
A little note also for the Article
class it’s thread-safe because everything inside it is immutable, so there is nothing much to worry about it here, but for the Author
class it’s not thread-safe even articles
is a final variable but its elements are still mutable and articles
is not thread-safe. In fact all of the collection classes in the java.util
package are not thread-safe except Hashtable
and Vector
, so keep in mind to use proper techniques as needed when working with Java concurrency.