Sunday, December 14, 2008

Date-Race-Ful Lazy Initialization for Performance

I was asked a question about benign data races in Java this week, so I thought I would take the opportunity to discuss one of the (only) approved patterns for benign races. So, at the risk of encouraging bad behavior (don't use data races in your code!), I will discuss the canonical example of "benign races for performance improvement". Also, I'll put in another plug for Josh Bloch's new revision of Effective Java (lgt amazon), which I continue to recommend.

As a reminder, basically, a data race is when you have one (or more) writes, and potentially some reads; they are all to the same memory location; they can happen at the same time; and that there is nothing in the program to prevent it. This is different from a race condition, which is when you just don't know the order in which two actions are going to occur. I've put more discussion of what a data race actually is at the bottom of this post.

A lot of people think that it is okay to have a data races all over your code. By and large, it is a lot more dangerous than those people think it is. For example, the version of double checked locking that doesn't use volatile has a data race in it. As I've discussed before, that version is broken.

So, having said that, when can you have a harmless data race in your code? Here's some code from Sun's implementation of java.lang.String, which is available under the GPL:

public final class String {
/** The value is used for character storage. */
private final char value[];

/** The offset is the first index of the storage that is used. */
private final int offset;

/** The count is the number of characters in the String. */
private final int count;

/** Cache the hash code for the string */
private int hash; // Default to 0

public int hashCode() {
int h = hash;
if (h == 0) {
int off = offset;
char val[] = value;
int len = count;

for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return h;
}

// ...
}
The principle here is pretty simple. The program calls hashCode(). If it hasn't calculated the hash code yet, then it does, and then stores the value for future use. The people who wrote String didn't bother to calculate the hash code in the constructor, because that would make construction (which is done all the time) more expensive. Because Strings are immutable, we don't need to worry about the hash code potentially changing in the future.

Do you see the potential data race? Here's some code that triggers it:

class BenignDataRace {
final String s = "blah";

public void main() {
class CalcHash implements Runnable {
public void run() {
int h = s.hashCode();
}
}
Thread t = new Thread(new CalcHash());
Thread t2 = new Thread(new CalcHash());

t.start();
t2.start();
}

}
Two threads access s.hash without doing synchronization, and at least one of those threads is going to write to it. Item 71 in Effective Java calls this idiom racy single-check. In this case, though, it actually works. Why is this safe, when the double-checked locking example, which uses synchronization, isn't?

There are basically two parts to the answer, and they are both pretty simple:

  1. In this case, we don't care how many times we actually calculate the answer and assign it to the hash field. It just isn't a big deal if we happen to calculate it multiple times. Furthermore,

  2. Java has a special guarantee for fields that are 32-bit or fewer and object references — a thread will always see some value that was assigned to that field. As a result of this, a thread reading hash will either see 0 or the calculated value. Edited to add: I just noticed that this was misleading. An object reference will point to the right object, but the contents of the object aren't guaranteed to be correct unless the object is immutable. Please see my series on immutability.

That second item might sound obvious, but bear in mind that 64-bit values don't get that guarantee; this code would not work if hash were a long or a double.

Now, let's break the code:

public int hashCode() {
if (hash == 0) {
int off = offset;
char val[] = value;
int len = count;

int h = 0;
for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return hash;
}
All I did was make it so that the temporary variable was only used for calculating the hash code! How could that break the code?

What I've done here is to add an additional read: the second read of hash, before the return. As odd as it sounds, and as unlikely as it is to happen, the first read can return the correctly computed hash value, and the second read can return 0! This is allowed under the memory model because the model allows extensive reordering of operations. The second read can actually be moved, in your code, so that your processor does it before the first!

(Side note: the first version is better anyway, because it performs fewer memory reads. If you really care that much about performance, that's the way to go.)

What's the moral here? The moral is that this stuff is very confusing, and you have to work very hard to get it right. Make sure you understand exactly what your code does. Avoid data races wherever possible.




Here are the details on data races that I promised above:

A program has a data race in it when it is possible to have an execution of that program where:

  1. All of the instructions in that program occur in the order they occurred in the program (this is important for technical reasons),

  2. There are two accesses to the same field / array element,

  3. At least one of those accesses is a write, and

  4. There is no happens-before ordering between those actions.


I defined "happens-before ordering" in my last post. Basically, it means that there is nothing establishing an order between the two actions; for example, there is no locking protecting the accesses, no volatile writes and reads intervening and no thread starts and joins.

Saturday, November 15, 2008

What Volatile Means in Java

Today, I'm going to talk about what volatile means in Java. I've sort-of covered this in other posts, such as my posting on the ++ operator, my post on double-checked locking and the like, but I've never really addressed it directly.

First, you have to understand a little something about the Java memory model. I've struggled a bit over the years to explain it briefly and well. As of today, the best way I can think of to describe it is if you imagine it this way:

  • Each thread in Java takes place in a separate memory space (this is clearly untrue, so bear with me on this one).

  • You need to use special mechanisms to guarantee that communication happens between these threads, as you would on a message passing system.

  • Memory writes that happen in one thread can "leak through" and be seen by another thread, but this is by no means guaranteed. Without explicit communication, you can't guarantee which writes get seen by other threads, or even the order in which they get seen.

The Java volatile modifier is an example of a special mechanism to guarantee that communication happens between threads. When one thread writes to a volatile variable, and another thread sees that write, the first thread is telling the second about all of the contents of memory up until it performed the write to that volatile variable.

At this point, I usually rely on a visual aid, which we call the "two cones" diagram, but which my officemate insists on calling the "two trapezoids" diagram, because he is picky. ready is a volatile boolean variable initialized to false, and answer is a non-volatile int variable initialized to 0.




The first thread writes to ready, which is going to be the sender side of the communications. The second thread reads from ready and sees the value the first thread wrote to it. It therefore becomes a receiver. Because this communication occurs, all of the memory contents seen by Thread 1, before it wrote to ready, must be visible to Thread 2, after it reads the value true for ready.

This guarantees that Thread 2 will print "42", if it prints anything at all.

If ready were not volatile, what would happen? Well, there wouldn't be anything explicitly communicating the values known by Thread 1 to Thread 2. As I pointed out before, the value written to the (now non-volatile) ready could "leak through" to Thread 2, so Thread 2 might see ready as true. However, the value for answer might not leak through. If the value for ready does leak through, and the value for answer doesn't leak through, then this execution will print out 0.

We call the communications points "happens-before" relationships, in the language of the Java memory model.

(Minor niggle: The read of ready doesn't just ensure that Thread 2 sees the contents of memory of Thread 1 up until it wrote to ready, it also ensures that Thread 2 sees the contents of memory of any other thread that wrote to ready up until that point.)




With this in mind, let's look at the Double-Checked Locking example again. To refresh your memory, it goes like this:

class Foo {
private volatile Helper helper = null;
public Helper getHelper() {
if (helper == null) {
synchronized(this) {
if (helper == null) {
helper = new Helper();
}
}
}
return helper;
}
The object of the double-checked locking pattern is to avoid synchronization when reading a lazily constructed singleton that is shared between threads. If you have already constructed the object, the helper field will not be null, so you won't have to perform the synchronization.

However, this is only part of the solution. If one thread creates the object, it has to communicate the contents of its memory to another thread. Otherwise, the object will just sit in the first thread's memory. How do we communicate the contents of memory to another thread? Well, we can use volatile variables. That's why helper has to be volatile -- so that other threads see the fully constructed object.

Locking in Java also forms these "happens-before" communication points. An unlock is the sender side, and a lock on the same variable is the receiver side. The reason that doesn't work for (non-volatile) double-checked locking is that only the writing thread ever performs the locking. The whole point of the idiom is that the reader side doesn't do the locking. Without the explicit communication in the form of the volatile variable, the reading thread will never see the update performed by the writer thread.

Let me know if that makes sense.

Saturday, November 8, 2008

G1 Garbage Collector in Latest OpenJDK Drop

ETA: The G1 collector is now in the latest JDK6 release, which you can download from Sun here. You can enable it for testing with the flags -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC. Original post follows.

For those of you who are interested in tinkering with new garbage collectors, the "Garbage-first" G1 garbage collector is in the new JDK7 drop from OpenJDK. I haven't tried it yet, but I am looking forward to seeing what it does.

(You will notice the link doesn't say anything about the new GC. Here's the appropriate bug.)

G1 is supposed to provide a dramatic improvement on existing GCs. There was a rather good talk about it at this year's JavaOne. It allows the user to provide pause time goals, both in terms of actual seconds and in terms of percentage of runtime.

The principle is simple: the collector splits the heap up into fixed-size regions and tracks the live data in those regions. It keeps a set of pointers — the "remembered set" — into and out of the region. When a GC is deemed necessary, it collects the regions with less live data first (hence, "garbage first"). Often, this can mean collecting an entire region in one step: if the number of pointers into a region is zero, then it doesn't need to do a mark or sweep of that region.

For each region, it tracks various metrics that describe how long it will take to collect them. You can give it a soft real-time constraint about pause times, and it then tries to collect as much garbage as it can in that constrained time. This is rather nifty (although it does add yet another layer of complexity to managing your GC).

Although this is likely not ready for prime time just yet, for the final version of JDK7, Sun is looking at making this the new "high performance" collector, replacing parallel newgen + CMS. Here is much more information about the collector.

Sunday, October 26, 2008

A Brief Digression into Politics

A plea for you to vote "No" on CA Proposition 8, and maintain the rights of same-sex couples in California.

I really hate using this blog as a political forum. I have a personal blog I use for that, read only by people who know me. This blog should be expressly for computer-related topics. I apologize to readers expecting me to talk about concurrency or APIs or anything else. I apologize to readers who don't like to see discussion of this issue, or to those who have a problem with the way I handle it.

Today's post is just for the California voters. On November 4th, we are going to be asked to vote on Proposition 8, which is only 14 words long: "Only marriage between a man and a woman is valid or recognized in California." Please vote no on this proposition, and keep California safe for families of all kinds.

A lot of you might wonder why a heterosexual who only blogs about software engineering and computer science might feel compelled to write about this topic. The answer is simple: I see it as one of the most important American civil rights issues of the early part of the 21st century (there are obviously much more serious issues in other countries).

I know lesbians and gays with children. Why should they have to explain to their children that their friends' mommies and daddies can be married, but that their mommies and daddies can't?

I know lesbians and gays who have been together for decades: through pain, and joy, and hardship, and victory. I've seen the love in their eyes when they were finally able to say those vows to each other and have it mean something in the eyes of society. They have just as much right to that as I do. Don't take that away from them.

I wonder what will happen when my friends in same-sex couples get sick when they are in a state with the wrong kind of law, and their partners can't visit them, or make life-or-death decisions for them. I wonder what happens when two people work all of their lives to build something together, and, when one of them dies, the state comes and takes half of it away because they didn't happen to be in a relationship that fits everyone's idea of what a relationship should be.

A lot of people say that civil unions provide the same thing without calling it marriage. This is simply not true. The issue is that civil unions will be treated differently by other states and by the Federal government, many of which don't recognize civil unions.

A lot of people are worried that churches will be forced to perform these ceremonies. This just clouds the issue. If you are worried that churches might be forced to perform weddings against their will, then lobby for a change to the constitution that says that churches can't be forced to perform marriages against their will. Don't side with the people — and we all know that they are out there, and how they will vote — who hate without reason, and take the right to be married away.

A lot of people are worried that their children will be exposed to homosexuality in the classroom. There is a law in our state that says that you can't teach children about those issues without their parents' consent. It's that simple.

A lot of people talk about how marriage is just for people who want to raise children. This is a smokescreen. Lots of same-sex couples raise children. Lots of heterosexual couples don't.

Don't destroy peoples' lives and families by trying to ban their marriages. Everyone is entitled to be treated equally under the law.

One last note: I know some people who feel that the government shouldn't be in the business of marriages at all. That's a completely separate issue, Libertarians.

We now return you to your regularly scheduled Java.

Sunday, August 24, 2008

Don't Use StringBuffer!

One thing I have noticed among Java API users is that some don't seem to know that there is a difference between StringBuffer and StringBuilder. There is: StringBuffer is synchronized, and StringBuilder isn't. (The same is true for Vector and ArrayList, as well as Hashtable and HashMap). StringBuilder avoids extra locking operations, and therefore you should use it where possible. That's not the meat of this post, though.

Among those who know that StringBuffer is synchronized, they sometimes think that it has magical thread-safety properties. Here's a simplified version of some code I wrote recently

final StringBuilder sb = new StringBuilder();
Runnable runner = new Runnable() {
@Override public void run() {
sb.append("1");
}
};
Thread t = new Thread(runner);
t.start();
try { t.join() } catch (InterruptedException e) {}
// use sb.
A fellow looked at this code, and said that he thought I should use StringBuffer, because it is safer with multiple threads. Well, no. Using StringBuffer here would actually be bad coding practice.

The issue here is that it isn't actually thread-safe to use StringBuffer. It is just pretend thread-safe. Yes, that's an official term. It is completely useless in practice to have multiple threads pounding on a StringBuffer, if at least one of them is mutating it, because you then can't make any guarantees about the StringBuffer's contents. Let's see what would happen if I added another thread to the example we were examining, and replaced the Builder with a Buffer:

Thread 1:
sb.append("a");

Thread 2:
sb.append("b");

Thread 3:
join 1,2
print(sb.toString());
Sure, it is "thread-safe", in the sense that there are no data races (which are basically concurrent accesses without adequate synchronization). But you have no idea what thread 3 will print: "ab" or "ba". I would have to introduce more synchronization to make this produce a sensible result. The locks that come with the StringBuffer haven't helped anything.

What I did in that original code was a perfectly legitimate handoff, and a widely used Java idiom. One thread hands off an object to another thread in a properly synchronized way, and then stops using it. The second thread uses it for a while, stops using it, and hands it back to the first.

Using Java memory model words: The initialization of the StringBuilder happens-before the spawn of the thread, which happens-before the append operation, which happens-before the join. Thread starts and joins induce happens-before edges, which is something that several people have managed to miss recently.

The additional locking provided by the StringBuffer class is completely superfluous: the guarantees are provided entirely by the Thread.start and Thread.join invocations. Using StringBuffer is even harmful, in the sense that you need to perform a whole bunch of (relatively) expensive synchronization, and prevent the compiler from aggressively optimizing the code.

To go back to the original point, I would even argue that using the StringBuffer class would be bad coding style: it would imply that that extra locking is providing something of benefit in the presence of multiple threads, which would be misleading.

As it turns out, I have yet to see a practical use case for the StringBuffer class over the StringBuilder class. A quick poll of Josh Bloch (who also works at Google), indicates that he has yet to see one, too. There are similar problems in some of the synchronized OutputStream classes in Java, too -- who writes to an OutputStream in multiple threads?

In short: a locked version of an API isn't not automatically better than an unlocked one just because you happen to be using it in multiple threads.

I've often thought that a short discussion of this should be added to the StringBuffer JavaDoc, since everyone who maintains that JavaDoc believes this, too.


A final note. Some people think that there is no performance impact to using StringBuffer instead of StringBuilder, because of all of the clever things JVMs do to eliminate synchronization (I can blog about this, too, if you want). This is a little unclear. Whether it can even perform these optimizing transformations definitely depends wildly on which JDK you use; I wrote a microbenchmark to test it, and the results differed on different JDKs -- but all that microbenchmarks really demonstrate is the performance of the microbenchmark.


ETA One of the comments points out that javac often replaces Strings with StringBuilders when constructing them. This is correct, and this is because javac realizes that when you are performing changes to the object, it makes more sense to using StringBuilders than Strings. If you have code like this:
while (guard) {
string += value;
}
You are almost always better off writing that like this:
while (guard) {
stringBuilder.append(value);
}
string = stringBuilder.toString();
The first way will allocate a new String object on every iteration, and copy the values of the previous String object. The second way will allocate a single array and append the characters to it. I have improved the performance of some pieces of code several orders of magnitude by switching from the first form to the second. Don't use String concatenation in a loop!

Monday, August 18, 2008

Top 3 Reasons Why Constructors are Worthless

In my last post, I made an offhand reference to the fact that object constructors are worthless in Java. I was asked why this is, so I thought I'd fill in the details.

This topic is a little more well-understood by the developer community than some of the concurrency tidbits that I usually discuss. People who use dependency injection toolkits like Guice and Spring's inversion of control tend to have very strong feelings on the advanced suckitude of constructors. I know quite a few developers who claim that use of Guice changed the way they code (for the better).

ETA: I've had a few people tell me just to use SPI. The point of this post is not that DI is wonderful, it is that constructors are awful. SPI doesn't stop constructors from being awful. In fact, it is yet another horrible solution, what with all of those Foo and FooImpl classes. If you like SPI, go ahead and use it. Furthermore, neither DI nor SPI addresses two of the three issues I mentioned.

So, without further ado:
  1. They are the one kind of instance method that can't be overridden. This is the major pain point for most people using constructors. For testing purposes, programmers like to create mock objects, which duplicate the functionality of the objects you are not testing. They are usually created because the objects they represent aren't easy to construct properly in a unit test, either because they are non-deterministic, or require interaction with a client, or are simply too complex to create easily.

    One of the major difficulties of creating mock instances in Java is that the way we create objects is to call the constructor:
    GiantNetworkedDatabaseConnector connector = 
    new GiantNetworkedDatabaseConnector(); // connects to the giant networked database.

    You don't really want to connect to the giant networked database when you are unittesting your code. The typical approach is to abstract the method into a factory:
    GiantNetworkedDatabaseConnector connector = 
    GiantNetworkedDatabaseConnector.newConnector();
    The newConnector code returns an object. But the method is a static one, which means that you can't replace that at runtime, either. You actually need a static field to hold the factory that makes the connector:
    GiantNetworkedDatabaseConnector connector = 
    GiantNetworkedDatabaseConnector.getFactory().newConnector();
    and then you need to set the factory in some other part of the program:
    GiantNetworkedDatabaseConnector.setFactory(mockFactory);
    and then you need to hope and pray that there aren't multiple threads trying to inject their own factories.

    Getting around this is, of course, the stock-in-trade of the dependency injection frameworks mentioned above. Guice and Spring tend to do what they do very well. It is just completely awful that they have to do it.

    In short: the constructor is just like a static method. You can't override it properly. You can't test it properly. Constructors bring pain.

  2. All immutable state must be set up front. I spent the last few blog entries describing how and why immutable state is good. But all immutable state must be set in the constructor. The degree to which this is terrible doesn't necessarily occur to people who are used to programming in Java. You just have to stop, take a step back, and realize that having separate StringBuilder and String classes is violently counterintuitive.

    Why do these two classes exist? Because there are actually two parts to String creation:

    1. Generate the String, and

    2. Publish it.

    The StringBuilder class exists solely because there is no way to have a clean break between step 1 and step 2. You need a way to say "I will be constructing this object for a while, and then I will be done." For those familiar with Ruby, you are probably thinking of the "freeze" operation at this point, which makes any mutable object into an immutable one. That is closer to what I am imagining, but the hard part of that is enforcing the fact that you call it.

    This problem isn't limited to Strings, of course. A minute's hard thought will make you think of a dozen other possible uses. Code bases tend to have lots of other Builder classes; I probably write a couple a month.

    In short: immutable objects can't be constructed properly when you can't just shove all of the built state into the constructor. Constructors bring pain.

  3. They make custom serialization and deserialization more painful. I'm not sure that I can do this topic justice in a single blog post. For a good overview of why serialization is awful, go purchase and read Effective Java.

    A brief precis: when you do deserialization, you need an instance of the object so that you can deserialize into it. The way this works is that the default constructor (the one with no arguments) is invoked for the non-serializable classes. This means that ETA: I phrased this incorrectly here, so I'm editing it. a) you need a visible no-arg constructor if something ever might get serialized (which is hard to predict), and b) that constructor might get called.

    That's just for the default mechanism, of course. For anything that requires custom deserialization (i.e., anything non-trivial), you have to use the the magic readObject method, which is not a constructor. That method, of course, doesn't have the semantics of constructors, so can't set final fields (as discussed in my post on deserialization and reflection).

    Serialization is, of course, in general, completely awful. It is no surprise that people turned to XML. At Google, we use a more compact system called protocol buffers, which are still rather hacktackular, but definitely better than serialization. Serialization is awful, and constructors make it worse. Constructors bring pain.

At one level removed, a big part of the problem is that object construction and initialization are tied together. Imagine if you could construct an object with all of its fields set to null or zero, and then call as many initializer methods as you wanted:
  1. Client code could create the object and could override the initializer methods, thereby neatly avoiding the problem of not being able to override the initializer. Even better, you could specify the type you wanted created at runtime:
    GiantNetworkedDatabaseConnector buildConnector(
    Class<? extends GiantNetworkedDatabaseConnector> c) {
    // This is impossible with standard Java bytecode.
    return new c.init();
    }

    // MockGiantNetworkedDatabaseConnector overrides the init() method
    GiantNetworkedDatabaseConnector c =
    buildConnector(MockGiantNetworkedDatabaseConnector.class);
  2. You could call initializer methods, and have them set immutable state (possibly using the notion of freeze to make an object immutable):
    String s = new String.append("foo").append("bar");
    assert s.equals("foobar");
    freeze s;

  3. You wouldn't have to worry about serialization calling the no-argument constructor, because the no-argument constructor wouldn't do anything. readObject could have initializer semantics, and be able to set final fields.
The point isn't really the solution: this solution is broken because the object doesn't always maintain its invariants (until the init() methods are done being called). You can continue to brainstorm to fix this, and come up with as many straw man approaches as you want, but the fact remains that Java programmers are still going to be stuck with constructors.

There isn't much of a moral to this story, other than that constructors bring pain.

Monday, July 21, 2008

Immutability in Java, Part 3: Deserialization and Reflection

This time, I'll talk about deserialization, immutability and final fields. I'll try to be a little shorter.

I spent the last two posts on immutability (here and here) talking about what it takes to make an object's fields immutable. This advice mostly boiled down to "it has to be final and set before the constructor ends".

When you have deserialization, though, the fields can't be set before the constructor ends. An instance object is constructed using the default constructor for the object, and the fields have to be filled in later. We gave deserialization special semantics, so that when an object was constructed this way, it would behave as if it had been constructed by an ordinary constructor. No problems there.

The problems come when and if you need to write custom deserialization (using, for example, the readObject method of ObjectInputStream). You can do this using reflection:
Class cl = // get the class instance
Field f = cl.getDeclaredField("myFinalField");
f.setAccessible(true);
f.set(enclosingObject, newValue);
Note that you need setAccessible to change final fields using reflection.

ETA: Jared Levy points out that you should probably use Field.getDeclaredField if the field isn't public. The original code used Field.getField(). Thanks to him for pointing that out.

In the last post, I mentioned that something very like a "freeze" action happens at the end of the constructor; as long as your final fields (and everything reachable from them) are fully constructed by the end of that, your data are immutable. There is a similar "freeze" action after you modify final fields using reflection; as long as your final fields (and everything reachable from them) are fully constructed as of when you change them via reflection, your data are immutable. This means that if you have data you want your final field to point to, and you want other threads to see, they all have to be constructed by the time you set the final field. See the previous postings for examples of this.

The other restrictions on immutability apply, too. You can't let another thread see the object before the final field is set via reflection, for example. See the previous blog postings for more on this, too.

This is slightly different for static final fields, since they are often completely removed at javac compile time. You can't change static final fields after class initialization. Well, almost. You can change System.out, System.in and System.err using System.setOut(), System.setIn() and System.setErr(). We felt very dirty after writing these rules.

By the way, the general treatment of static final fields was a big mistake in the original spec. The compiler (javac) is forced to inline static final compile time constants everywhere it can. They can even be inlined into other classes. This means that you can compile a class A with a static final int x set to 1, compile another class B that prints A.x, recompile the first class with x set to 2, and then run the program and have class B print out "1" instead of "2". Weird and bogus. However, we are stuck with it.

Monday, July 14, 2008

Immutability in Java, Part 2

I'd like to talk a little more about what it takes to ensure thread-safe immutability in Java, following on from a (semi)recent post I made on the subject.

The basic gist of that post was that if you make data immutable, then they can be shared between threads without additional synchronization. I call this "thread-safe immutability". In that post, I said this:

Now, in common parlance, immutability means "does not change". Immutability doesn't mean "does not change" in Java. It means "is transitively reachable from a final field, has not changed since the final field was set, and a reference to the object containing the final field did not escape the constructor".

I just wanted to go over these points again, because I get a lot of questions about them and there are a couple of things I glossed over. Let's take them one by one. Immutability means...
  1. the object is transitively reachable from a final field

    What this means is that you can reach the object or value by following references from the final field. Let's say we're making an immutable String class, like the one in the Java libraries. We might have something like this (notice I'm not checking for null pointers in the example code. Go ahead, make me care.):
    class String {
    private final byte[] bytes;
    public String(byte[] value) {
    byte[] newArray = new byte[value.length];
    System.arraycopy(value, 0, newArray,
    0, value.length);
    bytes = newArray;
    }
    }
    The thing we are storing in our String is a final reference to the byte array, not the array itself. The contents of the byte array can be changed, according to the semantics of Java, but what is important, for the purposes of thread-safety, is that they are reachable by following a reference from the final field.

    ETA: So, apparently, I wasn't clear enough on this. Let's try again.

    Transitively reachable means that you can get from the final field to the data you want to be immutable by following zero or more references. So, if you had:
    class String {
    final byte[] bytes;
    }
    The array of bytes is reachable from the bytes field. If you have:
    class String {
    static class BytesHolder {
    byte[] bytes;
    }
    final BytesHolder holder;
    }
    The array of bytes is still transitively reachable from the bytes field. In fact, you can add as many layers of indirection as you want — as long as you can get there by following references from the field, it is still transitively reachable from the field.

  2. the object has not changed since the final field was set

    This one is simple, right? Here's the String class again, but the bytes are no longer immutable:
    class String {
    private final byte[] bytes;
    public String(byte[] value) {
    byte[] newArray = new byte[value.length];
    System.arraycopy(value, 0, newArray,
    0, value.length);
    bytes = newArray;
    }
    public setZerothElt(byte b) {
    byte[0] = b;
    }
    }
    Well, now, obviously that's not thread-safe immutable. If two threads call setZerothElt(), then all sorts of bad thread horseplay can happen.

    However, I did mislead you a little here. I said that, if you want the data to be thread-safe immutable, it should not be changed after the final field gets set. It turns out that the immutable data can be changed after the final field is set. It just can't be changed after the constructor ends:
    class String {
    // Absolutely fine.
    private final byte[] bytes;
    public String(byte[] value) {
    bytes = new byte[value.length];
    System.arraycopy(value, 0,
    bytes, 0, value.length);
    }
    }
    You can think of the end of the constructor as being something like a "freeze" operation. The data reachable by the object's final fields must be frozen by the end of the constructor. When figuring out how this should work, we made this decision because it is a big pain in the neck to try to get everything done by the time you actually set the final field, and besides, the semantics were simpler this way.

  3. A reference to the object containing the final field did not escape the constructor

    "Escaping" the constructor simply means that when you were inside the constructor, you stored a reference to the object under construction somewhere where another thread could get at it.

    In some ways, this is kind of a nasty restriction. There are all sorts of classes that let a reference to the object escape the constructor. Swing has lots of examples of this (I think — I don't feel like looking it up). Here's an example where our String class tries to keep track of all of its instances:
    class String {
    // Don't do this.
    static Set<String> allStrings =
    Collections.synchronizedSet(new HashSet());
    private final byte[] bytes;
    public String(byte[] value) {
    allStrings.add(this);
    bytes = new byte[value.length];
    System.arraycopy(value, 0,
    bytes, 0, value.length);
    }
    }
    It's pretty clear that another thread could come along and read the object being constructed out of allStrings before it is done being constructed. So, there's no free ride there.

    But this one gets nastier, too! Let's take an example that looks correct at first blush, but isn't:
    class String {
    // Don't do this, either.
    static String lastConstructed;
    private final byte[] bytes;
    public String(byte[] value) {
    bytes = new byte[value.length];
    System.arraycopy(value, 0,
    bytes, 0, value.length);
    lastConstructed = this;
    }
    }
    It looks as if, when you read lastConstructed, it will always give you the last String instance that was constructed. Don't be fooled. Even if this does give you the last String instance that was constructed (and it may or may not, but that's besides the point), it is quite possible for the reader thread to see uninitialized data. The compiler can take the assignment to lastConstructed, and move it to before the assignment to bytes. So, you have to avoid allowing references to objects to escape constructors. (To preempt questions — yes, this example would work if you made lastConstructed volatile.).
A couple of more points, too:
  • The byte array has to be reachable from the final field by the end of the constructor. If I had bytes point to a container object, and then made the container object point to the array after the constructor finished, then I wouldn't be able to claim it was threadsafe.
    // The bytes reachable from bytesHolder *after* later()
    // is called are not thread-safe immutable.
    class String {
    static class BytesHolder {
    byte[] bytes;
    }
    final BytesHolder bytesHolder;
    public String() {
    bytesHolder = new BytesHolder();
    }
    void later(byte[] value) {
    byte[] newArray = new byte[value.length];
    System.arraycopy(value, 0,
    bytes, 0, value.length);
    bytesHolder.bytes = newArray;
    }
    }
    This, of course, violates the rule that the byte array is set before the constructor finishes — but even if you don't violate that rule, you don't get the thread-safety guarantee:
    // The bytes reachable from bytesHolder *after*
    // later() is called are still not thread-safe immutable.
    class String {
    static class BytesHolder {
    byte[] bytes;
    }
    final BytesHolder bytesHolder;
    byte[] bytes;
    public String(byte[] value) {
    bytesHolder = new BytesHolder();
    bytes = new byte[value.length];
    System.arraycopy(value, 0,
    bytes, 0, value.length);
    }
    void later(byte[] value) {
    bytesHolder.bytes = bytes;
    }
    }
  • The other point is that you can't just acquire these guarantees by waiting an inordinately long time. Here's another example of an instance escaping its constructor:
    class String {
    // Don't do this, either.
    static String firstInstance;
    private final byte[] bytes;
    public String(byte[] value) {
    bytes = new byte[value.length];
    System.arraycopy(value, 0,
    bytes, 0, value.length);
    if (firstInstance == null) {
    firstInstance = this;
    }
    }
    public static String getFirst() {
    String instance = firstInstance;
    if (instance == null) return null;
    // Wait a while for the constructor to end...
    try {
    Thread.sleep(1000)
    } catch (InterruptedException e) {}
    return instance;
    }
    }
    (Obviously, your code wouldn't look like this; this is just an example) The fact of the matter is that without using the guarantees of final fields or some form of explicit synchronization, there is no guarantee that a thread that reads the version that escapes the constructor will ever see the correct version. This can be accomplished in various ways, all of which have to do with sneaky compiler tricks.
There is, of course, far more to say about immutability. At some point, I should talk about its relationship with serialization and reflection. And best practices for making immutable objects when values aren't available for the constructor. And best practices for making immutable data types (like Sets and Maps). I think I've done enough typing for one evening, though. :)

If you liked this post, read the followup: Immutability in Java, Part 3.

Friday, May 23, 2008

Double Checked Locking

I still get a lot of questions about whether double-checked locking works in Java, and I should probably post something to clear it up. And I'll plug Josh Bloch's new book, too.

Double Checked Locking is this idiom:

// Broken -- Do Not Use!
class Foo {
  private Helper helper = null;
  public Helper getHelper() {
    if (helper == null) {
      synchronized(this) {
        if (helper == null) {
          helper = new Helper();
        }
      }
    }
  return helper;
}

The point of this code is to avoid synchronization when the object has already been constructed. This code doesn't work in Java. The basic principle is that compiler transformations (this includes the JIT, which is the optimizer that the JVM uses) can change the code around so that the code in the Helper constructor occurs after the write to the helper variable. If it does this, then after the constructing thread writes to helper, but before it actually finishes constructing the object, another thread can come along and read helper before it is finished initializing.
See the double-checked locking is broken declaration for a much more detailed explanation.

Anyway, the question I always get is "does making the helper field volatile fix this?" The answer is yes. If you make the helper field volatile, the actions that happen before the write to helper in the code must, when the program executes, actually happen before the write to helper — no sneaky reordering is allowed.

Having said this, there are often much better ways of lazily initializing a singleton. One of the items in Josh Bloch's new revision of Effective Java (lgt amazon) deals exclusively with this problem. This book is chock-full of useful Java knowledge, and is highly recommended for all Java programmers. He has revised it for Java 6; if you only have the old edition, pick this up, because he deals with things like generics, the new threading stuff in JDK5, autoboxing, and how these things all work together to make your life unpleasant.

Sunday, May 11, 2008

Which Lock is Which?

I was at JavaOne last week, and attended Bill Pugh's Defective Java talk. Bill had an important point about what kind of locks you want to be using and what kind you don't, and that point is worth repeating.

Bill is, with Dave Hovemeyer, the guy behind FindBugs, which everyone should be using religiously. He was also my graduate advisor; we worked together on the new Java Memory Model. If there are two things Bill knows, they are concurrency and bug patterns.

There is a tremendously useful defensive locking discipline that I use all of the time, and recommend that other people use, too:

class Foo {
private final Object lock = new Object();
private Bar lockProtectsMe;
...
public void blah() {
synchronized (lock) {
// accesses LockProtectsMe...
...
}
}
}

This has the following benefits:
  • If you try to acquire the lock on the Foo object instead of the lock, you run the risk that some code outside the class obtains that lock. Perhaps forever.

  • Using the Foo object also has the downside that it is coarse-grained locking; if you use that object to protect some other field called, say, lockProtectsMeToo, then you won't be able to access both lockProtectsMe and lockProtectsMeToo at the same time, even if they have nothing to do with each other.

  • If you try to acquire the lock on the lockProtectsMe object, you run the risk that someone changes that field. Since you acquire locks on objects, not on fields, if this happens, you will be acquiring a completely different lock.

There are reasons not to use this pattern, of course, and if you are curious, I can fill you in (ask in the comments), but that's not really the point of this post.

Anyway, Bill noticed that quite a few people were using the following pattern to lock objects:

private final String myString = "LOCK";
private Object lockProtectsMe;

...
public void blah() {
synchronized (myString) {
// accesses lockProtectsMe
...
}
}

This is awfully similar to the pattern above, but should never be used. Why? It turns out that when you make a String final and declare it in the source code, that String will be interned. This means that there will be one canonical instance of that String for the entire JVM.

Now, if you are likely to use this pattern in one place, you are likely to use this pattern in lots of places...

class Bar {
private final String lock = "LOCK";
...
}

class Baz {
private final String lock = "LOCK";
...
}

All of these lock fields will be interned. They will all share one common instance. If you run the code:

class Bar {
static final String lock = "LOCK";
}

public class Baz {
static final String lock = "LOCK";
public static void main(String [] args) {
System.out.println(Baz.lock == Bar.lock);
}
}

It will print out true, because, as a result of the interning, those will be the same object. Since we synchronize on objects, not fields, any code that synchronizes on them will be using the same lock. This is potentially disastrous; imagine if you had this code somewhere:

Thread 1:

synchronized (a) {
synchronized (Baz.lock) {
// ...
}
}

Thread 2:

synchronized (Bar.lock) {
synchronized (a) {
// ...
}
}

We've now turned perfectly inoffensive code into a potential deadlock. And that's an ugly one, because it isn't going to be obvious why it happened.

You should be equally careful with synchronizing on autoboxed / unboxed types. The reason for this is similar, and left as an exercise for the reader.

Friday, April 25, 2008

Immutability in Java

Another topic that comes up again and again in questions that people ask me is, "How does immutability work in Java"? Immutability is a godsend for concurrent programmers, because you don't have to do lots of sticky reasoning about what threads are updating what variables when, and you don't have to worry about cache thrashing, and you don't have to worry about all sorts of things. When I write concurrent code (which is reasonably often), I try to make as many things immutable as possible.

Now, in common parlance, immutability means "does not change". Immutability doesn't mean "does not change" in Java. It means "is transitively reachable from a final field, has not changed since the final field was set, and a reference to the object containing the final field did not escape the constructor".

In circumstances other than this, even if a given field is not mutated, the Java memory model requires that there be some form of synchronization (which can include the use of volatile, static initialization, synchronized blocks, any of the java.util.concurrent collections, or the use of a java.util.concurrent.atomic.AtomicFoo object) for a thread to make sure that it sees the correctly constructed object for the first time. Subsequent reads of the object by any given thread don't require additional synchronization.

So, a correctly written version of HashMap that was immutable and thread-safe would look like this:

public class ImmutableHashMap<K, V> implements Map<K, V> {
private final Map<K, V> map;
public ImmutableHashMap(Map<K, V> map) {
this.map = new HashMap<K, V>(map);
}

@Override
public V get(Object key) {
// And similarly all other accessors
return map.get(key);
}

@Override
public V put(K key, V value) {
// And similarly all other mutators
throw new UnsupportedOperationException();
}
}

ETA: This is how Collections.unmodifiableMap() works.

Because of the special meaning of the keyword "final", instances of this class can be shared with multiple threads without using any additional synchronization; when another thread calls get() on the instance, it is guaranteed to get the object you put into the map, without doing any additional synchronization. You should probably use something that is thread-safe to perform the handoff between threads (like LinkedBlockingQueue or something), but if you forget to do this, then you still have the guarantee.

There are two major points to make about this kind of "immutability":
  1. It's not immutable. So, I've completely misled you with mutable immutability. The following code is perfectly legal:

    HashMap<Integer, StringBuilder> map =
    new HashMap<Integer, StringBuilder>();
    StringBuilder builder = new StringBuilder();
    builder.append("foo");
    map.put(1, builder);
    ImmutableHashMap<Integer, StringBuilder> immutableMap =
    new ImmutableHashMap<Integer, StringBuilder>(map);
    builder.append("bar");
    System.out.println(immutableMap.get(1));

    I think we all know that that println() method is printing "foobar", not "foo". So, even if we call this an "immutable" hash map, values (and keys!) can still be mutated. This is a bad idea, of course. Other threads are not guaranteed to see the updates you make to immutable objects (at least, not without additional synchronization).

  2. The final field is absolutely necessary for the thread-safety guarantee. I recently saw an implementation of an ImmutableHashMap that looked more like this:

    public class ImmutableHashMap<K, V> extends HashMap<K, V> {
    public ImmutableHashMap(Map<K, V> map) {
    super(map);
    }

    @Override
    public V put(K key, V value) {
    // And similarly all other mutators
    throw new UnsupportedOperationException();
    }
    }
    This has the great virtue of avoiding the extra indirection of the delegation-based version, and also has the great virtue of being shorter (because you don't have to rewrite all of the accessors). The flip side is that if you share instances of this ImmutableHashMap with other threads, then you absolutely have to use synchronization, because it does not get the special guarantees that the final field provides. If you call get(), you can actually get the wrong value out. It isn't likely to happen in practice right now, but compiler writers are allowed to take advantage of this.

So, the moral of the story is:
  • Use final fields whenever you can, and

  • Immutability is a funny thing.

That's all I wanted to say.

If you liked this post, read the followup: Immutability in Java, Part 2.

Thursday, March 6, 2008

Talk on x86 Memory Model

Rick Hudson gave a talk at Google about the new official memory model for the Intel x86 platform. Good stuff, for memory model geeks.


Wednesday, March 5, 2008

Java Performance Talk

Paul Tyma and I gave a talk at SD West today on Java Performance Myths.

The big takeaway was this: Most of the things you think you know about Java performance are really hard to quantify. Write the best code you can. Profile it to find the bottlenecks. Remove the bottlenecks. Rinse and repeat.

A couple of other takeaway points here:
  • Don't forget to upgrade every so often. JDK6, for example, runs code compiled for JDK4 just fine, and you get a big performance boost. For free.

  • Memory allocation is pretty cheap. Know that there are three garbage collectors, and know which one is best, when. Read the Java memory management white paper.

  • Hotspot can do lots of things to optimize your locks, but it often doesn't. Having said that, the bottlenecks in concurrent code usually have to do with contention, not synchronization overhead. Use low contention data structures like ConcurrentHashMap to avoid contention, but don't try to get clever with locking.

  • Contrary to popular belief, thread-per-request synchronous servers are often faster than selector-based asynchronous servers.

The Java Performance Myths slides are here.

Hotspot Garbage Collection

Thought of the Day: The production version of Hotspot has 211 command line options for garbage collection. Getting the right combination of options is hard.

Tuesday, February 12, 2008

Causality and the Java Memory Model

NB: I started writing this last August, and sent it to a friend of mine for thoughts. I just realized that he never got back to me on it, and I never posted it. I don't much feel like editing it now, so here it is in its original glory

Okay, I know. It is nearly impossible to read and understand the causality bits of the Java memory model. That's all that stuff about "Executions and Causality Requirements" on pages 568 – 570 of the Java Language Specification. Bill Pugh, Sarita Adve and I wrote it. Basically, in the last three years, there have been three kinds of reactions to it:

  1. ZOMGWTF!? (I'm a normal human being. Give me a break.)

  2. This is incomprehensible gibberish. (I'm a pretty smart cookie and I think that everything I can't understand in three seconds is worthless.)

  3. Okay, I think I get it, but I have some questions. (I'm a pretty smart cookie and I spent several days trying to understand it.)


I find that the last group usually pretty much understands it, with a few minor clarifications. They are also overwhelmingly graduate students, for what it's worth. They have found a few flaws in it, which is also nice.

(Wait, I hear you ask, nice? Alas, I must admit that I never thought the JMM would be bulletproof. It was no more likely to be bulletproof than version 1.0 of a major software release. The main goal was to create a memory model which described the problem space completely, was practical for a real programming language and was rigorous (so that it could be completely understood with sufficient effort). The fact that it was possible to find a few flaws in specific cases (rather than by saying "this is a complete mess", which used to be the case) means that people could understand it and reason about it, and that therefore the mission was accomplished. No one had ever done that for a real language before, and I'm proud of the work we did.)

That still leaves us with the original problem, which is that it is really, really hard to understand. What I'm going to do in this post is to try to explain the key intuition behind it very briefly. If it still doesn't make sense, then tell me about it, and I'll try to clean it up.

(The prerequisite to understanding what I am about to say is an understanding of the happens-before order. If you understand that, and you understand that the lack of that in your program can lead to weird effects, then the rest of the post may interest you. If you don't understand that, this is not the post for you.)

There are two basic questions that we need to address to understand the memory model:

Why is the happens-before relationship not sufficient?

The main guarantee we want to give programmers is something called "Sequential Consistency for Data Race Free programs" (SC for DRF, for short). What this means, in brief, is that if your program is free from data races, then you won't see any weird effects. If you are still reading, you know that a data race is two accesses, at least one of which is a write, to the same variable, which are not ordered by happens-before. Now, consider the example:


Initially:

0: x == y == 0

Thread 1:
1: r1 = x;
   if (r1 == 1)
2:   y = 1;

Thread 2:
3: r2 = y;
   if (r2 == 1)
4:   x = 1;


If all of this program's statements were to execute, there would be happens-before relationships between 0 and 1, 1 and 2, 0 and 3, 3 and 4. But there would be no happens-before relationships between Thread 1 and Thread 2.

Now, an intuitive model, built entirely around happens-before, basically says that if you are trying to figure out the value that can be seen by a read, you can either look at the last write to a given variable via happens-before, or a write that is in a data race with the read. That would mean that the read at 3 could see the write at 2, and the read at 1 can see the write at 4, because those writes are in a data race with those reads.

We can now say that the read of x at 1 sees 1, and the read of y sees 1, and have that be legal under the happens-before model.

Do you see what happened there? The read of x at 1 saw the value 1, then wrote 1 to y at 2. The read of y at 3 saw the value 1, and wrote 1 to x at 4. That justified the read of x at 1 seeing the value 1.

We can't allow this, because it violates the SC for DRF property. A program is data race free if, under all SC executions -- those are the executions without any weird reorderings -- there are no data races. There are no data races under any SC execution of this program -- the reads will both return 0, so the writes won't occur, so there is no data shared between the two threads. Therefore, this program is data race free, and it has to have only SC executions. The execution I just described -- where the writes occur -- is not a SC execution.

Since the happens-before model allows this example, it doesn't enforce the SC for DRF property. We therefore had to make some additional guarantees.

Um... okay. What are the additional guarantees?

Now that we have decided that there are actions that are not allowed to happen --in the above example, the writes are not allowed to occur -- the trick is to figure out when an action is allowed to occur, and when it is not.

Now, if you are writing code, and want to figure out if an action can occur, what do you do? Well, you look at all of the actions that came before it. So let's look at the actions that happen-before the action you want to justify, and determine whether those actions justify it.

In the above example, we look at instruction 2, and then look at the actions that happen-before it, which are 1 and 0. 1 and 0 don't justify 2 happening, so it is not allowed to happen.

Let's try this on a different example. This is a good one:


0: x == y == 0

Thread 1:
1: r1 = x;
2: y = 1;

Thread 2:
3: r2 = y;
4: x = r2;


The interesting result here is r1 == r2 == 1, which is possible in a real execution if Thread 1's actions are reordered by the compiler. Basically, the write to y occurs, then Thread 2 reads the 1 from y and writes it to x, and then Thread 1 reads that value for x.

The first instruction we want to execute is the one labeled "2". If you look at the actions that happen-before 2 (again, 1 and 0), they will always lead into 2 being executed. So 2 is allowed to happen.

That's well and good for action 2, but the interesting result requires that the value 1 be read from y in instruction 3. This isn't justifiable based on what happens-before 3 -- if we are only looking at what happens-before 3, the only value that can be read from y is 0.

So we need some way of justifying the read of y in 3 returning the value 1. It can only do this if something wrote the value 1 to y. But wait -- we've already said that it was justifiable to write 1 to y! So if, in addition to being able to justify actions based on the actions that happen-before them, we also say that it is possible to justify actions based on actions you have already justified, then we have now allowed our result!

So there is the intuition behind the model. When you want to justify the fact that a read returns the value of a write, you can do so if:

a) That write happened before it, or
b) You have already justified the write.

The way that the model works is that you start with an execution that has all of the actions justified by property a), and you start justifying actions iteratively. So, in the second example above, you create successive executions that justify 0, then 2, then 3, then 4, then 1.

This definition has the lovely property of being reasonably intuitive and also guaranteeing SC for DRF programs.

That's the most complicated bit of the model (perhaps excluding the stuff about observable actions and infinite executions, but they are much less important).

I hope that helps.