value types and struct tearingThis note explains how a notion of “value types” for the VM
should protect the integrity of that value type’s invariants,
and points out important differences, in memory effects,
between “struct-like” and “persistent” designs for values.
First, the running example
Consider this two-variable data structure, a pair of encapsulated
integral coordinates, with a couple of access functions:
class XY {
private int x = 1, y = 0;
public XY() { } // make a blank one
public boolean step(int p, int q) {
int x1 = x + p, y1 = y + q;
if (x1 == 0 && y1 == 0) return false;
x = x1; y = y1;
return true;
}
public double measure() { // inverse radius
return Math.pow((double)x*x + (double)y*y, -0.5);
}
public void copyFrom(XY that) {
this.x = that.x; this.y = that.y;
}
}
The reader function measure
produces a real number, but will fail to
do so if both values x
, y
are zero at the same time.
The writer function step
updates the two coordinates (incrementally,
as if in a random walk) except when they would both be zero.
The details are not very important; the main point is that the two
values are mostly independent, but are coupled by an invariant that
can be violated by an uncoordinated change to either value alone.
Off to the races
What could go wrong? Why, nothing, if one of the following
conditions is satisfied:
If the object is confined to one thread only.
If multiple threads are updating the object, but some sort of
synchronization is preventing calls to step
and measure
from
overlapping.
If there is some assurance that an XY
cannot change and has beensafely published. Stashing a copy of the object is the basic
idea; that is why we might need methods like copyFrom
.
Side note 1:
Even a fully confined XY
object could in principle be broken by an
asynchronous interrupt which cancels an assignment to the y
field,
but such interrupts are not now part of the Java landscape.
Side note 2:
General descriptions of these sharing options, and more, may be
found in Chapter 3 of Java Concurrency in Practice.
The easy way to cover all the cases is to mark all the methods
as synchronized.
This is the design of the old Java class java.util.Vector
,
which synchronizes all of its access functions.
Since synchronization is usually not free, this design choice
pushes a cost onto all users.
This cost can sometimes be reduced by using transactions under
the hood, such as Intel supports, but since the language
and bytecode set do not directly express the transactions,
there is always a possibility that the JVM will have to create
a real critical section, so the optimization is not reliable.
This is why many programmers (including those who designed newer
standard classes like ArrayList
) omit the synchronization
and instead push the responsibility for locking onto the user.
This allows users to balance the cost of locking against the
danger of races.
The user of an ArrayList
must take responsibility to avoid publishing a
reference to the list without proper mutual exclusion between threads.
Often this is easy enough. If the data is shared, it is shared between threads
that are programmed together under common design rules and tested well enough
to catch simple bugs.
Insecurity complex
The responsibility of the user increases if the list will contain
sensitive data, such as parameters to a privileged operation.
In that case, the programmer must ensure that a reference to the shared
list cannot leak to uncontrolled code, even if the program is abused
in some way.
Otherwise, an attacker may create or obtain a shared list reference,
hand it to privileged code, and then at the same time mutate it.
The mutations will appear to the victim code via data races.
Most such mutations will be harmless, since the privileged code
will validity-check the arguments and throw an exception.
But with enough attempts from the attacker, a race can happen
which puts the list into a state which confuses the privileged
code into doing something unpredicted—except by the attacker.
This kind of weakness, known as a TOCTTOU bug, is a risk
whenever privilege checks are based on mutable data objects.
It is important to note that marking a variable as private
does
not protect against inconsistent updates from data races.
In the simple example above, where all the state is private,
a data race can create a disallowed state as follows:
final XY victim = new XY();
Thread racer = new Thread() { public void run() {
for (int i=0;i<1e6;i++) { victim.step(-1, 1); victim.step(1, -1); } } };
racer.start();
for (int i=0;i<1e6;i++) {
assert(victim.measure() <= 1e12) : i;
}
This code can fail its assertion quickly, often in the first 100
tries. The victim is observed in a state where both fields are zero.
Locally valid updates to single fields of the victim can cause
this violation.
The sequence of updates that leads the victim to that disallowed
state, plus the existence of additional surprising states, is
an exercise for the reader.
Side note:
Since measure
reads each field twice, it is allowed
to pick up different values for the two reads of one field.
This is rare, because optimizers tend to merge the reads, but it
cannot be excluded.
Thus, inconsistent reads of the same variable can therefore be
a source of bugs, and this is true even for one-field objects.
We could call this a single-variable-double-read hazard.
The invariant on XY
fails because the two fields lose their
mutual coherence because of race conditions. Let’s call this
loss of coherence structure tearing, because it looks like
the victim has been torn in parts and reassembled from other
parts.
In the analogous case of (non-volatile) 64-bit primitives
on old 32-bit JVMs, the 64-bit value can come apart into
independent 32-bit halves, which can race apart.
This is not exactly the same as “word tearing” but
is close enough (I think) to merit the term.
Persistence pays off
In our simple example, another way to protect the data is to formulate
it using final variables, changing the design to be persistent.
Side note:
These days we are recycling the term “persistent” as an
alternative and refinement to the older term “immutable”.
A problem with the older term is evident whenever you have a
conversation that stalls on the audible similarity of phrases like“immutable object” and “a mutable object”. Mumblable “immutable” is
semblable to “mutable”, say that ten times fast.
class XY {
final private int x, y;
private XY(int x, int y) { this.x = x; this.y = y; }
public static final BLANK = new XY(1, 0);
public XY step(int p, int q) {
int x1 = x + p, y1 = y + q;
if (x1 == 0 && y1 == 0) return null;
return new XY(x1, y1);
}
public double measure() { // inverse radius
return Math.pow((double)x*x + (double)y*y, -0.5);
}
//XY copyFrom(XY that) { return that; } // no special copy
}
In contrast to this persistent version of XY
, we can call the first
version struct-like. Despite the encapsulation of the fields, it
interacts with memory like a C struct.
The persistent version does not suffer from races on its x
and y
variables.
A persistent XY
object state can be captured or published simply by
moving a reference.
There only needs to be one public “blank” uninitalized value, and the
constructor itself is hidden inside the capsule.
In exchange for this new stability, users must update their references
whenever they call the updater function step
.
Crucially, it is now impossible to observe an XY
object in a
disallowed state. The class has full control over its internal
invariants, even in the face of deviously attacking racer threads.
There are still races possible, but they are on the references.
And since updating a reference is one memory operation, there
is no issue of multiple fields parting ways.
The single-variable-double-read bug can still happen, rarely.
It seems there is always a cost for stability and security.
In this case, the cost is allocating a new XY
object for
each distinct state.
This is required because the class insists on creating a new XY
object to represent each new position in the coordinate space.
Put another way, the user is forbidden to use the optimization
of re-using XY
objects to represent multiple values over time.
This is a reasonable restriction, since that optimization can
lead to race conditions, as described above.
Side note:
Since the XY
constructor is private, the class might cache
values like java.lang.Integer
does, but this has its own
sometimes surprising costs, since the caching logic can
interfere with optimizations such as escape analysis.
In any case, making the constructor private is a helpful move.
Where flattery gets us
But, suppose the JVM had an optimization to flatten instances
of XY
, in either the mutable or the persistent form. A flattened
instance of XY
would have both of its fields stored directly
in some containing object (or array). There would be no need
to have a separate XY
object on the heap to hold the fields.
Of course, if references to XY
were required for some uses,
these references could be created temporarily and then discarded.
Obviously, this sort of thing is what we are calling value types.
In a nutshell, a value type encapsulates a group of component values,
and can be efficiently stored in any type of Java variable.
Would value types this provide significantly better options for
designing XY
and similar classes, than the current ones sketched
above? The answer is a qualified “yes”.
Perhaps the biggest advantage would be better use of memory.
Overhead for the distinct XY
object (header+padding) would vanish.
And code would access the x
and y
fields directly within
a containing object, using static offset arithmetic rather
than a dynamic and cache-busting pointer chase.
This is most evident if many pairs are stored in an array:
Hardware can get clever about sequential accesses to the
fields.
The other main advantage would be better support for
methods which pass and return composite values. A method
could receive or return two or more values inside a single
flattened value “on the stack” (or in registers, usually).
Of course, locals could also contain flattened values.
So complex numbers and vectors get practical, along with
a host of other small but useful types.
There are more advantages that may accrue from value types,
but this is not the place to go into the details.
Memory vs. method
The two main advantages—better use of memory and better method
types—are at some tension with each other. If value types are
conceived as a memory layout mechanism only, then their primary usage
will be via a Java pointer. The flattened realization in registers
(in method types) will be a fiction to be uneasily maintained, at the
cost of confusing various JVM optimizations.
On the other hand, value types will fail to deliver effective use of
memory if they are conceived as only a clever way of loading registers
(again, holding method arguments, locals, and return values).
This is because their representation in other variables (fields and
array elements) must live in memory, and as such is sensitive to
quality of memory layout.
Naturally, we want both advantages, with good representations in
both register-based and memory-based variables.
To the extent the system materializes Java references that point
to flattened values, those references must be easily and routinely
optimized away. And such materialized references should be rare.
Therefore, a good design for value types needs a primary representation
that is pointer-free, even if there is a secondary “boxed” representation
which uses references.
Climbing into the capsule
We also want encapsulation, which means privacy of component fields
and of methods. Any credible value type design for the JVM must allow
a value type to restrict access and enforce invariants.
But encapsulation is incomplete—and therefore a dangerous
illusion—if it the protected invariants can be subverted by race
conditions, since those race conditions are available to anyone who
can perform a value type assignment.
This brings us back to the defects of the struct-like programming
style for XY
in the first example above. As seen above, the Java
memory model permits structure tearing, and requires the author of
a class to clearly state its contract, allowing the user to take
defensive action if the class does not adequately defend itself
against races.
But I wish to use value types, someday, to contain security critical
values, such as 96-bit timestamps or encapsulated native pointers.
The persistent-style design of String
is integral to its usability,
outside of any container, for carrying security critical values.
Struct-like value types will, I think, be difficult or impossible
to secure to the same degree, because of their weaker encapsulation.
They will have to be put inside a container to manage safe access
(a bit like an ArrayList
inside Collections.synchronizedList
),
and this will tend to cancel the advantages of flattening them
in the first place.
Assignment as an act of violence
There are a few specific reasons the persistent design provides better
encapsulation and safer APIs than the struct-like design.
First, since value types can be stored in all sorts of variables, not
just memory locations, it will be extremely common to assign them from
place to place. Note that assignment is a JVM primitive (in the
current JVM design) and cannot be customized by a class.
This is of no consequence for existing Java APIs, but if it is simply
generalized to a racy componentwise copy (as in C), then every
assignment statement which operates on a value type will be at risk of
structure tearing.
But, Java programmers do not expect assignments to corrupt
any internal structure of the assigned quantity.
(The sole exception of 64-bit primitives on 32-bit machines
is widely neglected.)
Allowing a value type assignment to racily disturb the underlying
value’s invariants is likely to introduce a new and persistent family
of bugs to Java programs.
Fixing this problem for struct-like types would require reifying the
assignment operation as an explicit method which could then perform
synchronization on both the source and destination of the copied
value. The code to do this would be complex and prone to errors and
deadlocks. In practice, designers of value types would punt on the
problem, pushing it (with wide-eyed trust and hope, doubtless) to
their users. Their users, meanwhile, would have to learn the
conventional distinctions between safe and unsafe versions of types,
for example choosing between FastString
and ThreadSafeString
.
Publishing via finals
Second, and more subtly, safe publication is one of the least
well understood aspects of the Java memory model, but it is crucial to
the safe sharing of any non-persistent type. Safe publication
requires either accurate synchronization on the object containing the
mutable state, or publishing the state via a final variable. Both of
these options require levels of indirection which are convenient
enough today, but would tend to vanish as data structures are
flattened.
The simplest (and thus safest) extension of safe publication patterns
to value types is to declare that their component fields are final, so
that when a value is assigned, its components are automatically
published individually. This pattern comes out naturally from
persistent-style values, but must be imposed on struct-like values.
In a sense, we are revisiting the old design decision in Java to make
variables mutable by default. Recall that blank final fields were
added in Java 1.1, enabling persistent-style types, and even today we
don’t yet have frozen (immutable, persistent) arrays. It would be
consistent with the oldest versions of Java to define the new
composite types to be mutable unless requested otherwise.
But more modern forms of inter-thread communication have
been created outside of that original model, avoiding
synchronization and adding (via the JMM) safe publication
semantics to final variables.
Although it would be a stretch to say, “Java should have been
designed with immutability as the default”, I believe that
consistency with Java 1.0 conventions for mutability could
reasonably be traded away in order to gain modern thread safety.
Costs of persistence
Those are the reasons why I think the persistent design pattern
is preferable over the struct-like pattern.
That is true despite the difficulties of the persistent design,
which I will describe next.
First, the stability provided by final variables is not free.
In the current JVM, they may require memory fences in constructors.
In a JVM supporting persistent-style value types, every assignment
to a memory location (not to registers) may require a similar
handshake with the memory system, to ensure that writes of the
component values are safe publications.
Correctness vs. throughput?
Second, the atomicity provided by single-reference updates
goes away if objects are flattened, and it must be recovered
some other way.
A store to a persistent-style value type must ensure that other
threads either see the whole store, or none of it.
This will require another kind of handshake with the memory
system, along the lines of the Intel transactions mentioned
above, or an atomic multi-word store instruction.
Note that the example type XY
fits in 64 bits, and so
would be supported at no extra cost by all 64-bit processors.
Processors which provide larger atomic vector store instructions
will cheaply support any value that fits in their vectors.
In the case of a jumbo value type which spills over multiple hardware
vectors and/or cache lines, the JVM’s software will need to perform
additional handshakes, perhaps including old fashioned boxing.
For sophisticated users who know what they are doing, a
non-transactional store operation, with the possibility of structure
tearing, needs to be provided.
Because of tearing, the library would have to provide additional
interlocks to ensure proper confinement, immutability, and so forth.
In other words, the struct-like component-wise assignment operator
will need to be made available as a privileged operation derived from
the persistent pattern, and used only inside carefully designed
concurrency-safe libraries.
This is the flip side of the necessity for providing persistent
containers for struct-like values, for safe publication. The key
question, I think, is which should be the default (mutable or
persistent) and under what circumstances the non-default mode
be supplied.
Notation, notation, notation
A third downside to the persistent style of values is notational.
It seems there are times when you want to say “just change the
imaginary component of this number to zero, please”.
If a value type is willing to expose its components and
accept component-wise updates, it seems harsh to require
the user to create a new value from scratch:
Complex c = ...;
c = new Complex(c.re, 0.0); // rebuild from scratch
c = c.changeIm(0.0); // maybe use a helper method
c.im = 0.0; // but what I meant was this
This can be viewed mainly a matter of syntax sugar, but there is also
a deep connection to the hardware here. A multi-component value type
is realized either as bits in a block of memory words or as as bits in
a collection of live registers. It is fundamentally reasonable for
the user to ask to change just one of those low-level components
in isolation, assuming the library designer allows the operation.
At the JVM level, it should be possible to render the bytecodes
of a component-wise update fairly directly and without confusion
to a register or memory write (adding memory handshakes as needed).
Reconstructing a new value from the ground up, just because one
component shifted, might create a bunch of noisy intermediate
representation which could distract the JIT compiler from more
important optimization work.
Our race is now run
To conclude, I believe that implementation challenges of
persistent-style values are manageable, and that the corresponding
opposing trade-offs for struct-like values are much more difficult to
control. In the end it comes down to safe and clean user model versus
direct compilation to memory instructions. And when safety seems
opposed to speed, I think we can agree that JVM needs to lean towards
designs that are safe by default, in the expectation that slowness is
easier to fix than insecurity.
And in the end, I think we will get both safety and speed.