Why the '+1' in Two's Complement Arithmetic?

by Daniel S. Wilkerson; 22 June 2004

This is a draft. Sorry for any unclarity.

Have you ever wondered where that +1 comes from when computing negatives in two's complement arithmetic? Inverting the bits to compute the negative seems so natural, but then there is this +1 you have to do. What is the source of the asymmetry that results in this +1, this fly in the ointment?

The problem

Consider a string of N bits as a non-negative binary number. Consider operations plus_N and times_N as arithmetic on N bits except we discard the carry bit that goes off the end; that is, we are doing the usual unsigned arithmetic on binary strings.

This is clearly the ring Z_{2^N} under the standard unsigned interpretation of a bit string as a number. The additive part of the ring is a group, and so inverses are unique. We would like a method for computing the additive inverse of a number in this ring given its representation in bits.

Theorem 1:

The inverse of a number represented by a string of bits is obtained by:

toggling each bit, operation T, followed by
incrementing by one, operation I.

Proof 1:

[This is the standard proof given in CS61C lectures.]

Note that for any x, toggling its bits and x one gives a string of pure ones; Adding one to that gives zero:


        T(x) + x + 1 = 0.

In a ring we can do standard algebra (here, we are using uniqueness of additive inverses).


        I(T(x)) = T(x) + 1 = -x.

Proof 2:

For a string of bits X, its numerical value is


          N(X) = Sum_i ( X_i * 2^i ).

Multiply by -1 to get the negative:


        - N(X) = Sum_i ( - X_i * 2^i ).

That is, replace 0 by 0 and 1 by -1 and you have the negative. The problem is we don't allow -1 as a digit.

So, add one to every digit; together with the negation, this results in simply togging the original bits. But what is the value of the all_ones string so that we can cancel out what we did? Assume for the moment we know it is -1; we look at how to see that below.


        - N(X) = Sum_i ( - X_i * 2^i ) + -1       + 1
               = Sum_i ( - X_i * 2^i ) + all_ones + 1
               = T(X) + 1
               = I(T(X))

One way we can check that all-ones string is negative one since if you add one to it you get zero and inverses are unique.

Another way is to just recall the sum of a geometric series. Let S denote Sum_{i=0,infinity}.


        S = 1 + r + r^2 + ...
          = 1 + r * (1 + r + r^2 ...)
          = 1 + rS.

Therefore S - rS = 1, or


        S = 1/(1-r).

And for a finite sum


        Sum_{i=0,N-1} = Sum_{i=0,infinity} - Sum_{i=N,infinity}
                      = 1/(1-r) - r^N * Sum_{i=0,infinity}
                      = (1-r^N) / (1-r).

So the value of the N ones is:


        (1 - 2^N) / (1 - 2) = 2^N - 1.

Adding one more gives 2^N = 0.

Proof 3:

Let N be 8 for specificity. Draw a clock with 8 equally-spaced points and labeled sequentially in binary.

       000
   111  |  001
  110  -*-  010
   101  |  011
       100

To take the negative we want to reflect across the 0,4 axis. However toggling the bits reflects across the (-0.5, 3.5) axis.

We can do a change of basis of the reflection we have to get the reflection we want, and further the change of basis matrices are just rotations. Here F_x means reflection about the axis through the point x and R_y means rotate by y. (I'm not checking the signs carefully here.) In my notation, function composition applies from left to right, that is A;B means do A then B.


    (Negation) = F_-1/2
               = R_-1/2; F_0; R_1/2
               = F_0; R_1
               = (Toggle); (Increment)

So why does inverting the bits reflect across the F_-1/2 axis instead of F_0? As Rob Johnson points out, any bijection, in particular a flip, has to maintain the center of mass of a bunch of points, since addition commutes. Inverting the bits is a bijection, and therefore the center of mass of the numbers doesn't move. [Need to show it is a flip also.] The axis of the flip has to go through the center of mass which is the expected value of a point if you pick each digit at random.

Normalizing the clock to sum to 1, not 8, we have the value of a one in each binary place is:


        1/2, 1/4, ....

Each has a probability of 1/2 of being one, and is zero otherwise. Thus, the expected value is:


        1/4 + 1/8 + 1/16

which is also:


        1/4 + 1/8 + 1/16 + 1/32 + .... minus
                           1/32 + ....

equals:


        1/2 - 1/16.

Re-normalizing the sum back to 8 (multiplying by 8) the 1/16 becomes 1/2, the source of the asymmetry.

Conclusion

Therefore, you can see the asymmetry as coming from the fact that the expected value of a digit is 0.5, not 0. If one balances the number system using, say, balanced trinary {-1, 0, +1} as your digits, then taking the negative of each digit computes the exact negative, leading a version of proof 2 but without the need to add a string of ones to get back to legal digits. Thus, we can see the source of the asymmetry as being the fact that we approximate points on the line with a geometric series from below, rather than converging from both positive and negative sides using both positive and negative digits.

Acknowledgments

Thanks to Saul Schleimer for proofreading this. Any remaining errors are of course my own. I have not finished including his recommended changes.