The other day I was working on a proof to show that there is a polynomial transformation from SUBSET-SUM-ZERO to SUBSET-SUM-ONE where the two problems are decision problems and are defined as follows:

  1. SUBSET-SUM-ZERO: Given a list of integers, if there exists a non-empty subset of those integers that sum to $0$, return YES. Otherwise return FALSE.
  2. SUBSET-SUM-ONE: Given a list of integers, if there exists a non-empty subset of those integers that sum to $1$, return YES. Otherwise return FALSE. For a while, I had come up with a transformation $f(I)$ where I could show that if I had a instance $I$ that should evaluate to true for SUBSET-SUM-ZERO (SSZ), then there is one for $f(I)$ that evaluates true for SUBSET-SUM-ONE (SSO), but couldn’t find one to prove the backwards direction.

After some back and forth of modify/adding/removing parts of $f(I)$ to gain the ability to fix a structure for a subset sum in $f(I)$ which is $1$, I ended up with the transformation:

  1. Define $P$ as the sum of all positive numbers in $I$ and $N$ as the sum of all negative numbers.
  2. Define $K = P + |N| + 1$. The importance of the $+1$ will become apparent in setting up inequalities in the following sections.
  3. Add $2x_i + 2K$ to the list generated by $f(I)$ where $x_i$ is an entry in $I$.
  4. Add $n-1$ entries of $-2K$ to $f(I)$ where $n$ is the length of $I$.
  5. Add $-2K + 1$ to $f(I)$.

Using this definition of $f(I)$ I was able to eliminate some possible structures for a subset of $f(I)$ that adds to $1$ (which I leave the elimination as an exercise to the reader 🙃) to be left with the only possible structure(s) being of the form: $$ 2(x_e + … + x_f) + j(2K) + c(-2K) + (-2K + 1) = 1 $$ where:

  1. $2(x_e + … + x_j) + j(2K)$ is some non-empty subset sum of elements in $f(I)$ that “come from” $I$ but are transformed. $j$ is the number of entries in $(x_e + … + x_j)$.
  2. $c(-2K)$ are some entries of $-2K$ where $0 \leq c \leq n - 1$.
  3. $-2K + 1$ is the last entry, and must be included since it is the only non-odd entry.

After doing some rearranging and factoring, we are left with: $$ x_e + … + x_j = K(i - j) $$ where $i = c + 1$. My goal here is to show that $x_e + … + x_j$ sums to zero, which shows that $I$ contains a subset who’s sum is zero.

The key observation is that the righthand side is some positive or negative multiple of $K = P + |N| + 1$. Then, the smallest positive value that can appear on the righthand side of the equation is $K = P + |N| = 1$ while the greatest negative number (closest to zero) is $-K = -(P + |N| + 1)$.

We know that the largest possible positive sum we can find in $I$ can only be as large as the sum of positive values in $I$, since any other value is either zero or reduces the sum as it is negative. Then we have $x_e + … + x_j < K$ since $P < K = P + |N| + 1$. A same statement can be made about the greatest negative number to determine that $-K < x_e + … + x_j$. This means that $-K < x_e + … + x_j < K$ and $x_e + … + x_j$ is a non-empty subset of $I$ that sums to a multiple of $K$. See where this is going? The only integer multiple of $K$ between $-K$ and $K$ that is $0$, which means $x_e + … + x_j$ is a non-empty sum of $I$ that sums to $0$ (as required).

To summarize: the reason we were able to deduce that the sum was zero was because we had (among some other choices of $f(I$) to eliminate possible other structures) a choice of $K$ that has a lowest possible positive value and greatest possible negative value out of the range of values that any subset of $I$ could sum too. Finding these bounds are a very simple task (unlike the NP nature of the SSO and SSZ class of problems), which makes finding $K$ in polytime easy too.

This also highlights the nature of some problems are NP (like finding whether a subset of given integers sum to a particular value), but related problems are in fact P (like finding the upper and lower bounds of the sum of a subset of given integers).