We will try to prove that the Subset Sum Problem is NP-Complete. To do this, we’ll assume that the problem is NP-Complete, and we’ll show that we can reduce the problem to the Subset Sum problem in polynomial time.
Hence, given a boolean formula in , we will try to construct an instance of the Subset Sum Problem in polynomial time such that the subset problem will be satisfiable if and only if is satisfiable.
The boolean variables in the formula will be and the clauses will be .
An example of a 3-CNF we’ll use for demonstrative purposes is
We’ll construct a set of numbers, with each variable and clause corresponding to 2 numbers each.
Every boolean variable will correspond to 2 numbers and , while every clause corresponds to 2 numbers and .
Every number we construct will be in the decimal base system even though we’re only using 0s and 1s, and they will all be of digits, where is the number of boolean variables and is the number of clauses.
We’ll talk about each number with respect to indexing from the MSB and not the LSB, as it will make it easier to understand.
Initially set all the digits to , and set digits according to this method.
For and , set the digit to , and keep the rest of the digits in the first digits set to .
If appears in clause , then the digit will be set to for . In the table, ’s digit is set to as appears in the first clause.
Similarly, if a variable appears in clause , then the digit will be set to for . As appears in the first clause, the digit is set to for .
For each clause , we have 2 corresponding numbers in the set - and . These two numbers are identical, and only have the digit set to , and the rest to .
The target sum is given as , where we have the first digits as , and the last digits as . We claim that we can pick a subset of the set of numbers we constructed if and only if the is satisfiable.
The first property we ensure is that we pick either or for a given . We can’t pick both, or pick none, we must pick exactly one of them.
This property is ensured by the fact that the first digits are s. We know that only and can have a digit in the first digits be set. For a given position in the first positions, we know that there are only two numbers that have the digit set, and .
Hence, we have the option of choosing exactly one of them in the subset, as choosing both would set the sum of digits at the position to be , and choosing none would sum up to for that position, both cases which don’t satisfy the target sum.
For to be satisfiable, it must make sure that every clause has at least one term that is true. This is ensured by the last digits of every number.
First, we’ll consider only the numbers and . We need to pick the numbers in such a way that each clause has at least one true variable. In terms of the subset problem, we can restate the problem as ensuring that digits to in the sum are all , as this implies that we’ve picked at least one true variable for each clause.
The max value a digit can have amongst the last digits is (when all three variables are true in a clause). To ensure that you can reach exactly , we have and .
Hence, we can say that , were added to the set to provide some sort of padding for testing if each clause is true for a given subset of numbers chosen.
If there exists a subset of numbers that satisfy the subset problem constructed, we get a satisfying assignment to variables for the problem. For all where was chosen, we assign , and for the rest we assign .
Hence, we’ve proven that the problem can be reduced to the Subset Sum Problem, proving that Subset Sum Problem is