Decision Problem in NP -π is in NP if for any YES instance I, ∃ a proof that I is a YES instance that can be verified in polytime. Counting Problem - We define a counting problem for a decision problem π in NP, where given an instance I of π, we have to produce an output the number of solutions for the instance.
Polynomial Approximation Scheme
A deterministic algorithm A for a counting problem P such that it takes an instance I as input, and ϵ∈R>0, and in polynomial time wrt n=∣I∣ produces an output A(I) such that
(1−ϵ)#(I)≤A(I)≤(1+ϵ)#(I)
A fully polynomial approximation scheme (FPAS) is a PAS such that it runs in time poly(n,ϵ1)
Polynomial Randomized Approximation Scheme
A randomized algorithm A for a counting problem P that takes instance I as input, and ϵ∈R>0, in time poly(n) produces A(I) such that
P[(1−ϵ)#(I)≤A(I)≤(1+ϵ)#(I)]≥43
A fully polynomial randomized approximation scheme (FPRAS) is a PRAS such that it runs in time poly(n,ϵ1)
An (ϵ,δ)−FPRAS for a counting problem is an FPRAS that takes as input an instance I and computes an ϵ−approximation to #(I) with probability ≥1−δ in time poly(n,e1,δ1)
Abstract Problem
We have U, a finite set of known size, and a function f:U→{0,1}, using which we define G={u∈U∣f(u)=1}. We have to estimate ∣G∣.
Attempt 1
Sample u1,u2,…,uN from U independently.
For all i∈[N]
However, N is dependent on ρ=∣U∣∣G∣, hence N can be exponential if ρ is exponentially small. Next, we look at an example of how we can reduce the universal set size to get a better ρ.
Example of DNF Counting
DNFs are the opposite of CNF, where we take a logical OR of clauses, which have logical ANDs.
F=T1∨T2∨⋯∨Tm,Ti=L1∧L2∧⋯∧Lk
The universe is U={T,F}n
We need to find the number of satisfying solutions to F
G={u∈U∣F(u)=T}
To count this, we make use of biased sampling, which is essentially reducing our universe so that we can get a more accurate estimation with lesser trials.
We define Hi as the subset of assignments that satisfy term Ti, hence we have H1,H2,…,Hm
We know ∣Hi∣=2n−ri, as there is exactly one assignment of the ri literals in term Ti to make it satisfy, and all other n−ri literals can take any value.
H=i=1⋃mHi
is the count we want, instead, we first work with a multiset union of these. Definition of the Universal Set
U=H1⊎H2⊎⋯⊎Hm
∣U∣=i=1∑m∣Hi∣≥∣H∣
To make things easier, we define U as
U={(v,i)∣v∈Hi}
To distinguish between duplicates and where they came from.
Coverage Set
cov(v)={i∣(v,i)∈U}
This essentially captures all the terms that would be satisfied by this assignment.
A trivial bound on this is ∣cov(v)∣≤m as an assignment can’t satisfy more terms than those that exist.
∣U∣=v∈H∑∣cov(v)∣
Function to define G in terms of U f:U→{0,1}
f((v,i))={10if i=min{j∣(v,j)∈U}otherwise
G={(v,j)∣f((v,j))=1}
By defining in terms of this, we essentially just take the minimum index occurrence of an assignment in case of duplicates, hence avoiding duplicates in G.
We can see that ∣G∣=∣H∣
Lemma -
ρ=∣U∣∣G∣≥m1
Proof -
∣U∣=v∈H∑∣cov(v)∣≤v∈H∑m=m∣H∣=m∣G∣
⟹∣U∣∣G∣≥m1
Hence, we’ve modified U such that the ratio has a much tighter bound in this specific counting problem.
Using the Estimator theorem, we can say that if ρ≥m1 , then the monte carlo method yields an ϵ−approximation to ∣G∣ with probability ≥1−δ, provided
N≥ϵ24mlnδ2
Randomized algorithm for DNF
We still need to ensure that the queries u1,u2,…,un are still randomly sampled from U. The algorithm to do this is given as follows.
Sample i∈[m] with probability ∣U∣∣Hi∣
Sample from ∣Hi∣ uniformly. Fix the ri bits that are there in the term, and for the rest of the bits you can just flip a coin to decide each of their values.
P[(v,i) is sampled]=P[i is sampled]⋅P[v is sampled from Hi∣i is sampled]=∣U∣∣Hi∣⋅∣Hi∣1=∣U∣1
Hence, every element in U is sampled with uniform probability.