You are building an on-device feature-selection routine for a predictive-maintenance model that must transmit at most k sensor channels to the cloud. The objective is to maximize the mutual information I(Y; S) between the selected channel set S and the target variable Y. Because solving the problem exactly is NP-hard, you implement a greedy algorithm that iteratively adds the channel giving the largest marginal increase in I(Y; S) until the quota k is reached. Under which mathematical condition on the utility function does this greedy procedure guarantee that the value obtained is at least (1 - 1/e) ≈ 63 % of the true optimum?
The utility function is concave and Lipschitz continuous.
The utility function merely exhibits the greedy-choice property and optimal substructure.
The utility function is monotone and submodular.
The utility function is convex and twice differentiable.
A classical result by Nemhauser, Wolsey and Fisher shows that, for any non-negative monotone submodular set function f(·) subject to a cardinality constraint, the simple greedy algorithm that adds the element with the largest marginal gain at each step achieves an approximation ratio of (1 - 1/e) to the optimal value. Mutual information over sets of variables is monotone (adding variables never decreases information) and exhibits diminishing returns, making it submodular under the usual entropy formulation. Convexity, concavity, Lipschitz continuity, or generic statements about greedy-choice and optimal substructure are not sufficient to obtain the (1 - 1/e) bound; the proof relies specifically on the submodularity property.
Therefore, the guarantee holds only when the objective is monotone and submodular, making that the correct choice, while the other options do not ensure the approximation bound.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What does it mean for a function to be submodular?
Open an interactive chat with Bash
Why is mutual information considered monotone and submodular?
Open an interactive chat with Bash
What is the significance of the (1 - 1/e) approximation ratio for greedy algorithms?