Set Cover Problem

Table of content

Set Cover Algorithm
Implementation

The set cover algorithm provides solution to many real-world resource allocating problems. For instance, consider an airline assigning crew members to each of their airplanes such that they have enough people to fulfill the requirements for the journey. They take into account the flight timings, the duration, the pit-stops, availability of the crew to assign them to the flights. This is where set cover algorithm comes into picture.

Given a universal set U, containing few elements which are all divided into subsets. Considering the collection of these subsets as S = {S₁, S₂, S₃, S₄... S_n}, the set cover algorithm finds the minimum number of subsets such that they cover all the elements present in the universal set.

As shown in the above diagram, the dots represent the elements present in the universal set U that are divided into different sets, S = {S₁, S₂, S₃, S₄, S₅, S₆}. The minimum number of sets that need to be selected to cover all the elements will be the optimal output = {S₁, S₂, S₃}.

Set Cover Algorithm

The set cover takes the collection of sets as an input and and returns the minimum number of sets required to include all the universal elements.

The set cover algorithm is an NP-Hard problem and a 2-approximation greedy algorithm.

Algorithm

Step 1 − Initialize Output = {} where Output represents the output set of elements.

Step 2 − While the Output set does not include all the elements in the universal set, do the following −

Find the cost-effectiveness of every subset present in the universal set using the formula, $\frac{Cost\left ( S_{i} ight )}{S_{i}-Output}$
Find the subset with minimum cost effectiveness for each iteration performed. Add the subset to the Output set.

Step 3 − Repeat Step 2 until there is no elements left in the universe. The output achieved is the final Output set.

Pseudocode

APPROX-GREEDY-SET_COVER(X, S)
   U = X
   OUTPUT = ф
   while U ≠ ф
      select S_i Є S which has maximum |S_i∩U|
   U = U – S
   OUTPUT = OUTPUT∪ {S_i}
return OUTPUT

Analysis

assuming the overall number of elements equals the overall number of sets (|X| = |S|), the code runs in time O(|X|3)

Example

Let us look at an example that describes the approximation algorithm for the set covering problem in more detail

S₁ = {1, 2, 3, 4}                cost(S₁) = 5
S₂ = {2, 4, 5, 8, 10}            cost(S₂) = 10
S₃ = {1, 3, 5, 7, 9, 11, 13}     cost(S₃) = 20
S₄ = {4, 8, 12, 16, 20}          cost(S₄) = 12
S₅ = {5, 6, 7, 8, 9}             cost(S₅) = 15

Step 1

The output set, Output = ф

Find the cost effectiveness of each set for no elements in the output set,

S₁ = cost(S₁) / (S₁ – Output) = 5 / (4 – 0)
S₂ = cost(S₂) / (S₂ – Output) = 10 / (5 – 0)
S₃ = cost(S₃) / (S₃ – Output) = 20 / (7 – 0)
S₄ = cost(S₄) / (S₄ – Output) = 12 / (5 – 0)
S₅ = cost(S₅) / (S₅ – Output) = 15 / (5 – 0)

The minimum cost effectiveness in this iteration is achieved at S₁, therefore, the subset added to the output set, Output = {S₁} with elements {1, 2, 3, 4}

Step 2

Find the cost effectiveness of each set for the new elements in the output set,

S₂ = cost(S₂) / (S₂ – Output) = 10 / (5 – 4)
S₃ = cost(S₃) / (S₃ – Output) = 20 / (7 – 4)
S₄ = cost(S₄) / (S₄ – Output) = 12 / (5 – 4)
S₅ = cost(S₅) / (S₅ – Output) = 15 / (5 – 4)

The minimum cost effectiveness in this iteration is achieved at S₃, therefore, the subset added to the output set, Output = {S₁, S₃} with elements {1, 2, 3, 4, 5, 7, 9, 11, 13}.

Step 3

Find the cost effectiveness of each set for the new elements in the output set,

S₂ = cost(S₂) / (S₂ – Output) = 10 / |(5 – 9)|
S₄ = cost(S₄) / (S₄ – Output) = 12 / |(5 – 9)|
S₅ = cost(S₅) / (S₅ – Output) = 15 / |(5 – 9)|

The minimum cost effectiveness in this iteration is achieved at S₂, therefore, the subset added to the output set, Output = {S₁, S₃, S₂} with elements {1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 13}

Step 4

Find the cost effectiveness of each set for the new elements in the output set,

S₄ = cost(S₄) / (S₄ – Output) = 12 / |(5 – 11)|
S₅ = cost(S₅) / (S₅ – Output) = 15 / |(5 – 11)|

The minimum cost effectiveness in this iteration is achieved at S₄, therefore, the subset added to the output set, Output = {S₁, S₃, S₂, S₄} with elements {1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 16, 20}

Step 5

Find the cost effectiveness of each set for the new elements in the output set,

S₅ = cost(S₅) / (S₅ – Output) = 15 / |(5 – 14)|

The minimum cost effectiveness in this iteration is achieved at S₅, therefore, the subset added to the output set, Output = {S₁, S₃, S₂, S₄, S₅} with elements {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 16, 20}

The final output that covers all the elements present in the universal finite set is, Output = {S₁, S₃, S₂, S₄, S₅}.

Implementation

Following are the implementations of the above approach in various programming langauges −

C C++ Java Python

#include <stdio.h>
#define MAX_SETS 100
#define MAX_ELEMENTS 1000
int setCover(int X[], int S[][MAX_ELEMENTS], int numSets, int numElements, int output[]) {
   int U[MAX_ELEMENTS];
   for (int i = 0; i < numElements; i++) {
      U[i] = X[i];
   }
   int selectedSets[MAX_SETS];
   for (int i = 0; i < MAX_SETS; i++) {
      selectedSets[i] = 0; // Initialize all to 0 (not selected)
   }
   int outputIdx = 0;
   while (outputIdx < numSets) {  // Ensure we don't exceed the maximum number of sets
      int maxIntersectionSize = 0;
      int selectedSetIdx = -1;
      // Find the set Si with the maximum intersection with U
      for (int i = 0; i < numSets; i++) {
         if (selectedSets[i] == 0) { // Check if the set is not already selected
            int intersectionSize = 0;
            for (int j = 0; j < numElements; j++) {
               if (U[j] && S[i][j]) {
                  intersectionSize++;
               }
            }
            if (intersectionSize > maxIntersectionSize) {
               maxIntersectionSize = intersectionSize;
               selectedSetIdx = i;
            }
         }
      }
      // If no set found, break from the loop
      if (selectedSetIdx == -1) {
          break;
      }
      // Mark the selected set as "selected" in the array
      selectedSets[selectedSetIdx] = 1;
      // Remove the elements covered by the selected set from U
      for (int j = 0; j < numElements; j++) {
          U[j] = U[j] - S[selectedSetIdx][j];
      }
      // Add the selected set to the output
      output[outputIdx++] = selectedSetIdx;
   }
   return outputIdx;
}
int main() {
   int X[MAX_ELEMENTS] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
   int S[MAX_SETS][MAX_ELEMENTS] = {
      {1, 1, 0, 0, 0, 0, 0, 0, 0, 0},
      {0, 1, 1, 1, 0, 0, 0, 0, 0, 0},
      {0, 0, 0, 1, 1, 1, 0, 0, 0, 0},
      {0, 0, 0, 0, 0, 1, 1, 1, 0, 0},
      {0, 0, 0, 0, 0, 0, 0, 1, 1, 1}
   };
   int numSets = 5;
   int numElements = 10;
   int output[MAX_SETS];
   int numSelectedSets = setCover(X, S, numSets, numElements, output);
   printf("Selected Sets: ");
   for (int i = 0; i < numSelectedSets; i++) {
      printf("%d ", output[i]);
   }
   printf("
");
   return 0;
}

Output

Selected Sets: 1 2 3 4 0

#include <iostream>
#include <vector>
using namespace std;
#define MAX_SETS 100
#define MAX_ELEMENTS 1000
// Function to find the set cover using the Approximate Greedy Set Cover algorithm
int setCover(int X[], int S[][MAX_ELEMENTS], int numSets, int numElements, int output[])
{
   int U[MAX_ELEMENTS];
   for (int i = 0; i < numElements; i++) {
      U[i] = X[i];
   }
   int selectedSets[MAX_SETS];
   for (int i = 0; i < MAX_SETS; i++) {
      selectedSets[i] = 0; // Initialize all to 0 (not selected)
   }
   int outputIdx = 0;
   while (outputIdx < numSets) {  // Ensure we don't exceed the maximum number of sets
      int maxIntersectionSize = 0;
      int selectedSetIdx = -1;
      // Find the set Si with maximum intersection with U
      for (int i = 0; i < numSets; i++) {
         if (selectedSets[i] == 0) { // Check if the set is not already selected
            int intersectionSize = 0;
            for (int j = 0; j < numElements; j++) {
               if (U[j] && S[i][j]) {
                  intersectionSize++;
               }
            }
            if (intersectionSize > maxIntersectionSize) {
               maxIntersectionSize = intersectionSize;
               selectedSetIdx = i;
            }
         }
      }
      // If no set found, break from the loop
      if (selectedSetIdx == -1) {
         break;
      }
      // Mark the selected set as "selected" in the array
      selectedSets[selectedSetIdx] = 1;
      // Remove the elements covered by the selected set from U
      for (int j = 0; j < numElements; j++) {
         U[j] = U[j] - S[selectedSetIdx][j];
      }
      // Add the selected set to the output
      output[outputIdx++] = selectedSetIdx;
   }
   return outputIdx;
}
int main()
{
   int X[MAX_ELEMENTS] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
   int S[MAX_SETS][MAX_ELEMENTS] = {
      {1, 1, 0, 0, 0, 0, 0, 0, 0, 0},
      {0, 1, 1, 1, 0, 0, 0, 0, 0, 0},
      {0, 0, 0, 1, 1, 1, 0, 0, 0, 0},
      {0, 0, 0, 0, 0, 1, 1, 1, 0, 0},
      {0, 0, 0, 0, 0, 0, 0, 1, 1, 1}
   };
   int numSets = 5;
   int numElements = 10;
   int output[MAX_SETS];
   int numSelectedSets = setCover(X, S, numSets, numElements, output);
   cout << "Selected Sets: ";
   for (int i = 0; i < numSelectedSets; i++) {
       cout << output[i] << " ";
   }
   cout << endl;
   return 0;
}

Output

Selected Sets: 1 2 3 4 0

import java.util.*;
public class SetCover {
   public static List<Integer> setCover(int[] X, int[][] S) {
      Set<Integer> U = new HashSet<>();
      for (int x : X) {
         U.add(x);
      }
      List<Integer> output = new ArrayList<>();
      while (!U.isEmpty()) {
         int maxIntersectionSize = 0;
         int selectedSetIdx = -1;
         for (int i = 0; i < S.length; i++) {
            int intersectionSize = 0;
            for (int j = 0; j < S[i].length; j++) {
               if (U.contains(S[i][j])) {
                  intersectionSize++;
               }
            }
            if (intersectionSize > maxIntersectionSize) {
               maxIntersectionSize = intersectionSize;
               selectedSetIdx = i;
            }
         }
         if (selectedSetIdx == -1) {
            break;
         }
         for (int j = 0; j < S[selectedSetIdx].length; j++) {
            U.remove(S[selectedSetIdx][j]);
         }
         output.add(selectedSetIdx);
      }
      return output;
   }
public static void main(String[] args) {
   int[] X = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
   int[][] S = {
      {1, 2},
      {2, 3, 4},
      {4, 5, 6},
      {6, 7, 8},
      {8, 9, 10}
   };
   List<Integer> selectedSets = setCover(X, S);
   System.out.print("Selected Sets: ");
   for (int idx : selectedSets) {
      System.out.print(idx + " ");
   }
   System.out.println();
   }
}

Output

Selected Sets: 1 3 4 0 2

def set_cover(X, S):
    U = set(X)
    output = []
    while U:
        max_intersection_size = 0
        selected_set_idx = -1
        for i, s in enumerate(S):
            intersection_size = len(U.intersection(s))
            if intersection_size > max_intersection_size:
                max_intersection_size = intersection_size
                selected_set_idx = i
        if selected_set_idx == -1:
            break
        U = U - set(S[selected_set_idx])
        output.append(selected_set_idx)
    return output
if __name__ == "__main__":
    X = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    S = [
        {1, 2},
        {2, 3, 4},
        {4, 5, 6},
        {6, 7, 8},
        {8, 9, 10}
    ]
    selected_sets = set_cover(X, S)
    print("Selected Sets:", selected_sets)

Output

Selected Sets: 1 3 4 0 2

数据结构和算法

数据结构

链接列表

堆栈 &队列

搜索算法

排序算法

图形数据结构

树数据结构

递归

分而治之

贪婪算法

动态规划

近似算法

随机算法

DSA 有用资源

Set Cover Problem

Set Cover Algorithm

Algorithm

Pseudocode

Analysis

Example

Implementation

Output

Output

Output

Output

颜色选择器

读后有收获微信请站长喝咖啡

错误报告

您的建议:

感谢您的帮助！