DevOps Engineer¶
Scheduling deployments¶

Schedule resource-heavy deployments on customer-facing hardware at low-traffic times so that customers are least likely to be impacted.
As a DevOps engineer, you are responsible for scheduling deployments throughout the day while minimizing the maximum load on the servers at any given time. You have data on the expected load during the day from regular customer usage, which consumes part of the available load on the servers. Additionally, you have several deployments planned, each with its own load requirements and duration.
The challenge is to schedule these deployments such that the load deviation on the servers is minimized. You also need to ensure that the load never surpasses 100%.
Objective: Minimize the total deviation from the average load on the servers.
Constraints:
- All deployments have to be executed and each deployment's start time must be in the range given in `Deployment Start Window Start` and `Deployment Start Window End`
- The total load at any given time (customer load + deployment load) should not exceed the server capacity.
- Deployments must be non-preemptive (i.e., once started, a deployment must run to completion).
Data:
The customer load can be found in scheduling_deployments_base_load.csv and has the following columns: Time,Customer Load
The deployments can be found in scheduling_deployments_deployments.csv and has the following columns: Deployment ID,Deployment Load,Deployment Duration,Deployment Start Window Start,Deployment Start Window End
scheduling_deployments_base_load.csv:
Time,Customer Load
0,20
1,25
2,30
3,22
4,18
5,15
6,20
7,35
8,50
9,60
10,55
11,45
12,40
13,38
14,35
15,33
16,30
17,40
18,50
19,55
20,60
21,45
22,35
23,25
scheduling_deployments_deployments.csv:
Deployment ID,Deployment Load,Deployment Duration,Deployment Start Window Start,Deployment Start Window End
D1,10,2,0,8
D2,15,1,8,16
D3,20,3,15,22
D4,25,2,6,12
D5,30,1,0,23
D6,10,1,0,8
D7,20,2,5,13
D8,15,3,9,15
D9,25,2,14,21
D10,30,1,0,23
🔢 Problem Definition
Let \(T\) be the total number of time periods in the day.
\(L_t\) is the customer load at time \(t\).
\(D_i\) is the deployment load for deployment \(i\).
\(C_i\) is the duration of deployment \(i\).
\(S_i\) is the start time of deployment \(i\).
\(x_{i,t}\) is a binary variable indicating whether deployment \(i\) is running at time \(t\).
Objective
We aim to minimize the maximum deviation from the average load:
Where
Constraints
Deployment Window Constraints:
Each deployment \(i\) must start within its allowed time window:
\[S_i \in [\text{Deployment Start Window Start}_i, \text{Deployment Start Window End}_i]\]Non-preemptive Constraint:
Once a deployment starts, it must run for its full duration:
\[\sum_{t=\text{Start Window Start}_i}^{\text{Start Window End}_i} x_{i,t} = 1\]Load Capacity Constraint:
The total load at any time \(t\) must not exceed the server capacity:
\[\sum_{i} \sum_{k=0}^{C_i-1} D_i \cdot x_{i,t-k} + L_t \leq 100 \quad \forall t\]
from gurobipy import Model, GRB, quicksum
import pandas as pd
# Load the data from the provided CSV files
base_load_data = pd.read_csv('/mnt/data/scheduling_deployments_base_load.csv')
deployments_data = pd.read_csv('/mnt/data/scheduling_deployments_deployments.csv')
# Define parameters
T = len(base_load_data)
customer_load = base_load_data['Customer Load'].tolist()
deployments = {
f'D{i+1}': {
'load': deployments_data.loc[i, 'Deployment Load'],
'duration': deployments_data.loc[i, 'Deployment Duration'],
'window': (
deployments_data.loc[i, 'Deployment Start Window Start'],
deployments_data.loc[i, 'Deployment Start Window End']
)
}
for i in range(len(deployments_data))
}
# Create model
m = Model("deployment_scheduling")
# Add variables
x = m.addVars(deployments.keys(), range(T), vtype=GRB.BINARY, name="x")
# Objective
avg_load = sum(customer_load) / T + sum(d['load'] * d['duration'] for d in deployments.values()) / T
max_deviation = m.addVar(name="max_deviation")
m.addConstrs(
(max_deviation >= (quicksum(deployments[d]['load'] * x[d, t - k] for d in deployments for k in range(deployments[d]['duration']) if t - k >= 0) + customer_load[t] - avg_load)
for t in range(T)), name="deviation_pos")
m.addConstrs(
(max_deviation >= -(quicksum(deployments[d]['load'] * x[d, t - k] for d in deployments for k in range(deployments[d]['duration']) if t - k >= 0) + customer_load[t] - avg_load)
for t in range(T)), name="deviation_neg")
m.setObjective(max_deviation, GRB.MINIMIZE)
# Constraints
# 1. All deployments must be executed within their specified time window
for d in deployments:
m.addConstr(quicksum(x[d, t] for t in range(deployments[d]['window'][0], deployments[d]['window'][1] + 1)) == 1, name=f"deployment_{d}")
# 2. The load at any given time should not exceed 100% of the server capacity
m.addConstrs(
(quicksum(deployments[d]['load'] * x[d, t - k] for d in deployments for k in range(deployments[d]['duration']) if t - k >= 0) + customer_load[t] <= 100
for t in range(T)), name="capacity")
# Optimize model
m.optimize()
# Extract the results
if m.status == GRB.OPTIMAL:
optimal_schedule = pd.DataFrame({
'Time': range(T),
'Total Load': [sum(deployments[d]['load'] * x[d, t-k].x for d in deployments for k in range(deployments[d]['duration']) if t - k >= 0) + customer_load[t] for t in range(T)]
})
optimal_schedule
else:
"No optimal solution found."
Assigning workloads¶

Having a limited number of machines to schedule workloads on, assign the jobs so as to minimize the number of machines impacted.
A DevOps Engineer is responsible for scheduling workloads on a limited number of onsite machines. Each machine has specific capacities in terms of virtual CPUs (vCPU), RAM, and GPU FLOPS. Each workload requires a certain amount of these resources to run.
Objective: Maximize the total resources of machines without any workloads assigned to them.
Constraints:
- vCPU constraint: The total vCPU requirement of the workloads assigned to a machine must not exceed the vCPU capacity of that machine.
- RAM constraint: The total RAM requirement of the workloads assigned to a machine must not exceed the RAM capacity of that machine.
- GPU FLOPS constraint: The total GPU FLOPS requirement of the workloads assigned to a machine must not exceed the GPU FLOPS capacity of that machine.
- All workloads must be scheduled.
Data:
The machines can be found in assigning_workloads_machines.csv and has the following columns: Machines,vCPU Capacity,RAM Capacity (GB),GPU Capacity (GFLOPS)
The workloads can be found in assigning_workloads_workloads.csv and has the following columns: Workloads,vCPU Requirement,RAM Requirement (GB),GPU Requirement (GFLOPS)
assigning_workloads_machines.csv:
Machines,vCPU Capacity,RAM Capacity (GB),GPU Capacity (GFLOPS)
Machine 1,32,128,20
Machine 2,16,64,15
Machine 3,32,64,30
Machine 4,8,32,10
Machine 5,16,64,15
Machine 6,32,128,20
Machine 7,8,32,10
Machine 8,8,32,10
assigning_workloads_workloads.csv:
Workloads,vCPU Requirement,RAM Requirement (GB),GPU Requirement (GFLOPS)
Workload A,8,16,5
Workload B,16,32,10
Workload C,4,8,2
Workload D,32,64,25
Workload E,2,4,1
Workload F,4,16,3
Workload G,16,32,12
Workload H,8,16,6
Problem Definition
Objective:
Maximize the sum of unused resources across all machines without any workloads assigned to them.
Sets:
M: Set of machines, indexed by (m).
W: Set of workloads, indexed by (w).
Parameters:
For each machine (m) in M:
vCPU_m: vCPU capacity of machine (m).
RAM_m: RAM capacity (in GB) of machine (m).
GPU_m: GPU capacity (in GFLOPS) of machine (m).
For each workload (w) in W:
vCPU_w: vCPU requirement of workload (w).
RAM_w: RAM requirement (in GB) of workload (w).
GPU_w: GPU requirement (in GFLOPS) of workload (w).
Decision Variables:
x_mw in {0, 1}: Binary variable indicating if workload (w) is assigned to machine (m).
y_m in {0, 1}: Binary variable indicating if machine (m) has no workloads assigned to it.
Updated Objective:
Maximize the total unused resources across all machines without any workloads assigned:
Constraints:
vCPU Capacity:
\[\sum_{w \in W} x_{mw} \cdot \text{vCPU}_w \leq \text{vCPU}_m \quad \forall m \in M\]RAM Capacity:
\[\sum_{w \in W} x_{mw} \cdot \text{RAM}_w \leq \text{RAM}_m \quad \forall m \in M\]GPU Capacity:
\[\sum_{w \in W} x_{mw} \cdot \text{GPU}_w \leq \text{GPU}_m \quad \forall m \in M\]All Workloads Must be Scheduled:
\[\sum_{m \in M} x_{mw} = 1 \quad \forall w \in W\]Unused Machine Condition:
\[\sum_{w \in W} x_{mw} \leq (1 - y_m) \times |W| \quad \forall m \in M\]This ensures that if a machine has any workload assigned ((sum_{w} x_{mw} > 0)), then (y_m) is forced to 0, meaning the machine is considered used.
from gurobipy import Model, GRB, quicksum
import pandas as pd
# Load the CSV files
machines_df = pd.read_csv('/mnt/data/assigning_workloads_machines.csv')
workloads_df = pd.read_csv('/mnt/data/assigning_workloads_workloads.csv')
# Initialize the model
model = Model("Workload Scheduling")
# Sets
machines = machines_df['Machines'].tolist()
workloads = workloads_df['Workloads'].tolist()
# Parameters
vCPU_capacity = dict(zip(machines_df['Machines'], machines_df['vCPU Capacity']))
RAM_capacity = dict(zip(machines_df['Machines'], machines_df['RAM Capacity (GB)']))
GPU_capacity = dict(zip(machines_df['Machines'], machines_df['GPU Capacity (GFLOPS)']))
vCPU_req = dict(zip(workloads_df['Workloads'], workloads_df['vCPU Requirement']))
RAM_req = dict(zip(workloads_df['Workloads'], workloads_df['RAM Requirement (GB)']))
GPU_req = dict(zip(workloads_df['Workloads'], workloads_df['GPU Requirement (GFLOPS)']))
# Decision Variables
x = model.addVars(machines, workloads, vtype=GRB.BINARY, name="x")
y = model.addVars(machines, vtype=GRB.BINARY, name="y")
# Objective: Maximize unused resources only if no workloads are assigned
model.setObjective(
quicksum(y[m] * (vCPU_capacity[m] + RAM_capacity[m] + GPU_capacity[m]) for m in machines),
GRB.MAXIMIZE
)
# Constraints
# vCPU capacity constraints
model.addConstrs(
(quicksum(x[m, w] * vCPU_req[w] for w in workloads) <= vCPU_capacity[m] for m in machines),
name="vCPU_Capacity"
)
# RAM capacity constraints
model.addConstrs(
(quicksum(x[m, w] * RAM_req[w] for w in workloads) <= RAM_capacity[m] for m in machines),
name="RAM_Capacity"
)
# GPU capacity constraints
model.addConstrs(
(quicksum(x[m, w] * GPU_req[w] for w in workloads) <= GPU_capacity[m] for m in machines),
name="GPU_Capacity"
)
# Each workload must be scheduled exactly once
model.addConstrs(
(quicksum(x[m, w] for m in machines) == 1 for w in workloads),
name="Workload_Assignment"
)
# Adding constraint: if any workload is assigned to a machine, y_m must be 0
model.addConstrs(
(quicksum(x[m, w] for w in workloads) <= (1 - y[m]) * len(workloads) for m in machines),
name="No_Workload_Assigned"
)
# Optimize the model
model.optimize()
# Collecting the results
assignments = []
unused_resources = {}
if model.status == GRB.OPTIMAL:
for m in machines:
for w in workloads:
if x[m, w].x > 0.5: # if assigned
assignments.append((m, w))
if y[m].x > 0.5: # if no workloads are assigned
unused_resources[m] = {
"vCPU": vCPU_capacity[m],
"RAM": RAM_capacity[m],
"GPU": GPU_capacity[m]
}
# Display results
print("Assignments:", assignments)
print("Unused Resources:", unused_resources)
Incident Response Planning¶

A complex system of internal and customer-facing services that have many interdependencies should be brought online efficiently in case of a disaster. The customer-facing services get assigned a priority value, determine the order in which the services should be brought back online.
To induce urgency, we utilize the following formula that states that customer-facing services should be brought online as quickly as possible, with more important services getting a higher priority:
\(V(t) = V_0 \cdot e^{-0.0398t}\)
You are a DevOps Engineer responsible for developing an optimized incident response plan to prioritize critical systems and allocate resources efficiently during outages.
You need to plan the recovery of all 60 interconnected systems as far as possible.
Each system has dependencies on other systems, and only systems with higher numbers (customer-facing systems) have priority scores.
The goal is to get systems with a priority score up and running as quickly as possible. We start at t=0, and the time required for every recovery is indicated by "Recovery Time (minutes)".
As time goes by, value of each system goes down. The value of a system at time t can be calculated via the following function:
V(t)=V0⋅e^−0.0398t with t being the time the system has finished recovering and V0 being the initial priority score.
Objective: Maximize the total of priority scores given the above function.
Data:
The systems and their dependencies can be found in incident_response.json and has the following fields: System, Priority, Dependencies, Recovery Time (minutes)
[
{"System": "S1", "Priority": 0, "Dependencies": [], "Recovery Time (minutes)": 2},
{"System": "S2", "Priority": 0, "Dependencies": ["S1"], "Recovery Time (minutes)": 3},
{"System": "S3", "Priority": 0, "Dependencies": [], "Recovery Time (minutes)": 1},
{"System": "S4", "Priority": 0, "Dependencies": ["S2", "S3"], "Recovery Time (minutes)": 2},
{"System": "S5", "Priority": 0, "Dependencies": ["S1"], "Recovery Time (minutes)": 4},
{"System": "S6", "Priority": 0, "Dependencies": ["S4", "S5"], "Recovery Time (minutes)": 3},
{"System": "S7", "Priority": 0, "Dependencies": ["S3"], "Recovery Time (minutes)": 2},
{"System": "S8", "Priority": 0, "Dependencies": ["S7"], "Recovery Time (minutes)": 1},
{"System": "S9", "Priority": 0, "Dependencies": ["S8"], "Recovery Time (minutes)": 3},
{"System": "S10", "Priority": 0, "Dependencies": ["S2"], "Recovery Time (minutes)": 2},
{"System": "S11", "Priority": 0, "Dependencies": ["S10"], "Recovery Time (minutes)": 4},
{"System": "S12", "Priority": 0, "Dependencies": ["S4"], "Recovery Time (minutes)": 1},
{"System": "S13", "Priority": 0, "Dependencies": ["S11", "S12"], "Recovery Time (minutes)": 3},
{"System": "S14", "Priority": 0, "Dependencies": ["S2", "S5"], "Recovery Time (minutes)": 4},
{"System": "S15", "Priority": 0, "Dependencies": ["S13"], "Recovery Time (minutes)": 2},
{"System": "S16", "Priority": 0, "Dependencies": ["S14"], "Recovery Time (minutes)": 1},
{"System": "S17", "Priority": 0, "Dependencies": ["S9"], "Recovery Time (minutes)": 2},
{"System": "S18", "Priority": 0, "Dependencies": ["S16"], "Recovery Time (minutes)": 3},
{"System": "S19", "Priority": 0, "Dependencies": ["S15"], "Recovery Time (minutes)": 4},
{"System": "S20", "Priority": 0, "Dependencies": ["S17"], "Recovery Time (minutes)": 3},
{"System": "S21", "Priority": 0, "Dependencies": ["S6", "S8"], "Recovery Time (minutes)": 2},
{"System": "S22", "Priority": 0, "Dependencies": ["S19", "S9"], "Recovery Time (minutes)": 1},
{"System": "S23", "Priority": 0, "Dependencies": ["S18", "S7"], "Recovery Time (minutes)": 3},
{"System": "S24", "Priority": 0, "Dependencies": ["S5", "S10"], "Recovery Time (minutes)": 4},
{"System": "S25", "Priority": 0, "Dependencies": ["S21", "S12"], "Recovery Time (minutes)": 2},
{"System": "S26", "Priority": 0, "Dependencies": ["S13", "S11"], "Recovery Time (minutes)": 1},
{"System": "S27", "Priority": 0, "Dependencies": ["S23", "S14"], "Recovery Time (minutes)": 3},
{"System": "S28", "Priority": 0, "Dependencies": ["S26", "S22"], "Recovery Time (minutes)": 4},
{"System": "S29", "Priority": 5, "Dependencies": ["S24", "S15"], "Recovery Time (minutes)": 2},
{"System": "S30", "Priority": 5, "Dependencies": ["S17", "S28"], "Recovery Time (minutes)": 1},
{"System": "S31", "Priority": 5, "Dependencies": ["S27", "S16"], "Recovery Time (minutes)": 3},
{"System": "S32", "Priority": 5, "Dependencies": ["S30", "S18"], "Recovery Time (minutes)": 4},
{"System": "S33", "Priority": 5, "Dependencies": ["S29", "S19"], "Recovery Time (minutes)": 2},
{"System": "S34", "Priority": 5, "Dependencies": ["S20", "S31"], "Recovery Time (minutes)": 1},
{"System": "S35", "Priority": 5, "Dependencies": ["S22", "S32"], "Recovery Time (minutes)": 3},
{"System": "S36", "Priority": 8, "Dependencies": ["S25", "S21"], "Recovery Time (minutes)": 4},
{"System": "S37", "Priority": 8, "Dependencies": ["S34", "S23"], "Recovery Time (minutes)": 2},
{"System": "S38", "Priority": 8, "Dependencies": ["S24", "S33"], "Recovery Time (minutes)": 1},
{"System": "S39", "Priority": 8, "Dependencies": ["S26", "S36"], "Recovery Time (minutes)": 3},
{"System": "S40", "Priority": 8, "Dependencies": ["S27", "S35"], "Recovery Time (minutes)": 4},
{"System": "S41", "Priority": 5, "Dependencies": ["S28", "S37"], "Recovery Time (minutes)": 2},
{"System": "S42", "Priority": 5, "Dependencies": ["S38", "S29"], "Recovery Time (minutes)": 1},
{"System": "S43", "Priority": 10, "Dependencies": ["S31", "S40"], "Recovery Time (minutes)": 3},
{"System": "S44", "Priority": 10, "Dependencies": ["S30", "S39"], "Recovery Time (minutes)": 4},
{"System": "S45", "Priority": 10, "Dependencies": ["S32", "S42"], "Recovery Time (minutes)": 2},
{"System": "S46", "Priority": 10, "Dependencies": ["S41", "S33"], "Recovery Time (minutes)": 1},
{"System": "S47", "Priority": 8, "Dependencies": ["S34", "S44"], "Recovery Time (minutes)": 3},
{"System": "S48", "Priority": 10, "Dependencies": ["S43", "S35"], "Recovery Time (minutes)": 4},
{"System": "S49", "Priority": 10, "Dependencies": ["S36", "S45"], "Recovery Time (minutes)": 2},
{"System": "S50", "Priority": 10, "Dependencies": ["S37", "S46"], "Recovery Time (minutes)": 1},
{"System": "S51", "Priority": 10, "Dependencies": ["S38", "S47"], "Recovery Time (minutes)": 1},
{"System": "S52", "Priority": 20, "Dependencies": ["S39", "S48"], "Recovery Time (minutes)": 4},
{"System": "S53", "Priority": 15, "Dependencies": ["S40", "S49"], "Recovery Time (minutes)": 2},
{"System": "S54", "Priority": 25, "Dependencies": ["S41", "S50"], "Recovery Time (minutes)": 3},
{"System": "S55", "Priority": 30, "Dependencies": ["S42", "S51"], "Recovery Time (minutes)": 4},
{"System": "S56", "Priority": 5, "Dependencies": ["S43", "S52"], "Recovery Time (minutes)": 1},
{"System": "S57", "Priority": 10, "Dependencies": ["S44", "S53"], "Recovery Time (minutes)": 2},
{"System": "S58", "Priority": 15, "Dependencies": ["S45", "S54"], "Recovery Time (minutes)": 3},
{"System": "S59", "Priority": 20, "Dependencies": ["S46", "S55"], "Recovery Time (minutes)": 4},
{"System": "S60", "Priority": 25, "Dependencies": ["S47", "S56", "S58", "S59"], "Recovery Time (minutes)": 3}
]
🔢 Problem Definition
- Objective:
Maximize the total value of the recovered systems over time, where the value of each system at the time \(t\) is given by the function:
\[V(t) = V_0 \cdot e^{-0.0398t}\]Here:
\(V_0\) is the initial priority score of the system.
\(t\) is the time when the system finishes recovering.
Decision Variables:
\(S_i\): The start time of recovery for system \(i\) (continuous).
\(F_i\): The finish time of recovery for system \(i\) (continuous).
\(V_i(t)\): The value of system \(i\) at its finish time (continuous).
Constraints:
Dependency Constraints:
A system \(i\) can only start its recovery after all its dependent systems \(j\) have finished recovering.
\[S_i \geq F_j \quad \forall j \in \text{Dependencies}(i)\]Recovery Time Calculation:
The finish time for system \(i\) is the start time plus its recovery duration \(T_i\).
\[F_i = S_i + T_i\]Piecewise Linear Approximation:
The exponential decay function \(e^{-0.0398t}\) is approximated using a piecewise linear function over the interval of possible finish times.
Objective Function:
Maximize the sum of the weighted values of all systems at their respective finish times:
\[\text{Maximize} \sum_{i} V_0^i \cdot V_i(F_i)\]Where \(V_i(F_i)\) is the value of the system \(i\) at its finish time using the piecewise linear approximation of the exponential function.
import json
import gurobipy as gp
from gurobipy import GRB
import math
# Load data from the JSON file
with open('incident_response.json', 'r') as file:
data = json.load(file)
# Extract relevant information
systems = [item["System"] for item in data]
priorities = {item["System"]: item["Priority"] for item in data}
dependencies = {item["System"]: item["Dependencies"] for item in data}
recovery_times = {item["System"]: item["Recovery Time (minutes)"] for item in data}
# Set up the model
model = gp.Model("Incident_Response")
# Decision variables
start_times = model.addVars(systems, vtype=GRB.CONTINUOUS, name="start_time")
finish_times = model.addVars(systems, vtype=GRB.CONTINUOUS, name="finish_time")
value_at_finish = model.addVars(systems, vtype=GRB.CONTINUOUS, name="value_at_finish")
# Constraints to ensure dependencies are met
for system in systems:
for dep in dependencies[system]:
model.addConstr(start_times[system] >= finish_times[dep])
# Constraints to ensure finish times are correctly calculated
for system in systems:
model.addConstr(finish_times[system] == start_times[system] + recovery_times[system])
# Defining the piecewise linear approximation for the exponential function
t_values = list(range(0, 121, 10)) # Time intervals (in minutes)
v_values = [math.exp(-0.0398 * t) for t in t_values] # Corresponding exponential values
for system in systems:
if priorities[system] > 0:
model.addGenConstrPWL(finish_times[system], value_at_finish[system], t_values, v_values, "PWL_" + system)
else:
model.addConstr(value_at_finish[system] == 0)
# Objective function
model.setObjective(
gp.quicksum(
priorities[system] * value_at_finish[system]
for system in systems
),
GRB.MAXIMIZE
)
# Optimize the model
model.optimize()
# Collect the results
results = {
"System": [],
"Start Time (minutes)": [],
"Finish Time (minutes)": [],
"Priority": [],
"Value at Finish Time": []
}
for system in systems:
start_time = start_times[system].X
finish_time = finish_times[system].X
priority = priorities[system]
value_at_finish_val = value_at_finish[system].X
results["System"].append(system)
results["Start Time (minutes)"].append(start_time)
results["Finish Time (minutes)"].append(finish_time)
results["Priority"].append(priority)
results["Value at Finish Time"].append(value_at_finish_val)
Testing strategy optimization¶

Smartly decide which machines to run tests on and what kind of testing environment to simulate.
A DevOps Engineer wants to optimize the testing strategy for a software application. The application needs to be tested on four operating systems: Linux64, Armlinux64, MacOS, and Windows. Each operating system must be tested once. There are 10 testing environments available, numbered from 1 to 10, and each OS must be assigned a unique testing environment. Additionally, there are 40 testing machines, with 10 machines dedicated to each OS. Each machine can only handle a subset of the testing environments.
The goal is to find the optimal combination of {machine, testing environment} for each operating system to maximize the total score. The score for each {machine, testing environment} combination is calculated by taking the number of days since the Testing Date and then multiplying that number by its modifier.
Objective: Maximize the total score.
Constraints:
- Each operating system (Linux64, Armlinux64, MacOS, Windows) must be tested exactly once on one of their OS-specific machines.
- Each testing environment can be chosen at most once.
- Each {machine, testing environment} combination can only be chosen if the machine supports the given environment.
Data:
Use the data from testing_strategy.csv with the following columns: OS,Testing Environment,Machine,Testing Date,Modifier
The data also show which combination of {machine, testing environment} are available.
OS,Testing Environment,Machine,Testing Date,Modifier
Linux64,1,Linux1,2024-06-01,1
Linux64,1,Linux2,2024-06-15,2
Linux64,2,Linux3,2024-06-10,1
Linux64,2,Linux4,2024-06-05,1
Linux64,3,Linux5,2024-06-12,2
Linux64,3,Linux6,2024-06-08,1
Linux64,3,Linux7,2024-06-20,2
Linux64,4,Linux8,2024-06-18,1
Linux64,4,Linux9,2024-06-25,2
Linux64,4,Linux10,2024-06-30,1
Armlinux64,1,ArmLinux1,2024-06-01,1
Armlinux64,2,ArmLinux2,2024-06-15,2
Armlinux64,3,ArmLinux3,2024-06-10,1
Armlinux64,4,ArmLinux4,2024-06-05,1
Armlinux64,1,ArmLinux5,2024-06-12,2
Armlinux64,2,ArmLinux6,2024-06-08,1
Armlinux64,3,ArmLinux7,2024-06-20,2
Armlinux64,4,ArmLinux8,2024-06-18,1
Armlinux64,1,ArmLinux9,2024-06-25,2
Armlinux64,2,ArmLinux10,2024-06-30,1
MacOS,2,MacOS1,2024-06-01,1
MacOS,1,MacOS2,2024-06-15,2
MacOS,4,MacOS3,2024-06-10,1
MacOS,3,MacOS4,2024-06-05,1
MacOS,2,MacOS5,2024-06-12,2
MacOS,1,MacOS6,2024-06-08,1
MacOS,4,MacOS7,2024-06-20,2
MacOS,3,MacOS8,2024-06-18,1
MacOS,2,MacOS9,2024-06-25,2
MacOS,1,MacOS10,2024-06-30,1
Windows,4,Windows1,2024-06-01,1
Windows,4,Windows2,2024-06-15,2
Windows,3,Windows3,2024-06-10,1
Windows,3,Windows4,2024-06-05,1
Windows,2,Windows5,2024-06-12,2
Windows,2,Windows6,2024-06-08,1
Windows,2,Windows7,2024-06-20,2
Windows,1,Windows8,2024-06-18,1
Windows,1,Windows9,2024-06-25,2
Windows,1,Windows10,2024-06-30,1
🔢 Problem Definition
We need to formulate an optimization problem to maximize the total score of testing a software application on four different operating systems: Linux64, Armlinux64, MacOS, and Windows. The application needs to be tested on specific machines assigned to unique testing environments.
Objective
Maximize the total score of assigning machines to testing environments for each operating system.
Decision Variables
Let:
\(x_{ijk}\) be a binary variable that is 1 if machine \(j\) for OS \(i\) is assigned to testing environment \(k\), and 0 otherwise.
Sets
\(OS = { \text{Linux64}, \text{Armlinux64}, \text{MacOS}, \text{Windows} }\)
\(E = { 1, 2, \ldots, 10 }\) (testing environments)
\(M_i\) is the set of machines available for OS \(i\)
Parameters
\(s_{ijk}\): score for assigning machine \(j\) of OS \(i\) to testing environment \(k\)
Constraints
Each OS must be assigned to exactly one testing environment.
Each testing environment can be used by only one OS.
Each {machine, testing environment} combination can only be chosen if the machine supports the given environment.
Objective Function
import pandas as pd
from gurobipy import Model, GRB
import datetime as dt
# Load the data
df = pd.read_csv('/mnt/data/devops_testing_strategy.csv')
# Create a model
model = Model('Testing_Optimization')
# Extract sets and parameters
os_list = df['OS'].unique()
env_list = df['Testing Environment'].unique()
machines = {os: df[df['OS'] == os]['Machine'].unique() for os in os_list}
scores = {(row['OS'], row['Machine'], row['Testing Environment']):
(row['Modifier'] * (dt.datetime.now() - pd.to_datetime(row['Testing Date'])).days)
for _, row in df.iterrows()}
# Decision variables
x = model.addVars(scores.keys(), vtype=GRB.BINARY, name="x")
# Constraints
# Each OS must be assigned to exactly one testing environment
for os in os_list:
model.addConstr(
sum(x[os, m, e] for m in machines[os] for e in env_list if (os, m, e) in scores) == 1,
name=f"assign_{os}"
)
# Each testing environment can be used by only one OS
for e in env_list:
model.addConstr(
sum(x[os, m, e] for os in os_list for m in machines[os] if (os, m, e) in scores) <= 1,
name=f"use_env_{e}"
)
# Objective function
model.setObjective(sum(scores[key] * x[key] for key in scores), GRB.MAXIMIZE)
# Solve the model
model.optimize()
# Display results
if model.status == GRB.OPTIMAL:
solution = model.getAttr('x', x)
result = pd.DataFrame([(key[0], key[1], key[2], scores[key]) for key in solution if solution[key] > 0],
columns=['OS', 'Machine', 'Testing Environment', 'Score'])
else:
print("No optimal solution found.")