# code for loading the format for the notebook
import os
# path : store the current path to convert back to it later
path = os.getcwd()
os.chdir(os.path.join('..', 'notebook_format'))
from formats import load_style
load_style(plot_style=False)
os.chdir(path)
# 1. magic to print version
# 2. magic so that the notebook will reload external python modules
%load_ext watermark
%load_ext autoreload
%autoreload 2
from collections import namedtuple
from ortools.linear_solver import pywraplp
%watermark -a 'Ethen' -d -t -v -p ortools
The way to think about operation research, or optimization problem is we want to maximizie/minimize some objective, while being subjected to certain constraints.
For example, say we are deciding whether to buy ice cream or boba tea for dessert. Each type of food has an associated value, and cost, while we have a certain budget that we don't wish to exceed.
\begin{align} & \text{maximize} && \text{value}_{\text{ice_cream}} \cdot \text{ice_cream} + \text{value}_{\text{boba}} \cdot \text{boba} \nonumber \\ & \text{subject to} && \text{cost}_{\text{ice_cream}} \cdot \text{ice_cream} + \text{cost}_{\text{boba}} \cdot \text{boba} \leq \text{budget} \end{align}Say we are able to replace the value, cost, and budget part with actual numbers (in practice, assigning actual numbers to each of these coefficients is often times core pieces of the work).
\begin{align} & \text{maximize} && 3 \cdot \text{ice_cream} + 2 \cdot \text{boba} \nonumber \\ & \text{subject to} && 2 \cdot \text{ice_cream} + 1 \cdot \text{boba} \leq 1 \end{align}Given this toy problem, we can eye ball the solution, and see that we should use our limited budget to buy a boba tea for dessert. Operation research, a.k.a optimization techniques helps us algorithmically find solutions for these types of problems at a much larger scale.
The following section, uses ortools
library to solve this problem programmatically.
budget = 1
DessertInfo = namedtuple('DessertInfo', ['name', 'value', 'cost'])
dessert_infos = [
DessertInfo('ice_cream', 3, 2),
DessertInfo('boba', 2, 1),
]
num_desserts = len(dessert_infos)
dessert_infos
# creates solver
solver = pywraplp.Solver.CreateSolver('GLOP')
# creates variables
variables = [solver.NumVar(0, solver.infinity(), dessert_info.name) for dessert_info in dessert_infos]
# define constraints
constraint_coefficients = [dessert_info.cost for dessert_info in dessert_infos]
constraint = [constraint_coefficients[i] * variables[i] for i in range(num_desserts)]
solver.Add(solver.Sum(constraint) <= budget)
# define objective
objective_coefficients = [dessert_info.value for dessert_info in dessert_infos]
objective = constraint = [objective_coefficients[i] * variables[i] for i in range(num_desserts)]
solver.Maximize(solver.Sum(objective))
# solve
status = solver.Solve()
# extract optimal/feasible value
if status == pywraplp.Solver.OPTIMAL or status == pywraplp.Solver.FEASIBLE:
optimal_value = solver.Objective().Value()
print(f'Optimal Value: = {optimal_value}')
for i in range(num_desserts):
print(variables[i].name(), variables[i].solution_value())
A couple of important things to note:
NumVar
specifies that our variables can take on numeric solutions. Often times, our problem might require some of the variables to be integers, these are called Mixed Integer Programming. e.g. In our example, we probably can't buy 1.5 portion of boba. In these cases, we can specify our variables to be IntVar
.ortools
out there, feel free to pick and choose based on preferences or speed. The exact API might be different, but the main idea revolves around defining the objective, defining the variables, adding the constraints, solving it and extracting the optimal/feasible solution.Continuing with our discussions around Mixed Integer Programming, a closely related problem is the assignment problem, where our variables involves boolean decisions of 0 and 1 values.
We'll use the examples from this blog post, Blog: Towards optimal personalization: synthesisizing machine learning and operations research.
Say we are working in the marketing team, and we have different types of churn prevention channel, each having different prices, while different users/customers' retention rate is different for each channel. Our constraint is not spending above our monthly marketing budget, and the goal is to maxmize the total number of retained customers.
\begin{align} \text{maximize} & \sum_{u, c} R_{u, c} A_{u, c} \nonumber \\ \text{subject to} & \sum_{u, c} P_{u, c} A_{u, c} \leq B \\ & \sum_{c} A_{u, c} = 1, \forall u \in U \\ & a_{u, c} \in \{0, 1\} \end{align}Where:
Let's say we have 4 channels: email (0.25), push notification (0.3), text message (0.85), and phone call (5.0). Number in parenthesis indicates the cost/price. As for the retention probability, we will be using some randomly generated numbers, but imagine in real world scenarios where this can come from aggregated historical information, or even generated by some machine learning models.
budget = 1000
price = [25, 30, 85, 250]
# rentention probability for each customer and channel pair
retention_prob = [
[0.02, 0.27, 0.17, 0.87],
[0.14, 0.21, 0.28, 0.014],
[0.13, 0.003, 0.016, 0.64],
[0.14, 0.04, 0.14, 0.26],
[0.04, 0.24, 0.11, 0.31],
]
num_users = len(retention_prob)
num_channels = len(retention_prob[0])
# creates the solver for the mixed integer programming
solver = pywraplp.Solver.CreateSolver('SCIP')
# variable: assignment problem, creating a dictionary of binary variables
variables = {}
for i in range(num_users):
for j in range(num_channels):
variables[i, j] = solver.IntVar(0, 1, f'prob{i}_{j}')
# constraint: each user is assigned to at most 1 channel.
for i in range(num_users):
solver.Add(solver.Sum([variables[i, j] for j in range(num_channels)]) <= 1)
# constraint: total cost should not exceed budget
constraints = []
for j in range(num_channels):
for i in range(num_users):
constraint = price[j] * variables[i, j]
constraints.append(constraint)
solver.Add(solver.Sum(constraints) <= budget)
# objective
objective_terms = []
for i in range(num_users):
for j in range(num_channels):
objective_terms.append(retention_prob[i][j] * variables[i, j])
solver.Maximize(solver.Sum(objective_terms))
# invokes the solver
status = solver.Solve()
if status == pywraplp.Solver.OPTIMAL or status == pywraplp.Solver.FEASIBLE:
optimal_value = solver.Objective().Value()
print(f'Optimal Value: = {optimal_value}')
for i in range(num_users):
for j in range(num_channels):
# check indicator variable's value, with tolerance for floating point arithmetic
if variables[i, j].solution_value() > 0.5:
print(f'User {i} assigned to Channel {j}, Cost = {price[j]}')
In this article, we took a sneak peak into some problems that can benefit from leveraging optimization. The problems that we deal with in real world settings can be a lot more complicated than the examples seen here, but hopefully, this gives you the idea that whenever we see a problem that involves maximizing some objectives given some constraint, we have a tool at hand that we can turn to.