Está en la página 1de 93

UNIVERSIDAD DE CHILE

FACULTAD DE CIENCIAS FÍSICAS Y MATEMÁTICAS


DEPARTAMENTO DE INGENIERÍA QUÍMICA Y BIOTECNOLOGÍA

DESIGN OF A REAL BIOTECHNOLOGICAL MULTIPRODUCT BATCH


PLANT WITH AN OPTIMIZATION BASED APPROACH

TESIS PARA OPTAR AL GRADO DE DOCTORA EN CIENCIAS DE LA


INGENIERÍA MENCIÓN QUÍMICA

GABRIELA DANIELA SANDOVAL HEVIA

PROFESORES GUÍA:
JUAN ASENJO DE LEUZE
NICOLÁS FIGUEROA GONZÁLEZ

COMISIÓN:
DANIEL ESPINOZA GONZÁLEZ
MARÍA ELENA LIENQUEO CONTRERAS
LUIS CISTERNAS ARAPIO

Este trabajo es financiado por una beca CONICYT para estudios de doctorado
en Chile, los proyectos Fondecyt regular 1110024 y 1150046; y por el Centro Basal
financiado por CONICYT CeBiB FB0001

SANTIAGO DE CHILE
MARZO 2016
RESUMEN DE LA TESIS
PARA OPTAR AL GRADO DE DOCTORA EN CIENCIAS DE LA INGENIERÍA,
MENCIÓN QUÍMICA
POR: GABRIELA DANIELA SANDOVAL HEVIA
FECHA: MARZO 2016
PROF. GUÍA: SR. JUAN ASENJO DE LEUZE

Productos biotecnológicos, como los biofarmacéuticos entre otros, son productos cuyas
tecnologı́as de producción están en constante desarrollo. Adicionalmente, sus escalas
de producción son pequeñas haciendo de las plantas batch las más apropiadas para su
producción. En particular, plantas batch multi-producto permiten la producción de una
variedad de productos biotecnológicos con varias etapas en común.
Una forma de modelar el diseño de plantas batch multi-producto es mediante el enfoque
basado en optimización que fue estudiado por primera vez para este tipo de plantas por
Robinson y Lonkar, quienes estudiaron el diseño de este tipo de plantas dimensionando los
equipos que la conforman. Pese a los múltiples avances en el área, los que incluyen decisiones
como la duplicación de unidades, la disposición de tanques de almacenamiento intermedio,
programación de la producción y consideración ambientales, entre otras mejores, aún existe
una falta de trabajos donde este tipo de enfoques es aplicado en plantas reales.
En este trabajo se estudia una reformulación Entera-Mixta Lineal (MILP) del problema
Entero-Mixto No-Lineal (MINLP) que resulta al plantear el modelo para el diseño de una
planta biotecnológica batch multi-producto. En un primer paso se estudia una reformulación
MILP que permite modelar el diseño de una planta utilizando tamaños de equipos en un
conjunto continuo y una selección de hosts en un conjunto discreto de opciones. Esta
reformulación hace uso de técnicas avanzadas de reformulación, probando ser escalable y
confiable para su aplicación en casos reales. En un segundo paso, la reformulación MILP
original fue modificada para la inclusión de una selección de equipos, tanto en un conjunto
discreto, como en uno continuo, dando un enfoque más realista para poder modelar una
planta biotecnológica batch multi-producto; donde unidades como los reactores pueden ser
construidos de acuerdo con las necesidades del cliente, sin embargo, unidades como las
columnas cromatográficas sólo están disponibles en tamaños discretos dados por el proveedor.
Información de procesos reales que formaban parte de una planta batch multi-producto
real permitieron la determinación de los parámetros del modelo y una comparación entre las
distintas lineas de producción versus la planta real mostraron que este tipo de modelos puede
permitir grandes ahorros en los costos de los principales equipos de la planta.
Finalmente, como el enfoque estudiado utiliza software de modelación y optimización,
el modelo es más amigable para quienes puedan utilizarlo en la práctica. Sin embargo
niveles más bajos de implementación podrı́an mejorar los tiempos de resolución permitiendo
la inclusión de formulaciones más complejas, como por ejemplo, la inclusión de costos u
objetivos de producción variables.
A mi marido por ser mi fuerza,
a mis hijos por ser mi alegrı́a
Agradecimientos

Quiero agradecer a mi profesor guı́a, Juan Asenjo, por confiar en mı́, por su generosidad
al permitirme trabajar además con Daniel y Nicolás, y por su apoyo en los momentos menos
gratos que me tocó vivir en estos años de doctorado.
A Daniel y Nicolás, por su tiempo de trabajo, ideas y correcciones. Daniel, en particular
te agradezco los tiempos que te diste para preguntar un poco más y por esos consejos tanto
personales, como para mi futuro académico. No siempre fueron fáciles de escuchar, pero
siempre fueron valiosos y me ayudaron a reflexionar.
Agradezco a las lindas personas que conocı́ en mi paso por el laboratorio. Entre ellos
a Pablo que siempre estuvo disponible para ayudarme a entender el lenguaje matemático a
veces demasiado inentendible para mı́. A quienes me acompañaron en los primeros pasos y
estuvieron ahı́ en los grandes hitos de mi historia en estos 6 años. Cami, Fran, Dani S., Vida,
Gianni, Paty y Alicia. Fueron mis amigas y compañeras en parte de este recorrido y guardo
bellos recuerdos.
Gracias a mis padres por observar a la distancia y estar siempre prestos a ayudar. A mi
suegra por portarse un 10 conmigo. Por darme la tranquilidad para poder trabajar sabiendo
que mi Cati estaba siendo bien cuidada y regaloneada.
Finalmente, quiero agradecer a mi marido. Por permitirme trabajar aún a costa de sus
propios tiempos. Por la compañı́a y el apoyo incondicional. Por la confianza y el amor
infinito.
Hoy termina esta etapa, pero sé que queda mucho camino por recorrer. Espero seguir
encontrando en mi camino personas como ustedes, que me regalaron lindos momentos y un
lindo espacio para trabajar.

Gabriela Sandoval Hevia.


Marzo, 2016
Abstract
Biotechnological products such as biopharmaceuticals among others are products which
production technologies are in constant development. In addition, their production scales
are small making batch plants the most suitable type for their production. In particular,
multi-product batch plants allows the production of a variety of biotechnological products
with many common steps.
One way to model the design of such plants is the optimization based approach that
was first studied in 1972 by Robinson and Lonkar who addressed the equipment sizing of
a multi-product batch plant. Despite of the advances in the area, including decisions as
duplication of units, allocation of intermediate storage vessels, scheduling and environmental
considerations, among other improvements, there is a lack of reported work where this type
of approach is applied to real plants.
In this work a Mixed-Integer Linear Programming (MILP) reformulation of the resulted
Mixed-Integer Non-Linear problem (MINLP) for the design of a biotechnological multi-
product batch plant is studied. In a first step a MILP reformulation that addresses the
desing of a plant using continuous equipment sizes and discrete host selection is studied.
This reformulation made use of advanced reformulation techniques and proved to be scalable
and reliable for its application in real cases. In a second step the former MILP reformulation
was modified for the inclusion of the selection of equipment sizes in both, continuous and
discrete sizes giving a more realistic approach to model a real biotechnological multi-product
batch plant. Items such as reactors may be build according to customer needs, but units such
as chromatographic columns are only available in discrete sets of sizes given by manufacturers.
Information from real processes that where part of an actual multi-product batch plant
allowed the computation of the model parameters; and a comparison of the optimized facilities
versus the actual plant showed that this type of models may achieve great savings in the cost
of the main equipment of the plant.
As the studied approach relies on “off the shelve” optimization and modelling software
the model is more amiable to practioners. Nevertheless lower implementation levels could
improve resolution times allowing for the inclusion of more complex formulations such as the
inclusion of variable costs and production target parameters, among others.
Contents

1 Introduction 1
1.1 Biotechnological industry and batch processes . . . . . . . . . . . . . . . . . 1
1.2 Multi-product batch plant design . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Elements for the design and synthesis . . . . . . . . . . . . . . . . . . . . . . 3
1.3.1 Synthesis decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 Design decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 MINLP versus MILP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5.1 Main Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5.2 Specific Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Summary of methodology and principal results . . . . . . . . . . . . . . . . . 6

2 Design of multi-product batch plants 8


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Current limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.1 MINLP formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.2 Mixed-Integer linear formulations . . . . . . . . . . . . . . . . . . . . 18
2.4 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.1 Solvers and modelling language . . . . . . . . . . . . . . . . . . . . . 22
2.4.2 Execution environment . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5.1 Size of instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5.2 Selection of cutting points . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.3 Equipment sizing: comparison of problems (P1) and (P3) . . . . . . . 26
2.5.4 Routes selection: comparison of problems (P2) and (P4) . . . . . . . 29
2.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3 Design of a real plant 33


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Design of a biotechnological multiproduct batch plant . . . . . . . . . . . . . 35
3.2.1 Processes description . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.2 Estimation of processes data and plant cost . . . . . . . . . . . . . . 38

vi
CONTENTS CONTENTS

3.2.3 Multiproduct batch plant . . . . . . . . . . . . . . . . . . . . . . . . 44


3.3 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.1 Mathematical modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.2 Size and time factors . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.3 Computational tools / Execution environment . . . . . . . . . . . . . 55
3.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Cost of the real plant . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.2 Number of cutting points and a posteriori gaps . . . . . . . . . . . . 55
3.4.3 Original purification facility versus corresponding stages in a multi-
product batch plant . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.4 Optimization of the “global” multiproduct batch plant of 44 stages . 57
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4 Main conclusions 61

References 62

Appendices 68

Appendix A Published article 69

vii
List of Figures

1.1 Multi-product/flowshop plant (taken from Biegler et al. (1997)). . . . . . . . 1


1.2 Multipurpose/jobshop plant (taken from Biegler et al. (1997)). . . . . . . . . 2

2.1 Basic techniques used to model synthesis and design decisions considering
continuous equipment sizes and discrete host selection. . . . . . . . . . . . . 12
2.2 Formulations compared in this article. Model (P1) is the most basic
formulation that only includes design decisions. Model (P2) includes the
selection of the downstream processes without the use of Big-M constraints.
Models (P3) and (P4) are the transformed models of (P1) and (P2),
respectively, using our proposed inner and outer approximations. . . . . . . . 14
2.3 Feasible region (patterned area) of a outer and b inner approximations (dashed
lines) of an exponential function (solid line). Points bi are the cutting points
and LB and UB are the lower and upper bounds of x. . . . . . . . . . . . . . 20
2.4 Comparison of performance profiles of a relative optimality gap obtained a
posteriori and b the logarithm of the running time of “sizing instances” solved
with linear model (P3) using 17, 33 and 65 cutting points for lower and upper
approximations with an optimality relative gap of 0.1%. . . . . . . . . . . . . 27
2.5 Comparison of absolute errors using 2 different sets of 33 cutting points where
f (x) are the linear functions used to approximate the exponential function
between 2 cutting points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6 Comparison of performance profiles of a posteriori gaps obtained using 33
cutting points to solve model (P4) where f (x) are the linear functions used
to approximate the exponential function between 2 cutting points. Time limit
was set in 12 hours. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.7 Comparison of performance profiles of the logarithm of running time of “sizing
instances” solved using models (P1) and (P3) with an optimality relative gap
of 0.1% for the linear solver and 2% for non-linear solvers. . . . . . . . . . . 29
2.8 Comparison of performance profiles of the logarithm of the ratio of the
computing time of the pair model-solver versus the best time of the pairs
model-solvers for “sizing instances” solved with models (P1) and (P3) with
an optimality relative gap of 0.1% for the linear solver and 2% for non-linear
solvers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

viii
LIST OF FIGURES LIST OF FIGURES

2.9 Comparison of performance profiles of the logarithm of running time of “sizing


instances” solved using models (P1), (P2), (P3) and (P4) with an optimality
relative gap of 0.1% for the linear solver and 2% for non-linear solvers. . . . 30
2.10 Comparison of performance profiles of relative difference between simple and
more complex formulation for “sizing instances”. Models (P1) and (P2) were
solved to an optimality gap of 2% and (P3) and (P4), to an optimality gap of
0.1%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.11 Comparison of performance profiles of the logarithm of running time of
“routing” and “sizing instances” solved using models (P2) and (P4) with an
optimality relative gap of 0.1% for the linear solver and 2% for non-linear solvers. 32

3.1 Production process for Product 1, an intracellular protein synthesized in


Saccharomyces cerevisiae. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Production process for Product 2, an intracellular protein synthesized in
Escherichia coli. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Production process of Product 3, an intracellular protein synthesized in
Escherichia coli. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 Production process of Product 4, an extracellular protein synthesized in
Saccharomyces cerevisiae. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

ix
List of Tables

2.1 Comparison of the number of constraints and variables of some selected


instances solved using models that account for selection with Big-M constraints
-models (C1) and (C2)- and a classic formulation for design decisions only,
(P1). All instances were solved using the DICOPT solver. . . . . . . . . . . 13
2.2 Sizes of sample instances solved using non-linear model (P1). . . . . . . . . . 24
2.3 Sizes of sample instances solved using linear model (P3) with 33 cutting points
for linear inner approximation. . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Sizes of sample instances solved using non-linear model (P2). . . . . . . . . . 25
2.5 Sizes of sample instances solved using linear model (P4) with 33 cutting points
for linear inner approximation. . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.1 Production data for the Purification facility (Imperatore & Asenjo, 2001). . . 36
3.2 Original equipment sizes and operation times for the production of Product 1,
an intracellular protein synthesized in S. cerevisiae. . . . . . . . . . . . . . . 41
3.3 Original equipment sizes and operation times for the production of Product 2,
an intracellular protein synthesized in E. coli. . . . . . . . . . . . . . . . . . 42
3.4 Original equipment sizes and operation times for the production of Product 3,
an intracellular protein synthesized in E. coli. . . . . . . . . . . . . . . . . . 43
3.5 Original equipment sizes and operation times for the production of Product 4,
an extracellular protein synthesized in S. cerevisiae. . . . . . . . . . . . . . . 43
3.6 Cost coefficients and variable bounds needed to size batch units. Costs can be
γ
calculated in U.S.$ with the function cj Vj j . Data was actualized to year 2012
using CE index: year 2000, 394.1 ; year 2012, 584.6. . . . . . . . . . . . . . . 45
3.7 Available cost and equipment sizes for semi-continuous units. Costs are in
1000 U.S.$. Data actualized with CE index: year 1998, 389.5 ; year 2012, 584.6. 46
3.8 Downstream processing stages that conform a multiproduct biotecnological
batch plant that produces 4 different recombinant proteins synthesized in E.
coli and S. cerevisiae as intra and extracellular products. . . . . . . . . . . . 47
3.9 Data used to estimate size and time factors for Product 1, an intracellular
protein syntehsized in S. cerevisiae. . . . . . . . . . . . . . . . . . . . . . . . 49
3.10 Data used to estimate size and time factors for Product 2, an intracellular
protein syntehsized in E. coli. . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.11 Data used to estimate size and time factors for Product 3, an intracellular
protein syntehsized in E. coli. . . . . . . . . . . . . . . . . . . . . . . . . . . 51

x
LIST OF TABLES LIST OF TABLES

3.12 Data used to estimate size and time factors for Product 4, an extracellular
protein syntehsized in S. cerevisiae. . . . . . . . . . . . . . . . . . . . . . . . 51
3.13 Average execution time and relative a posteriori gaps for the purification
facility instance solved 100 times. . . . . . . . . . . . . . . . . . . . . . . . . 56
3.14 Comparison between the costs of the original and the optimized facilities
considering the maximum capacity of the original purification plant and a
time horizon given by the number of production weeks. Costs are calculated
in U.S.$ based on data for year 2012. . . . . . . . . . . . . . . . . . . . . . . 57
3.15 Optimized structure of the multiproduct batch plant over a time horizon of 5
904 hours. The cost function is equal to U.S.$ 26 000 900. . . . . . . . . . . 58
3.16 Final batch size and cycle time of the 4 products produced in the multiproduct
batch plant optimized over a time horizon of 5 904 hours. . . . . . . . . . . . 59
3.17 Comparison between the costs of the original and the optimized facilities
considering different production targets and time horizon. Costs are calculated
in U.S.$ based on data for year 2012. . . . . . . . . . . . . . . . . . . . . . . 59

xi
Nomenclature

Indices
h host
i product
j stage
m number of duplicated units
U P , LO upper and lower bounds

Sets
E1 set of batch stages, ⊂ E
E2 set of semi-continuous stages, ⊂ E
E3 set of chromatographic stages, ⊂ E
E set of all stages
H set of hosts
I set of products
M set of available units operating in parallel in-phase or out-of-phase
R set of routes: stages j needed to process the product i synthesized by host
h
U set of available hosts h for product i synthesis

Parameters
δ time horizon
cnj , γjn cost coefficients related to Yjn with n ∈ {1, 2, 3}
di production target for product i 
Sijn , snij size factor for product i in stage j related to Yjn . snij = ln Sijn n ∈ {1, 2, 3}
Tij0 , t0ij time factor for product i in the batch or chromatographic stage j. t0ij =

ln Tij0
Tij1 , t1ij time factor forproduct i in the semi-continuous or chromatographic stage
j. t1ij = ln Tij1

xii
LIST OF TABLES LIST OF TABLES

Variables
v slack variable
Xjn , xnj number of units operating in parallel in-phase (Xj1 ) and out-of-phase (Xj2 )

in stage j. xnj = ln Xjn
Yj1 , yj1 volumetric capacity for batch units and retentate or feed tank for semi-
continuous or chromatographic stages. yj1 = ln Yj1
Yj2 , yj2 volumetric capacity for permeate of product tanks for semi-continuous or
chromatographic stages. yj2 = ln Yj2 
Yj3 , yj3 capacity of semi-continuous items. yj3 = ln Yj3
Yi4 , yi4 batch size for final product i. yi4 = ln (Yi4 )
Yi5 , yi5 cycle time for product i. yi5 = ln (Yi5 )
1
zih binary variable: 1 if protein i is synthesized by host h; 0 otherwise
zj2 binary variable: 1 if stage j is used to process at least one of the products
3
zjm binary variable: 1 if m units are operating in parallel in-phase in stage j;
0 otherwise
4
zjm binary variable: 1 if m units are operating in parallel out-of-phase in stage
j; 0 otherwise

xiii
Chapter 1

Introduction

1.1 Biotechnological industry and batch pro-


cesses
According to Moreno & Montagna (2007b):

The batch mode of operation in food and biotechnological industries has received
a renewed interest particularly because of the market wich has become more
uncertain, complex and competitive.

Batch plants can be easily reorganized to allow for production modifications within the
same plant (Barbosa-Póvoa, 2007) being the most common and studied structures the multi-
product and the multipurpose batch plants. In multi-product or flowshop plants all products
follow the same production path (see Figure 1.1). On the other hand, in multipurpose
or jobshop plants different products can be produced by sharing available equipment, raw
materials, utilities and production time resources. Major difference with multi-product
plants is that different products may be produced in arbitrary sequences and locations (see
Figure 1.2).

A A
B 1 2 3 4 B
C C

Figure 1.1 – Multi-product/flowshop plant (taken from Biegler et al. (1997)).

This work is focused on multi-product batch plants since the main objective is the
application of an optimization based approach to the design of a biotechnological multi-
product batch plant using information of real processes that were in fact part of a real
multi-product batch plant (Imperatore & Asenjo, 2001).

1
1.2 Multi-product batch plant design 1 Introduction

A A
B 1 2 3 4 B
C C

Figure 1.2 – Multipurpose/jobshop plant (taken from Biegler et al. (1997)).

1.2 Multi-product batch plant design with an


optimization based approach
The multi-product batch plant design may include a variety of problems such as synthesis,
design, production planning, and scheduling (Voudouris & Grossmann, 1993). The main
objective is to minimize the production requirements over a defined time horizon (Barbosa-
Póvoa, 2007). Voudouris & Grossmann (1993) classified such decisions as follows:

1. Synthesis decisions

(a) allocation of tasks to equipment


(b) parallel units operating either in-phase or out-of-phase
(c) location of intermediate storage

2. Design decisions

(a) selection of equipment of standard sizes


(b) sizing of intermediate storage vessels with standard sizes

3. Production planning decisions

(a) optimal length of production cycle during which the optimal schedule is executed
(b) levels of inventory of final products

4. Scheduling decisions

(a) sequencing of products

The optimization based approach for the design of batch plants began with Robinson
& Loonkar (1972) who studied the design decision of selection of equipment sizes and since
then great progress has been made. In recent years synthesis decisions such as duplication of
units in series has been incorporated (Moreno et al., 2009a,b). Corsano et al. (2005) studied
synthesis, design and operation decisions simultaneously; Fumero et al. (2011, 2012b) studied
the simultaneous design and scheduling of a multi-product batch plant; and authors such as
Pinto-Varela et al. (2009) have included demand uncertainty, environmental issues (Wang
et al., 2010) and transportation concerns (Yi & Reklaitis, 2011).

2
1.3 Elements for the design and synthesis 1 Introduction

According to Iribarren et al. (2004) towards the middle of the past decade despite of the
existance of a great amount of work based on expert systems for the synthesis of bioprocesses
just few papers that used an optimization based approach had been published; and those
that had been published dealt with small portions of the global problem (Montagna et al.,
2004). That is the case of Vásquez-Alvarez et al. (2001) that developed a strategy for
the synthesis of a purification process based on a variety of chromatographic stages; the
MILP model they studied used physico-chemical data of a mixture of proteins. On the
other hand, a few years earlier Samsatli & Shah (1996a) studied the design problem for the
entire production of a unique product that included a fermentation stage followed by primary
separation steps and high resolution stages for a final purification. After this work a second
part dealt with scheduling decisions for a more accurate sequencing and timing determination
for each operation unit in the plant (Samsatli & Shah, 1996b).
In the year 2000 Montagna et al. began a series of collaborations that studied the design of
a biotechnological multi-product batch plant (Asenjo et al., 2000; Pinto et al., 2001) including
later synthesis decisions (Iribarren et al., 2004). Former work studied the production of 4
recombinant proteins where 6 steps of separation and purification followed the fermentation
process; this process was used a few years later by other authors as an example process to
test their own formulations (Dietz et al., 2005; Moreno et al., 2009a). On the other hand,
the work of Iribarren et al. (2004) was described by Moreno-Benito et al. (2014) as one of
the most relevant contributions to the area at that time because their MINLP formulation
addressed the combination of process synthesis decisions -selection of the microorganism
responsible of the biological process; and selection of separation and purifucation techniques-
plant allocation decisions -operation mode- and plant design decisions such as equipment
sizing.
After these papers, advances in biotechnological multi-product batch plants literature
have not been as much as those that can be found for chemical plants (see Barbosa-Póvoa
(2007) for an extended review); nevertheless most of the advances applied for chemical plants
can be also applied for bioprocesses. Among the papers in biotechnology are the work of
Srinivasan et al. (2003) who included the uncertainty of the demand in the design of a
bioreactor that produces penicillin; Dietz et al. (2005) included environmental considerations
to the design of a multi-product batch plant and Moreno & Montagna (2007b) developed a
model for the design and scheduling of a plant of 5 stages for a vegetable extraction.

1.3 Elements for the design and synthesis of a


biotechnological multi-product batch plant
In this work the design of a biotechnological multi-product batch plant with an optimization
based approach took into account design and synthesis decisions which will be explained
below.

3
1.3 Elements for the design and synthesis 1 Introduction

1.3.1 Synthesis decisions


Synthesis decisions make reference to the configuration of the plant taking into account 3
topics:

Allocation of tasks to equipment: This makes reference to the selection of one technique
among two or more that can perform the same downstream processing step.

Parallel units operating either in-phase or out-of-phase: Duplication of units in-


phase permits the elimination of bottlenecks due to equipment capacities as different units
process the incoming stream simultaneously (Voudouris & Grossmann, 1993). This decision
is used when the maximum capacity available is used allowing the processing of batches
of bigger sizes. The duplication of units out-of-phase, on the other hand, eliminates the
bottlenecks due to cycle times as different units process the incoming stream with different
initial times (Voudouris & Grossmann, 1993).

Location of intermediate storage: This decision is highly related to the used storage
policy and the objective is to reduce the idle times during production (Galiano & Montagna,
1993). According to Barbosa-Póvoa (2007) 5 policies can be identified:

• Zero-wait: the material is unstable and has to be processed immediately.

• Unlimited intermediate storage: the material is stable and can be arranged in one
or more storage vessels with an unlimited capacity.

• Finite intermediate storage: the material is stable and can be arranged in one or
more storage vessels with a finite capacity.

• Shared intermediate storage: the material is stable and can be arranged in one or
more storage vessels that can be shared with other material but not simultaneously.

• No intermediate storage: the material is stable but no storage vessels are available.
However, it may reside temporarily in the processing equipment where it has been
produced.

1.3.2 Design decisions


Design decisions correspond to the sizing of different process equipment which can be selected
among a continuous range of sizes or among a discrete set of available sizes. According
to Salomone et al. (1994) process equipment can be clasified into 3 types:

Batch intensive: these are the traditional batch stages in which the process material
remains in the unit for a time that depends on the process kinetics, that in turn
depends on concentration and temperature. Examples of these units are fermenters
and reactors.

4
1.4 MINLP versus MILP 1 Introduction

Semi-continuous: these are the traditional semi-continuous stages that operate between
two batch stages. These units operate continuously but intermittently to transfer the
material among different batch stages. Examples of these units are heat exchangers.

Batch extensive: these equipments involve both items, batch and semi-continuous.
Examples of these units are filters and centrifuges that need feed and product tanks,
togheter to the semi-continuous item.

In this work batch and semi-continuous items are sized in stages that are batch intensive
and batch extensive.

1.4 Mixed Integer Non-Linear Programming


(MINLP) versus Mixed Integer Linear
Programming (MILP)
As stated by Grossmann et al. (2000) design decision problems can be written as a Mixed-
Integer Non-Linear Problem (MINLP) of the form:

min Z = f (x, y)
s.t. h(x, y) = 0
(1.1)
g(x, y) ≤ 0
x ∈ X, y ∈ {0, 1}

where f (x, y) is the objective function (e.g. cost), h(x, y) = 0 are equations that describe the
performance of the system, such as mass and/or energy balances, g(x, y) ≤ 0 are inequalities
that define specific restrictions to feasible options and at least one of these functions is
non-linear. x variables model equipment sizes and production times and volume and the y
variables are restricted to be 0 or 1 modelling action selection.
This problem has a unique global optimum if all of the functions involved are strictly
convex (Grossmann et al., 2000) otherwise finding a global optimum is not guaranteed. One
way to deal with non-convexities arising in the standard model of batch facilities is the
use of the logarithmic change of variables proposed by Kocis & Grossmann (1988) which
linearizes most of the functions and leads to a convex problem, approach that has been used
among others by Rippin (1993), Montagna et al. (2000) and Moreno et al. (2009b). Another
approach is the use of heuristic procedures (Grossmann et al., 2000). This was the option
selected by Pinto et al. (2001) among others.
Some other authors as Voudouris & Grossmann (1993), Moreno & Montagna
(2007b), Moreno & Montagna (2011) and Fumero et al. (2011) have modeled the design
problem as a Mixed-Integer Linear Problem (MILP or MIP) selecting equipment sizes among
a discrete set of available sizes. This alternative guarantees a global optimality in the solution
of the batch design problem (Voudouris & Grossmann, 1993). Moreno & Montagna (2011)
made a comparison between both MINLP and MILP approaches and conclude that although
the precision of the model is reduced in a MILP approach, a superior performance is achieved.

5
1.5 Objectives 1 Introduction

A key feature in these design problems is the use of Big-M constraints to account for
the selection decisions despite of being problematic (Bosch & Trick, 2005). Some authors
that have included this type of constraints in their formulations are Gupta & Karimi
(2003), Corsano et al. (2009), Moreno et al. (2009a) and Moreno & Montagna (2012).
Obviously, these authors have found that the value of the Big-M parameters has a tremendous
impact on the solution time. See for example Montagna et al. (2004) who made a comparison
between the use of Big-M and convex-hull formulations; or Moreno & Montagna (2007b) who
had to test different values for Big-M parameters.

1.5 Objectives
1.5.1 Main Objective
The main objective of this work is to study the use of an optimization based approach for the
design of a biotechnological multiproduct batch plant with rigorous information of different
processes from a real plant.

1.5.2 Specific Objectives


The specific objectives of this work are:

• To investigate a formulation for the desing of multi-product batch plants that is robust,
scalable and reliable for its application in the design of real cases.

• To introduce a standard methodology, coming from the optimization field, to compare


different formulations for the design of multi-product batch plants.

• To define an appropriate methodology to estimate the parameters of the defined


optimization model based on rigorous information from real processes.

• To investigate the application of the developed methodology in the design of a


biotechnological multi-product batch plant using real data of production processes that
are actually part of a real multi-product batch plant.

1.6 Summary of methodology and principal


results
A first approach to study the design of a real biotechnological multi-product batch plant was
the use of the MINLP formulation proposed by Iribarren et al. (2004) using the DICOPT
solver in GAMS language; nevertheless it was found that their formulation was only reliable
for small plants with no more than 15 stages in total. A discussion of this fact can be found
in Sandoval et al. (2016) in Chapter 2.

6
1.6 Summary of methodology and principal results 1 Introduction

To avoid the problems that can be found in MINLP formulations a MILP reformulation
is proposed that in a first stage permits the selection of equipment sizes over a range of
continuous sizes (see Chapter 2); and in a second stage permits both, discrete and continuous
sizes (see Chapter 3).
Selection of techniques to perform a defined step is addressed with the introduction of
a route formulation that makes use of advanced reformulation techniques coming from the
mixed-integer-programming literature: clique constraints. This formulation avoids the use of
Big-M constraints.
The combination of the MILP reformulation with the clique constraints permits the
definition of a methodology that meets all the desired requirements: is scalabe, robust and
realiable for its application in the design of real multi-product batch plants.
As a final step of this work the proposed approach is used to study the design of the
real biotechnological multi-product batch plant. Mass balances allow the computation of the
parameters needed by the formulation and the obtained results illustrate the reliability of
the optimization based approach in the design of real multi-product batch plants. The use of
these type of models may achieve big savings in the cost of the main equipment of the plant.
Finally, it is important to highlight that the proposed approach takes at most a few
minutes to find an optimum solution leaving plenty of space for continuing the addition of new
and more complex constraints or objetive functions. In addition, lower level implementations
in C or C ++ could improve timing performance and the complexity of the model as well.

7
Chapter 2

MILP reformulations for the design of


biotechnological multi-product batch
plants using continuous equipment
sizes and discrete host selection

Published on Computers and Chemical Engineering at January 2016 (Sandoval, G., Espinoza,
D., Figueroa, N. y Asenjo, J.A. / Computers and Chemical Engineering 84 (2016) 1-11. See
appendix A)

8
2 Design of multi-product batch plants

Abstract
In this article we present a new approach, relying on mixed-integer linear programming
(MILP) formulations, for the design of multi-product batch plants with continuous sizes
for processing units and host selection. The main advantage of the proposed approach is
its scalability, that allows us to solve, within reasonable precision requirements, realistic
instances. Furthermore, we show that many other alternatives are either numerically unstable
(for the problem sizes that we are interested in), unable to solve large instances, or much
slower than the proposed method. We present extensive computational experiments, which
show that we are able to solve almost all tested instances, and, in average, we are ten times
faster than alternative approaches. As we use a high level implementation language (AMPL)
we should get further time improvements if lower level implementations are used (C, C ++ ).
Reproducibility of our results can be tested using our models and data available on-line
at BPLIB1 .
Keywords: multi-product batch plant, MINLP, MILP, production path.

1
Available in http://www.dii.uchile.cl/~daespino/

9
2.1 Introduction 2 Design of multi-product batch plants

2.1 Introduction
Conventional multi-product batch process literature using an optimization-based approach
model the design and synthesis of such plants with Mixed-Integer Non-Linear Programming
(MINLP) formulations (Floudas, 1995). The usual objective is to minimize the investment
cost subject to the fulfillment of the production targets of a given set of products. Major
drawbacks are given by the combinatorial nature of mixed-integer programming and possible
nonconvexities due to non-linearities. In computational optimization numerical issues of these
formulations given by rounding errors, numerical instabilities and approximation errors are
well-documented (Goldberg, 1991; Koch, 2004; Margot, 2009; Vielma, 2013).
Since Robinson & Loonkar (1972) different procedures have been proposed to tackle
these problems (Reklaitis, 1990; Rippin, 1993; Barbosa-Póvoa, 2007; Verderame et al., 2010;
Nikolopoulou & Ierapetritou, 2012) but a method that is more efficient for a particular
example is hardly predictable (Ponsich et al., 2007) and nowadays the development of effective
solution approaches and algorithms remains very necessary (Grossmann & Guillén-Gosálbez,
2010).
The logarithmic change of variables proposed by Kocis & Grossmann (1988) linearizes
most of the functions and leads to a convex MINLP problem, approach used by Ravemark &
Rippin (1998) and Montagna et al. (2000) among others. Another approach chosen by Pinto
et al. (2001) and Ponsich et al. (2007) among others is the use of specially designed solvers
which can usually find good feasible solutions by the use of heuristic procedures (Grossmann
et al., 2000). In practice the best off the shelf solvers for this kind of problems are the
open source codes BONMIN and SCIP and the commercial solvers BARON and DICOPT
that stand out in Mittelmann’s benchmarks for optimization software (Mittelmann, 2013).
Nevertheless none of them guarantee convergence to a global optimum, converging in some
instances to local optima or not converging altogether. For the particular case of BARON
and DICOPT performance failures are reported for non-convex models (Ponsich et al., 2007;
Rebennack et al., 2011; Li et al., 2012); nevertheless even in cases where theoretically the
algorithms work, we found that in practice, they do not converge to the global optimum. We
have run precise experiments that demonstrate these failures in convex MINLP formulations
(see Section 2.2).
It is a fact that there is a huge gap between Mixed-Integer Linear Programming (MILP
or MIP) and MINLP solvers technology (Nowak, 2005). Nowadays mixed-integer linear
techniques are fast, robust and able to provide solutions to problems with up to millions of
variables (Geißler et al., 2012). Taking advantage of this Voudouris & Grossmann (1992)
used reformulation schemes to develop MILP models for the preliminary design of multi-
product batch plants, introducing binary variables for the selection of discrete available
equipment sizes. From this point, to the design decisions other were included as synthesis,
production planning and scheduling (Voudouris & Grossmann, 1993); design and planning in
a multiperiod scenario (Moreno & Montagna, 2007a); design of multi-product batch plants
considering duplication of units in series (Moreno et al., 2009b) and the design and planning
of multi-product batch plants using mixed-product campaigns (Corsano et al., 2009). Most
recently these MILP formulations have been used to account for the design and scheduling
of this type of plants (Fumero et al., 2011, 2012b,a) and for the design under uncertainty

10
2.2 Current limitations 2 Design of multi-product batch plants

considering different types of decisions (Durand et al., 2012; Moreno & Montagna, 2012;
Durand et al., 2014; Moreno-Benito et al., 2014).
A key feature in these design problems is the use of Big-M constraints to account for
selection decisions despite being problematic (Bosch & Trick, 2005). Some authors that have
included this type of constraints in their formulations are Gupta & Karimi (2003); Corsano
et al. (2009); Moreno et al. (2009a); Moreno & Montagna (2012). Obviously, these authors
have found that the value of the Big-M parameters has a tremendous impact on the solution
time; see for example Moreno et al. (2007). In addition it has been proven experimentally
that other methods, as the convex hull formulation presented by Montagna et al. (2004) are
better to account for selection decisions.
In this paper we develop a robust methodology to solve the design problem of
a biotechnological multi-product batch plant in situations where equipment can be
manufactured according to customer needs, as fermentors or tanks in general. To do that,
we develop a MILP formulation which does not rely on the use of Big-M constraints and
does not use a discrete range of equipment sizes. To do that we use four basic techniques
(see Figure 2.1): First, an extension of the non-linear (but convex) formulation proposed by
Kocis & Grossmann (1988) is applied. Secondly, to deal with non-linear convex inequalities
a priori we constructed linear outer (or inner) approximations of them which allow us to
compute (a posteriori) true feasible solutions and lower (or upper) bounds. Thirdly, to
deal with integer variables, we used advanced reformulation techniques coming from the
mixed-integer-programming literature (clique constraints). Finally, once the initial problem
is transformed into a standard mixed-integer programming problem, it is possible to take
advantage of mature commercial MIP solvers.
This approach, at least in our experiments, is more stable numerically, scalable, and
faster to solve than current alternatives and can deal with the more general problem of
jointly selecting equipment sizes and alternative production paths for multiple products.
Using our approach, it is possible to quickly and accurately compute solutions at any desired
precision level. In our extensive computational experiments (see figure 2.11) we found that
current non-linear solvers only solved 43% of the instances generated for this study, while
our approach was able to solve over 95% of the studied instances in a running time that,
on average, was more than ten times faster than MINLP solvers in equivalent and standard
MINLP formulations. To make these comparisons we introduce the performance profiles; a
methodology borrowed from the optimization literature.
The rest of this paper is organized as follows. In section 2.2 typical drawbacks found
by a commonly used MINLP solver and the standard MINLP formulation is presented. In
section 2.3 classic and novel formulations for the design problem are described. Relevant
information about the methodology used to benchmark different formulations and to avoid
numerical instabilities is given in Section 2.4 and computational results are presented and
discussed in Section 2.5. Finally, the conclusions are presented in Section 2.6.

2.2 Current limitations


Our main objective is finding a robust and scalable methodology for the design of
biotechnological multi-product batch plant considering equipment sizing (design decisions)

11
2.2 Current limitations 2 Design of multi-product batch plants

MINLP
non-convex

yjn = ln(Yjn )

MINLP
convex
Inner/Outer
approximations

MILP
equipment sizing
Selection variables/
Clique constraints
MILP
equipment sizing
and route selection
MIP solver

Figure 2.1 – Basic techniques used to model synthesis and design decisions considering
continuous equipment sizes and discrete host selection.

and selecting the downstream processing stages (synthesis decisions). Given the complexities
that to date have been added to the original design problem we decided to go back to the
problem studied by Iribarren et al. (2004) where only design and synthesis decisions are
modeled. In their paper they designed a biotechnological batch plant for the production of
four recombinant proteins, i, where each can be synthesized by two different hosts, h, having
four microorganisms in total. In addition to that three of the fifteen processing stages, j, may
be performed by two different unit operations, d. In their formulation they used constant
size (Sijdh ) and time (Tijdh ) factors to model each stage; considered duplication of units in
parallel in-phase, Gjd , and out-of-phase, Mjd , in order to diminish either the equipment sizes
Vj or cycle times T Li , respectively, and used Big-M constraints to account for the selection
of hosts and equipment.
As a correctness test, we took the example presented in Iribarren et al. (2004) and splitted
into 16 different instances that only allow equipment selection. Then we tested two different
yet equivalent formulations based on their model but removing host selection2 .
In the first (C1) the selection of hosts was eliminated by limiting the set of available hosts,
Hi , to just one per protein, and in the second (C2), by setting the values of the selection
binary variables to 1 for the selected hosts and 0 for those non-selected. If the solution is
being found by the solvers, we should observe two things:
2
To further isolate the results obtained from problems with non-standard local-settings, the GAMS
modelling language was used and the experiments were run in the NEOS server (Gropp & Moré, 1997;
Czyzyk et al., 1998; Dolan, 2001) available in http://www.neos-server.org

12
2.3 Problem Formulation 2 Design of multi-product batch plants

Table 2.1 – Comparison of the number of constraints and variables of some selected instances
solved using models that account for selection with Big-M constraints -models (C1) and (C2)-
and a classic formulation for design decisions only, (P1). All instances were solved using the
DICOPT solver.

Variables Constraints Status of


Disc. Cont. Linear Non-linear solution
(C1) 480 185 303 10 Incorrect
(C2) 512 323 682 18 Incorrect
(P1) 360 71 179 9 Correct

(a) both model formulations (C1) and (C2) give the same solution, and

(b) the minimum of the separated instances is equivalent to the global minimum of the
problem with host selection.

Contrary to what we expected differences in the objective function value for both (C1) and
(C2) formulations went from 1% to 78% in the 16 instances studied (data not shown), and
even more striking, the solver finds a local minimum which is worse than those found for
most instances without host selection.
These numerical instabilities seem to be aggravated with size since it is known that
DICOPT works fine for small instances. Situation in accordance to the results obtained
by Ponsich et al. (2007). In order to show these differences in sizes we built Table 2.1 to
compare the number of constraints and variables involved in the smaller instances of the
cases (C1) and (C2), that were incorrectly solved according to the aforementioned results,
with the size of a smaller instance that was correctly solved by a classic formulation (P1) that
solves an equipment sizing problem similar to that presented by Iribarren et al. (2004), but
with no selection of hosts or equipments, and using DICOPT solver. This last formulation
is presented en Section 2.3.1.

2.3 Problem Formulation


Two major contributions are presented in this section. First, clique constraints are introduced
to formulate the discrete part of the model allowing the selection of the production path
without the use of Big-M constraints, in models (P2) and (P4). Second, a new approach,
in Section 2.3.2, to handle non-linearities using standard reformulation techniques from the
optimization field that permits the use of linear solvers leading to more reliable results and
faster computing time. The relation among the four different models studied is shown in
Figure 2.2.

13
2.3 Problem Formulation 2 Design of multi-product batch plants

Equipment Route
sizing selection

MINLP (P1) (P2)

MILP (P3) (P4)

Figure 2.2 – Formulations compared in this article. Model (P1) is the most basic formulation
that only includes design decisions. Model (P2) includes the selection of the downstream
processes without the use of Big-M constraints. Models (P3) and (P4) are the transformed
models of (P1) and (P2), respectively, using our proposed inner and outer approximations.

2.3.1 MINLP formulation


The equipment-sizing problem (P1)
In this section we present the most basic formulation for the design of biotechnological
multi-product batch plants as only equipment sizing and duplication of units in parallel
are considered.
The plant consists of a sequence of batch, semi-continuous and chromatographic stages used
to manufacture different products i; where semi-continuous as well as chromatographic stages
are composed by the semi-continuous items plus feed and product tanks. At each stage j there
are Xj2 groups of units operating in parallel out-of-phase and each group is conformed by Xj1
units operating in-phase. For semi-continuous or chromatographic stages feed and product
tanks can only be duplicated out-of-phase. Single production campaigns are considered and
batches are transferred from one stage to the next without delay (zero wait policy).
The objective is to minimize the investment costs of main equipments of the plant (see
equation (2.1)) given fixed production targets, di , over a time horizon δ.

X  
γ1
min cost = Xj1 Xj2 c1j Yj1 j
j∈E 1
X h  γ1
 
γ2

+ Xj2 c1j Yj1 j + Xj2 c2j Yj2 j
j∈E 2 ∪E 3
 i
γ3
+Xj1 Xj2 c3j Yj3 j + vρδ (2.1)

Variables Yj· represent the different equipment sizes. Parameters c·j and γj· are cost
coefficients distinctive for each kind of equipment and v is a slack variable included to assure
feasibility (Montagna et al., 2004).
Making the change of variables introduced by Kocis & Grossmann (1988) we get the new
objective function (2.2).

14
2.3 Problem Formulation 2 Design of multi-product batch plants

X 
min cost = c1j exp x1j + x2j + yj1 γj1
j∈E 1
X   
+ c1j exp x2j + yj1 γj1 + c2j exp x2j + yj2 γj2
j∈E 2 ∪E 3

+c3j exp x1j + x2j + yj3 γj3 + vρδ (2.2)
At each stage and for each product the size of the units must allow the processing of
the incoming batch which can be splitted among Xj1 units to not surpass the upper bound
capacity of the equipment. In batch stages this constraint can be written as equation (2.3a);
convexified in equation (2.3b).

Sij1 Yi4
Yj1 ≥ ∀i ∈ I, j ∈ E 1 (2.3a)
Xj1
yj1 + x1j ≥ s1ij + yi4 ∀i ∈ I, j ∈ E 1 (2.3b)
As in semi-continuous or chromatographic stages duplication is allowed just for semi-
continuous items, feed and product tanks are sized using constraints (2.4) and (2.5).

yj1 ≥ s1ij + yi4 ∀i ∈ I, j ∈ E 2 ∪ E 3 (2.4)

yj2 ≥ s2ij + yi4 ∀i ∈ I, j ∈ E 2 ∪ E 3 (2.5)


Chromatographic columns have to process the incoming batch and both duplication in-
phase and out-of-phase are allowed. Duplication in-phase is modeled in size constraint (2.6)
since this permits smaller units and duplication out-of-phase is reflected in time constraints.

yj3 + x1j ≥ s3ij + yi4 ∀i ∈ I, j ∈ E 3 (2.6)


The cycle time for each product i, Yi5 , is defined as the time elapsed between the
production of two consecutive batches and is given by the larger operating time, Tij , among
the stages in the process. This time can be decreased if a duplication of units out-of-phase
is used:
Tij
Yi5 ≥ ∀i ∈ I, j ∈ E (2.7)
Xj2
As batch stages operate for a fixed time, Tij0 , cycle time constraint in its convex form is given
by equation (2.8):

yi5 + x2j ≥ t0ij ∀i ∈ I, j ∈ E 1 (2.8)


Semi-continuous stages, on the other hand, operate during a time that depends on the
final batch size, Yi4 . For those stages the cycle time is constrained as in equation (2.9).
Y4
Tij1 X 1iY 3
Yi5 ≥ j j
∀i ∈ I, j ∈ E 2 (2.9a)
Xj2

15
2.3 Problem Formulation 2 Design of multi-product batch plants

yi5 + x2j ≥ t1ij + yi4 − x1j − yj3 ∀i ∈ I, j ∈ E 2 (2.9b)

Lastly, chromatographic stages are modeled considering both fixed and variable operation
times leading to the highly non-linear constraint (2.10).
Y4
Tij0 + Tij1 X 1iY 3
Yi5 ≥ j j
∀i ∈ I, j ∈ E 3 (2.10a)
Xj2
  
yi5 + x2j ≥ ln exp t0ij + exp t1ij + yi4 − x1j − yj3 ∀i ∈ I, j ∈ E 3 (2.10b)

Production targets for all products, di , must be satisfied within the time horizon δ.
X di Y 5
i
≤ δ + vδ (2.11a)
i∈I
Yi4
X di 
exp yi5 − yi4 ≤ 1 + v (2.11b)
i∈I
δ

Finally, variables for duplication in-phase Xj1 are restricted to integer values using
3
constraints (2.12) and (2.13), where zjm are binary variables and M a set of available units
to operate in parallel in-phase. The same is valid for variables for duplication out-of-phase,
Xj2 .
X
x1j = 3
zjm ln(m) ∀j ∈ E (2.12)
m∈M
X
3
zjm =1 ∀j ∈ E (2.13)
m∈M

Appropriate upper and lower bounds are also considered for all of the variables.

The design problem with selection of routes and equipment sizing


(P2)
More recent models (like Iribarren et al. (2004)) take into account the joint selection of the
production processes including selection of hosts and equipment. Their formulation uses
classical Big-M constraints. Since we know that these constraints are problematic (Bosch
& Trick, 2005) in this work we propose a different way to formulate the integer part of the
problem replacing the Big-M by clique constraints. With this formulation all constraints are
ignored except the upper bound on the variables (Dietrich et al., 1993). Model (P2) includes
the sizing of the equipment, the duplication of units in parallel working in-phase and out-of-
phase and accounts for the selection of the global process selecting routes which are defined
as the series of unit operations used to purify a protein given a certain host that synthesizes
it. In this way once the pair product-host, (i, h), is selected the set of stages conforming the
process is fixed.

16
2.3 Problem Formulation 2 Design of multi-product batch plants

The objective function becomes:

X 
min cost = zj2 c1j exp x1j + x2j + yj1 γj1
j∈E 1
X   
+ zj2 c1j exp x2j + yj1 γj1 + c2j exp x2j + yj2 γj2
j∈E 2 ∪E 3

+c3j exp x1j + x2j + yj3 γj3 + vρδ (2.14)

Since some stages can be unused and just one route per protein can be selected we
1
introduced two binary variables: zih and zj2 . zih
1
is equal to 1 when for product i synthesis
2
host h is selected and 0 otherwise and zj is 1 when stage j is used to process at least one of
the products and 0 otherwise. Constraint (2.15) enforces to chose just one host h to produce
the protein i and constraint (2.16) permits stage j to be used just in case at least one product
needs it to be processed.
X
1
zih =1 (2.15)
(i,h)∈ U

zj2 ≥ zih
1
∀(i, h, j) ∈ R|(i, h) ∈ U (2.16)
For chromatographic stages constraints take the form of equations (2.17)- (2.20) that are
1
trivially satisfied if host h is not selected to produced protein i (zih = 0). When the host h
1
is selected for protein i (zih = 1) and the stage j has to be performed to process product i
then zj2 = 1 and constraints are the same as in previous formulation (Section 2.3.1).

yj1 zj2 ≥ s1ihj zih


1 4 1
+ yih zih ∀(i, h, j) ∈ R, j ∈ E 3 (2.17)

yj2 zj2 ≥ s2ihj zih


1 4 1
+ yih zih ∀(i, h, j) ∈ R, j ∈ E 3 (2.18)

yj3 zj2 + x1j zj2 ≥ s3ihj zih


1 4 1
+ yih zih ∀(i, h, j) ∈ R, j ∈ E 3 (2.19)

5 1
   1
yih zih + x2j zj2 ≥ ln exp t0ihj + exp t1ihj + yih
4
− x1j − yj3 zih
∀(i, h, j) ∈ R, j ∈ E 3 (2.20)

If stage j is not necessary for the process (zj2 = 0); then equipment sizes are set to 0 with
constraints as (2.21) and no unit is considered to conform that stage (constraint (2.22)):

yj1,LO zj2 ≤ yj1 ≤ yj1,U P zj2 ∀j ∈ E (2.21)

X
3
zjm = zj2 ∀j ∈ E (2.22)
m∈M

17
2.3 Problem Formulation 2 Design of multi-product batch plants

Finally, in the planning horizon constraint (2.23) only the terms associated to the selected
host per protein are considered.
X dih 
1 5 4
zih exp yih − yih ≤1+v (2.23)
δ
(i,h)∈ U

2.3.2 Mixed-Integer linear formulations


To obtain more accurate solutions and, specially in larger instances, in a reasonable running
time we present a MILP reformulation, which can be solved using any commercial MILP
solver. These models are basically equal to their MINLP counterpart but replacing the non-
linear objective and time constraints with sets of linear functions which give arbitrarily good
lower or upper approximations of their respective original functions. The actual optimal
solution is in between both approximations and the precision level is given by the number
of cutting points selected to generate the set of linear functions to replace each non-linear
function and the actual selection of the approximation points used for example equispaced
or non-equispaced. In this way the accuracy of the solution can be as high as desired at the
cost of longer computing time.

Inner and outer approximations


Given a convex function of one variable g(x) ≤ 0 and a set of points {xk }k=1,...,n in the domain
of g then, is easy to see that:

{x|g(x̂k ) + ∇g(x̂k )(x − x̂k ) ≤ 0 k = 1, ..., n} ⊇ {x|g(x) ≤ 0} (2.24)


and

 
g (x̂k+1 ) − g (x̂k )
x|g(x̂k ) + (x − x̂k ) ≤ 0 k = 1, ..., n ⊆ {x|g(x) ≤ 0}, (2.25)
x̂k+1 − x̂k
which allows for straightforward lower and upper approximations of g. Using this fact, it is
easy to find inner and outer approximations of the problems (P1) and (P2).
In fact, for each non-linear constraints of the form gj (x) ≤ 0, and considering an arbitrary
set of cutting points in the domain {xk }k=1,...,n the consideration of the set of constraints

gj (x̂k ) + ∇gj (x̂k ) (x − x̂k ) ≤ 0 k = 1, ..., n (2.26)


which leads to a larger feasible set, as can be seen in Figure 2.3a. On the other hand, we
consider the set of constraints

gj (x̂k+1 ) − gj (x̂k )
gj (x̂k ) + (x − x̂k ) ≤ 0 k = 1, ..., n − 1 (2.27)
x̂k+1 − x̂k
which leads to a smaller feasible set, as can be seen in Figure 2.3b.
In the same way, the minimization of the cost objective function, f (x), can be replaced
by

18
2.3 Problem Formulation 2 Design of multi-product batch plants

min v
(2.28)
s.t. v ≥ f (x̂k ) + ∇f (x̂k )(x − x̂k ) k = 1, ..., n
which leads, together with an outer approximation of constraints, to a lower bound of the
true cost. The objective function can also be replaced by

min v
f (x̂k+1 ) − f (x̂k )
s.t. v ≥ f (x̂k ) + (x − x̂k ) (2.29)
x̂k+1 − x̂k
k = 1, ..., n
which leads, together with an inner approximation of constraints, to an upper bound of the
true cost.
In what follows, ∇f (x̂k ) or f (x̂x̂k+1 )−f (x̂k )
k+1 −x̂k
= αk in the lower or upper approximation,
respectively, x̂k = bk and f (x̂k ) = βk .

Reformulation for the equipment-sizing problem (P3)


The assumptions for this model are the same as those for the MINLP proposed in
Section 2.3.1.

Objective function Cost functions of equation (2.2) are individually linearized using the
approximations given in Section 2.3.2 which leads to equation (2.30):
X X  
min cost = vj1 + vj1 + vj2 + vj3 (2.30)
j∈E 1 j∈E 2 ∪E 3

Constraints Batch and semi-continuous stages and binary variables for duplication of
units constraints in this MILP model are the same as those in the MINLP model shown
in Section 2.3.1.

Chromatographic stages Size constraints for feed and product tanks and column size
constraint are the same as those in the MINLP model shown in Section 2.3.1.
Time constraint (2.31) is obtained from the linearization of equation (2.10):

yi5 + x2j ≥ αk6ij yi4 − x1j − yj3 − b6ij
k + βk6ij ∀i ∈ I, j ∈ E 3 , k ∈ K6 (2.31)

Planning horizon From linearization of equation (2.11) constraint (2.32) is obtained:


X
vi7 ≤ 1 (2.32)
i∈I

19
2.3 Problem Formulation 2 Design of multi-product batch plants

(a) Outer approximation

f (x)

b1 = LB b2 b3 b4 b5 = U B
x
(b) Inner approximation
f (x)

b1 = LB b2 b3 b4 b5 = U B
x

Figure 2.3 – Feasible region (patterned area) of (a) outer and (b) inner approximations (dashed
lines) of an exponential function (solid line). Points bi are the cutting points and LB and UB
are the lower and upper bounds of x.

20
2.3 Problem Formulation 2 Design of multi-product batch plants

Auxiliary variables Cost functions in the objective function are linearized as shown
in equation (2.33) and planning horizon constraint is linearized as shown in equation (2.34):

vj1 ≥ αk1j x1j + x2j + γj1 yj1 − b1j
k + βk1j ∀j ∈ E 1 , k ∈ K1 (2.33)

di 7i 5  di 7i
vi7 ≥ αk yi − yi4 − b7i
k + β ∀i ∈ I, k ∈ K7 (2.34)
δ δ k

Reformulation for the design problem considering route selection


and equipment sizing (P4)
Similar to the model presented in Section 2.3.2 this model was built based on its MINLP
counterpart and most of constraints remain the same.
The objective function is the same as that from model (P3) and for stages design only
differences are encountered for time constraints of chromatographic stages. In this way,
applying the inner or outer approximations to equation (2.20) constraint (2.35) is obtained:

  
5
yih + x2j ≥ αk6ihj 4
yih − x1j − yj3 + βk6ihj − αk6ihj b6ihj
k
1
zih
∀(i, h, j) ∈ R, j ∈ E 3 , k ∈ K6 (2.35)

This set of equations together with constraints of the type of (2.36) for variables yj1 , yj2 ,
4 5
yih , yih , x1j and x2j will model the same situation as in (P2).

yj3,LO zj2 ≤ yj3 ≤ yj3,U P zj2 ∀j ∈ E 3 (2.36)


If zj2 = 0 constraint (2.35) is trivially satisfied. 1
On the other hand if zih = 0
constraint (2.35) becomes:

x2j ≥ αk6ihj −x1j − yj3 ∀(i, h, j) ∈ R, j ∈ E 3 , k ∈ K6 (2.37)
Since αk6ihj is a positive parameter constraint (2.37) will be always satisfied only if |x1j |>
|yj3 | or if both variables are bigger than or equal to 0. To assure this data preprocessing is
necessary. As x1j is always bigger than 0 we normalized variables yj3 by their lower bound.
More details in section 2.4.3.
For the case of planning horizon constraint little difference is found between (2.32) and
(2.38). The last one takes into account host selection:
X
7
vih ≤1 (2.38)
(i,h)∈ U

Finally, constraints for auxiliary variables vj1 , vj2 , vj3 and vih
7
are different from those for
problem (P3) to account for route selection:
  2
vj1 ≥ αk1j x1j + x2j + γj1 yj1 + βk1j − αk1j b1j
k zj ∀j ∈ E 1 , k ∈ K1 (2.39)

21
2.4 Methods 2 Design of multi-product batch plants

vj1 ≤ zj2 vj1,U P ∀j ∈ E 1 (2.40)

7 di 7ih 5 4
 di 7ih  1
vih ≥ αk yih − yih + βk − αk7ih b7ih
k zih ∀(i, h) ∈ U, k ∈ K7 (2.41)
δ δ

7 1 7,U P
vih ≤ zih vih ∀(i, h) ∈ U (2.42)
If zj2 = 0 constraints (2.39) to (2.42) are trivially satisfied and if zih1
= 0 then
1 2
constraints (2.41) and (2.42) are trivially satisfied. Finally, if zih = 1 and zj = 1 then
constraints (2.39) to (2.42) are the same as those in the MILP formulation without route
selection.

2.4 Methods
2.4.1 Solvers and modelling language
For MINLP problems the open source BONMIN 1.5 and SCIP 3.0.1 solvers were studied.
In our computational tests SCIP uses SoPlex 1.7.1 as the LP solver and BONMIN (with its
default algorithm, B-Hyb) uses Cbc 2.7.1 as the MIP solver and Ipopt 3.10.0 with MUMPS
as linear solver. For the case of BONMIN we tested 3 over 5 available algorithms: B-Hyb the
default algorithm, B-Ecp a specific parameter setting of B-Hyb that can be faster in some
cases (Bonami & Lee, 2013) and B-OA using CPLEX as the MILP solver that according
to Mittelmann (2013) can be faster for convex instances. In preliminary studies solvers
as KNITRO and COUENNE were also tested to solve our MINLP formulations, but their
performance in our simplest instances were poorer than that for the selected solvers.
For MILP problems the commercial CPLEX solver in its version 12.4.0.0 was used as it
is one of the top performer from the literature (Mittelmann, 2013).
All models were coded using the AMPL modelling language.

2.4.2 Execution environment


Each instance was executed using a single thread on a Intel(R) Xeon(R) CPU
E5620@2.40GHz with a running time limit of 48 hours, an optimality relative gap of 0.1%
for models (P3) and (P4) and 2% for (P1) and (P2), and a maximum memory usage of 6Gb
of RAM.
The difference in the prescribed optimality gap for MILP and MINLP solvers is given by
the fact that while MINLP problems are solved to find the actual minimum cost function
within a defined optimality gap, and therefore an a priori optimality gap, MILP models find
true upper and lower bounds for the actual cost function leading to an a posteriori optimality
gap that is computed afterwards. As will be shown in Section 2.5.2, this difference ensure
that our results are comparable.

22
2.4 Methods 2 Design of multi-product batch plants

2.4.3 Methodology
Instances
To compare different approaches two set of instances, with randomly generated data between
given reasonable upper and lower bounds, were built: “sizing instances”, to compare simpler
models (P1) and (P3), and “routing instances” to compare more complex models (P2) and
(P4). We considered a variety of different number of proteins to be produced (4 to 6), number
of stages to conform the process (11 to 65), number of routes to synthesize the product (20
to 65) and different cost coefficients values (1% to 110% of nominal values).

Benchmarking
In order to compare the model-solver pairs studied in this work we introduce a new tool for
process engineers that was introduced in the optimization field by Dolan & Moré (2002) to
compare different optimization software: the performance profile.
As stated by Dolan & Moré (2002) the performance profile for a solver is the “cumulative
distribution for a performance metric”, for example computing time. In this way things like
how many instances a solver is able to solve given some stop criteria like those shown in
Section 2.4.2, or how fast it solves different instances of the same type of problem can be
seen graphically.
As an example of how to read these plots, in Figure 2.4a it can be seen that when using
17 cutting points 40% of the instances were solved to an a posteriori optimality gap up to
2% while when using 33 cutting points leads to an optimality gap under 0.5% for the same
amount of instances.

Data pre-processing
It is known that zero-one problems of large-scale are hard combinatorial optimization
problems (Crowder et al., 1983; Koch, 2004; Applegate et al., 2007) reason why in order
to obtain reliable solutions preprocessing data is necessary. The use of tight bounds and the
normalization of the variables are necessary to decrease numerical errors.
Although not all of our instances are big enough to need data preprocessing all were
subjected to the same treatment:

• All variable bounds and parameters associated to variables Yj· and Yi· were normalized
by their respective lower bounds.

• Size and time factors were normalized and dimensionless considering the respective
associated units. For example, as size factor for tanks have units of batch size divided
by a volume this parameters are dimensionless by multiplying by the lower bound of
the final batch size and dividing by the respective tank lower bound.

• Lower bounds for the cycle time were tightened using time constraints and upper bounds
for final batch product were tightened using size constraints.

23
2.5 Results and Discussion 2 Design of multi-product batch plants

Table 2.2 – Sizes of sample instances solved using non-linear model (P1).

Variables Constraints
Discrete Continuous Linear Non-linear
Small 264 57 139 9
Medium1 840 157 407 5

2.5 Results and Discussion


In this section we show the robustness of our proposed MILP transformations and its
superiority over classic MINLP formulations with Big-M constraints using performance
profiles, a methodology borrowed from the optimization literature. Our approach is not
only able to find correct solutions in realistic situations unlike MINLP formulations but also
in a small fraction of the time required by those approaches. Major implications of these
features are the exactness of the solutions that make this information reliable for decision-
making; and as time reduction is significant numerous alternatives can be tested with the
same formulation or with complexified models that may address the combination of different
types of decisions.
This presentation is organized as follows: first, we describe the instances generated for
comparison then discuss the selection of the cutting points for the proposed approach and
finally, we compare MINLP and MILP formulations in terms of their performance solving
the sets of instances using time as the metric.

2.5.1 Size of instances


To compare the most basic and easy to solve problems (P1) and (P3) a set of 186 instances
(“sizing instances”) were generated varying the number of proteins to be produced (2 to
6), the number of stages that conform each process (11-35) and the cost coefficient values
(1% - 110% of nominal values). Sizes of these instances in terms of number of variables and
constraints are shown in tables 2.2 and 2.3, where “Small” corresponds to an example of one
of the smaller instances solved with different models and “Medium1”, to an example of the
bigger instances solved for these two models. As it can be seen in both tables new auxiliary
variables and the sets of linear functions generated to replace non-linear restrictions makes
the problem from 7 to 12 times bigger in terms of linear constraints when 33 cutting points
are used for linearization with an increase in about 50% of continuous variables. However, as
we will see later, this increase in variables and constraints leads to smaller execution times
and more accurate results.
To test models (P2) and (P4), as they were posed to solve more complex scenarios, a set
of 249 new and bigger instances (“routing instances”) were generated varying the number
of proteins to be produced (4 to 6), the number of stages conforming the global process
(18 to 65) and the number of routes available to produce the proteins (20 to 40). Sizes of
these instances in terms of number of variables and constraints are shown in tables 2.4 and

24
2.5 Results and Discussion 2 Design of multi-product batch plants

Table 2.3 – Sizes of sample instances solved using linear model (P3) with 33 cutting points for
linear inner approximation.

Variables Constraints
Discrete Continuous Linear Non-linear
Small 264 87 1357 -
Medium1 840 237 3096 -

Table 2.4 – Sizes of sample instances solved using non-linear model (P2).

Variables Constraints
Discrete Continuous Linear Non-linear
Small 279 57 251 9
Medium1 881 157 719 5
Medium2 488 152 1262 22
Large 1794 613 15766 462

2.5, where “Medium2” corresponds to an example of one of the smaller “routing instances”
solved with models (P2) and (P4) and “Large”, to an example of the bigger instances solved
in this work. As it can be seen in both tables, in comparison with tables 2.2 and 2.3, the
number of discrete variables increases by 5% with the addition of selection variables, z, and
the number of linear constraints increases by 15% for (P4) and is around double for (P2).
As we will see later, this addition permits the resolution of more complex scenarios, while at
the same time not affecting execution time or optimality gap in comparison with the more
basic formulation.

Table 2.5 – Sizes of sample instances solved using linear model (P4) with 33 cutting points for
linear inner approximation.

Variables Constraints
Discrete Continuous Linear Non-linear
Small 279 87 1539 -
Medium1 881 237 3602 -
Medium2 488 229 4829 -
Large 1794 926 46489 -

25
2.5 Results and Discussion 2 Design of multi-product batch plants

2.5.2 Selection of cutting points


Contrary to MINLP problems (P1) and (P2) that are solved to an a priori optimality gap,
models (P3) and (P4) give true upper and lower bounds for the actual cost function of each
instance and therefore an optimality gap that is computed a posteriori. Figure 2.4a shows
the performance profiles of the gaps obtained a posteriori for the “sizing instances” solved
with 17, 33 and 65 cutting points that generate 16, 32 and 64 linear functions for the inner
approximations, respectively. Here we can see that our linear model is able to solve all
instances with a maximum gap of 5% in less than 16 seconds when using 17 cutting points,
and a gap of less than 0.5% in less than 64 seconds when using 65 cutting points. The
running time profiles can be seen in Figure 2.4b. If 65 cutting points had been selected, the
optimality gap for non-linear solvers would have been around 0.5% as that is the worst gap
obtained a posteriori with CPLEX (Figure 2.4a). Given this, in our final experiments we use
a set of 33 cutting points, since this option gives the largest improvement in gap versus the
increase in execution time, and a slightly bigger optimality gap criteria of 2% was selected
for non-linear solvers.
Once the number of points is selected, the specific values of these points must be chosen.
The most obvious choice is equispaced points which, for the (relevant) exponential function ex
generates small errors for low values of x and large errors for high values. Another alternative
is to use the expression (2.43), where N is the total number of cutting points including −∞
as x1 and x̄ as the upper bound of x. This is a good approximation in order to minimize the
maximum value of the error (see Figure 2.5).
 
k−1
xk = 2 ln + x̄ ∀k ∈ 2...N (2.43)
N −1
We can see in Figure 2.5 that the choice of equispaced points leads to better
approximations for low values of x, but much worse for high values. For our numerical
experiments, equispaced points work better: while execution time remains the same for both
approaches, a posteriori gaps were slightly worse for non-equispaced points (Figure 2.6).
This leaves open important questions about the optimal point selection to improve the
precision of upper and lower approximations. Our preliminary simulations seem to indicate
that giving substantial attention to smaller values of x could significantly improve the results,
but this is left for further research.

2.5.3 Equipment sizing: comparison of problems (P1)


and (P3)
We tested three different combinations of solvers-models: the linear model (P3) was solved
using CPLEX as solver while the non-linear model, (P1), was solved using SCIP and 3 of the
5 algorithms that are available for using BONMIN which were chosen based on BONMIN
users’ manual and Mittelmann’s benchmarking (Mittelmann, 2013) information. All of the
instances were solved using the stopping criteria and conditions mentioned in Section 2.4.2.
Figure 2.7 shows the performance profile of running time using BONMIN-Hyb, BONMIN-
Ecp, BONMIN-OAcpx, SCIP and our CPLEX-based approach. From this, we can see that

26
2.5 Results and Discussion 2 Design of multi-product batch plants

(a) Relative optimality gap


1

0.8

solved instances
total instances
0.6

0.4
17 cutting points
0.2 33 cutting points
65 cutting points
0
0% 1% 2% 3% 4% 5%
Optimality gap

(b) Running time


1

0.8
solved instances
total instances

0.6

0.4
17 cutting points
0.2 33 cutting points
65 cutting points
0
−4 −2 0 2 4
log2 (T ime [s])

Figure 2.4 – Comparison of performance profiles of (a) relative optimality gap obtained a
posteriori and (b) the logarithm of the running time of “sizing instances” solved with linear
model (P3) using 17, 33 and 65 cutting points for lower and upper approximations with an
optimality relative gap of 0.1%.

27
2.5 Results and Discussion 2 Design of multi-product batch plants

5
Non-equispaced points
Equispaced points
4

f (x) − ex
3

0
0 2 4 6 8
x

Figure 2.5 – Comparison of absolute errors using 2 different sets of 33 cutting points where
f (x) are the linear functions used to approximate the exponential function between 2 cutting
points.

0.8
solved instances
total instances

0.6

0.4

0.2 Non-equispaced cutting points


Equispaced cutting points
0
0% 0.5 % 1 % 1.5 % 2 % 2.5 % 3%
Optimality gap

Figure 2.6 – Comparison of performance profiles of a posteriori gaps obtained using 33


cutting points to solve model (P4) where f (x) are the linear functions used to approximate
the exponential function between 2 cutting points. Time limit was set in 12 hours.

28
2.5 Results and Discussion 2 Design of multi-product batch plants

0.8

solved instances
total instances
0.6
(P1)-SCIP
0.4 (P1) B-Ecp
(P1) B-Hyb
(P1) B-OAcpx
0.2 (P3)-CPLEX lower app
(P3)-CPLEX upper app
0
0 5 10 15
log2 (T ime [s])

Figure 2.7 – Comparison of performance profiles of the logarithm of running time of “sizing
instances” solved using models (P1) and (P3) with an optimality relative gap of 0.1% for the
linear solver and 2% for non-linear solvers.

problem (P1) was solved faster using any BONMIN algorithm than using SCIP solver.
Moreover, SCIP only worked well in about the 75% of the instances, while BONMIN is
able to solve the 85% of the instances using the OA algorithm and the 100% of the studied
instances using either the Ecp or the Hyb algorithm. On the other hand, model (P3) solved
using CPLEX is able to solve all of the instances studied, as well as BONMIN, but taking
much less time than the problem (P1). As B-Ecp and B-Hyb seems to be equally good for
those instances that take longer to be solved we decided to use as the performance metric
the ratio of the computing time of the model-solver versus the best time of all of the model-
solvers, denoted by τ . Those performance profiles are plotted in Figure 2.8 where we can
see even more clear than the CPLEX-based approach is always better than all of the other
options and that Ecp algorithm is always better than Hyb for this asked optimality gap.

2.5.4 Routes selection: comparison of problems (P2)


and (P4)
As a test of correctness, models (P2) and (P4) were solved using the generated “sizing
instances” where just one route was available to produce each product. Contrasting these
results to those obtained by (P1) and (P3) it can be seen in Figure 2.9 that both formulations,
for equipment sizing and considering routes, are consistent solving the same amount of
instances in virtually the same amount of time.
Performance profiles of the relative difference between the value of the objective function
obtained with simpler -(P1) and (P2)- and more complex formulations -(P3) and (P4)- are
presented in Figure 2.10 where it can be seen that the differences between MILP formulations,
as well as for MINLP formulations solved using BONMIN, are at most the optimality gap
asked for each solver. The case of SCIP is different because in almost the 10% of the instances

29
2.5 Results and Discussion 2 Design of multi-product batch plants

0.8

solved instances
total instances
0.6
(P1) B-Ecp
0.4 (P1) B-Hyb
(P1) B-OAcpx
(P1) SCIP
0.2 (P3) CPLEX-lower approx
(P3) CPLEX-upper approx
0
0 5 10 15 20
log2 (τ )

Figure 2.8 – Comparison of performance profiles of the logarithm of the ratio of the computing
time of the pair model-solver versus the best time of the pairs model-solvers for “sizing
instances” solved with models (P1) and (P3) with an optimality relative gap of 0.1% for the
linear solver and 2% for non-linear solvers.

0.8
solved instances
total instances

(P1) SCIP
0.6 (P2) SCIP
(P1) B-Ecp
0.4 (P2) B-Ecp
(P3) CPLEX lower app
(P3) CPLEX upper app
0.2 (P4) CPLEX lower app
(P4) CPLEX upper app
0
0 5 10 15
log2 (T ime [s])

Figure 2.9 – Comparison of performance profiles of the logarithm of running time of “sizing
instances” solved using models (P1), (P2), (P3) and (P4) with an optimality relative gap of
0.1% for the linear solver and 2% for non-linear solvers.

30
2.6 Conclusions 2 Design of multi-product batch plants

0.8

solved instances
total instances
0.6

0.4
SCIP
B-Ecp
0.2 CPLEX lower app.
CPLEX upper app.
0
0 0.01 0.02 0.03 0.04 0.05
(P 1)cost −(P 3)cost (P 2)cost −(P 4)cost
(P 1)cost or (P 2)cost

Figure 2.10 – Comparison of performance profiles of relative difference between simple and
more complex formulation for “sizing instances”. Models (P1) and (P2) were solved to an
optimality gap of 2% and (P3) and (P4), to an optimality gap of 0.1%.

the solver gives results with a difference in the objective function between models (P1) and
(P2) greater than the optimality gap which shows that this solver is not reliable to solve this
type of problems.
As a final step in this work we compare in Figure 2.11 the performance profiles of total
running time obtained after solving all instances generated for this work (435 in total). Here
we can see that for “routing instances” (P4) is much more robust than (P2), that was not
able to solve any of those cases. In average (geometric average), the instances take about
40 seconds to be solved using model (P4), which is about the 3% of the time required by
(P2)-BONMIN and less than the 1% of the time required by (P2)-SCIP. Nevertheless, for
some punctual instances that represent the 4.6% of the instances studied, the time and/or
memory usage were not enough to get the desired optimality gap. From this it can be stated
that for the solution of more realistic instances or even to solve real problems considering
continuous equipment sizes, the formulation proposed in this work is much more reliable and
faster than the usual and widely studied standard MINLP formulation.

2.6 Conclusions
In this work we present a scalable approach to solve, within reasonable running times and
quality assurance requirements, the problem of designing a biotechnological multi-product
batch plant that support continuous equipment sizes and discrete host and/or process
selection, up to sizes of real instances and that can be applicable to any kind of multi-product
batch plant.
The proposed method was proved to be more numerically stable than other alternative
approaches for the same problem giving true optimal solutions, and in general, faster than

31
2.6 Conclusions 2 Design of multi-product batch plants

1
(P2) SCIP
(P2) B-Ecp
0.8
(P4) CPLEX lower app

solved instances
total instances
(P4) CPLEX upper app
0.6

0.4

0.2

0
0 5 10 15
log2 (T ime [s])

Figure 2.11 – Comparison of performance profiles of the logarithm of running time of “routing”
and “sizing instances” solved using models (P2) and (P4) with an optimality relative gap of
0.1% for the linear solver and 2% for non-linear solvers.

other tested approaches. Our method takes advantage of two facts: the continuous relaxation
of the feasible region is convex and bounded (which allow us to build, up front, inner or
outer approximations of the feasible space, and thus report true upper/lower bounds for each
instance); and the fact that mixed-integer linear solvers are much more stable numerically
and scalable in size than MINLP algorithms. Also, this approach relies on “off the shelve”
optimization and modelling software, which makes it more amiable to practitioners.
To assert our claims, we borrow algorithm comparison tools from the optimization
community, which are an interesting form to test the quality of competing algorithms to
tackle the same class of problems.
Although, in a real scenario, semi-continuous units such as centrifuges and microfilters,
among others, are available only in discrete sizes, unlike tanks that can be built according to
customer needs, we feel that the proposed approach is robust enough to consider such issues,
however, this was left as a next step in our research. Additionally, to increase the precision
of our results for real cases, it is possible to explore a two step approach where after using
the proposed method to obtain upper and lower approximations for the objective function
we can refine different upper and lower variable bounds making them tighter and perform a
re-optimization.
Finally, if true speed is the goal; we know that low-level implementations of dynamic
inner/outer approximation can provide further time reductions, however, we feel that this is
beyond the scope of this work.

32
Chapter 3

Optimization of a biotechnological
multi-product batch plant design for
the manufacture of four different
products: a real case scenario

Submitted to Biotechnology and Bioengineering at December 2015.

33
3 Design of a real plant

Abstract
In this work a mixed-integer linear programming (MILP) formulation recently developed
by us is used to optimize the equipment sizes of a hypothetical new biotechnological multi-
product batch plant, based on information of real known processes for the production of 4
different biotechnological products. Knowing the specific steps conforming the downstream
processing of each product, size and time factors were computed and used as parameters to
solve the aforementioned MILP reformulation. New constraints were included to permit the
selection of some equipment -such as centrifuges and membrane filters- in a discrete set of
sizes. For equipment that can be built according to customer needs -such as fermenters and
stirred tanks- the original formulation was retained.
Computational results show the ability of this methodology to deal with real data giving
reliable solutions for a multi-product batch plant composed of 44 unit operations in a
relatively small amount of time showing that in the case studied it is possible to save up
to a 66% of the capital investment in equipment given the cost data used.
Keywords: multi-product batch plant, biotechnological products, MILP.

34
3.1 Introduction 3 Design of a real plant

3.1 Introduction
The design of multi-product batch plants using an optimization based approach has been
studied for more than 40 years and different approaches to deal with the complexity of the
optimization models that result in non-convex mixed-integer non-linear problems (MINLP)
goes from the development of new algorithms that are able solve these type of problems (Kocis
& Grossmann, 1989; Viswanathan & Grossmann, 1990; Borisenko et al., 2011; Li et al., 2012)
to the reformulation of them into mixed-integer linear problems (MILP) (Montagna et al.,
2004; Moreno & Montagna, 2011; Sandoval et al., 2016) that can be solved using the known
accurate commercial solver CPLEX. In the majority of the cases the aim of these efforts is
to be able to find the optimal design of multi-product batch plants in real scenarios which
is, according to Barbosa-Póvoa (2007), still a challenge.
In this work the methodology developed by Sandoval et al. (2016) is applied to a real case
scenario. The production process of 4 recombinant proteins -Products 1 to 4- are known and
with this information a new biotechnological multi-product batch plant is designed. Product
2 and 3 are synthesized as part of inclusion bodies in recombinant Escherichia coli while
Product 1 and 4 are synthesized by recombinant Saccharomyces cerevisiae as intracellular
and extracellular products, respectively. Each production process is composed of 8 to 21
processing stages of which about 14 can be shared by 2 or more individual processes. As the
equipment involved in each stage may be selected from the equipment available offered by
different manufacturers or sized according to the customer needs, a new selection decision
was introduced to the routing model presented by Sandoval et al. (2016), which now allows
for the selection of a discrete number of possible sizes when it is necessary.
The computational results obtained show that the methodology developed earlier is
capable of solving the optimization problem of a real type biotechnological multi-product
batch plant -with 44 operational stages- reliably and in a small amount of time.
The rest of this paper is organized as follows. In Section 3.2 known processes to
manufacture the 4 proteins studied are presented together to the proposed structure of a
multiproduct batch plant that produce them. In Section 3.3 the Problem Formulation and
the equations used for parameter estimation is presented. Results are discussed in Section 3.4
and conclusions are presented in Section 3.5.

3.2 Design of a biotechnological multiproduct


batch plant
3.2.1 Processes description
In this section the downstream processes of the 4 recombinant proteins studied are described.
According to Imperatore & Asenjo (2001) Product 1 is purified in 18 stages, while Products
2 and 3 need 20 and 15 isolation and purification steps, respectively. Product 4 on the other
hand, only needs 7 processing stages given its nature of extracellular product. Flowsheets of
the 4 processes are shown in Figures 3.1 to 3.4.

35
3.2 Design of a biotechnological multiproduct batch plant 3 Design of a real plant

Table 3.1 – Production data for the Purification facility (Imperatore & Asenjo, 2001).

Cycle Final batch Weeks of Num. Max. num.


Product time Suite of
size volume production of batches
(h) (kg) (L) batches

P1 120 6.005 285 1 41 48 49


P2 24 3.777 135 2 15 16 90
4 20 16 120
P3 32 7.385 285 3 41 95 184
4 11 25 49
P4 36 14.043 285 2 11 38 44

The original plant is divided into two facilites. In the first, fermentation and primary
purification stages are included; in the second, high resolution steps are performed in 4
different suites, each processing one or more products per year. General production data for
these suites is presented in Table 3.1. The maximum number of batches per year, in the last
column, corresponds to the maximum capacity of the plant given the cycle times, the annual
production weeks for each process and 144 work hours per week.

Product 1
Product 1 (P1) is an intracellular protein synthesized by S. cerevisiae in a fermentation
process that lasts 23 hours. After this step a concentration stage follows performed by a
microfilter were cells are collected in the retentate stream for a subsequent homogenization
step. In order to wash the homogenization buffer a concentration together with a
diafiltration process follows before cellular deactivation in a stirred tank. After that, a
new homogenization step is performed followed by a sucrose gradient centrifugation for the
collection of a small fraction containing the protein of interest.
At another facility, the purification process is continued with a solubilization process and
a centrifugation to concentrate the product. Then the mixture is subjected to a reducing
environment followed by a gel filtration chromatography and a reaction of oxidation to refold
the protein. Before the next chromatographic step -a reversed phase chromatography- a
cross-flow ultrafiltration is performed to concentrate the product mixture. The protein of
interest is then precipitated and centrifuged before a dissolution step to be later fed to a gel
filtration chromatographic colum. Finally a concentration and diafiltration step is performed.

Product 2
Product 2 (P2) is an intracellular protein synthesized by E. coli as inclusion bodies in a
fermentation process that lasts 11.5 hours. Cell harvest is performed using a centrifugation
step followed by the cell lysis and the harvesting of the inclusion bodies. After 3 consecutive
washing and centrifugation steps the protein is subjected to a an oxidative environment
followed by the refolding of the protein. Before a cation exchange chromatography a

36
3.2 Design of a biotechnological multiproduct batch plant 3 Design of a real plant

Fermenter Microfilter1 Homogenizer1 Microfilter2


Fermentation Concentration Cell Lysis Conc/Diafiltration

Reactor1 Homogenizer2 Centrifuge1 Tank1


Host cell inact. Homogenization Sucrose Solubilization
gradient cent.

Centrifuge2 Reactor2 Chromatographic C1 Reactor3


Concentration Reduction Gel Filtration Oxidation

Ultrafilter1 Chromatographic C2 Tank2 Centrifuge3


Concentration RP-HPLC Precipitation Concentration

Tank3 Chromatographic C3 Ultrafilter1


Dilution Gel Filtration Conc./Diafiltration

Figure 3.1 – Production process for Product 1, an intracellular protein synthesized in


Saccharomyces cerevisiae.

37
3.2 Design of a biotechnological multiproduct batch plant 3 Design of a real plant

microfiltration setp is carried out in order to concentrate the mixture and then, in the second
facility, 2 more chromatographic stages are carried out: gel filtration and anion exchange. A
dilution and concentration steps followed by a new gel filtration and a hydrophobic interaction
chromatography precede a concentration and diafiltration step as the last stage of the process.

Product 3
Product 3 (P3) is also an intracellular protein synthesized by E. coli as inclusion bodies but in
a fermentation process that lasts 25 hours. Cell harvest is performed using a centrifugation
step followed by a cell washing, a cell concentration and followed by cell disruption. A
new centrifugation step is carried out in order to capture the inclusion bodies which are
then suspended an subjected to a reducing environment. The next stage, in which the
protein is refolded, is followed by a concentration, a cation exchange chromatography and
a concentration and diafiltration step. In the second facility, three chromatographic stages
-anion exchange, hydrophobic interaction and cation exchange- are performed before the last
concentration and diafiltration step.

Product 4
Product 4 (P4) is a recombinant protein synthesized by S. cerevisiae as an extracellular
product in a fermentation process that lasts 122 hours. After fermentation a centrifugation
step to discard cells is performed followed by a microfiltration step to clarify the stream
before two chromatographic stages: a cationic exchange chromatography and an hydrophobic
interaction chromatography.
In the second facility, two ion exchange chromatographies follow the process -anion and
cation exchange, respectively- ending with a concentration and diafiltration stage.

3.2.2 Estimation of processes data and plant cost


Equipment sizes
Based on mass balances and the processes data given by Imperatore & Asenjo (2001),
different equipment sizes for each productive process were estimated. A 20% safety factor
was considered for tanks in general and a 15% for semi-continuous items different from
chromatographic columns that were sized using a relatively low capacity usage.
Estimated equipment sizes are given in Tables 3.2 to 3.5. The Times column corresponds
to the number of times that the stage is used before passing to the next step. V , V 1 and V 2
are the sizes of batch, feed and product tanks, respectively; R is the semi-continuous item
size; V 3 the chromatographic column size and T the operation time of the stage in hours.
Tanks, reactors and chromatographic columns are sized in L, membrane filtration systems
in m2 , centrifuges in 1000 · m2 and homogenizers in L/h. For all products, V 2 in the last
membrane filtration step is the size of the final product.

38
3.2 Design of a biotechnological multiproduct batch plant 3 Design of a real plant

Fermenter Centrifuge1 Homogenizer1 Centrifuge2


Fermentation Cell Harvest Cell Lysis IB Harvest

Tank1 Centrifuge3 Tank2 Centrifuge4 Tank3


IB Wash IB Capture IB Wash IB Capture IB Capture

Centrifuge5 Reactor4 Reactor5 Microfilter1


Extraction Oxidation Ox. titration Concentration

Chromatographic C1 Chromatographic C2 Chromatographic C3 Tank4


Cation Exchange Gel Filtration Anion Exchange Dilution

Microfilter2 Chromatographic C4 Chromatographic C5


Conc./Diafiltration Gel Filtration HIC

Microfilter3
Conc./Diafiltration

Figure 3.2 – Production process for Product 2, an intracellular protein synthesized in


Escherichia coli.

39
3.2 Design of a biotechnological multiproduct batch plant 3 Design of a real plant

Fermenter Centrifuge1 Tank1 Centrifuge2


Fermentation Cell Harvest Cell Wash Concentration

Homogenizer1 Centrifuge3 Tank2 Reactor1 Reactor2


Cell Disruption IB Capture IB Suspension Reduction Oxidation

Microfilter1 Chromatographic C1 Microfilter2


Concentration SP-Sepharose FF Conc./Diafiltration

Chromatographic C2 Chromatographic C3 Chromatographic C4


Q-Sepharose HP Butyl HIC SP-Sepharose HP

Microfilter3
Conc./Diafiltration

Figure 3.3 – Production process of Product 3, an intracellular protein synthesized in


Escherichia coli.

40
3.2 Design of a biotechnological multiproduct batch plant 3 Design of a real plant

Table 3.2 – Original equipment sizes and operation times for the production of Product 1, an
intracellular protein synthesized in S. cerevisiae.

Stage Times V/V1 V2 V3 /R T


1 fer 1 12500 23
2 mf1 1 12500 21 2.5
Fermentation

3 hom1 1 1964 3697 3


4 mf2 1 1964 17 8
5 rct1 1 1473 8
6 hom2 1 1621 4067 3
7 cnt1 1 17826 463 56 15
8 tnk1 8 250 2
9 cnt2 8 250 38 95 5
10 rct2 4 225 1
11 chr1 4 225 203 453 24
Purification

12 rct3 4 405 6
13 uf1 1 1620 3 2
14 chr2 8 176 159 50 4
15 tnk2 8 635 1.5
16 cnt3 8 635 63 399 3
17 tnk3 5 226 2
18 chr3 5 226 90 254 24
19 mf3 1 452 356 21 6

41
3.2 Design of a biotechnological multiproduct batch plant 3 Design of a real plant

Table 3.3 – Original equipment sizes and operation times for the production of Product 2, an
intracellular protein synthesized in E. coli.

Stage Times V/V1 V2 V3 /R T


1 fer 1 15000 11.5
2 cnt1 1 15000 1500 15 5
3 hom1 1 1650 2718 4
4 cnt2 1 1650 413 39 2
5 tnk1 1 1650 1
Fermentation

6 cnt3 1 1650 413 52 1.5


7 tnk2 1 1650 1
8 cnt4 1 1650 413 78 1
9 tnk3 1 1650 1
10 cnt5 1 1650 413 389 8
11 rct1 1 743 12
12 rct2 1 2673 18
13 mf1 1 2673 16 4
14 chr1 1 668 601 308 10
15 chr2 1 601 541 5600 24
16 chr3 2 271 244 51 10
Purification

17 tnk4 1 1063 2
18 mf2 1 1063 3 4
19 chr4 1 750 675 1800 24
20 chr5 2 338 304 141 5.6
21 mf3 1 608 169 10 7.5

42
3.2 Design of a biotechnological multiproduct batch plant 3 Design of a real plant

Table 3.4 – Original equipment sizes and operation times for the production of Product 3, an
intracellular protein synthesized in E. coli.

Stage Times V/V1 V2 V3 /R T


1 fer 1 12500 25
2 cnt1 1 12500 1250 15 4
3 tnk1 1 12500 1
4 cnt2 1 12500 2500 7 9
Fermentation

5 hom1 1 2750 4529 4


6 cnt3 1 2750 344 11 12
7 tnk2 1 375 1
8 rct1 1 3125 1
9 rct2 1 4688 26
10 mf1 8 586 5 3
11 chr1 1 1000 1100 462 9
12 mf2 1 1100 34 6
Purification

13 chr2 1 550 495 283 12


14 chr3 4 124 111 126 8
15 chr4 4 111 100 63 8
16 mf3 1 401 356 31 4

Table 3.5 – Original equipment sizes and operation times for the production of Product 4, an
extracellular protein synthesized in S. cerevisiae.

Stage Times V/V1 V2 V3 /R T


1 fer 1 9125 122
Fermentation

2 cnt1 1 9125 1014 1 12


3 mf1 2 507 425 1 5
4 chr1 3 283 255 308 15
5 chr2 6 128 115 173 8
Purification

6 chr3 3 230 207 60 12


7 chr4 3 207 186 156 8
8 mf2 1 558 356 9 15

43
3.2 Design of a biotechnological multiproduct batch plant 3 Design of a real plant

Fermenter Centrifuge1 Microfilter1 Chromatographic C1


Fermentation Cell Slurry Clarification SP-Sepharose FF

Chromatographic C2 Chromatographic C3 Chromatographic C4


HIC Q-Sepharose FF SP-Sepharose FF

Microfilter2
Conc./Diafiltration

Figure 3.4 – Production process of Product 4, an extracellular protein synthesized in


Saccharomyces cerevisiae.

Equipment costs
Units suchs as tanks, reactors and fermenters may be built according to customers needs
therefore the information needed for their sizing corresponds to cost coeficients for functions
γ
of the type cj Vj j and lower and upper size bounds. Original data for fermenter and reactor
was taken from Iribarren et al. (2004) and for tanks, from Harrison et al. (2015). The
corresponding data is presented in Table 3.6.
For the case of the semi-continuous items costs are determined according to the selected
unit offered by the manufacturer. For these cases cost and equipment sizes are given in
discrete sets. Data used in this paper is presented in Table 3.7.

3.2.3 Multiproduct batch plant


Given the description of the 4 downstream processes it is possible to suggest that some
equipment can be shared by 2 or more processes as is the case of the centrifugation step used
for cell harvesting in the isolation processes of Products 2 and 3.
Isolation and purification processes for each individual protein and the identification of
the stages than can be shared by 2 or more processes in the proposed multiproduct batch
plant is shown in Table 3.8. For comparison purposes, stages are organized as “Fermentation”
and “Purification” steps.

44
3.3 Problem formulation 3 Design of a real plant

Table 3.6 – Cost coefficients and variable bounds needed to size batch units. Costs can be
γ
calculated in U.S.$ with the function cj Vj j . Data was actualized to year 2012 using CE index:
year 2000, 394.1 ; year 2012, 584.6.

Cost coefficients Size bounds (L)


Item
cj γj lower upper
Fermenter 1491 0.6 20 100000
Reactor 1454 0.5 20 100000
Tank 35945 0.1168 200 5000
492 0.6217 5000 50000

3.3 Problem formulation


3.3.1 Mathematical modeling
The routing formulation proposed by Sandoval et al. (2016) permits the sizing of the
equipment according to each of the different type of stages, j, present in a biotechnological
multi-product batch plant. In this article, that formulation was modified to account for the
selection of semi-continuous items in a discrete set of sizes and costs. The model considers
the duplication of units in parallel working in-phase, using continuous variables x1j , and out-
of-phase, using continuous variables x2j ; it also accounts for the selection of the global process
selecting routes which were defined as different sets of downstream processing stages used
to purify a protein given a certain host that synthesizes it. The selection of one route per
protein and therefore the use of just some of the possible stages is carried out by the variables
1
zih and zj2 . The former is 1 when for product i synthesis host h is selected and 0 otherwise
and the last is 1 when stage j is part of at least one of the routes selected and 0 otherwise.
Main variations are presented in the following paragraphs.

Objective function
The objective is to minimize the investment cost of the main equipment of the plant given
fixed production targets, di , over a time horizon δ. As the original cost functions are non-
linear, continuous variables vj. are defined to transform former functions into sets of linear
functions. The resulting objective function is given by equation (3.1) that is associated to
constraints as (3.2) for batch stages different to stirred tanks, such as fermenters and reactors
in subset Ê 1 .
X X  
min cost = vj1 + vj1 + vj2 + vj3 (3.1)
j∈E 1 j∈E 2 ∪E 3
  2
vj1 ≥ αk1j x1j + x2j + γj1 yj1 + βk1j − αk1j b1j
k zj ∀j ∈ Ê 1 , k ∈ K1 (3.2)

45
3.3 Problem formulation 3 Design of a real plant

Table 3.7 – Available cost and equipment sizes for semi-continuous units. Costs are in 1000
U.S.$. Data actualized with CE index: year 1998, 389.5 ; year 2012, 584.6.

Item Size Cost


Micro/ultrafilter (m2 ) 5 150
(Harrison et al., 2015) 15 175
30 210
55 230
Centrifuge (1000 m2 ) 10 60
(Harrison et al., 2015) 50 85
100 190
210 505
Homogenizer (L/h) 55 15
(Harrison et al., 2003) 105 38
290 53
700 83
2100 110
4500 155
Chromatographic column (L) 1 8
(Harrison et al., 2015) 2 10
4 12
8 17
15 20
35 90
65 190
150 240
250 300
400 400
580 600
1000 800
1600 1100

46
3.3 Problem formulation 3 Design of a real plant

Table 3.8 – Downstream processing stages that conform a multiproduct biotecnological batch
plant that produces 4 different recombinant proteins synthesized in E. coli and S. cerevisiae as
intra and extracellular products.

Stage Description P1 P2 P3 P4
1 fer Fermentation x x x x
2 mf1 Concentration x
3 cnt1 Cell harvest x x
4 tnk1 Cell wash x
5 cnt2 Cell concentration x
6 hom1 Cell lysis x x x
7 mf2 Conc./ Diafiltration x
8 rct1 Cellular inactivation x
9 hom2 Homogenization x
10 cnt3 Concentration x
11 cnt4 IB harvest x
Fermentation

12 tnk2 IB wash x
13 cnt5 IB capture x
14 tnk3 IB wash x
15 cnt6 IB capture x x
16 tnk4 IB suspension x x
17 cnt7 Extraction x
18 rct2 Reduction x
19 rct3 Oxidation x x
20 rct4 Oxidation titration x
21 mf3 Concentration x x
22 cnt8 Cell slurry x
23 mf4 Clarification x
24 chr1 SP-Sepharose FF x x x
25 mf5 Conc./ Diafiltration x
26 chr2 HIC x
27 tnk5 Solubilization x
28 cnt9 Concentration x
29 rct5 Reduction x
30 chr3 Gel filtration x x
31 rct6 Oxidation x
32 uf1 Concentration x
33 chr4 RP-HPLC x
Purification

34 tnk6 Precipitation x
35 cnt10 Concentration x
36 chr5 Q-Sepharose FF x x
37 chr6 Q-Sepharose HP x
38 tnk7 Dilution x x
39 mf6 Concentration x
40 chr7 Gel filtration x x
41 chr8 HIC x x
42 chr9 SP-Sepharose FF x
43 chr10 SP-Sepharose HP x
44 mf7 Conc./ Diafiltration x x x x
47
3.3 Problem formulation 3 Design of a real plant

where αk1j , b1j 1j


k and βk are the parameters used to build the set of linear constraints and K
1

the set of cutting points used.


For these mentioned batch stages former cost functions are given by equations as (3.3).

cost = c1j · exp x1j + x2j + γj1 yj1 ∀j ∈ Ê 1 (3.3)
And for the particular case of stirred tanks in set Ē 1 -as a singular batch step or as a part
of a semi-continuous stage-, found cost data (see Table 3.6) shows a two piecewise function
that was modeled using constraint (3.4).
 
cost = c1s 3 1s 1 1b 3 1b 1
j · exp xj + γj yj + cj · exp xj + γj yj ∀j ∈ Ē 1 (3.4)
where x3j = x1j + x2j if the stirred tank is a batch stage or x3j = x2j if the tank is part of a
semi-continuos stage.
When stirred tanks form part of semi-continuous or chromatographic stages, variables vj1
and vj2 are constrained with equations as (3.5) and (3.6).
  2
vj1 ≥ αk1sj x3j + γj1s yj1 + βk1sj − αk1sj b1sj
k zj ∀j ∈ Ē 1 , k ∈ K1 (3.5)
  
vj1 ≥ αk1bj x3j + γj1b yj1 + βk1bj − αk1bj b1bj
k zj2 ∀j ∈ Ē 1 , k ∈ K1 (3.6)

with αk1bj , αk1sj , b1bj 1sj


k , bk , βk
1bj
and βk1sj being the parameters used to build the set of linear
constraints and K1 the set of cutting points used.
Finally, the costs of semi-continuous and chromatographic units are constrained using
equations as (3.7). c3∗ j , in dimensions equivalent to the cost of other equipment, is calculated
3
as yj in constraints (3.8) and (3.9).
  2
vj3 ≥ αk3j x1j + x2j + cj3∗ + βk3j − αk3j b3j
k zj ∀j ∈ E 2 ∪ E 3 , k ∈ K3 (3.7)

Constraints
Sizing and timing constraints are basically the same as those presented by Sandoval et al.
(2016). Major differences are given by the introduction of binary variables for the selection
of the available equipment sizes and costs.
As was previously stated semi-continuous items including chromatographic columns are
only available in a discrete number of sizes. In order to achieve this type of selection variables
5
as zjk were introduced to the model which are restricted by constraints (3.8) and (3.9) that
are of the same type of those used to define the number of parallel units in each stage.
X
yj3 = 5
zjk ln(uk ) ∀j ∈ E ∗ (3.8)
k∈K8
X
5
zjk = zj2 ∀j ∈ E ∗ (3.9)
k∈K8
where uk is k-th element of the set of available equipment sizes, UE ∗ , defined for each subset
of stages E ∗ , that is to say, centrifuges, micro/ultrafilters, homogenizers and chromatographic
stages.

48
3.3 Problem formulation 3 Design of a real plant

Table 3.9 – Data used to estimate size and time factors for Product 1, an intracellular protein
syntehsized in S. cerevisiae.

Stage η X f N Jconc NP vs
fer 1.000 1.000 0.000
mf 0.990 7.000 0.000 0 0.200
hom 0.900 1.000 0.100 6
mf 0.991 2.000 0.000 8 0.080
rct 0.950 1.000 0.500
hom 0.810 1.000 0.100 8
cnt 0.850 24.971 10.000 20.0
tnk 0.990 1.000 3.202
cnt 0.850 6.667 0.000 0.5
rct 0.950 1.000 2.000
gf 0.950 1.667 0.000
rct 0.950 1.000 1.000
mf 0.998 1.148 0.000 0 0.030
chr 0.980 1.667 0.000
tnk 0.750 1.000 3.000
cnt 0.850 9.000 0.000 0.5
tnk 0.980 1.000 1.000
gf 0.950 1.667 0.000
mf2 0.993 1.267 0.000 8 0.030

3.3.2 Size and time factors


Given the knowledge of each individual downstream processing stage and of the corresponding
equipment parameters mass balances were solved and constant size and time factors were
computed. Size factors for product i in stage j (in g/L) for items in batch stages, Sij ; and
for feed and product tanks, Sij1 and Sij2 , respectively, are calculated using equation (3.10):
Vj
Sij = (3.10)
Bi
with Vj [L] being the actual tank volume in Tables 3.2 to 3.5; and Bi [g], equal to Yi4 in the
nomenclature used by Sandoval et al. (2016), the batch size of the product i at the end of
the process shown in Table 3.1. This last parameter was estimated using the broth volume
in the fermenter (80% of the tank volume), the concentration of product in the fermentation
broth -2.5 g/L for Product 1; 2 g/L for Products 2 and 3; 3 g/L for Product 4-, and the
overall mass yield given by the product of the yields in each stage, η, shown in Tables 3.9
to 3.12.
Operation time for each non-batch stage is modelled with equation (3.11):

49
3.3 Problem formulation 3 Design of a real plant

Table 3.10 – Data used to estimate size and time factors for Product 2, an intracellular protein
syntehsized in E. coli.

Stage η X f N Jconc NP vs
fer 1.000 1.000
cnt 0.850 10.000 200.0
hom 0.810 1.000 0.100 7
cnt 0.800 3.000 20.0
tnk 0.950 1.000 2.000
cnt 0.800 4.000 20.0
tnk 0.950 1.000 2.000
cnt 0.800 4.000 20.0
tnk 0.950 1.000 2.000
cnt 0.800 4.000 0.5
rct 0.950 1.000 24.000
rct 0.950 1.000 2.600
mf 0.983 30.000 0 0.030
chr 0.950 1.111
gf 0.950 1.111
chr 0.950 1.111
tnk 0.950 1.000 1.094
mf 0.997 1.417 0 0.030
gf 0.950 1.111
chr 0.950 1.111
mf2 0.991 3.600 8 0.030

50
3.3 Problem formulation 3 Design of a real plant

Table 3.11 – Data used to estimate size and time factors for Product 3, an intracellular protein
syntehsized in E. coli.

Stage η X f N Jconc NP vs
fer 1.000 1.000
cnt 0.800 10.000 200.0
tnk 0.980 1.000 9.000
cnt 0.800 5.000 200.0
hom 0.910 1.000 0.100 7
cnt 0.950 8.000 20.0
tnk 0.950 1.000 0.091
rct 0.950 1.000 24.000
rct 0.950 1.000 0.500
mf 0.987 14.063 0 0.030
chr 0.950 0.909
mf 0.991 2.000 8 0.030
chr 0.950 1.111
chr 0.950 1.111
chr 0.950 1.111
mf2 0.992 1.125 8 0.030

Table 3.12 – Data used to estimate size and time factors for Product 4, an extracellular protein
syntehsized in S. cerevisiae.

Stage η X f N Jconc NP vs
fer 1.000 1.000
cnt 0.800 9.000 7000.0
mf3 0.992 6.186 3 0.200
chr 0.950 1.111
chr 0.950 1.111
chr 0.950 1.111
chr 0.950 1.111
mf2 0.992 1.565 8 0.030

51
3.3 Problem formulation 3 Design of a real plant

Bi
Tij = Tij0 + Tij1 (3.11)
Rj
with Tij [h] being the time needed for the step j to process i; Tij0 [h] the equipment startup
time; Tij1 a constant time factor or duty factor; and Rj , equal to Yj3 in previous nomenclature,
the semi-continuous equipment size in units according to the equipment type (see Table 3.7).
Data used to estimate time or duty factors are given in Tables 3.9 to 3.12. In these tables
η is a mass yield, X is a volume reduction factor and f a dilution factor. For membrane
filtration systems, N is a ratio between the buffer volumes added and the fixed retentate
3
volume; and Jconc in mm2 ·h is a flux in the concentration process. For the case of homogenizers,
N P is the number of passes through the homogenizer; and finally, vs in 103 mm/h is the
settling velocity needed in the case of centrifuges.

Cross-flow filtration
This step is used for the removal of suspended particles, recovery of cells from fermentation
broth, and clarification of homogenates containing cell debris and both of them may be
followed by the diafiltration of the retentate (stream with larger particles) (Green & Perry,
2007). This last step is essentially a washing step that can be used either to remove more
impurities or to increase yield by recovering more product as permeate in clarification process.
Diafiltration is performed maintaining constant the level of the feed or retentate tank by the
addition of a suitable solvent while the permeate is removed through the membrane (Hearn,
2000).
For a batch operation the design equation of the filtration unit is given by equation (3.12)
(Green & Perry, 2007):
!
V0 1 − X1 N
A= + X (3.12)
t Jconc Jdiaf
where A is the membrane area, V0 is the initial retentate volume, X < 40 is the volume
reduction factor given by the ratio between the initial retentate volume and the final retentate
volume, N is the ratio between the buffer volumes added and the fixed retentate volume.
Jconc is the flux in the concentration process (volumetric permeate flow rate/membrane area)
and Jdiaf is the flux in the diafiltration process. Values for Jconc went from 0.2 in first stages
to 0.03 in the final step (Iribarren et al., 2004). Jdiaf should be smaller than Jconc (Hearn,
2000); their value was considered as 43 Jconc when necessary.
Operation time can be written from equation (3.12) as equation (3.13):
!
1 N
0 V0 [m3
] 1 − Bi [kg]
Ti,mf [h] = Ti,mf [h] + X 3  +
X
 m3  (3.13)
m
Bi [kg] Jconc m2 ·h Jdiaf m2 ·h Amf [m2 ]
1
Then, time factor Ti,mf , can be computed using equation (3.14):
 2  !
1 N
1 m · h V 0 [m 3
] 1 − 1[kg] 1
Ti,mf = Xm 3  +
X
 m3  · · (3.14)
g Bi [kg] Jconc m2 ·h Jdiaf m2 ·h 1000[g] 0.85

52
3.3 Problem formulation 3 Design of a real plant

Finally, this stage yield is given by equation (3.15a) when this step is used for
concentration and by equation (3.15b) when this step is used for clarification. Si is the
observed solute passage that can be computed as the ratio between the concentration of the
protein in the permeate and its concentration in the feed stream. If the solute is fully retained
Si = 0 and 1 for a fully passing solute (Green & Perry, 2007).

ηi,mf = e−Si (N +ln X) (3.15a)


 
ηi,mf = 1 − X −Si 1 − e−Si N (3.15b)

Centrifugation
Centrifugation utilizes the density difference between the solids and the surrounding fluid
and is often used when solid particles are small and hard to filter (Bell, 1989).
Centrifuges costs are estimated using a Σ factor that is equivalent to a transversal area.
Based on the settling velocity of the solid, vs , and the volumen to be treated, V0 , this factor
can be computed using equation (3.16):

(V0 [m3 ])
Σ[m2 ] =   (3.16)
(t [h]) vs m h
1
Then, time factor Ti,cnt , is computed using equation (3.17):

V [m3 ] 1 Bi [g]
0
Ti,cnt [h] = Ti,cnt [h] +  m0  (3.17)
vs h · Bi [g] 10 Σ [103 m2 ]
6

from where
 
1 m2 · h V0 [m3 ] 1
Ti,cnt = 3  mm  (3.18)
g 10 · vs h · Bi [g] 0.85
Based on Hatti-Kaul & Mattiasson (2003) the settling velocity was estimated to be
0.2 mm/h for E. coli and 7 mm/h for S. cerevisiae.

Homogenization
For high pressure homogenizers, the fraction of protein released depends on the operational
pressure and the number of passes through the homogenizer (Doran, 2012).
According to Pinto et al. (2001) homogenization time is proportional to the volume fed to
the homogenizer, Vi,hom (L) = V0,hom · N Pi,hom , and inversely proportional to the homogenizer
capacity, Capi,hom (L/min), where V0,hom corresponds to the volume received from previous
stage plus a 10% extra volume of lysis buffer if needed (Harrison et al., 2003). With this,
equation (3.11) takes the form of equation (3.19):

0 (V0,hom [L]) (N Pi,hom ) Bi [g]


Ti,hom [h] = Ti,hom [h] +   (3.19)
Bi [g] Capi,hom Lh
1
Therefore time factor Ti,hom can be calculated using equation (3.20),

53
3.3 Problem formulation 3 Design of a real plant

 
1 L (V0,hom [L]) (N Pi,hom ) 1
Ti,hom = (3.20)
g Bi [g] 0.85
Based on data given in Bell (1989) and Clonis (1990) the number of passes N P was
considered to be 7 in the case of E. coli and 8 for S. cerevisiae. As no especific kinetic data
was known, yields were considered to be between 0.8 and 0.9.

Chromatographic separations
As explained by Doran (2012), the basis of chromatography is the selective retardation of
solute molecules during passage through a bed of resin particles.
In the processes studied, two major types of liquid chromatographic separations can be
found: gel filtration and adsorption chromatography.
Adsorption chromatographic columns are sized taking into account the amount of protein
that can be adsorbed into the column, Bi,chr , which is in turn related to the final batch size
(Pinto et al., 2001):
  
3
 kg Bi [kg]
Bi,chr [kg] = Vchr [m ] (πi ) βi,chr = (3.21)
m3 Y
NE
ηin
n=chr

with Vchr being the column volume; βi,chr the column capacity; πi the fraction of the maximum
capacity that is being used by the adsorbed protein; and N E the number of total stages.
From this, the column size factor can be computed with equation (3.22):
 
3 L 1 Vchr [L]
Si,chr = = (3.22)
g YN E Bi [g]
πi βi,chr ηin
n=chr

The operation time of the chromatographic step is given by (3.23):


Vf eed + Vwash + Velution + Vregeneration
Ti,chr = (3.23)
Achr vchr
with Vf eed the volume of solution with protein to be purified, Vwash the buffer wash volume
used to eliminate proteins not bound to the resin, Velution the volume of buffer used to recover
the protein and Vregeneration the volume of buffer used to regenerate the column.
Considering the time factors, the operation time can be computed using equation (3.24):

0 ([Vf eed + Velution ] [m3 ]) (h [m]) Bi [kg]


Ti,chr [h] = Ti,chr [h] +  m  (3.24)
(Bi [kg]) vchr h (Achr [m2 ])(h[m])

If the column hight is set to 0.25 m based on data of Imperatore & Asenjo (2001), and
3 column volume are used to wash and regenerate the resin, respectively, then the constant
time factor can be computed as:

54
3.4 Results and Discussion 3 Design of a real plant

0 6 · A[m2 ] · h[m] 1.5


Ti,chr [h] = m = (3.25)
A[m2 ] · v h v
Finally, factor T 1 is calculated using equation (3.26).

  0

1 L·h ([Vf eed + Velution ] [m3 ]) (h [m]) Ti,chr − Ti,chr [h] · Vchr [m3 ]
Ti,chr =  m  = (3.26)
g (Bi [kg]) vchr h Bi [kg]
In gel filtration, a constant time was considered independent of the feed stream.
For adsorption chromatography a velocity of 5.5 m/h was considered for ionic-exchange
resins and for hydrophobic resins, 4 m/h. These velocities were defined based on
GE Healthcare Life Sciences handbooks that can be downloaded from their web page
(http://www.gelifesciences.com/ ).

3.3.3 Computational tools / Execution environment


The MILP model studied was coded using the AMPL modelling language and different
instances were solved using the commercial CPLEX solver in its version 12.4.0.0. The
execution environment was given by a single thread on a Intel(R) Xeon(R) CPU
E5620@2.40GHz with an optimality relative gap of 0.1% and 256 cutting points for an a
posteriori gap up to 0.12%.

3.4 Results and Discussion


3.4.1 Cost of the real plant
The real plant considers both fermentation and purification facilities.
The fermentation facility produces the 4 products in a single productive line, therefore
cost was estimated based on a plant configuration as that shown in Table 3.8. The equipment
sizes for each stage were set as the maximum size needed among the 4 processes; and for
semi-continuos items, that size was selected among those available.
The purification facility cost was estimated as the addition of the 4 suites, each one
sharing the same stages as in Table 3.8. For those units, the bigger size among the processes
was considered.
Estimated costs are summarized in Table 3.14 along with a comparison between the
original and the optimized facilities. These results are discussed in Sections 3.4.3 and 3.4.4.

3.4.2 Number of cutting points and a posteriori gaps


As established in previous work (Sandoval et al., 2016) the approach of defining lower and
upper approximations for non-linear functions in constraints and the objective function gives
true upper and lower approximations for the actual optimal plant cost. Therefore a small a
posteriori gap between both approximations is expected.

55
3.4 Results and Discussion 3 Design of a real plant

Table 3.13 – Average execution time and relative a posteriori gaps for the purification facility
instance solved 100 times.

Execution time (s)


Cutting points gap (%)
upper approx. lower approx.
32 3.78 1.83 0.54
64 5.05 6.46 0.30
128 13.10 13.79 0.13
256 38.65 32.09 0.11

Taking the purification facility as an example to study, different number of cutting points
were considered to solve the instance: 32, 64, 128 and 256. Execution times and a posteriori
gaps were obtained for a set of 100 runs. Results are shown in Table 3.13. 256 cutting points
were selected to solve the instances studied since this gives an a posteriori gap very close
to the solver optimality relative gap and take less than a minute to solve the instance. For
bigger instances such as that for the overall multiproduct batch plant, a larger time was
obtained but is still small enough (about 6 minutes) considering that the a posteriori gap is
close 0.11%.

3.4.3 Original purification facility versus corresponding


stages in a multi-product batch plant
The original purification facility is divided into 4 suites and the maximum capacity of
production for a time horizon of one year is shown in Table 3.1. With this information
plus the equipment cost data in Tables 3.6 and 3.7 the optimization of the purification plant
defined as a part of the hyphothetical multiproduct batch plant in Table 3.8 was carried
out. The optimization results together to a comparison between real and optimized costs are
presented in Table 3.14.
As expected, if the optimization is performed over each individual suite the optimized
suite is between 20 to 46% less expensive, and more over, if a unique production line is
considered, costs can be saved up to a 32% in comparison to the real case. The difference is
even bigger if the actual production plant is taken into account with differences up to 68%
in the equipment costs (data not shown). In that scenerio an optimization model as the one
presented in this article is not just useful in the evaluation of the project but very necessary
to design an optimal and not oversized plant.

56
3.4 Results and Discussion 3 Design of a real plant

Table 3.14 – Comparison between the costs of the original and the optimized facilities
considering the maximum capacity of the original purification plant and a time horizon given
by the number of production weeks. Costs are calculated in U.S.$ based on data for year 2012.

Difference
Plant Original Optimized
(%)
Fermentation Plant 7 317 406
Purification Plant 21 487 703 14 561 500 32
Suite 1 3 494 168 1 893 510 46
Suite 2 8 010 547 6 182 200 23
Suite 3 1 511 030 1 117 190 26
Suite 4 8 417 957 6 619 020 22

3.4.4 Optimization of the “global” multiproduct batch


plant of 44 stages
The structure of the multiproduct batch plant proposed in section 3.2.3 was determined for a
time horizon of 5 904 hours and a production target estimated with the final batch size and
the maximum number of batches in Table 3.1. The structure of the plant was optimized to a
minimum cost of U.S.$ 26 000 900 and the equipment sizes shown in Table 3.15. Estimated
final batch size and cycle times for each product are presented in Table 3.16.
Notice that both duplication, in-phase and out-of-phase, were used in the optimized plant
decreasing the large time needed for the fermentation in Product 4 process and permiting
small semi-continuous equipment sizes in general.
A comparison to the real plant was not straightforward in this case given that the
fermentation plant did not produce the 4 proteins in the one year period. Diferent time
horizon, based on production target, were studied and results are shown in Table 3.17.
Maximum capacity refers to the production target computed with the maximum number
of batches that can be produced in the purification plant (in Table 3.1); and the actual
capacity to the actual number of batches produced in the same facility.
Based on real data of cycle times and the aforementioned capacity, fermentation plant
should take about 9 436 hours to produce the 4 products. An optimization of the 26 stages
conforming that plant over that time horizon and the maximum capacity computed from
Table 3.1 gives a cost of U.S.$ 5 378 660, a 26% lower than the calculated based on real sizes.
Finally, real data indicates that the total production of the 4 products given the real used
capacity takes about 15 004 hours. Running the optimization based on that data, the plant
cost reaches U.S.$ 9 899 380 that is almost a third of the cost calculated for the real plant.
This result makes even more evident the benefits of using this type of models for plant design.

57
3.4 Results and Discussion 3 Design of a real plant

Table 3.15 – Optimized structure of the multiproduct batch plant over a time horizon of 5
904 hours. The cost function is equal to U.S.$ 26 000 900.

Stage X1 X2 V1 V2 V3 /R
1 1 3 25714.3 . .
2 1 1 16479.7 . 5
3 1 1 25714.3 2571.43 50
4 1 1 21827.9 . .
5 1 1 21827.9 4365.58 50
6 1 1 4802.14 . 4500
7 1 1 2589.67 . 15
8 1 1 1942.25 . .
9 1 1 2136.47 . 700
10 1 1 23501.2 941.141 50
11 1 1 2828.57 942.857 10
12 1 1 2828.57 . .
13 1 1 2828.57 707.143 10
14 1 1 2121.43 . .
15 1 1 4802.14 600.267 50
16 1 1 1591.07 . .
17 3 1 1591.07 397.768 50
18 1 1 16372.3 . .
19 1 3 24558.4 . .
20 1 1 35799.1 . .
21 2 1 35799.1 . 55
22 1 1 18559.3 2062.15 10
23 1 1 2062.15 1728.79 5
24 1 1 1746.38 1921 1000
25 1 1 1921 . 55
26 1 1 1556.07 1400.6 400
27 1 1 3954.67 . .
28 7 1 3954.67 593.171 50
29 1 1 1779.51 . .
30 6 1 1779.51 1067.49 1600
31 1 1 2134.99 . .
32 1 1 2134.99 . 5
33 6 1 1859.75 1115.62 15
34 1 1 4462.5 . .
35 7 1 4462.5 495.833 50
36 8 1 1400.6 1260.67 15
37 2 1 960.498 864.457 400
38 1 1 1821.66 . .
39 1 1 1821.66 . 5
40 2 1 1285.85 1157.28 1600
41 1 1 1157.28 1041.56 1000
42 1 1 1260.67 1134.71 400
43 1 1 778.019 700.224 400
44 1 1 1134.71 . 30

58
3.4 Results and Discussion 3 Design of a real plant

Table 3.16 – Final batch size and cycle time of the 4 products produced in the multiproduct
batch plant optimized over a time horizon of 5 904 hours.

Product Final Batch size (kg) Cycle time (h)


P1 7.916 24
P2 6.410 24
P3 12.831 8.67
P4 28.562 40.67

Table 3.17 – Comparison between the costs of the original and the optimized facilities
considering different production targets and time horizon. Costs are calculated in U.S.$ based
on data for year 2012.

Time Optimized
Difference
Capacity horizon cost
(%)
(h) (U.S.$)
Multi-product Batch Plant. Actual cost U.S.$ 28 805 109.
Maximum capacity 5 904 26 000 900 10
23 744 12 291 600 57
Actual capacity 5 904 14 826 000 49
15 004 9 899 380 66
Fermentation Plant. Actual cost U.S.$ 7 317 406.
Maximum capacity 5 904 8 629 070 -18
16 333 5 890 440 20
Actual capacity 5 904 6 163 610 16
9 436 5 378 660 26
Purification Plant. Actual cost U.S.$ 21 487 703.
Purification Plant 5 904 14 561 500 32

59
3.5 Conclusions 3 Design of a real plant

3.5 Conclusions
In this article a modification of the model presented in a former article has been presented
in order to apply the model to the design of a multiproduct batch plant that produces 4
recombinant proteins with known processes. The new model includes discrete costs and
equipment sizes for semi-continuous items and preserves the selection of costs and sizes in a
continuous range for stirred tanks, reactors and fermenters.
The application of the model permited cost savings up to 66% of the cost of the main
equipment, showing that this tool is not just useful but necessary in order to design a plant
of the optimal and necessary size.
Lower level implementations (in C, C ++ ) could include the effect of variable cost and
production target parameters but this is beyond the scope of this article and was left for
a future investigation. In addition, as the time required to solve each instance is less than
30 seconds, there is plenty of space for continuing the addition of new and more complex
constraints.

60
Chapter 4

Main conclusions

In this work a scalable approach that can be applied for the design of real multi-product
batch plants is presented.
The proposed methodology was proven to be more numerically stable than other
alternative approaches for the same problem giving true optimal solutions, and in general,
faster than other tested approaches. The developed method takes advantage of two facts:
that the continuous relaxation of the feasible region is convex and bounded (which allows to
build, up front, inner or outer approximations of the feasible space, and thus reports true
upper/lower bounds for each instance); and the fact that mixed-integer linear solvers are
much more stable numerically and scalable in size than MINLP algorithms.
The incorporation of performance profiles -borrowed from the optimization community-
allowed an easy comparison between the proposed MILP formulations and their former
MINLP forms.
The linear nature of the proposed formulation permited the modification for the inclusion
of the selection of equipmtent sizes in a discrete set of available sizes. The inclusion of both,
discrete and continuous sizes, allowed for a better modeling for a real case scenario where
tanks and fermenters may be built according to customer needs; and semi-continuous items
such as centrifuges or membrane filtration systems may be found in discrete sets of sizes
given by their manufacturers.
The application of the model to the particular case of a plant that produces 4 proteins
in a downstream process of 43 stages showed that costs of main equipment may be saved up
to a 66%. This shows that this tool is not just useful but necessary to design an optimal
and not oversized plant. For other cases the application of this methodology should be
straightforward.
Lower level implementations (in C, C ++ ) could permit the study of more complex
constraints such as variable costs and production targets, nevertheless this is beyond the
scope of this work. In addition to that, as the model takes at most a few minutes to find
a global optimum, there is plenty of space to complexify the model for example including
operating costs in the objective function.

61
References

Applegate, D. L., Cook, W., Dash, S., & Espinoza, D. G. (2007). Exact solutions to linear
programming problems. Operations Research Letters, 35, 693–699.

Asenjo, J. A., Montagna, J. M., Vecchietti, A. R., Iribarren, O. A., & Pinto, J. M. (2000).
Strategies for the simultaneous optimization of the structure and the process variables of
a protein production plant. Computers & Chemical Engineering, 24, 2277–2290.

Barbosa-Póvoa, A. P. (2007). A critical review on the design and retrofit of batch plants.
Computers & Chemical Engineering, 31, 833–855.

Bell, G. (1989). Bioseparations: Downstream processing for biotechnology, volume 11. John
Wiley and Sons Inc., New York, NY.

Biegler, L. T., Grossmann, I. E., & Westerberg, A. W. (1997). Systematic methods for
chemical process design. Prentice Hall, Old Tappan, NJ (United States).

Bonami, P. & Lee, J. (2013). BONMIN Users’s Manual.

Borisenko, A., Kegel, P., & Gorlatch, S. (2011). Optimal design of multi-product batch
plants using a parallel branch-and-bound method. In V. Malyshkin (Ed.), Lecture Notes in
Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture
Notes in Bioinformatics), volume 6873 LNCS of Lecture Notes in Computer Science (pp.
417–430). Springer Berlin Heidelberg.

Bosch, R. & Trick, M. (2005). Integer Programming. In E. K. Burke & G. Kendall


(Eds.), Search methodologies: Introductory Tutorials in Optimization and Decision Support
Techniques. Springer.

Clonis, Y. D. (1990). Separation processes in biotechnology. Process affinity chromatography.,


volume 9. CRC Press.

Corsano, G., Aguirre, P. A., & Montagna, J. M. (2009). Multiperiod design and planning of
multiproduct batch plants with mixed-product campaigns. AIChE journal, 55, 2356–2369.

Corsano, G., Iribarren, O. a., Montagna, J. M., Aguirre, P. a., & González Suarez, E. (2005).
Optimization model for the synthesis, design and operation of a batch-semi continuous
ethanol plant. In 2nd Mercosur Congress on Chemical Engineering; 4th Mercosur Congress
on Process Systems Engineering (pp. 1–10).

62
REFERENCES REFERENCES

Crowder, H., Johnson, E. L., & Padberg, M. (1983). Solving large-scale zero-one linear
programming problems. Operations Research, 31, 803–834.

Czyzyk, J., Mesnier, M. P., & More, J. J. (1998). The NEOS Server. Computational Science
& Engineering, IEEE, 5, 68–75.

Dietrich, B. L., Escudero, L. F., & Chance, F. (1993). Efficient reformulation for 0–1
programs—methods and computational results. Discrete Applied Mathematics, 42, 147–
175.

Dietz, A., Azzaro-Pantel, C., Pibouleau, L., & Domenech, S. (2005). A Framework for
Multiproduct Batch Plant Design with Environmental Consideration: Application To
Protein Production. Industrial & Engineering Chemistry Research, 44, 2191–2206.

Dolan, E. D. (2001). NEOS Server 4.0 administrative guide. Technical report, Technical
Memorandum ANL/MCS-TM-250, Mathematics and Computer Science Division, Argonne
National Laboratory.

Dolan, E. D. & Moré, J. J. (2002). Benchmarking optimization software with performance


profiles. Mathematical Programming, 91, 201–213.

Doran, P. M. (2012). Bioprocess Engineering Principles, 2nd Edition. Academic Press.

Durand, G. A., Mele, F. D., & Bandoni, J. A. (2012). Determination of storage tanks location
for optimal short-term scheduling in multipurpose/multiproduct batch-continuous plants
under uncertainties. Annals of Operations Research, 199, 225–247.

Durand, G. A., Moreno, M. S., Mele, F. D., Montagna, J. M., & Bandoni, A. (2014).
Comparing the performances of two techniques for the optimization under parametric
uncertainty of the simultaneous design and planning of a multiproduct batch plant.
Iberoamerican Journal of Industrial Engineering, 5, 42–54.

Floudas, C. A. (1995). Nonlinear and mixed-integer optimization: fundamentals and


applications. Oxford University Press.

Fumero, Y., Corsano, G., & Montagna, J. M. (2011). Detailed Design of Multiproduct Batch
Plants Considering Production Scheduling. Industrial & Engineering Chemistry Research,
50, 6146–6160.

Fumero, Y., Corsano, G., & Montagna, J. M. (2012a). Planning and scheduling of multistage
multiproduct batch plants operating under production campaigns. Annals of Operations
Research, 199, 249–268.

Fumero, Y., Montagna, J. M., & Corsano, G. (2012b). Simultaneous design and scheduling
of a semicontinuous/batch plant for ethanol and derivatives production. Computers &
Chemical Engineering, 36, 342–357.

Galiano, F. C. & Montagna, J. M. (1993). Optimal allocation of intermediate storage in


multiproduct batch chemical plants. Mathematical and Computer Modelling, 18, 111–129.

63
REFERENCES REFERENCES

Geißler, B., Martin, A., Morsi, A., & Schewe, L. (2012). Using piecewise linear functions for
solving MINLPs. In Mixed Integer Nonlinear Programming (pp. 287–314). Springer.

Goldberg, D. (1991). What every computer scientist should know about floating-point
arithmetic. ACM Computing Surveys (CSUR), 23, 5–48.

Green, D. & Perry, R. (2007). Perry’s Chemical Engineers’ Handbook, Eighth Edition.
McGraw Hill professional. McGraw-Hill Education.

Gropp, W. & Moré, J. (1997). Optimization environments and the NEOS server.
Approximation theory and optimization, (pp. 167–182).

Grossmann, I. E., Caballero, J. A., & Yeomans, H. (2000). Advances in mathematical


programming for the synthesis of process systems. Latin American Applied Research, 30,
263–284.

Grossmann, I. E. & Guillén-Gosálbez, G. (2010). Scope for the application of mathematical


programming techniques in the synthesis and planning of sustainable processes. Computers
& Chemical Engineering, 34, 1365–1376.

Gupta, S. & Karimi, I. A. (2003). An Improved MILP Formulation for Scheduling


Multiproduct, Multistage Batch Plants. Industrial & Engineering Chemistry Research,
42, 2365–2380.

Harrison, R. G., Todd, P., Rudge, S. R., & Petrides, D. P. (2003). Bioseparations Science
and Engineering. Oxford University Press.

Harrison, R. G., Todd, P. W., Rudge, S. R., & Petrides, D. P. (2015). Bioseparations science
and engineering. Oxford University Press, second edition.

Hatti-Kaul, R. & Mattiasson, B. (2003). Isolation and Purification of Proteins. CRC Press.

Hearn, M. T. (2000). Handbook of Bioseparations, volume 2. Academic Press.

Imperatore, C. & Asenjo, J. A. (2001). Case study in a Multiproduct Multifunctional Plant.


Technical report, Chiron Corporation and University of Chile.

Iribarren, O. a., Montagna, J. M., Vecchietti, A. R., Andrews, B., Asenjo, J. a., & Pinto, J. M.
(2004). Optimal Process synthesis for the production of multiple recombinant proteins.
Computer Aided Chemical Engineering, 18, 427–432.

Koch, T. (2004). The final NETLIB-LP results. Operations Research Letters, 32, 138–142.

Kocis, G. & Grossmann, I. (1989). Computational experience with dicopt solving MINLP
problems in process systems engineering. Computers & Chemical Engineering, 13, 307–315.

Kocis, G. R. & Grossmann, I. E. (1988). Global optimization of nonconvex mixed-integer


nonlinear programming (MINLP) problems in process synthesis. Ind. Eng. Chem. Res.,
27, 1407–1421.

64
REFERENCES REFERENCES

Li, X., Chen, Y., & Barton, P. I. (2012). Nonconvex generalized benders decomposition
with piecewise convex relaxations for global optimization of integrated process design and
operation problems. Industrial and Engineering Chemistry Research, 51, 7287–7299.

Margot, F. (2009). Testing cut generators for mixed-integer linear programming.


Mathematical Programming Computation, 1, 69–95.

Mittelmann, H. (2013). Decision Tree for Optimization Software.

Montagna, J. M., Iribarren, O. a., & Vecchietti, A. R. (2004). Synthesis of Biotechnological


Processes Using Generalized Disjunctive Programming. Industrial & Engineering
Chemistry Research, 43, 4220–4232.

Montagna, J. M., Vecchietti, a. R., Iribarren, O. a., Pinto, J. M., & Asenjo, J. a. (2000).
Optimal design of protein production plants with time and size factor process models.
Biotechnology progress, 16, 228–37.

Moreno, M. S., Iribarren, O. A., & Montagna, J. M. (2009a). Design of Multiproduct


Batch Plants with Units in Series Including Process Performance Models. Industrial &
Engineering Chemistry Research, 48, 2634–2645.

Moreno, M. S., Iribarren, O. A., & Montagna, J. M. (2009b). Optimal design of multiproduct
batch plants considering duplication of units in series. Chemical Engineering Research and
Design, 87, 1497–1508.

Moreno, M. S. & Montagna, J. M. (2007a). New alternatives in the design and planning of
multiproduct batch plants in a multiperiod scenario. Industrial & engineering chemistry
research, 46, 5645–5658.

Moreno, M. S. & Montagna, J. M. (2007b). Optimal Simultaneous Design and Operational


Planning of Vegetable Extraction Processes. Food and Bioproducts Processing, 85, 360–371.

Moreno, M. S. & Montagna, J. M. (2011). Multiproduct batch plants design using linear
process performance models. AIChE Journal, 57, 122–135.

Moreno, M. S. & Montagna, J. M. (2012). Multiperiod production planning and design of


batch plants under uncertainty. Computers & Chemical Engineering, 40, 181–190.

Moreno, M. S., Montagna, J. M., & Iribarren, O. A. (2007). Multiperiod optimization for the
design and planning of multiproduct batch plants. Computers & Chemical Engineering,
31, 1159–1173.

Moreno-Benito, M., Espuña, A., & Puigjaner, L. (2014). Flexible batch process and
plant design using mixed-logic dynamic optimization: single-product plants. Industrial
& Engineering Chemistry Research, 53, 17182–17199.

Nikolopoulou, A. & Ierapetritou, M. G. (2012). Optimal design of sustainable chemical


processes and supply chains: A review. Computers & Chemical Engineering, 44, 94–103.

65
REFERENCES REFERENCES

Nowak, I. (2005). Overview of Global Optimization Methods. In Relaxation and


Decomposition Methods for Mixed Integer Nonlinear Programming (pp. 121–128). Springer.

Pinto, J. M., Montagna, J. M., Vecchietti, a. R., Iribarren, O. a., & Asenjo, J. a. (2001).
Process performance models in the optimization of multiproduct protein production plants.
Biotechnology and bioengineering, 74, 451–465.

Pinto-Varela, T., Barbosa-Povoa, A. P. F. D., & Novais, A. Q. (2009). Design and Scheduling
of Periodic Multipurpose Batch Plants under Uncertainty. Industrial & Engineering
Chemistry Research, 48, 9655–9670.

Ponsich, A., Azzaro-Pantel, C., Domenech, S., & Pibouleau, L. (2007). Mixed-integer
nonlinear programming optimization strategies for batch plant design problems. Industrial
& engineering chemistry research, 46, 854–863.

Ravemark, D. E. & Rippin, D. W. T. (1998). Optimal design of a multi-product batch plant.


Computers & Chemical Engineering, 22, 177–183.

Rebennack, S., Kallrath, J., & Pardalos, P. M. (2011). Optimal storage design for a multi-
product plant: A non-convex MINLP formulation. Computers & chemical engineering, 35,
255–271.

Reklaitis, G. V. (1990). Progress and issues in computer-aided batch process design. In Proc.
of the Third Int. Conf. on Foundations of Computer-Aided Process Design, Snowmass
Village, Colorado, July (pp. 10–14).: CACHE/Elsevier.

Rippin, D. (1993). Batch process systems engineering: A retrospective and prospective


review. Computers & Chemical Engineering, 17, S1–S13.

Robinson, J. D. & Loonkar, Y. R. (1972). Minimizing capital investment for multi-product


batch-plants. Process Technology, 17, 861.

Salomone, H. E., Montagna, J. M., & Iribarren, O. A. (1994). Dynamic Simulations in the
Design of Batch Processes. Computers chem Engng, 18, 191–204.

Samsatli, N. J. & Shah, N. (1996a). An Optimization Based Design Procedure for Biochemical
Processes: Part I: Preliminary Design and Operation. Food and Bioproducts Processing,
74, 221–231.

Samsatli, N. J. & Shah, N. (1996b). An Optimization Based Design Procedure for Biochemical
Processes: Part II: Detailed Scheduling. Food and Bioproducts Processing, 74, 232–242.

Sandoval, G., Espinoza, D., Figueroa, N., & Asenjo, J. A. (2016). MILP reformulations for
the design of biotechnological multi-product batch plants using continuous equipment sizes
and discrete host selection. Computers & Chemical Engineering, 84, 1–11.

Srinivasan, B., Bonvin, D., Visser, E., & Palanki, S. (2003). Dynamic optimization of batch
processes: II. Role of measurements in handling uncertainty. Computers & Chemical
Engineering, 27, 27–44.

66
REFERENCES REFERENCES

Vásquez-Alvarez, E., Lienqueo, M. E., & Pinto, J. M. (2001). Optimal Synthesis of Protein
Purification Processes. Biotechnology Progress, 17, 685–696.

Verderame, P. M., Elia, J. A., Li, J., & Floudas, C. A. (2010). Planning and scheduling
under uncertainty: a review across multiple sectors. Industrial & engineering chemistry
research, 49, 3993–4017.

Vielma, J. P. (2013). Mixed integer linear programming formulation techniques.

Viswanathan, J. & Grossmann, I. (1990). A combined penalty function and outer-


approximation method for MINLP optimization. Computers & Chemical Engineering,
14, 769–782.

Voudouris, V. T. & Grossmann, I. E. (1992). Mixed-integer linear programming


reformulations for batch process design with discrete equipment sizes. Industrial &
Engineering Chemistry Research, 31, 1315–1325.

Voudouris, V. T. & Grossmann, I. E. (1993). Optimal synthesis of multiproduct batch plants


with cyclic scheduling and inventory considerations. Industrial & Engineering Chemistry
Research, 32, 1962–1980.

Wang, Z., Jia, X.-P., & Shi, L. (2010). Optimization of multi-product batch plant
design under uncertainty with environmental considerations. Clean Technologies and
Environmental Policy, 12, 273–282.

Yi, G. & Reklaitis, G. V. (2011). Optimal design of multiperiod batch-storage network


including transportation processes. AIChE Journal, 57, 2821–2840.

67
Appendices

68
Appendix A

Published article

69
Computers and Chemical Engineering 84 (2016) 1–11

Contents lists available at ScienceDirect

Computers and Chemical Engineering


journal homepage: www.elsevier.com/locate/compchemeng

MILP reformulations for the design of biotechnological multi-product


batch plants using continuous equipment sizes and discrete host
selection
G. Sandoval a , D. Espinoza b , N. Figueroa c , J.A. Asenjo a,∗
a
Center of Biotechnology and Bioengineering, CeBiB, Departamento de Ingeniería Química y Biotecnología, Universidad de Chile, Santiago, Chile
b
Departamento de Ingeniería Industrial, FCFM, Universidad de Chile, Santiago, Chile
c
Instituto de Economía, Pontificia Universidad Católica de Chile, Santiago, Chile

a r t i c l e i n f o a b s t r a c t

Article history: In this article we present a new approach, relying on mixed-integer linear programming (MILP) formu-
Received 20 August 2014 lations, for the design of multi-product batch plants with continuous sizes for processing units and host
Received in revised form 24 July 2015 selection. The main advantage of the proposed approach is its scalability, that allows us to solve, within
Accepted 1 August 2015
reasonable precision requirements, realistic instances. Furthermore, we show that many other alterna-
Available online 19 August 2015
tives are either numerically unstable (for the problem sizes that we are interested in), unable to solve
large instances, or much slower than the proposed method. We present extensive computational experi-
Keywords:
ments, which show that we are able to solve almost all tested instances, and, in average, we are ten times
Multi-product batch plant
MINLP
faster than alternative approaches. As we use a high level implementation language (AMPL) we should
MILP get further time improvements if lower level implementations are used (C, C++ ).
Production path Reproducibility of our results can be tested using our models and data available on-line at BPLIB.1
© 2015 Elsevier Ltd. All rights reserved.

1. Introduction and nowadays the development of effective solution approaches


and algorithms remains very necessary (Grossmann and Guillén-
Conventional multi-product batch process literature using an Gosálbez, 2010).
optimization-based approach model the design and synthesis of The logarithmic change of variables proposed by Kocis and
such plants with Mixed-Integer Non-Linear Programming (MINLP) Grossmann (1988) linearizes most of the functions and leads to
formulations (Floudas, 1995). The usual objective is to minimize a convex MINLP problem, approach used by Ravemark and Rippin
the investment cost subject to the fulfillment of the production (1998) and Montagna et al. (2000) among others. Another approach
targets of a given set of products. Major drawbacks are given by the chosen by Pinto et al. (2001) and Ponsich et al. (2007) among others
combinatorial nature of mixed-integer programming and possible is the use of specially designed solvers which can usually find good
nonconvexities due to non-linearities. In computational optimiza- feasible solutions by the use of heuristic procedures (Grossmann
tion numerical issues of these formulations given by rounding et al., 2000). In practice the best off the shelf solvers for this kind of
errors, numerical instabilities and approximation errors are well- problems are the open source codes BONMIN and SCIP and the com-
documented (Goldberg, 1991; Koch, 2004; Margot, 2009; Vielma, mercial solvers BARON and DICOPT that stand out in Mittelmann’s
2013). benchmarks for optimization software (Mittelmann, 2013). Never-
Since Robinson and Loonkar (1972) different procedures have theless none of them guarantee convergence to a global optimum,
been proposed to tackle these problems (Reklaitis, 1990; Rippin, converging in some instances to local optima or not converging
1993; Barbosa-Póvoa, 2007; Verderame et al., 2010; Nikolopoulou altogether. For the particular case of BARON and DICOPT perfor-
and Ierapetritou, 2012) but a method that is more efficient for mance failures are reported for non-convex models (Ponsich et al.,
a particular example is hardly predictable (Ponsich et al., 2007) 2007; Rebennack et al., 2011; Li et al., 2012); nevertheless even
in cases where theoretically the algorithms work, we found that
in practice, they do not converge to the global optimum. We have
run precise experiments that demonstrate these failures in convex
∗ Corresponding author. Tel.: +56 229784723; fax: +56 226991084.
MINLP formulations (see Section 2).
E-mail addresses: gsandova@ing.uchile.cl (G. Sandoval), daespino@dii.uchile.cl
(D. Espinoza), nicolasf@uc.cl (N. Figueroa), juasenjo@ing.uchile.cl (J.A. Asenjo). It is a fact that there is a huge gap between Mixed-Integer Linear
1
Available at: http://www.dii.uchile.cl/ daespino/ Programming (MILP or MIP) and MINLP solvers technology (Nowak,

http://dx.doi.org/10.1016/j.compchemeng.2015.08.001
0098-1354/© 2015 Elsevier Ltd. All rights reserved.
2 G. Sandoval et al. / Computers and Chemical Engineering 84 (2016) 1–11

plants, introducing binary variables for the selection of discrete


Nomenclature available equipment sizes. From this point, to the design deci-
sions other were included as synthesis, production planning and
Indices
scheduling (Voudouris and Grossmann, 1993); design and planning
h host
in a multiperiod scenario (Moreno and Montagna, 2007); design
i product
of multi-product batch plants considering duplication of units in
j stage
series (Moreno et al., 2009) and the design and planning of multi-
m number of duplicated units
product batch plants using mixed-product campaigns (Corsano
UP,LO upper and lower bounds
et al., 2009). Most recently these MILP formulations have been
used to account for the design and scheduling of this type of plants
Sets
(Fumero et al., 2011, 2012a,b) and for the design under uncertainty
E1 set of batch stages, ⊂ E
considering different types of decisions (Durand et al., 2012, 2014;
E2 set of semi-continuous stages, ⊂ E
Moreno and Montagna, 2012; Moreno-Benito et al., 2014).
E3 set of chromatographic stages, ⊂ E
A key feature in these design problems is the use of Big-M
E set of all stages
constraints to account for selection decisions despite being prob-
H set of hosts
lematic (Bosch and Trick, 2005). Some authors that have included
I set of products
this type of constraints in their formulations are Gupta and Karimi
M set of available units operating in parallel in-phase
(2003), Corsano et al. (2009), Moreno et al. (2009), Moreno and
or out-of-phase
Montagna (2012). Obviously, these authors have found that the
R set of routes: stages j needed to process the product
value of the Big-M parameters has a tremendous impact on the
i synthesized by host h
solution time; see for example Moreno et al. (2007). In addition it
U set of available hosts h for product i synthesis
has been proven experimentally that other methods, as the convex
Parameters hull formulation presented by Montagna et al. (2004) are better to
ı time horizon account for selection decisions.
cjn , jn cost coefficients related to Yjn with n ∈ {1, 2, 3} In this paper we develop a robust methodology to solve the
design problem of a biotechnological multi-product batch plant
di production target for product i
in situations where equipment can be manufactured according
Sijn , Sijn size factor for product i in stage j related to Yjn . sijn =
to customer needs, as fermentors or tanks in general. To do that,
ln(Sijn )n ∈ {1, 2, 3} we develop a MILP formulation which does not rely on the use
Tij0 , Tij0 time factor for product i in the batch or chromato- of Big-M constraints and does not use a discrete range of equip-
graphic stage j. tij0 = ln(Tij0 ) ment sizes. To do that we use four basic techniques (see Fig. 1):
Tij1 , tij1 time factor for product i in the semi-continuous or First, an extension of the non-linear (but convex) formulation pro-
posed by Kocis and Grossmann (1988) is applied. Secondly, to deal
chromatographic stage j. tij1 = ln(Tij1 ) with non-linear convex inequalities a priori we constructed linear
outer (or inner) approximations of them which allow us to compute
Variables (a posteriori) true feasible solutions and lower (or upper) bounds.
v slack variable
Xjn , Xjn number of units operating in parallel in-phase (Xj1 )
and out-of-phase (Xj2 ) in stage j. xjn = ln(Xjn )
Yj1 , yj1 volumetric capacity for batch units and retentate or
feed tank for semi-continuous or chromatographic
stages. yj1 = ln(Yj1 )
Yj2 , Yj2 volumetric capacity for permeate of product tanks
for semi-continuous or chromatographic stages.
yj2 = ln(Yj2 )
Yj3 , Yj3 capacity of semi-continuous items. yj3 = ln(Yj3 )
Yi4 , Yi4 batch size for final product i. yi4 = ln(Yi4 )
Yi5 , Yi5 cycle time for product i. yi5 = ln(Yi5 )
1
zih binary variable: 1 if protein i is synthesized by host
h; 0 otherwise
zj2 binary variable: 1 if stage j is used to process at least
one of the products
3
zjm binary variable: 1 if m units are operating in parallel
in-phase in stage j; 0 otherwise
4
zjm binary variable: 1 if m units are operating in parallel
out-of-phase in stage j; 0 otherwise

2005). Nowadays mixed-integer linear techniques are fast, robust


and able to provide solutions to problems with up to millions of
variables (Geißler et al., 2012). Taking advantage of this Voudouris
and Grossmann (1992) used reformulation schemes to develop Fig. 1. Basic techniques used to model synthesis and design decisions considering
MILP models for the preliminary design of multi-product batch continuous equipment sizes and discrete host selection.
G. Sandoval et al. / Computers and Chemical Engineering 84 (2016) 1–11 3

Thirdly, to deal with integer variables, we used advanced reformu- Table 1


Comparison of the number of constraints and variables of some selected instances
lation techniques coming from the mixed-integer-programming
solved using models that account for selection with Big-M constraints – models (C1)
literature (clique constraints). Finally, once the initial problem is and (C2) – and a classic formulation for design decisions only, (P1). All instances
transformed into a standard mixed-integer programming problem, were solved using the DICOPT solver.
it is possible to take advantage of mature commercial MIP solvers.
Variables Constraints Status of solution
This approach, at least in our experiments, is more stable numer-
ically, scalable, and faster to solve than current alternatives and can Discrete Continuous Linear Non-linear
deal with the more general problem of jointly selecting equipment (C1) 480 185 303 10 Incorrect
sizes and alternative production paths for multiple products. Using (C2) 512 323 682 18 Incorrect
our approach, it is possible to quickly and accurately compute solu- (P1) 360 71 179 9 Correct

tions at any desired precision level. In our extensive computational


experiments (see Fig. 11) we found that current non-linear solvers
Contrary to what we expected differences in the objective function
only solved 43% of the instances generated for this study, while our
value for both (C1) and (C2) formulations went from 1% to 78% in
approach was able to solve over 95% of the studied instances in a
the 16 instances studied (data not shown), and even more striking,
running time that, on average, was more than ten times faster than
the solver finds a local minimum which is worse than those found
MINLP solvers in equivalent and standard MINLP formulations. To
for most instances without host selection.
make these comparisons we introduce the performance profiles; a
These numerical instabilities seem to be aggravated with size
methodology borrowed from the optimization literature.
since it is known that DICOPT works fine for small instances. Situa-
The rest of this paper is organized as follows. In Section 2 typ-
tion in accordance to the results obtained by Ponsich et al. (2007). In
ical drawbacks found by a commonly used MINLP solver and the
order to show these differences in sizes we built Table 1 to compare
standard MINLP formulation is presented. In Section 3 classic and
the number of constraints and variables involved in the smaller
novel formulations for the design problem are described. Relevant
instances of the cases (C1) and (C2), that were incorrectly solved
information about the methodology used to benchmark different
according to the aforementioned results, with the size of a smaller
formulations and to avoid numerical instabilities is given in Section
instance that was correctly solved by a classic formulation (P1)
4 and computational results are presented and discussed in Section
that solves an equipment sizing problem similar to that presented
5. Finally, the conclusions are presented in Section 6.
by Iribarren et al. (2004), but with no selection of hosts or equip-
ments, and using DICOPT solver. This last formulation is presented
2. Current limitations en Section 3.1.1.

Our main objective is finding a robust and scalable methodol- 3. Problem formulation
ogy for the design of biotechnological multi-product batch plant
considering equipment sizing (design decisions) and selecting the Two major contributions are presented in this section. First,
downstream processing stages (synthesis decisions). Given the clique constraints are introduced to formulate the discrete part of
complexities that to date have been added to the original design the model allowing the selection of the production path without
problem we decided to go back to the problem studied by Iribarren the use of Big-M constraints, in models (P2) and (P4). Second, a new
et al. (2004) where only design and synthesis decisions are mod- approach, in Section 3.2, to handle non-linearities using standard
eled. In their paper they designed a biotechnological batch plant for reformulation techniques from the optimization field that permits
the production of four recombinant proteins, i, where each can be the use of linear solvers leading to more reliable results and faster
synthesized by two different hosts, h, having four microorganisms computing time. The relation among the four different models stud-
in total. In addition to that three of the fifteen processing stages, ied is shown in Fig. 2.
j, may be performed by two different unit operations, d. In their
formulation they used constant size (Sijdh ) and time (Tijdh ) factors
3.1. MINLP formulation
to model each stage; considered duplication of units in parallel in-
phase, Gjd , and out-of-phase, Mjd , in order to diminish either the
3.1.1. The equipment-sizing problem (P1)
equipment sizes Vj or cycle times TLi , respectively, and used Big-M
In this section we present the most basic formulation for the
constraints to account for the selection of hosts and equipment.
design of biotechnological multi-product batch plants as only
As a correctness test, we took the example presented in Iribarren
equipment sizing and duplication of units in parallel are considered.
et al. (2004) and splitted into 16 different instances that only allow
The plant consists of a sequence of batch, semi-continuous and
equipment selection. Then we tested two different yet equivalent
chromatographic stages used to manufacture different products i;
formulations based on their model but removing host selection2 .
In the first (C1) the selection of hosts was eliminated by limiting
the set of available hosts, Hi , to just one per protein, and in the
second (C2), by setting the values of the selection binary variables to
1 for the selected hosts and 0 for those non-selected. If the solution
is being found by the solvers, we should observe two things:

(a) both model formulations (C1) and (C2) give the same solution,
and
(b) the minimum of the separated instances is equivalent to the
global minimum of the problem with host selection.

Fig. 2. Formulations compared in this article. Model (P1) is the most basic formu-
2
To further isolate the results obtained from problems with non-standard local- lation that only includes design decisions. Model (P2) includes the selection of the
settings, the GAMS modelling language was used and the experiments were run in downstream processes without the use of Big-M constraints. Models (P3) and (P4)
the NEOS server (Gropp and Moré, 1997; Czyzyk et al., 1998; Dolan, 2001) available are the transformed models of (P1) and (P2), respectively, using our proposed inner
in http://www.neos-server.org. and outer approximations.
4 G. Sandoval et al. / Computers and Chemical Engineering 84 (2016) 1–11

where semi-continuous as well as chromatographic stages are com- process. This time can be decreased if a duplication of units out-of-
posed by the semi-continuous items plus feed and product tanks. phase is used:
At each stage j there are Xj2 groups of units operating in parallel Tij
out-of-phase and each group is conformed by Xj1 units operating Yi5 ≥ ∀i ∈ I, j ∈ E (7)
Xj2
in-phase. For semi-continuous or chromatographic stages feed and
product tanks can only be duplicated out-of-phase. Single produc- As batch stages operate for a fixed time, Tij0 , cycle time constraint
tion campaigns are considered and batches are transferred from in its convex form is given by Eq. (8):
one stage to the next without delay (zero wait policy).
The objective is to minimize the investment costs of main equip- yi5 + xj2 ≥ tij0 ∀i ∈ I, j ∈ E1 (8)
ments of the plant (see Eq. (1)) given fixed production targets, di ,
Semi-continuous stages, on the other hand, operate during a
over a time horizon ı.
time that depends on the final batch size, Yi4 . For those stages the
cycle time is constrained as in Eq. (9).
      
1 1
min cost = Xj1 Xj2 cj1 Yj1 j + Xj2 cj1 Yj1 j Y4
Tij1 i
X1Y 3
j∈E1 j∈E2 ∪E3
Yi5 ≥
j j
∀i ∈ I, j ∈ E2 (9a)
 2
  3
 Xj2
2 j 3 j
+ Xj2 cj2 Yj + Xj1 Xj2 cj3 Yj + vı (1)
yi5 + xj2 ≥ tij1 + yi4 − xj1 − yj3 ∀i ∈ I, j ∈ E2 (9b)

Variables Yj· represent the different equipment sizes. Param- Lastly, chromatographic stages are modeled considering both
fixed and variable operation times leading to the highly non-linear
eters cj· and j· are cost coefficients distinctive for each kind of
constraint (10).
equipment and v is a slack variable included to assure feasibility
(Montagna et al., 2004). Y4
Tij0 + Tij1 i
Making the change of variables introduced by Kocis and X1Y 3
Yi5 ≥
j j
∀i ∈ I, j ∈ E3 (10a)
Grossmann (1988) we get the new objective function (2). Xj2
   

   yi5 + xj2 ≥ ln exp tij0 + exp tij1 + yi4 − xj1 − yj3 ∀i ∈ I, j ∈ E3 (10b)
min cost = cj1 exp xj1 + xj2 + yj1 j1
j∈E1
Production targets for all products, di , must be satisfied within
 the time horizon ı.
   
+ cj1 exp xj2 + yj1 j1 + cj2 exp xj2 + yj2 j2  di Y 5
i
≤ ı + vı (11a)
j∈E2 ∪E3 Yi4
 
i∈I
+ cj3 exp xj1 + xj2 + yj3 j3 + vı (2) d  
i
exp yi5 − yi4 ≤ 1 + v (11b)
ı
i∈I
At each stage and for each product the size of the units must
allow the processing of the incoming batch which can be splitted Finally, variables for duplication in-phase Xj1 are restricted to
among Xj1 units to not surpass the upper bound capacity of the 3 are binary
integer values using constraints (12) and (13), where zjm
equipment. In batch stages this constraint can be written as Eq. variables and M a set of available units to operate in parallel in-
(3a); convexified in Eq. (3b). phase. The same is valid for variables for duplication out-of-phase,
Xj2 .
Sij1 Yi4
Yj1 ≥ ∀i ∈ I, j ∈ E1 (3a) 
Xj1 xj1 = 3
zjm ln(m) ∀j ∈ E (12)
m∈M
yj1 + xj1 ≥ sij1 + yi4 ∀i ∈ I, j ∈ E1 (3b) 
3
zjm = 1 ∀j ∈ E (13)
As in semi-continuous or chromatographic stages duplication is m∈M
allowed just for semi-continuous items, feed and product tanks are
Appropriate upper and lower bounds are also considered for all
sized using constraints (4) and (5).
of the variables.
yj1 ≥ sij1 + yi4 ∀i ∈ I, j ∈ E2 ∪ E3 (4)
3.1.2. The design problem with selection of routes and equipment
yj2 ≥ sij2 + yi4 ∀i ∈ I, j ∈ E2 ∪ E3 (5) sizing (P2)
More recent models (like Iribarren et al. (2004)) take into
Chromatographic columns have to process the incoming batch account the joint selection of the production processes including
and both duplication in-phase and out-of-phase are allowed. Dupli- selection of hosts and equipment. Their formulation uses classical
cation in-phase is modeled in size constraint (6) since this permits Big-M constraints. Since we know that these constraints are prob-
smaller units and duplication out-of-phase is reflected in time con- lematic (Bosch and Trick, 2005) in this work we propose a different
straints. way to formulate the integer part of the problem replacing the Big-
M by clique constraints. With this formulation all constraints are
yj3 + xj1 ≥ sij3 + yi4 ∀i ∈ I, j ∈ E3 (6) ignored except the upper bound on the variables (Dietrich et al.,
1993). Model (P2) includes the sizing of the equipment, the dupli-
The cycle time for each product i, Yi5 , is defined as the time cation of units in parallel working in-phase and out-of-phase and
elapsed between the production of two consecutive batches and accounts for the selection of the global process selecting routes
is given by the larger operating time, Tij , among the stages in the which are defined as the series of unit operations used to purify
G. Sandoval et al. / Computers and Chemical Engineering 84 (2016) 1–11 5

a protein given a certain host that synthesizes it. In this way once linear functions which give arbitrarily good lower or upper approx-
the pair product-host, (i, h), is selected the set of stages conforming imations of their respective original functions. The actual optimal
the process is fixed. solution is in between both approximations and the precision level
The objective function becomes: is given by the number of cutting points selected to generate the
set of linear functions to replace each non-linear function and the
   actual selection of the approximation points used for example equi-
min cost = zj2 cj1 exp xj1 + xj2 + yj1 j1
spaced or non-equispaced. In this way the accuracy of the solution
j∈E1 can be as high as desired at the cost of longer computing time.
    
+ zj2 cj1 exp xj2 + yj1 j1 + cj2 exp xj2 + yj2 j2
3.2.1. Inner and outer approximations
j∈E2 ∪E3 Given a convex function of one variable g(x) ≤ 0 and a set of
 
points {xk }k=1,...,n in the domain of g then, is easy to see that:
+cj3 exp xj1 + xj2 + yj3 j3 + vı (14)

x|g(x̂k ) + ∇ g(x̂k )(x − x̂k ) ≤ 0 k = 1, . . ., n ⊇ x|g(x) ≤ 0 (24)
Since some stages can be unused and just one route per protein and
can be selected we introduced two binary variables: zih 1 and z 2 . z 1
j ih
is equal to 1 when for product i synthesis host h is selected and 0
    
g x̂k+1 − g x̂k  
otherwise and zj2 is 1 when stage j is used to process at least one x|g(x̂k ) + x − x̂k ≤ 0 k = 1, ldots, n
x̂k+1 − x̂k
of the products and 0 otherwise. Constraint (15) enforces to chose
just one host h to produce the protein i and constraint (16) permits
⊆ {x|g(x) ≤ 0}, (25)
stage j to be used just in case at least one product needs it to be
processed. which allows for straightforward lower and upper approximations
 of g. Using this fact, it is easy to find inner and outer approximations
1
zih =1 (15) of the problems (P1) and (P2).
(i,h)∈:U In fact, for each non-linear constraints of the form gj (x) ≤ 0,
and considering an arbitrary set of cutting points in the domain
zj2 ≥ zih
1
∀(i, h, j) ∈ R|(i, h) ∈ U (16)
{xk }k=1,...,n the consideration of the set of constraints
For chromatographic stages constraints take the form of Eqs.  
(17)–(20) that are trivially satisfied if host h is not selected to pro- gj (x̂k ) + ∇ gj (x̂k ) x − x̂k ≤ 0 k = 1, . . ., n (26)
1 = 0). When the host h is selected for protein i
duced protein i (zih which leads to a larger feasible set, as can be seen in Fig. 3a. On the
1 = 1) and the stage j has to be performed to process product i
(zih other hand, we consider the set of constraints
then zj2 = 1 and constraints are the same as in previous formulation    
gj x̂k+1 − gj x̂k  
(Section 3.1.1). gj (x̂k ) + x − x̂k ≤ 0 k = 1, . . ., n − 1 (27)
x̂k+1 − x̂k
yj1 zj2 ≥ sihj
1 1 4 1
zih + yih zih ∀(i, h, j) ∈ R, j ∈ E3 (17)
which leads to a smaller feasible set, as can be seen in Fig. 3b.
yj2 zj2 ≥ sihj
2 1 4 1
zih + yih zih ∀(i, h, j) ∈ R, j ∈ E3 (18) In the same way, the minimization of the cost objective function,
f(x), can be replaced by
yj3 zj2 + xj1 zj2 ≥ sihj
3 1 4 1
zih + yih zih ∀(i, h, j) ∈ R, j ∈ E3 (19)
min v
   
(28)
5 1
yih zih + xj2 zj2 ≥ ln exp 0
tihj + exp 1
tihj 4
+ yih − xj1 − yj3 1
zih s.t. v ≥ f (x̂k ) + ∇ f (x̂k )(x − x̂k ) k = 1, . . ., n
which leads, together with an outer approximation of constraints,
∀(i, h, j) ∈ R, j ∈ E3 (20)
to a lower bound of the true cost. The objective function can also
be replaced by
If stage j is not necessary for the process (zj2 = 0); then equip-
min v
ment sizes are set to 0 with constraints as (21) and no unit is    
f x̂k+1 − f x̂k
considered to conform that stage (constraint (22)): s.t. v ≥ f (x̂k ) + (x − x̂k ) (29)
x̂k+1 − x̂k
yj1,LO zj2 ≤ yj1 ≤ yj1,UP zj2 ∀j ∈ E (21)
 k = 1, . . ., n
3
zjm = zj2 ∀j ∈ E (22)
which leads, together with an inner approximation of constraints,
m∈M
to an upper bound of the true cost.
Finally, in the planning horizon constraint (23) only the terms f (x̂ )−f (x̂k )
In what follows, ∇ f (x̂k ) or k+1 x̂ −x̂
= ˛k in the lower or upper
associated to the selected host per protein are considered. k+1 k
approximation, respectively, x̂k = bk and f (x̂k ) = ˇk .
 d  
ih 1 5 4
zih exp yih − yih ≤1+v (23)
ı 3.2.2. Reformulation for the equipment-sizing problem (P3)
(i,h)∈:U
The assumptions for this model are the same as those for the
3.2. Mixed-integer linear formulations MINLP proposed in Section 3.1.2.
Objective function. Cost functions of Eq. (2) are individually lin-
To obtain more accurate solutions and, specially in larger earized using the approximations given in Section 3.2.1 which
instances, in a reasonable running time we present a MILP refor- leads to Eq. (30):
mulation, which can be solved using any commercial MILP solver.  

min cost = v1j + v1j + v2j + v3j (30)


These models are basically equal to their MINLP counterpart but
replacing the non-linear objective and time constraints with sets of j∈E1 j∈E2 ∪E3
6 G. Sandoval et al. / Computers and Chemical Engineering 84 (2016) 1–11

di 7i  5  di 7i
v7i ≥ ˛k yi − yi4 − b7i
k
+ ˇk ∀i ∈ I, k ∈ K7 (34)
ı ı

3.2.3. Reformulation for the design problem considering route


selection and equipment sizing (P4)
Similar to the model presented in Section 3.2.2 this model
was built based on its MINLP counterpart and most of constraints
remain the same.
The objective function is the same as that from model (P3) and
for stages design only differences are encountered for time con-
straints of chromatographic stages. In this way, applying the inner
or outer approximations to Eq. (20) constraint (35) is obtained:

5 6ihj
   6ihj 6ihj 6ihj

yih + xj2 ≥ ˛k 4
yih − xj1 − yj3 + ˇk − ˛k bk 1
zih

∀(i, h, j) ∈ R, j ∈ E3 , k ∈ K6 (35)

This set of equations together with constraints of the type of (36)


4 , y5 , X 1 and X 2 will model the same situation
for variables yj1 , Yj2 , yih ih j j
as in (P2).
yj3,LO zj2 ≤ yj3 ≤ yj3,UP zj2 ∀j ∈ E3 (36)

If zj2 = 0 constraint (35) is trivially satisfied. On the other hand


1 = 0 constraint (35) becomes:
if zih
6ihj
 
xj2 ≥ ˛k −xj1 − yj3 ∀(i, h, j) ∈ R, j ∈ E3 , k ∈ K6 (37)
6ihj
Since ˛k is a positive parameter constraint (37) will be always
satisfied only if |xj1 | > |yj3 | or if both variables are bigger than or
equal to 0. To assure this data preprocessing is necessary. As Xj1 is
always bigger than 0 we normalized variables Yj3 by their lower
bound. More details in Section 4.3.3.
For the case of planning horizon constraint little difference is
found between (32) and (38). The last one takes into account host
selection:

Fig. 3. Feasible region (patterned area) of (a) outer and (b) inner approximations v7ih ≤ 1 (38)
(dashed lines) of an exponential function (solid line). Points bi are the cutting points
(i,h)∈:U
and LB and UB are the lower and upper bounds of x.
Finally, constraints for auxiliary variables v1j , v2j , v3j and v7ih are
Constraints. Batch and semi-continuous stages and binary vari- different from those for problem (P3) to account for route selection:
ables for duplication of units constraints in this MILP model are the    
same as those in the MINLP model shown in Section 3.1.1. v1j ≥ ˛1j
k
1j 1j 1j
xj1 + xj2 + j1 yj1 + ˇk − ˛k bk zj2 ∀j ∈ E1 , k ∈ K1 (39)
Chromatographic stages. Size constraints for feed and product
tanks and column size constraint are the same as those in the MINLP v1j ≤ zj2 v1,UP
j
∀j ∈ E1 (40)
model shown in Section 3.1.1.
Time constraint (31) is obtained from the linearization of Eq. di 7ih  5  di  7ih  1
(10): v7ih ≥ ˛ 4
yih − yih + ˇk − ˛7ih b7ih zih ∀(i, h) ∈ U, k ∈ K7
ı k ı k k

(41)
 
6ij 6ij 6ij
yi5 + xj2 ≥ ˛k yi4 − xj1 − yj3 − bk + ˇk ∀i ∈ I, j ∈ E , k ∈ K
3 6

1 7,UP
(31) v7ih ≤ zih vih ∀(i, h) ∈ U (42)

If zj2
= 0 constraints (39)–(42) are trivially satisfied and if zih 1 =0
Planning horizon. From linearization of Eq. (11) constraint (32) 1 =1
then constraints (41) and (42) are trivially satisfied. Finally, if zih
is obtained:
 and zj2 = 1 then constraints (39)–(42) are the same as those in the
v7i ≤ 1 (32) MILP formulation without route selection.
i∈I
4. Methods
Auxiliary variables. Cost functions in the objective function are
linearized as shown in Eq. (33) and planning horizon constraint is
4.1. Solvers and modelling language
linearized as shown in Eq. (34):
 
v1j ≥ ˛1j 1j 1j
xj1 + xj2 + j1 yj1 − bk + ˇk ∀j ∈ E1 , k ∈ K1 (33) For MINLP problems the open source BONMIN 1.5 and SCIP 3.0.1
k
solvers were studied. In our computational tests SCIP uses SoPlex
G. Sandoval et al. / Computers and Chemical Engineering 84 (2016) 1–11 7

1.7.1 as the LP solver and BONMIN (with its default algorithm, B-


Hyb) uses Cbc 2.7.1 as the MIP solver and Ipopt 3.10.0 with MUMPS
as linear solver. For the case of BONMIN we tested 3 over 5 available
algorithms: B-Hyb the default algorithm, B-Ecp a specific parame-
ter setting of B-Hyb that can be faster in some cases (Bonami and
Lee, 2013) and B-OA using CPLEX as the MILP solver that according
to Mittelmann (2013) can be faster for convex instances. In prelim-
inary studies solvers as KNITRO and COUENNE were also tested to
solve our MINLP formulations, but their performance in our sim-
plest instances were poorer than that for the selected solvers.
For MILP problems the commercial CPLEX solver in its version
12.4.0.0 was used as it is one of the top performer from the literature
(Mittelmann, 2013).
All models were coded using the AMPL modelling language.

4.2. Execution environment

Each instance was executed using a single thread on a Intel(R)


Fig. 4. Comparison of performance profiles of (a) relative optimality gap obtained a
Xeon(R) CPU E5620@2.40GHz with a running time limit of 48 hours,
posteriori and (b) the logarithm of the running time of “sizing instances” solved
an optimality relative gap of 0.1% for models (P3) and (P4) and 2% with linear model (P3) using 17, 33 and 65 cutting points for lower and upper
for (P1) and (P2), and a maximum memory usage of 6Gb of RAM. approximations with an optimality relative gap of 0.1%.
The difference in the prescribed optimality gap for MILP and
MINLP solvers is given by the fact that while MINLP problems are the normalization of the variables are necessary to decrease numer-
solved to find the actual minimum cost function within a defined ical errors.
optimality gap, and therefore an a priori optimality gap, MILP mod- Although not all of our instances are big enough to need data
els find true upper and lower bounds for the actual cost function preprocessing all were subjected to the same treatment:
leading to an a posteriori optimality gap that is computed after-
wards. As will be shown in Section 5.2, this difference ensure that • All variable bounds and parameters associated to variables Y · and
j
our results are comparable.
Yi· were normalized by their respective lower bounds.
• Size and time factors were normalized and dimensionless consid-
4.3. Methodology ering the respective associated units. For example, as size factor
for tanks have units of batch size divided by a volume this param-
4.3.1. Instances eters are dimensionless by multiplying by the lower bound of the
To compare different approaches two set of instances, with ran- final batch size and dividing by the respective tank lower bound.
domly generated data between given reasonable upper and lower • Lower bounds for the cycle time were tightened using time con-
bounds, were built: “sizing instances”, to compare simpler models straints and upper bounds for final batch product were tightened
(P1) and (P3), and “routing instances” to compare more complex using size constraints.
models (P2) and (P4). We considered a variety of different num-
ber of proteins to be produced (4–6), number of stages to conform
5. Results and discussion
the process (11–65), number of routes to synthesize the product
(20–65) and different cost coefficients values (1–110% of nominal
In this section we show the robustness of our proposed MILP
values).
transformations and its superiority over classic MINLP formulations
with Big-M constraints using performance profiles, a methodol-
4.3.2. Benchmarking ogy borrowed from the optimization literature. Our approach is
In order to compare the model-solver pairs studied in this work not only able to find correct solutions in realistic situations unlike
we introduce a new tool for process engineers that was introduced MINLP formulations but also in a small fraction of the time required
in the optimization field by Dolan and Moré (2002) to compare by those approaches. Major implications of these features are the
different optimization software: the performance profile. exactness of the solutions that make this information reliable for
As stated by Dolan and Moré (2002) the performance profile for decision-making; and as time reduction is significant numerous
a solver is the “cumulative distribution for a performance metric”, alternatives can be tested with the same formulation or with com-
for example computing time. In this way things like how many plexified models that may address the combination of different
instances a solver is able to solve given some stop criteria like those types of decisions.
shown in Section 4.2, or how fast it solves different instances of the This presentation is organized as follows: first, we describe the
same type of problem can be seen graphically. instances generated for comparison then discuss the selection of
As an example of how to read these plots, in Fig. 4a it can be the cutting points for the proposed approach and finally, we com-
seen that when using 17 cutting points 40% of the instances were pare MINLP and MILP formulations in terms of their performance
solved to an a posteriori optimality gap up to 2% while when using solving the sets of instances using time as the metric.
33 cutting points leads to an optimality gap under 0.5% for the same
amount of instances. 5.1. Size of instances

4.3.3. Data pre-processing To compare the most basic and easy to solve problems (P1) and
It is known that zero-one problems of large-scale are hard com- (P3) a set of 186 instances (“sizing instances”) were generated vary-
binatorial optimization problems (Crowder et al., 1983; Koch, 2004; ing the number of proteins to be produced (2–6), the number of
Applegate et al., 2007) reason why in order to obtain reliable solu- stages that conform each process (11–35) and the cost coefficient
tions preprocessing data is necessary. The use of tight bounds and values (1–110% of nominal values). Sizes of these instances in terms
8 G. Sandoval et al. / Computers and Chemical Engineering 84 (2016) 1–11

Table 2 5.2. Selection of cutting points


Sizes of sample instances solved using non-linear model (P1).

Variables Constraints Contrary to MINLP problems (P1) and (P2) that are solved to
an a priori optimality gap, models (P3) and (P4) give true upper
Discrete Continuous Linear Non-linear
and lower bounds for the actual cost function of each instance and
Small 264 57 139 9
therefore an optimality gap that is computed a posteriori. Fig. 4a
Medium1 840 157 407 5
shows the performance profiles of the gaps obtained a posteriori
for the “sizing instances” solved with 17, 33 and 65 cutting points
Table 3 that generate 16, 32 and 64 linear functions for the inner approx-
Sizes of sample instances solved using linear model (P3) with 33 cutting points for imations, respectively. Here we can see that our linear model is
linear inner approximation.
able to solve all instances with a maximum gap of 5% in less than
Variables Constraints 16 seconds when using 17 cutting points, and a gap of less than
Discrete Continuous Linear Non-linear
0.5% in less than 64 seconds when using 65 cutting points. The run-
ning time profiles can be seen in Fig. 4b. If 65 cutting points had
Small 264 87 1357 –
been selected, the optimality gap for non-linear solvers would have
Medium1 840 237 3096 –
been around 0.5% as that is the worst gap obtained a posteriori with
CPLEX (Fig. 4a). Given this, in our final experiments we use a set of
of number of variables and constraints are shown in Tables 2 and 33 cutting points, since this option gives the largest improvement
3, where “Small” corresponds to an example of one of the smaller in gap versus the increase in execution time, and a slightly bigger
instances solved with different models and “Medium1”, to an exam- optimality gap criteria of 2% was selected for non-linear solvers.
ple of the bigger instances solved for these two models. As it can Once the number of points is selected, the specific values of
be seen in both tables new auxiliary variables and the sets of linear these points must be chosen. The most obvious choice is equispaced
functions generated to replace non-linear restrictions makes the points which, for the (relevant) exponential function ex generates
problem from 7 to 12 times bigger in terms of linear constraints small errors for low values of x and large errors for high values.
when 33 cutting points are used for linearization with an increase Another alternative is to use the expression (43), where N is the
in about 50% of continuous variables. However, as we will see later, total number of cutting points including −∞ as x1 and x̄ as the upper
this increase in variables and constraints leads to smaller execution bound of x. This is a good approximation in order to minimize the
times and more accurate results. maximum value of the error (see Fig. 5).
To test models (P2) and (P4), as they were posed to solve more
k−1
complex scenarios, a set of 249 new and bigger instances (“rout-
xk = 2 ln + x̄ ∀k ∈ 2...N (43)
ing instances”) were generated varying the number of proteins to N−1
be produced (4–6), the number of stages conforming the global
process (18–65) and the number of routes available to produce We can see in Fig. 5 that the choice of equispaced points leads to
the proteins (20–40). Sizes of these instances in terms of number better approximations for low values of x, but much worse for high
of variables and constraints are shown in Tables 4 and 5, where values. For our numerical experiments, equispaced points work
“Medium2” corresponds to an example of one of the smaller “rout- better: while execution time remains the same for both approaches,
ing instances” solved with models (P2) and (P4) and “Large”, to an a posteriori gaps were slightly worse for non-equispaced points
example of the bigger instances solved in this work. As it can be (Fig. 6).
seen in both tables, in comparison with Tables 2 and 3, the number This leaves open important questions about the optimal point
of discrete variables increases by 5% with the addition of selection selection to improve the precision of upper and lower approxi-
variables, z, and the number of linear constraints increases by 15% mations. Our preliminary simulations seem to indicate that giving
for (P4) and is around double for (P2). As we will see later, this substantial attention to smaller values of x could significantly
addition permits the resolution of more complex scenarios, while improve the results, but this is left for further research.
at the same time not affecting execution time or optimality gap in
comparison with the more basic formulation.

Table 4
Sizes of sample instances solved using non-linear model (P2).

Variables Constraints

Discrete Continuous Linear Non-linear

Small 279 57 251 9


Medium1 881 157 719 5
Medium2 488 152 1262 22
Large 1794 613 15766 462

Table 5
Sizes of sample instances solved using linear model (P4) with 33 cutting points for
linear inner approximation.

Variables Constraints

Discrete Continuous Linear Non-linear

Small 279 87 1539 –


Medium1 881 237 3602 –
Fig. 5. Comparison of absolute errors using 2 different sets of 33 cutting points
Medium2 488 229 4829 –
where f(x) are the linear functions used to approximate the exponential function
Large 1794 926 46489 –
between 2 cutting points.
G. Sandoval et al. / Computers and Chemical Engineering 84 (2016) 1–11 9

Fig. 6. Comparison of performance profiles of a posteriori gaps obtained using 33 Fig. 8. Comparison of performance profiles of the logarithm of the ratio of the com-
cutting points to solve model (P4) where f(x) are the linear functions used to approx- puting time of the pair model-solver versus the best time of the pairs model-solvers
imate the exponential function between 2 cutting points. Time limit was set in for “sizing instances” solved with models (P1) and (P3) with an optimality relative
12 h. gap of 0.1% for the linear solver and 2% for non-linear solvers.

5.3. Equipment sizing: comparison of problems (P1) and (P3)


the CPLEX-based approach is always better than all of the other
We tested three different combinations of solvers-models: the options and that Ecp algorithm is always better than Hyb for this
linear model (P3) was solved using CPLEX as solver while the asked optimality gap.
non-linear model, (P1), was solved using SCIP and 3 of the 5 algo-
rithms that are available for using BONMIN which were chosen 5.4. Routes selection: comparison of problems (P2) and (P4)
based on BONMIN users’ manual and Mittelmann’s benchmarking
(Mittelmann, 2013) information. All of the instances were solved As a test of correctness, models (P2) and (P4) were solved using
using the stopping criteria and conditions mentioned in Section the generated “sizing instances” where just one route was avail-
4.2. able to produce each product. Contrasting these results to those
Fig. 7 shows the performance profile of running time using obtained by (P1) and (P3) it can be seen in Fig. 9 that both formula-
BONMIN-Hyb, BONMIN-Ecp, BONMIN-OAcpx, SCIP and our CPLEX- tions, for equipment sizing and considering routes, are consistent
based approach. From this, we can see that problem (P1) was solved solving the same amount of instances in virtually the same amount
faster using any BONMIN algorithm than using SCIP solver. More- of time. Performance profiles of the relative difference between the
over, SCIP only worked well in about the 75% of the instances, while value of the objective function obtained with simpler – (P1) and (P2)
BONMIN is able to solve the 85% of the instances using the OA – and more complex formulations – (P3) and (P4) – are presented
algorithm and the 100% of the studied instances using either the in Fig. 10 where it can be seen that the differences between MILP
Ecp or the Hyb algorithm. On the other hand, model (P3) solved formulations, as well as for MINLP formulations solved using BON-
using CPLEX is able to solve all of the instances studied, as well as MIN, are at most the optimality gap asked for each solver. The
BONMIN, but taking much less time than the problem (P1). As B-Ecp case of SCIP is different because in almost the 10% of the instances
and B-Hyb seems to be equally good for those instances that take the solver gives results with a difference in the objective func-
longer to be solved we decided to use as the performance metric tion between models (P1) and (P2) greater than the optimality gap
the ratio of the computing time of the model-solver versus the best which shows that this solver is not reliable to solve this type of
time of all of the model-solvers, denoted by . Those performance problems.
profiles are plotted in Fig. 8 where we can see even more clear than

Fig. 7. Comparison of performance profiles of the logarithm of running time of “siz- Fig. 9. Comparison of performance profiles of the logarithm of running time of
ing instances” solved using models (P1) and (P3) with an optimality relative gap of “sizing instances” solved using models (P1), (P2), (P3) and (P4) with an optimality
0.1% for the linear solver and 2% for non-linear solvers. relative gap of 0.1% for the linear solver and 2% for non-linear solvers.
10 G. Sandoval et al. / Computers and Chemical Engineering 84 (2016) 1–11

The proposed method was proved to be more numerically sta-


ble than other alternative approaches for the same problem giving
true optimal solutions, and in general, faster than other tested
approaches. Our method takes advantage of two facts: the con-
tinuous relaxation of the feasible region is convex and bounded
(which allow us to build, up front, inner or outer approximations
of the feasible space, and thus report true upper/lower bounds for
each instance); and the fact that mixed-integer linear solvers are
much more stable numerically and scalable in size than MINLP algo-
rithms. Also, this approach relies on “off the shelve” optimization
and modelling software, which makes it more amiable to practi-
tioners.
To assert our claims, we borrow algorithm comparison tools
from the optimization community, which are an interesting form
to test the quality of competing algorithms to tackle the same class
of problems.
Fig. 10. Comparison of performance profiles of relative difference between simple
Although, in a real scenario, semi-continuous units such as
and more complex formulation for “sizing instances”. Models (P1) and (P2) were centrifuges and microfilters, among others, are available only in
solved to an optimality gap of 2% and (P3) and (P4), to an optimality gap of 0.1%. discrete sizes, unlike tanks that can be built according to customer
needs, we feel that the proposed approach is robust enough to
consider such issues, however, this was left as a next step in our
research. Additionally, to increase the precision of our results for
real cases, it is possible to explore a two step approach where after
using the proposed method to obtain upper and lower approxi-
mations for the objective function we can refine different upper
and lower variable bounds making them tighter and perform a
re-optimization.
Finally, if true speed is the goal; we know that low-level imple-
mentations of dynamic inner/outer approximation can provide
further time reductions, however, we feel that this is beyond the
scope of this work.

Acknowledgements

This work was supported by a CONICYT scholarship for doctoral


studies, FONDECYT Grant 1110024, Núcleo Milenio Información y
Fig. 11. Comparison of performance profiles of the logarithm of running time of Coordinación en Redes P10-024-F, and CONICYT for funding of Basal
“routing” and “sizing instances” solved using models (P2) and (P4) with an optimality Centre, CeBiB, FB0001.
relative gap of 0.1% for the linear solver and 2% for non-linear solvers.

References
As a final step in this work we compare in Fig. 11 the per-
formance profiles of total running time obtained after solving all Applegate DL, Cook W, Dash S, Espinoza DG. Exact solutions to linear programming
instances generated for this work (435 in total). Here we can see problems. Oper Res Lett 2007];35:693–9.
that for “routing instances” (P4) is much more robust than (P2), Barbosa-Póvoa AP. A critical review on the design and retrofit of batch plants. Com-
put Chem Eng 2007];31:833–55.
that was not able to solve any of those cases. In average (geo- Bonami P, Lee J. BONMIN users’s manual; 2013].
metric average), the instances take about 40 s to be solved using Bosch R, Trick M. Integer programming. In: Burke EK, Kendall G, editors. Search
model (P4), which is about the 3% of the time required by (P2)- methodologies: introductory tutorials in optimization and decision support
techniques. Springer; 2005]. p. 67–92.
BONMIN and less than the 1% of the time required by (P2)-SCIP. Corsano G, Aguirre PA, Montagna JM. Multiperiod design and planning of multiprod-
Nevertheless, for some punctual instances that represent the 4.6% uct batch plants with mixedproduct campaigns. AIChE J 2009];55:2356–69.
of the instances studied, the time and/or memory usage were not Crowder H, Johnson EL, Padberg M. Solving large-scale zero-one linear programming
problems. Oper Res 1983];31:803–34.
enough to get the desired optimality gap. From this it can be
Czyzyk J, Mesnier MP, More JJ. The NEOS server. IEEE Comput Sci Eng 1998];5:68–75.
stated that for the solution of more realistic instances or even Dietrich BL, Escudero LF, Chance F. Efficient reformulation for 0-1 programs-
to solve real problems considering continuous equipment sizes, methods and computational results. Discret Appl Math 1993];42:147–75.
the formulation proposed in this work is much more reliable and Dolan ED, Moré JJ. Benchmarking optimization software with performance profiles.
Math Program 2002];91:201–13.
faster than the usual and widely studied standard MINLP formula- Dolan ED. NEOS Server 4.0 administrative guide, Technical Report, Technical
tion. Memorandum ANL/MCS-TM-250. Mathematics and Computer Science Division,
Argonne National Laboratory; 2001].
Durand GA, Mele FD, Bandoni JA. Determination of storage tanks location for optimal
6. Conclusions short-term scheduling in multipurpose/multiproduct batch-continuous plants
under uncertainties. Ann Oper Res 2012];199:225–47.
Durand GA, Moreno MS, Mele FD, Montagna JM, Bandoni A. Comparing the perfor-
In this work we present a scalable approach to solve, within
mances of two techniques for the optimization under parametric uncertainty of
reasonable running times and quality assurance requirements, the the simultaneous design and planning of a multiproduct batch plant. Iberoam J
problem of designing a biotechnological multi-product batch plant Ind Eng 2014];5:42–54.
that support continuous equipment sizes and discrete host and/or Floudas CA. Nonlinear and mixed-integer optimization: fundamentals and applica-
tions. Oxford University Press; 1995].
process selection, up to sizes of real instances and that can be appli- Fumero Y, Corsano G, Montagna JM. Detailed design of multiproduct batch plants
cable to any kind of multi-product batch plant. considering production scheduling. Ind Eng Chem Res 2011];50:6146–60.
G. Sandoval et al. / Computers and Chemical Engineering 84 (2016) 1–11 11

Fumero Y, Montagna JM, Corsano G. Simultaneous design and scheduling of a semi- Moreno MS, Montagna JM. Multiperiod production planning and design of batch
continuous/batch plant for ethanol and derivatives production. Comput Chem plants under uncertainty. Comput Chem Eng 2012];40:181–90.
Eng 2012a];36:342–57. Moreno MS, Montagna JM, Iribarren OA. Multiperiod optimization for the design and
Fumero Y, Corsano G, Montagna JM. Planning and scheduling of multistage mul- planning of multiproduct batch plants. Comput Chem Eng 2007];31:1159–73.
tiproduct batch plants operating under production campaigns. Ann Oper Res Moreno MS, Iribarren OA, Montagna JM. Optimal design of multiproduct batch plants
2012b];199:249–68. considering duplication of units in series. Chem Eng Res Des 2009];87:1497–508.
Geißler B, Martin A, Morsi A, Schewe L. Using piecewise linear functions for solving Moreno MS, Iribarren OA, Montagna JM. Design of multiproduct batch plants
MINLPs. In: Mixed Integer Nonlinear Programming. Springer; 2012]. p. 287–314. with units in series including process performance models. Ind Eng Chem Res
Goldberg D. What every computer scientist should know about floating-point arith- 2009];48:2634–45.
metic. ACM Comput Surv (CSUR) 1991];23:5–48. Moreno-Benito M, Espuña A, Puigjaner L. Flexible batch process and plant design
Gropp W, Moré J. Optimization environments and the NEOS server. Approx Theory using mixed-logic dynamic optimization: single-product plants. Ind Eng Chem
Opt 1997]:167–82. Res 2014];53:17182–99.
Grossmann IE, Guillén-Gosálbez G. Scope for the application of mathematical pro- Nikolopoulou A, Ierapetritou MG. Optimal design of sustainable chemical processes
gramming techniques in the synthesis and planning of sustainable processes. and supply chains: a review. Comput Chem Eng 2012];44:94–103.
Comput Chem Eng 2010];34:1365–76. Nowak I. Overview of global optimization methods. In: Relaxation and Decomposi-
Grossmann IE, Caballero JA, Yeomans H. Advances in mathematical program- tion Methods for Mixed Integer Nonlinear Programming; 2005]. p. 121–8.
ming for the synthesis of process systems. Latin Am Appl Res 2000];30: Pinto JM, Montagna JM, Vecchietti AR, Iribarren OA, Asenjo JA. Process performance
263–84. models in the optimization of multiproduct protein production plants. Biotech-
Gupta S, Karimi IA. An improved MILP formulation for scheduling multiproduct, nol Bioeng 2001];74:451–65.
multistage batch plants. Ind Eng Chem Res 2003];42:2365–80. Ponsich A, Azzaro-Pantel C, Domenech S, Pibouleau L. Mixed-integer nonlinear pro-
Iribarren OA, Montagna JM, Vecchietti AR, Andrews B, Asenjo JA, Pinto JM. Opti- gramming optimization strategies for batch plant design problems. Ind Eng
mal process synthesis for the production of multiple recombinant proteins. Chem Res 2007];46:854–63.
Biotechnol Prog 2004];20:1032–43. Ravemark DE, Rippin DWT. Optimal design of a multi-product batch plant. Comput
Koch T. The final NETLIB-LP results. Oper Res Lett 2004];32:138–42. Chem Eng 1998];22:177–83.
Kocis GR, Grossmann IE. Global Optimization of nonconvex mixed-integer nonlin- Rebennack S, Kallrath J, Pardalos PM. Optimal storage design for a multi-product
ear programming (MINLP) problems in process synthesis. Ind Eng Chem Res plant: a non-convex MINLP formulation. Comput Chem Eng 2011];35:255–71.
1988];27:1407–21. Reklaitis GV. Progress and issues in computer-aided batch process design. Found
Li X, Chen Y, Barton PI. Nonconvex generalized Benders decomposition with piece- Comput Aided Process Des 1990]:241–75.
wise convex relaxations for global optimization of integrated process design and Rippin D. Batch process systems engineering: A retrospective and prospective
operation problems. Ind Eng Chem Res 2012];51:7287–99. review. Comput Chem Eng 1993];17:S1–13.
Margot F. Testing cut generators for mixed-integer linear programming. Math Pro- Robinson JD, Loonkar YR. Minimizing capital investment for multi-product batch-
gram Comput 2009];1:69–95. plants. Process Technol 1972];17:861.
Mittelmann H. Decision Tree for Optimization Software; 2013]. Verderame PM, Elia JA, Li J, Floudas CA. Planning and scheduling under uncertainty:
Montagna JM, Vecchietti AR, Iribarren OA, Pinto JM, Asenjo JA. Optimal design of a review across multiple sectors. Ind Eng Chem Res 2010];49:3993–4017.
protein production plants with time and size factor process models. Biotechnol Vielma JP. Mixed integer linear programming formulation techniques; 2013].
Prog 2000];16:228–37. Voudouris VT, Grossmann IE. Mixed-integer linear programming reformulations
Montagna JM, Iribarren OA, Vecchietti AR. Synthesis of biotechnological pro- for batch process design with discrete equipment sizes. Ind Eng Chem Res
cesses using generalized disjunctive programming. Ind Eng Chem Res 1992];31:1315–25.
2004];43:4220–32. Voudouris VT, Grossmann IE. Optimal synthesis of multiproduct batch plants
Moreno MS, Montagna JM. New alternatives in the design and planning of with cyclic scheduling and inventory considerations. Ind Eng Chem Res
multiproduct batch plants in a multiperiod scenario. Ind Eng Chem Res 1993];32:1962–80.
2007];46:5645–58.

También podría gustarte