Markov Decision Processes, stochastic modeling, and queueing theory
Production and inventory control, new product development portfolio management, and sustainable operations
Abstracts of Selected Papers
New functional characterizations and optimal structural results for assemble-to-order M-systems (with Mustafa Akan and Alan Scheller-Wolf)
We consider an assemble-to-order (ATO) M-system with multiple components and products. Specifically, the system involves a single product which uses multiple units from each component, and multiple individual products each of which consumes multiple units from a different component. We model the system as an infnite-horizon Markov Decision Process (MDP) under the discounted cost criterion. Each component is produced in batches of fixed size in a make-to-stock fashion; batch sizes are determined by individual product sizes. Production times are independent and exponentially distributed. Demand for each product arrives as an independent Poisson process. If not satisfied immediately upon arrival, these demands are lost. A control policy specifies when a batch of components should be produced, and whether an arriving demand for each product should be satisfied. We introduce new functional characterizations for submodularity and supermodularity restricted to certain lattices of the state space. These enable us to characterize optimal inventory replenishment and allocation policies via a new type of policy: lattice-dependent base-stock and lattice-dependent rationing. This implies that the state space of the problem can be partitioned into disjoint lattices such that, on each lattice, (a) it is optimal to produce a batch of a particular component if and only if the state vector is less than the base-stock level of that component, and (b) it is optimal to fulfill a demand of a particular product if and only if the state vector is greater than or equal to the rationing level for that product. We also show that LBLR remains optimal if the optimization criterion is modified to the average cost rate.
Performance evaluation of lattice-dependent base-stock and lattice-dependent rationing policies in assemble-to-order systems (with Mustafa Akan and Alan Scheller-Wolf)
We consider ATO systems with multiple products, multiple components which may be demanded in different quantities by different products, batch ordering of components, random lead times, and lost sales. We model these systems as an infinite-horizon MDP under the average cost criterion, and evaluate the use of an LBLR policy as a heuristic replenishment and allocation policy when the M-system product structure is violated. We numerically compare the globally optimal policy to LBLR and two other heuristics from the literature: a state-dependent base-stock and state-dependent rationing (SBSR) policy, and a fixed base-stock and fixed rationing (FBFR) policy. Interestingly, LBLR yields the globally optimal cost in each of more than 1800 compiled instances. LBLR and SBSR perform significantly better than FBFR when replenishment batch sizes imperfectly match the component requirements of the most valuable or most highly demanded product. In addition, LBLR substantially outperforms SBSR if it is crucial to hold a significant amount of inventory that must be rationed. We then modify our optimization criterion to total expected discounted cost over an infinite horizon, and show that submodularity and supermodularity, which are used to prove the optimality of LBLR for ATO M-systems, need not hold for general ATO systems. Thus, the proof of the optimality of LBLR will likely require new methodologies.
Optimal portfolio strategies for new product development (with Mustafa Akan, Laurens Debo, and Alan Scheller-Wolf)
We study the problem of project selection and resource allocation in a multi-stage new product development (NPD) process with stage-dependent resource constraints. We model the problem as an infinite-horizon MDP under the discounted cost criterion. Each NPD project undergoes a different experiment in each stage of the NPD process; these experiments generate signals about the true nature of the project. Experimentation times are independent and exponentially distributed. Beliefs about the ultimate outcome of each project are updated after each experiment according to Bayesian rule. Projects thus become differentiated through their signals, and all available signals for a project determine its category. The state of the system is described by the numbers of projects in each category. A control policy specifies when and at what rate to utilize the resources at each stage, and on which projects. We characterize the optimal control policy via a new type of strategy, state-dependent non-congestive promotion, (a) when there are multiple uninformative experiments, or (b) when there is a single informative experiment with multiple signals. A non-congestive promotion policy implies that, at each stage, it is optimal to advance a project with the highest expected reward to the next stage if and only if the number of projects in each successor category is less than a state-dependent threshold. Furthermore, threshold values decrease in a non-strict sense as a later stage becomes more congested or as an earlier stage becomes less congested. We conjecture that the optimal control policy for the general problem is a non-congestive promotion policy.