Algorithm Helps Managers Allocate Scarce Resources
Managers who rely on computer models to help with decision-making bump into a dilemma when it comes to allocation of scarce resources in complex environments with many moving parts.
Organizations want immediate economic benefits. But they also want to learn how to make better decisions in the future, which requires experimentation with strategies that might seem suboptimal in the present.
New research from the University of Maryland’s Robert H. Smith School of Business explores the tradeoffs of the “exploration vs. exploitation” dilemma and proposes a new approach for making decisions over time in situations where each choice influences the variables available for consideration in the future.
“We must strike a balance between making decisions based solely on existing approximations, and making decisions with uncertain outcomes to learn about the problem,” says Ilya O. Ryzhov, co-author of the study in Operations Research. “Decisions that seem suboptimal based on past observations may actually be much better than we believe.”
The reverse can also be true. Decisions that seem wise based on past observations might actually lead to disaster in real-world applications with shifting variables.
Computer scientists can mitigate the risks by conducting experiments in a simulator. But business moves fast in the 21st century, and organizations cannot always wait for clarity. They need real-time feedback when making decisions about things like market pricing and internet advertising.
Other examples where managers must allocate scare resources in dynamic environments include commodity storage, water management, call center operations and finance — where fund managers must decide how much cash to keep on hand and how much to invest.
Ryzhov and his co-authors develop their model using the Bayesian philosophy, which approximates the distribution of unknown quantities based on prior knowledge and belief.
They then test their model in three specific scenarios, including a transportation problem with nomadic truckers who must respond to demands that arise randomly in different locations. Drivers with a single truck must consider variables such as their current location, the day of the week and trailer type.
“The set of possible decisions depends on the current state,” the authors write. “For example, we can choose to accept a currently available load of a particular type in some location, or we can choose to move empty.”
Their algorithm allows anyone with scarce resources to compute the tradeoffs between exploration of new strategies and exploitation of immediate available benefits.
“Using several resource allocation problems, we demonstrate that the performance of our method is highly competitive against other exploration strategies, with the benefits being particularly valuable in online implementations where computation time is less restrictive,” the authors conclude.
Read more: Bayesian Exploration for Approximate Dynamic Programming is featured in Operations Research. Authors include Ilya O. Ryzhov at Maryland Smith; Martijn Mes at the University of Twente in the Netherlands; Warren B. Powell at Princeton University; and Gerald van den Berg, vice president of analytics and strategic finance at Etsy.