|
|
|
12/18/06
|
|
|
|
Welcome to
Thomas's Web site!
Thomas Shih is an Industrial Engineering
Ph.D. at UTA. The title of his dissertation is "Convex Versions of
Multivariate Adaptive Regression Splines and Implementations for Complex
Optimizations Problems,"
co-supervised by Dr. Chen and Dr. Kim. Below are the frequent questions about
Shih’s dissertation: Q1: What is
Multivariate Adaptive Regression Splines (MARS) ? Multivariate
Adaptive Regression Splines (MARS) provide a flexible statistical modeling
method that employs forward and backward search algorithms to identify the
combination of basis functions that best fits the data. Q2: How is
MARS used in optimization? In
optimization, MARS has been used successfully to estimate the value function
in stochastic dynamic programming, and MARS could be potentially useful in
many real world optimization problems where objective (or other) functions
need to be estimated from data, such as in simulation optimization. Q3: What is
Convex version of MARS? Many
optimization methods depend on convexity, but a nonconvex MARS approximation
is inherently possible. In my dissertation, convex versions of MARS are
proposed. Q4: What kind
of modification is necessary to ensure convexity of MARS? In order to
ensure MARS convexity, two major modifications are made: (1) coefficients are
constrained such that pairs of basis functions are guaranteed to jointly form
convex functions; (2) The form of interaction terms is appropriately changed.
Q5: What are
the applications in this dissertation? This research
studies application to an inventory forecasting stochastic dynamic
programming problem and an airline fleet assignment problem. Q6: Why is data
mining important to the problems in your dissertation? The
implementation of MARS for approximating complex optimization functions can
involve thousands of state or decision variables. Although one can
simply attempt a MARS approximation over all the variables, prior research on
the fleet assignment application indicates that many variables have little
effect on the objective. Thus, a data mining step to conduct variable
selection is needed. This step separates potentially critical variables from
clearly redundant ones. Q7: What kind
of data mining tool is used to select important variables in the dissertation? In his
dissertation, variants of two data mining tools are explored separately and
in combination for variable selection: regression trees and multiple testing
procedures based on false discovery rate. Q8: What are your primary research
interests? My interests include using statistical
approaches to develop new methods for operations research problems in
engineering and science. Q9: What are your strengths? Are there any
particular applications? My expertise lies in statistical modeling
and data mining, particularly employed for computer experiments and
optimization. I have studied applications in inventory forecasting, airline
optimization, and air quality. Q10: What is the contribution of this
dissertation? Through the statistics-based methodology, I
have developed computationally tractable methods for complex optimization
problems. Q11: Is there any other project other than
the dissertation? The other
project I worked on is the one sponsored by DFW International Airport. This
funded project is joint with Dr. Victoria C. P. Chen, Dr. Seoung Bum Kim, Dr.
Jay M. Rosenberger and other colleagues in Center on Stochastic Modeling,
Optimization, & Statistics. Q12: What is
your contribution with respect to the DFW project? An integrated relational database has been
constructed, and various data sets can be obtained through relevant queries.
Statistical data analyses such as multiple linear regressions, regression trees
models, and MARS models have been explored. Q13: What kind of analysis have you done in
the DFW project? Data mining
approaches such as regression trees have been explored to identify the
important variables to predict dissolved oxygen in the receiving water of
interest. |
This
site was last updated 12/18/06