Summary: | High levels of automation in manufacturing industries are leading to data sets of increasing
size and dimension. The challenge facing statisticians and field professionals is to develop
methodology to help meet this demand.
Functional data is one example of high-dimensional data characterized by observations
recorded as a function of some continuous measure, such as time. An application considered
in this thesis comes from the automotive industry. It involves a production process in which
valve seats are force-fitted by a ram into cylinder heads of automobile engines. For each
insertion, the force exerted by the ram is automatically recorded every fraction of a second
for about two and a half seconds, generating a force profile. We can think of these profiles
as individual functions of time summarized into collections of curves.
The focus of this thesis is the analysis of functional process data such as the valve seat
insertion example. A number of techniques are set forth. In the first part, two ways to
model a single curve are considered: a b-spline fit via linear regression, and a nonlinear
model based on differential equations. Each of these approaches is incorporated into a
mixed effects model for multiple curves, and multivariate process monitoring techniques
are applied to the predicted random effects in order to identify anomalous curves. In the
second part, a Bayesian hierarchical model is used to cluster low-dimensional summaries
of the curves into meaningful groups. The belief is that the clusters correspond to distinct
types of processes (e.g. various types of “good” or “faulty” assembly). New observations
can be assigned to one of these by calculating the probabilities of belonging to each cluster.
Mahalanobis distances are used to identify new observations not belonging to any of the
existing clusters. Synthetic and real data are used to validate the results.
|