# Process segmentation and modelling applied to time series featuring the response of biological materials to toxic agents

in environmental and military fields. In this framework, the department of Microbiology

and Biochemistry at Oregon State University has discovered that fish living cells

are promising indicators of the presence of a wide range of toxins. Thus, an interdisciplinary

project called ”SOS Cytosensor” was launched to create an autonomous and

mobile device to detect such toxins using these living cells.

After exposing a cell culture to a specific biological or chemical agent, a sequence

of cell images is recorded. The extraction of features from the experimental sequences

of images results in time series that have to be modelled and classified in order to prove

useful in toxin detection. The chosen models should give a representation of time series

that supports accurate classification and clustering and that would also make storage

and transmission more efficient. There are many techniques for dimensionality reduction

of time series data in the literature, such as Fourier transforms, but segmentation is

the most popular technique for extracting structures from time series. Segmentation

algorithms can be classified as batch or online. The main idea is that given a time

series Y, segmentation produces the best representation using an undefined number K

of segments, such that the combined error of all segments is less than a user-specified threshold and that the maximum error for any segment doesn’t exceed a user-specified

local threshold. First, we modelled each time series data using a single ARX model

with regularly spaced breakpoints. Then, we considered improving the result by placing

the breakpoints dynamically. As a pre-analysis of the curves, we performed a piecewise

linear segmentation, thus tracking changes in the behaviour of the time series and placing

breakpoints at those locations. Piecewise linear regression refers to the approximation of

a time series Y, of length N, with K straight lines. Because K is typically much smaller

than N, this representation makes the storage, transmission and computation of data

more efficient. The piecewise linear regression is usually used for change point detection,

which is our goal in this study.

As the segmentation into several simple adequate AR models proved not to be

satisfying in terms of fitting, we combined this concept with the piecewise linear segmentation

discussed above. Instead of modelling the time series by a single ARX model

using breakpoints determined by the segmentation algorithm or by several AR models,

we model each segment with a different ARX model. We use sum of square errors or the

residual error as a measure of the cost of merging segments. Computation speed has been

increased by presegmenting the time series with a fine piecewise linear approximation.

It also enables the user to predefine the number of final segments for classification and

clustering purposes. The final state can be detected by extracting the last segment from

the segmentation process.

Finally, classification and clustering are essential steps in the analysis of the experimental

time series. Cytosensor project required a numerical and non-numerical representation

of the experimental data. The approach adopted in this study is a soft

classification approach, which allows a better understanding and eases decision making,

thus complementing numerical features. Using calibration runs and the resulting model

parameters, we build a database of tight clusters representing scenarios. Then, we calculate

the probabilistic distances between an operational cluster and each of the calibration

clusters, leading to the identification of an operational run to a specific scenario.

Advisor:Temes, Gabor C.

School:Oregon State University

School Location:USA - Oregon

Source Type:Master's Thesis

Keywords:biosensors computer programs toxicity testing

ISBN:

Date of Publication:07/16/2004