DMS Statistics Seminar Series

Statistics Seminar Series

Department of Mathematical Sciences
and
Center for Applied Mathematics and Statistics

New Jersey Institute of Technology

Spring 2009

All seminars are 4:00 - 5:00 p.m., in Cullimore Hall Room 611 (Math Conference Room) unless noted otherwise. Refreshments are usually served at 3:30 p.m., and talks start at 4:00 p.m. If you have any questions about a particular seminar, please contact the person hosting the speaker.

Date	Speaker and Title	Host
Thursday March 5, 2009 4:00PM	Yujun Wu, Sanofi-Aventis Approaches to Handling Data When a Phase II Trial Deviates from the Pre-specified Simon’s Two-Stage Design (abstract)	Sunil Dhar
Thursday April 30,2009 4:00PM	Ganesh K. (Mani) Subramaniam, AT&T Labs - Research, Florham Park, NJ Some Approaches to Mine Time Series Data (abstract)	Sunil Dhar

ABSTRACTS

Approaches to Handling Data When a Phase II Trial Deviates from the Pre-specified Simon’s Two-Stage Design:

Simon’s ‘optimal’ and ‘minimax’ two-stage designs are common methods for conducting phase IIA studies investigating new cancer therapies. However, these designs are rather rigid in their settings because of the pre-specified rejection rules and fixed sample sizes at each stage. In practice, we often encounter the problem that a study is unable to adhere to the event number and sample sizes of the original two-stage design. In this paper, we consider some approaches in handling situations where deviations or interruptions from the original Simon’s two-stage design occur because recruitment of patients is slower than expected. We consider four scenarios and use conditional probabilities to address the issues commonly inquired by the scientific review board. We also discuss how to report p-values in these situations.

Yujun Wu ~ March 5, 2009

Some Approaches to Mine Time Series Data:

Business decisions and business process monitoring are often based on time series data that represent an aggregation of a large number of time series. Although the inferences are generally based on the aggregate data, significant insights lurk in the underlying time series that had been combined. The challenge that analysts currently face is the large number and complexity of data underlying aggregate time series. This paper provides a framework that supports drill down analysis and screening of large scale time series data by developing feature extraction rules. We develop an exploratory method based on functional data analysis, where we fit smooth functions. One feature extraction involves estimating derivatives from these models. These provide insights on the bumps and dips for the underlying time series.

A critical parameter in smoothing noisy data is the bandwidth parameter. With the large number of curves, optimal bandwidth selection on a curve-by-curve basis is not practical, but some reasonable approximation of optimal bandwidth using, for example, generalized cross-validation or plug-in-bandwidth, might be acceptable. However, the main problem is that the optimal bandwidth for estimating regression function is generally smaller than the optimal bandwidth for estimating the derivatives (Hardle 1985). Thus automatic smoothing of optimal function and derivative estimation in a data mining setting presents a unique challenge. We will present two approaches one proposed by Rice (2004) which is the idea of borrowing strength commonly used in LDA (Diggle et al 1994) that could be beneficially used in FDA settings. The other is a method based lowess. We will also use simulated telecommunications network data to demonstrate how these techniques can be used in anomaly detection.

This is Joint work with Dr. Ravi Varadhan at Johns Hopkins.

Ganesh K. (Mani) Subramaniam ~ April 30, 2009