[Cowystats] ASA Chapter Meeting - April 18th - More Details

Matt Pocernich mjpdenver at gmail.com
Mon Apr 14 23:37:54 MDT 2008

### Spring Chapter Meeting - Friday April 18th - Boulder Colorado

Only 4 days away - the Colorado/Wyoming Chapter of the ASA Spring
meeting.  Great weather  and a beautiful and enjoyable day guaranteed.
 The agenda and schedule is listed below.  Times are approximate.
Some more details.

++Directions - The meeting will be held in the main seminar room of
the National Center for Atmospheric Research's Mesa Lab.  The Mesa Lab
is located at the western end of Table Mesa Drive  in Boulder.    More
information can be found at  xxxx.xxxxx.  As you walk in the front
door, there will be a sign in sheet at the security desk.

++ Lunch - Lunch is available at NCAR's cafeteria.  The cafeteria
offers several entrees, a grill with sandwiches and a salad bar.  For
a special,  slightly subsidized price of $5 you can have an entree or
sandwich, a drink, salad and dessert.  At the checkout,  just tell the
cashier you are with the ASA.  The $5 will need to be paid to one of
the officers.  This means cash will be provided.  Details will be
provided before the lunch break.  There will be several tables
reserved for us along the western end of the cafeteria so we can sit
At 12:30,  there will be an optional, demonstration of NCAR's VisLab facility.
Alternatively, we will organize a discussion of possible future
chapter activities.
Another option is a hike in the beautiful open space surrounding the
Mesa Lab.  There have not been any mountain lion sightings this week!


9    -    9:30    Coffee, snacks and registration
9:30   - 9:40    Welcome
9:40   - 10:00   Matt Benigini
10:00 - 10:20  Colin O'Donnell
10:20 - 10:40   Damian Wandler
10:40 - 11:00   Steven Anderson
11:00 - 11:40   Deb Glueck


1 - 1:10      Chapter Elections
1:10  1:50   Mary Myers
1:50 - 2:10  Stacy Hancock
2:10 - 2:30  Alex Przedpelski
2:30 - 3:10  Alicia Karspeck

Please contact Matt with any questions.
mjpdenver at gmail.com

++ Extended Abstracts

Steven Anderson - Metropolitan State College of Denver
Statistical Methods for determining optimal rifle cartridge dimensions
We have designed and carried out a statistical study to determine the
optimal cartridge dimensions for a Savage 10FLP law enforcement grade
rifle.  Optimal performance is defined as minimal distance from the
target's bull's-eye, and minimal group diameter.  A full factorial
block design with two main factors and one blocking factor was used.
The two main factors were bullet seating depth and powder charge.  The
experimental units were individual shots taken from a bench-rest
position and fired into separate targets.  Additionally, thirteen
covariates describing various cartridge dimensions were recorded.  The
data analysis includes ANOVA and multiple regression.   We will
describe the experiment, the analysis, and some results.

Matt Benigni Colorado School of Mines
"Analysis of Periodic Spatio-Temporal Improvised Explosive Device
Attack Patterns"
Roadside bombs by far the number one killer of U.S. and Coalition
soldiers in Iraq.  This research extends standard non-parametric
spatio-temporal density fitting techniques, and attempts to assess
risk levels for differing routes of travel based on past attacks.
Deborah Glueck - University of Colorado Health Sciences Center
Why Two Mammograms May Be Better Than One: the Science and the Statistics
D.H. Glueck, M.M. Lamb, J.M. Lewin and K. E. Muller. Glueck et al.,
2007 compared the area under the ROC curve of full-field digital
mammography, screen-film mammography, and a combined technique that
allowed diagnosis if a finding was suspicious on film, on digital, or
on both. We used data on paired full-field and digital mammograms
performed in 4,489 women (Lewin et al., 2002). At the
Bonferroni-corrected 0.025 alpha level, there was a significant
difference between both film and combined (difference = 0.07, p =
0.01) and digital vs. combined ROC curves (difference = 0.12, p =
0.001). Using two mammograms, one film and one digital, significantly
increased the diagnostic accuracy. In general, a combined test can
have better diagnostic accuracy than either component test alone, as
occurred in the mammography example. However, the combined test can
also be worse. Characterizing the combined ROC curve analytically
demonstrates that the correlations between the two component tests for
women with cancer, and for women without cancer governs the gain in
sensitivity and the loss in specificity when combining test results.

Stacey Hancock - Colorado State University
Estimating Structural Breaks in Nonstationary Time Series Data

Abstract: Many time series exhibit structural breaks in a variety of
ways, the most obvious being a mean level shift. In this case, the
mean level of the process is constant over periods of time, jumping to
different levels at times called "changepoints". These jumps may be
due to outside influences such as changes in government policy or
manufacturing regulations.  Structural breaks may also be a result of
changes in variability or changes in the spectrum of the process. The
goal of this research is to estimate where these structural breaks
occur and to provide a model for the data within each stationary
segment. The program AutoPARM (Automatic Piecewise AutoRegressive
Modeling procedure), developed by Davis, Lee, and Rodriguez-Yam
(2006), uses the minimum description length principle to estimate the
number and locations of changepoints in a time series by fitting
autoregressive models to each segment. This research shows that when
the true underlying model is segmented autoregressive, the estimates
obtained by AutoPARM are consistent. Under a more general time series
model exhibiting structural breaks, AutoPARM's estimate of the number
of changepoints is again consistent, and the segmented autoregressive
model provides a useful approximation to the true process.

Alicia Karspeck National Center for Atmospheric Research
Ensemble data assimilation for estimation of "El Nino relevant" variability

This talk will include some background on the El Nino climatic pattern
in the tropical Pacific ocean and atmosphere and a discussion of a
numerical model that is commonly used for simulation and prediction.
I will then discuss the use of an ensemble Kalman filter for
estimation of the variability, including some challenges  this system
poses for ensemble Kalman filtering.

Mary Meyer Colorado State University
Shape-Restricted Regression Splines in Action

Regression splines are smooth, flexible, and parsimonious
nonparametric function estimators.  They are known to be sensitive to
knot number and placement, but if assumptions such as monotonicity or
convexity may be imposed on the regression function, the
shape-restricted regression splines are robust to knot choices.
Monotone regression splines were introduced by Ramsay (1988), but were
limited to quadratic and lower order.  An algorithm for the cubic
monotone case is proposed, and the method is extended to convex
constraints and variants such as increasing-concave.    The restricted
version have smaller squared error loss than the unrestricted splines,
although they have the same convergence rates.  The relatively small
degrees of freedom of the model and the insensitivity of the fits to
the knot choices allow for practical inference methods;  the
computational efficiency allows for back-fitting of additive models.
Tests of constant versus increasing and linear versus convex
regression function, when implemented with shape-restricted regression
splines, have higher power than the standard version using ordinary
shape-restricted regression.

Colin O'Donnel University of Colorado Health Sciences Center
A Likelihood Model That Accounts for Censoring Due to Fetal Loss Can
Accurately Test the Effects of Maternal and Fetal Genotype on the
Probability of Miscarriage

Heritable maternal and fetal thrombophilia and/or hypofibrinolysis are
important causes of miscarriage. Under the constraint that fetal
genotype is observed only after a live birth, estimating risk is
complicated. Censoring prevents use of published statistical
methodology. We propose techniques to determine whether increases in
miscarriage are due to the fetal genotype, maternal genotype, or both.
 We propose a study to estimate the risk of miscarriage contributed by
an allele, expressed in either dominant or recessive fashion. Using a
multinomial likelihood, we derive maximum likelihood estimates of risk
for different genotype groups. We describe likelihood ratio tests and
a planned hypothesis testing strategy.

Parameter estimation is accurate (bias < 0.0011, root mean squared
error < 0.0780, N = 500). We used simulation to estimate power for
studies of three gene mutations: the 4G hypofibrinolytic mutation in
the plasminogen activator inhibitor gene (PAI-1), the prothrombin
G20210A mutation, and the Factor V Leiden mutation. With 500 families,
our methods have approximately 90% power to detect an increase in the
miscarriage rate of 0.2, above a background rate of 0.2.  Our
statistical method can determine whether increases in miscarriage are
due to fetal genotype, maternal genotype, or both despite censoring.

Alex Przedpelski  Monarch High School - Louisville, CO
Predicting Home Attendance for the Colorado Rockies or something of the sort.

The final goal of my project is to be able to predict the attendance
of Colorado Rockies home games. To do this, I investigated the
attendance figures from 2007 and potential factors that might have an
effect on them.  Over 20 individual variables were considered, ranging
from characteristics of the opposing team to short and long-term
success of the Rockies to weather. Data from all of these variables
are entered into Microsoft Excel (dummy/indicator variables were used
for a few variables). Using the linear least-squares method, Excel
generates an equation using any combination of variables you select.
Although I have created equations with adjusted r-squareds in the high
.80s using variables that have p-values below .10, this is still a
work in progress.  There is much more to be done, including acquiring
the evasive precipitation forecast percentages and applying the
equation to different years to analyze how the variables and/or their
coefficients affecting attendance change over time.

Damian Wandler - Colorado State University
Fiducial Inference and The Generalized Pareto

This talk will serve as an introduction to fiducial inference.  The
fiducial idea defines a distribution of the parameters that uses all
the information that the data contains.  It is similar to a Bayesian
approach except the for the need of a prior distribution.  Like
Baysian inference the fiducial idea produces a "posterior" like
density that we can use to calculate confidence intervals for the
parameters of our choice.  In this talk I will focus on fiducial
inference when dealing with the generalized pareto distribution.
Specifically it is of interest to calculate confidence intervals for
the quantiles of the generalized pareto and compare this method with
the current MLE methods to see if we compare favorably.

More information about the Cowystats mailing list