Hi wonderful MET help team!

I told you it wouldn't be long before you heard from me! I hope you're all
doing well.

I'm running into a few issues with StatAnalysis. I'm running StatAnalysis
on a HPC via Singularity starting from the DTC Docker image. I am using
version 10.0.0.

1. I run into a performance issue when running multiple *-by* options. For
example, if I run a command
"stat_analysis -lookin <stat_file_directory> \
-job aggregate_stat -line_type MPR -out_line_type CNT \
-out_stat <out_put_file>  \
-fcst_var TMP -obs_var TMP -fcst_lead 06 -fcst_init_beg 2021060112 \
-fcst_init_end 2021060712 -by OBS_SID -set_hdr VX_MASK OBS_SID -set_hdr
DESC CASE -out_bin_size 1 -v 3"

It runs in about 1 minute and 20 seconds (7000 lines). If I add *-by
FCST_VAR, OBS_SID*, the job consumes all available 125 G of compute node
RAM before crashing. Do you know why this would occur? I am following the
example in the NRL tutorial StatAnalysis presentation (slide 14) that uses
multiple -by statements. I tried turning on debugging (-v 4) and I don't
get any related messages.

2. I run into a second performance issue when running any command with the
following flags and settings: *-aggregate_stat -line_type MPR.* For each
matched pair, a CDF is calculated with the default number of thresholds of
20. However, for each matched pair after the first, the previous matched
pair's CDF thresholds are added, like so:

"DEBUG 4: ClimoCDFInfo::set_cdf_ta() -> For "cdf_bins" (20) and
"center_bins" (false), defined climatology CDF thresholds:
DEBUG 4: ClimoCDFInfo::set_cdf_ta() -> For "cdf_bins" (20) and
"center_bins" (false), defined climatology CDF thresholds:

This is cumulative, so when running over 5000 matched pairs, the last one
has 100,000 thresholds attached to it. This creates a job so complex that
the job does not finish. I was able to circumvent this problem by adding
the flag *-out_bin_size 1*, but I still figured you would want to know
about it.

That's all I have for now. Thank you in advance for your help!


Lindsay Blank

Forecast Verification and Analytics Developer


Spire Global, Inc

San Francisco | Boulder | Washington D.C. | Singapore | Glasgow | Luxembourg

