<p><b>dmbarker</b> 2008-06-03 15:25:23 -0600 (Tue, 03 Jun 2008)</p><p>Modified var chapter. Dale Barker<br>
</p><hr noshade><pre><font color="gray">Modified: trunk/wrf/technote/var.tex
===================================================================
--- trunk/wrf/technote/var.tex        2008-06-03 21:21:13 UTC (rev 79)
+++ trunk/wrf/technote/var.tex        2008-06-03 21:25:23 UTC (rev 80)
@@ -80,10 +80,10 @@
\vspace{0.5cm}
b) Observations ${\bf y^{o}}$--- In the current version of WRF-Var, observations may be
-supplied either in a text (MM5 3D-Var) format or BUFR format (but not a
-combination of the two). An observation preprocessor (3DVAR$\_$OBSPROC)
-is supplied with the code release to perform basic quality control, assign
-observation errors, and reformat observations from the MM5 {\it little$\_$r} text
+supplied either in PREPBUFR format ({\it ob\_format=1}) or an ASCII "little\_r" format
+({\it ob\_format=2}). An observation preprocessor (3DVAR$\_$OBSPROC)
+is supplied with the code release to perform basic quality control, assign "total"
+observation errors (${\bf R = E+F}$ in Fig. \ref{var-sketch}), and reformat observations from the MM5 {\it little$\_$r} text
format into 3D-Var's own text format. Details can be found in \citet{barker03, barker04}.
\vspace{0.5cm}
@@ -105,10 +105,9 @@
variational tuning approaches \citep{desroziers01}.
Following assimilation of all data, an analysis ${\bf x^{a}}$ is produced that must be
-merged with the existing lateral boundary conditions ${\bf x^{lbc}}$ (described in
-\citet{barker03}). Note: In cycling mode, only the {\it wrfbdy} lateral boundary condition
-files (${\bf x^{lbc}}$) output of SI/real are used, and not the {\it wrfinput} initial condition
-files (${\bf x^{b}}$). In cold-start mode, both are required.
+merged with the existing lateral boundary conditions ${\bf x^{lbc}}$ in the {\it WRF\_BC}
+utility (\citet{barker03}). At this stage, the {\it wrfbdy} lateral boundary condition
+files (${\bf x^{lbc}}$) output of WPS/real is updated to make the lateral boundaries consistent with the analysis, and surface fields (e.g. SST) are also updated in the {\it wrfinput} analysis file.
\section{Improvements to the WRF-Var Algorithm}
\label{var-upgrade}
@@ -131,23 +130,19 @@
\subsection{Improved minimization and ``outer loop"}
-The default WRF-Var cost function minimization uses a modified version of the limited
-memory Quasi-Newton Method (QNM). Recently, an alternative Conjugate Gradient
-Method (CGM) has been implemented. Unlike the QNM technique, the CGM method
-restricts 3D-Var's inner loop to be completely linear. This limitation is dealt with through
-the inclusion of an outer loop in WRF-Var, the purpose of which is to iterate towards
+Prior to WRF-Var V3.0, the default WRF-Var cost function minimization used a modified
+version of the limited memory Quasi-Newton Method (QNM). In V3.0, an alternative
+Conjugate Gradient Method (CGM) has been implemented. Unlike the QNM technique,
+the CGM method restricts WRF-Var's inner loop to be completely linear. This limitation is dealt
+with through the inclusion of an outer loop in WRF-Var, the purpose of which is to iterate towards
nonlinear solutions (e.g., observation operators, balance constraints, and the forecast itself in
-4D-Var) using the WRF-Var analysis from the previous iteration as new background. The
+4D-Var) using the WRF-Var analysis from the previous iteration as new first guess. The
outer loop is also used as a form of variational quality control as follows: observations are
-rejected if their O-B values are outside a prescribed range (typically several times the
-observation error standard deviation). This {\it errormax} test implicitly assumes the rejected
-large O-B values are due to a bad observation (O) rather than poor background (B).
-However, if it is the background B that is incorrect then the system will reject the most
-useful observations available to the assimilation system, i.e., those in areas where the
-first-guess is poor. The outer loop alleviates this effect by allowing observations
-rejected in previous iterations to be accepted if their new O-B falls within the required range
-in subsequent outer loops. The assimilation of nearby observations in previous iterations
-essentially provides a ``buddy check" to the observation in question.
+rejected if the magnitude of the observation minus first guess differences are larger than a
+specified threshold (typically several times the observation error standard deviation). This {\it errormax} test implicitly assumes the first guess is accurate. However, in cases when this assumption breaks
+down (i.e. in areas of large forecast error), there is a danger that good observations might be rejected in areas where they are most valuable. The outer loop alleviates this effect by allowing observations
+rejected in previous iterations to be accepted if their updated observation minus analysis differences
+pass the errormax QC check in in subsequent outer loops. The assimilation of nearby observations in previous iterations essentially provides a ``buddy check" to the observation in question.
\subsection{Choice of control variables}
\label{var-cvs}
@@ -180,8 +175,7 @@
(observation minus first guess difference), and hence a better use of observations when
their valid time differs from that of the analysis.
FGAT is most effective for the analysis of observations from
-asynoptic, moving platforms (e.g., aircraft and satellite data). Surface observations with
-high temporal resolution also benefit from the use of FGAT.
+asynoptic, moving platforms (e.g., aircraft and satellite data).
\subsection{Radar Data Assimilation}
@@ -213,7 +207,7 @@
boundary conditions. Of course, there are also scientific questions
concerning the optimal mix of observations required for
global/regional models, and the choice of control variables and
-balance constraints. A unified global/regional 3D-Var system should
+balance constraints. A unified global/regional data assimilation system should
therefore be flexible to a variety of thinning/quality-control
algorithms and also to alternative formulations of the background
error covariance matrix. This flexibility has been a key design
@@ -232,8 +226,7 @@
correlation defined in spectral space is also a weakness---
anisotropies need to be defined in an alternative manner. One solution
to this problem is to replace the spectral correlations with
-grid-point correlations (e.g., in the Gridpoint Statistical
-Interpolation scheme under development at NCEP). An alternative
+grid-point correlations \citep{purser03}. An alternative
technique is to supplement the isotropic spectral correlations with an
anisotropic component derived via grid transformations, additional
control variables or 4D-Var. Research using the latter techniques is
@@ -275,8 +268,8 @@
completely define the analysis response away from observations. The latter impact is
particularly important in data-sparse areas of the globe. Unlike ensemble filter data
assimilation techniques (e.g., the Ensemble Adjustment Kalman Filter, the Ensemble
-Transform Kalman Filter), 3/4D-Var systems do not implicitly evolve forecast error
-covariances in real-time. Instead, climatologic statistics are usually estimated offline.
+Transform Kalman Filter), 3/4D-Var systems do not explicitly evolve forecast error
+covariances in real-time (although both 4D-Var and hybrid variational/ensemble data assimilation techniques currently being developed within WRF-Var implement flow-dependent covariances implicitly). Instead, climatologic statistics are usually estimated offline.
The ``NMC-method", in which forecast error covariances are approximated using
forecast difference (e.g., T+48 minus T+24) statistics, is a commonly used approach
\citep{parrish92}. Experiments at ECMWF \citep{fisher03} indicate superior statistics may
@@ -294,17 +287,12 @@
required to specify and implement flow-dependent error covariances in 3/4D-Var is
significant.
-The NMC-method code developed for MM5 3D-Var \citet{barker04} is nearing the
-end of its useful life. The development of a unified global/regional WRF-Var system, and
-its application to a variety of models (e.g., ARW, MM5, KMA global model,
-Taiwan's Nonhydrostatic Forecast System [NFS]) has
-required a new, efficient, portable forecast background error covariance calculation
-code to be written. There is also a demand for such a capability to be available and
-supported for the wider 3/4D-Var research community for application to their own
-geographic areas of interest (the default statistics supplied with the WRF-Var
-release are designed only as a starting point). In this section, the new {\it gen$\_$be} code
-developed by NCAR/MMM to generate forecast error statistics for use with the
-WRF-Var system is described.
+The development of a unified global/regional WRF-Var system, and its widespread use
+in the WRF community has necessitated the development of a new, efficient, portable forecast background error covariance calculation code. Numerous applications have also indicated
+that superior results are obtained if one invests effort in calculating domain-specific
+error covariances, instead of using the the default statistics supplied with the WRF-Var
+release. In this section, the new {\it gen$\_$be} code developed by NCAR/MMM to generate
+forecast error statistics for use with the WRF-Var system is described.
The background error covariance matrix is defined as
@@ -325,15 +313,6 @@
result is an ensemble of model perturbation vectors from which estimates of
background error may be derived. The new {\it gen$\_$be} utility has been designed to work with
either forecast difference, or ensemble-based, perturbations.
-
-As described above, the WRF-Var background error covariances are specified not in
-model space ${\bf x'}$, but in a control variable space ${\bf v}$, which is related to the model variables
-(e.g., wind components, temperature, humidity, and surface pressure) via the control
-variable transform defined in (\ref{var-cv}). Both (\ref{var-cv}) and
-its adjoint are required in WRF-Var. In contrast, the background error code performs the
-inverse control variable transform ${\bf v}={\rm U}_{h}^{-1} {\rm U}_{v}^{-1} {\rm U}_{p}^{-1}{\bf x'}$ in order to
-accumulate statistics for each component of the control vector ${\bf v}$.
-
Using the NMC-method, ${\bf x}'={\bf x_{T2}}-{\bf x_{T1}}$ where $T2$ and $T1$
are the forecast difference times (e.g., 48h minus 24h for global, 24h minus 12h for regional).
Alternatively, for an ensemble-based approach, ${\bf x_{k}}'={\bf x_{k}}-\bar{\bf
@@ -343,6 +322,14 @@
Using the NMC-method, $n_e=1$ (1 forecast difference per time). For ensemble-based
statistics, $n_e$ is the number of ensemble members.
+As described above, the WRF-Var background error covariances are specified not in
+model space ${\bf x'}$, but in a control variable space ${\bf v}$, which is related to the model variables
+(e.g., wind components, temperature, humidity, and surface pressure) via the control
+variable transform defined in (\ref{var-cv}). Both (\ref{var-cv}) and
+its adjoint are required in WRF-Var. To enable this, the (offline) background error utility is used
+to compute components of the forecast error covariance matrix modeled within the
+${\rm U}$ transform. This process is described in the following subsections.
+
The background error covariance generation code {\it gen$\_$be} is designed to process
data from a variety of regional/global models (e.g., ARW, MM5, KMA global model,
NFS, etc.), and process it in order to provide error
@@ -390,12 +377,6 @@
\subsection{Multivariate Covariances: Regression coefficients and unbalanced variables}
-The WRF-Var system permits a variety of background error covariance
-models to be employed, as described in Section \ref{var-cvs}
-above.
-The utility {\it gen$\_$be} is used to provide background error
-statistics only for cv$\_$options 4 and 5.
-
The second stage
of {\it gen$\_$be (gen$\_$be$\_$stage2)} provides statistics for the
unbalanced fields $\chi_u$, $T_u$, and $P_{su}$ used as control
@@ -408,33 +389,33 @@
\citet{wu02} for further details). The resulting regression coefficients
are output for use
in WRF-Var's ${\rm U}_p$ transform. Currently, three regression analyses are
-performed resulting in three sets of regression coefficients (Note:
+performed resulting in three sets of regression coefficients (note:
The perturbation notation has been dropped for the
remainder of this chapter for clarity.):
\begin{itemize}\setlength{\parskip}{-4pt}
-\item Velocity potential/streamfunction regression: $\chi_b=c\psi$;
-\item        Temperature/streamfunction regression: $T_{b,k1}=\sum_{k2}G_{k1,k2}\psi_{k2}$; and
-\item        Surface pressure/streamfunction regression: $p_{sb}=\sum{k}W_{k}\psi_{k}$.
+\item Velocity potential/streamfunction regression: $\chi_b(k)=c(k)\psi(k)$;
+\item        Temperature/streamfunction regression: $T_b(k)=\sum_{k1}G(k1,k)\psi(k1)$; and
+\item        Surface pressure/streamfunction regression: $p_{sb}=\sum_{k1}W(k1)\psi(k1)$.
\end{itemize}
-Data is read from all $n_f \times n_e$ files and sorted into bins defined via the namelist
-option {\it bin$\_$type}. Regression coefficients $G(k1,k2)$ and $W(k)$ are computed
-individually for each bin (bin$\_$type=1 is used here, representing latitudinal dependence)
-in order to allow representation of differences between, for example, polar, mid-latitude, and
-tropical dynamical and physical processes. In addition, the scalar coefficient $c$ used to
-estimate velocity potential errors from those of streamfunction is calculated as a function
-of height to represent, for example, the impact of boundary-layer physics. Latitudinal/height
+The summation over the vertical index $k1$ relates to the integral (hydrostatic) relationship between
+mass fields and the wind field. By default, the regression coefficients $c$, $G$, and $W$ do
+not vary horizontally, however options exists to relax this assumption via the {\it bin\_type}
+namelist variable in order to allow representation of differences between, for example, polar, mid-latitude, and tropical dynamical and physical processes. The scalar coefficient $c$ used to
+estimate velocity potential errors from those of streamfunction is permitted to vary with model
+level in order to represent, for example, the impact of boundary-layer physics. Latitudinal/height
smoothing of the resulting coefficients may be optionally performed to avoid artificial
-discontinuities at the edges of latitude/height boxes.
+discontinuities at the edges of latitude/height boxes (see the future WRF-Var technical note for
+details of these "expert" features).
Having computed regression coefficients, the unbalanced components of the fields are
-calculated as $\chi_u=\chi-c\psi$, $T_{u,k1}=T_{k1}-\sum_{k2}G_{k1,k2}\psi_{k2}$,
-and $p_{su}=p_s - \sum_{k} W_{k}\psi_{k}$. These fields are output for the
+calculated as $\chi_{u}(k)=\chi(k)-c(k)\psi(k)$, $T_{u}(k)=T(k)-\sum_{k1}G(k1,k)\psi(k1)$,
+and $p_{su}=p_s - \sum_{k1} W(k1)\psi(k1)$. These fields are output for the
subsequent calculation of the spatial covariances as described below.
\subsection{Vertical Covariances: Eigenvectors/eigenvalues and
-control variable projections}
+control variable projections}
The third stage ({\it gen$\_$be$\_$stage3}) of {\it gen$\_$be}
calculates the statistics required for the vertical component of the
@@ -448,12 +429,8 @@
The {\it gen$\_$be} code calculates both domain-averaged and local
values of the vertical component of the background error covariance
-matrix. The definition of local again depends on the value of the
-namelist variable bin$\_$type chosen. For example, for bin$\_$type=1,
-a $kz \times kz$ (where $kz$ is the number of vertical levels) vertical
-component of $\bf B$ is produced at every latitude (data is averaged
-over time and longitude) for each control variable. Eigendecomposition
-of the resulting climatological vertical error covariances ${\bf
+matrix. Eigendecomposition of the resulting $K\times K$ ($K$ is the number of
+vertical levels) climatological vertical error covariance matrix ${\bf
B}={\bf E}{\Lambda}{\bf E}^{T}$ results in both domain-averaged and
local eigenvectors $\bf E$ and eigenvalues $\Lambda$. Both sets of
statistics are included in the dataset supplied to WRF-Var, allowing
@@ -471,12 +448,12 @@
The last aspect of the climatological component of background error
covariance data required for WRF-Var is the horizontal error
correlations, the representation of which forms the largest difference
-between running WRF-Var in regional and global mode. (It is however,
-still a fairly local change.)
+between running WRF-Var in regional and global mode - the rest of
+{\it gen\_be} is essentially the same for both regional and global models.
In a global application ({\it gen\_be\_stage4\_global}), power spectra
-are computed for each of the $kz$ vertical modes of the 3D control
-variables $\psi$, $\chi_u$, $T_u$, and $r$, and for the 2D control
+are computed for each of the $K$ vertical modes of the 3D control
+variables $\psi$, $\chi_u$, $T_u$, and relative humidity $r$, and for the 2D control
variable $p_{su}$ data. In contrast, in regional mode, horizontal
correlations are computed between grid-points of each 2D field, binned
as a function of distance. A Gaussian curve is then fitted to the data
</font>
</pre>