<p><b>dmbarker</b> 2008-06-03 15:25:23 -0600 (Tue, 03 Jun 2008)</p><p>Modified var chapter. Dale Barker<br>

</p><hr noshade><pre><font color="gray">Modified: trunk/wrf/technote/var.tex

===================================================================

--- trunk/wrf/technote/var.tex        2008-06-03 21:21:13 UTC (rev 79)

+++ trunk/wrf/technote/var.tex        2008-06-03 21:25:23 UTC (rev 80)

@@ -80,10 +80,10 @@

 \vspace{0.5cm}

 b) Observations ${\bf y^{o}}$--- In the current version of WRF-Var, observations may be 

-supplied either in a text (MM5 3D-Var) format or BUFR format (but not a 

-combination of the two). An observation preprocessor (3DVAR$\_$OBSPROC) 

-is supplied with the code release to perform basic quality control, assign 

-observation errors, and reformat observations from the MM5 {\it little$\_$r} text 

+supplied either in PREPBUFR format ({\it ob\_format=1}) or an ASCII &quot;little\_r&quot; format

+({\it ob\_format=2}). An observation preprocessor (3DVAR$\_$OBSPROC) 

+is supplied with the code release to perform basic quality control, assign &quot;total&quot; 

+observation errors (${\bf R = E+F}$ in Fig. \ref{var-sketch}), and reformat observations from the MM5 {\it little$\_$r} text 

 format into 3D-Var's own text format. Details can be found in \citet{barker03, barker04}.

 \vspace{0.5cm}

@@ -105,10 +105,9 @@

 variational tuning approaches \citep{desroziers01}.

 Following assimilation of all data, an analysis ${\bf x^{a}}$ is produced that must be 

-merged with the existing lateral boundary conditions ${\bf x^{lbc}}$ (described in 

-\citet{barker03}). Note: In cycling mode, only the {\it wrfbdy} lateral boundary condition 

-files (${\bf x^{lbc}}$) output of SI/real are used, and not the {\it wrfinput} initial condition 

-files (${\bf x^{b}}$). In cold-start mode, both are required.

+merged with the existing lateral boundary conditions ${\bf x^{lbc}}$ in the {\it WRF\_BC} 

+utility (\citet{barker03}). At this stage, the {\it wrfbdy} lateral boundary condition 

+files (${\bf x^{lbc}}$) output of WPS/real is updated to make the lateral boundaries consistent with the analysis, and surface fields (e.g. SST) are also updated in the {\it wrfinput} analysis file.

 \section{Improvements to the WRF-Var Algorithm}

 \label{var-upgrade}

@@ -131,23 +130,19 @@

 \subsection{Improved minimization and ``outer loop&quot;}

-The default WRF-Var cost function minimization uses a modified version of the limited 

-memory Quasi-Newton Method (QNM). Recently, an alternative Conjugate Gradient 

-Method (CGM) has been implemented. Unlike the QNM technique, the CGM method 

-restricts 3D-Var's inner loop to be completely linear. This limitation is dealt with through 

-the inclusion of an outer loop in WRF-Var, the purpose of which is to iterate towards 

+Prior to WRF-Var V3.0, the default WRF-Var cost function minimization used a modified 

+version of the limited memory Quasi-Newton Method (QNM). In V3.0, an alternative 

+Conjugate Gradient Method (CGM) has been implemented. Unlike the QNM technique, 

+the CGM method restricts WRF-Var's inner loop to be completely linear. This limitation is dealt 

+with through the inclusion of an outer loop in WRF-Var, the purpose of which is to iterate towards 

 nonlinear solutions (e.g., observation operators, balance constraints, and the forecast itself in 

-4D-Var) using the WRF-Var analysis from the previous iteration as new background. The 

+4D-Var) using the WRF-Var analysis from the previous iteration as new first guess. The 

 outer loop is also used as a form of variational quality control as follows: observations are 

-rejected if their O-B values are outside a prescribed range (typically several times the 

-observation error standard deviation). This {\it errormax} test implicitly assumes the rejected 

-large O-B values are due to a bad observation (O) rather than poor background (B). 

-However, if it is the background B that is incorrect then the system will reject the most 

-useful observations available to the assimilation system, i.e., those in areas where the 

-first-guess is poor. The outer loop alleviates this effect by allowing observations 

-rejected in previous iterations to be accepted if their new O-B falls within the required range 

-in subsequent outer loops. The assimilation of nearby observations in previous iterations 

-essentially provides a ``buddy check&quot; to the observation in question.

+rejected if the magnitude of the observation minus first guess differences are larger than a 

+specified threshold (typically several times the observation error standard deviation). This {\it errormax} test implicitly assumes the first guess is accurate. However, in cases when this assumption breaks

+down (i.e. in areas of large forecast error), there is a danger that good observations might be rejected in areas where they are most valuable. The outer loop alleviates this effect by allowing observations 

+rejected in previous iterations to be accepted if their updated observation minus analysis differences

+pass the errormax QC check in in subsequent outer loops. The assimilation of nearby observations in previous iterations essentially provides a ``buddy check&quot; to the observation in question.

 \subsection{Choice of control variables}

 \label{var-cvs}

@@ -180,8 +175,7 @@

 (observation minus first guess difference), and hence a better use of observations when 

 their valid time differs from that of the analysis. 

 FGAT is most effective for the analysis of observations from 

-asynoptic, moving platforms (e.g., aircraft and satellite data). Surface observations with 

-high temporal resolution also benefit from the use of FGAT.

+asynoptic, moving platforms (e.g., aircraft and satellite data).

 \subsection{Radar Data Assimilation}

@@ -213,7 +207,7 @@

 boundary conditions. Of course, there are also scientific questions

 concerning the optimal mix of observations required for

 global/regional models, and the choice of control variables and

-balance constraints. A unified global/regional 3D-Var system should

+balance constraints. A unified global/regional data assimilation system should

 therefore be flexible to a variety of thinning/quality-control

 algorithms and also to alternative formulations of the background

 error covariance matrix. This flexibility has been a key design

@@ -232,8 +226,7 @@

 correlation defined in spectral space is also a weakness---

 anisotropies need to be defined in an alternative manner. One solution

 to this problem is to replace the spectral correlations with

-grid-point correlations (e.g., in the Gridpoint Statistical

-Interpolation scheme under development at NCEP). An alternative

+grid-point correlations \citep{purser03}. An alternative

 technique is to supplement the isotropic spectral correlations with an

 anisotropic component derived via grid transformations, additional

 control variables or 4D-Var. Research using the latter techniques is

@@ -275,8 +268,8 @@

 completely define the analysis response away from observations. The latter impact is 

 particularly important in data-sparse areas of the globe. Unlike ensemble filter data 

 assimilation techniques (e.g., the Ensemble Adjustment Kalman Filter, the Ensemble 

-Transform Kalman Filter), 3/4D-Var systems do not implicitly evolve forecast error 

-covariances in real-time. Instead, climatologic statistics are usually estimated offline. 

+Transform Kalman Filter), 3/4D-Var systems do not explicitly evolve forecast error 

+covariances in real-time (although both 4D-Var and hybrid variational/ensemble data assimilation techniques currently being developed within WRF-Var implement flow-dependent covariances implicitly). Instead, climatologic statistics are usually estimated offline. 

 The ``NMC-method&quot;, in which forecast error covariances are approximated using 

 forecast difference (e.g., T+48 minus T+24) statistics, is a commonly used approach 

 \citep{parrish92}. Experiments at ECMWF \citep{fisher03} indicate superior statistics may 

@@ -294,17 +287,12 @@

 required to specify and implement flow-dependent error covariances in 3/4D-Var is 

 significant.

-The NMC-method code developed for MM5 3D-Var \citet{barker04} is nearing the 

-end of its useful life. The development of a unified global/regional WRF-Var system, and 

-its application to a variety of models (e.g., ARW, MM5, KMA global model, 

-Taiwan's Nonhydrostatic Forecast System [NFS]) has 

-required a new, efficient, portable forecast background error covariance calculation 

-code to be written. There is also a demand for such a capability to be available and 

-supported for the wider 3/4D-Var research community for application to their own 

-geographic areas of interest (the default statistics supplied with the WRF-Var 

-release are designed only as a starting point). In this section, the new {\it gen$\_$be} code 

-developed by NCAR/MMM to generate forecast error statistics for use with the 

-WRF-Var system is described.

+The development of a unified global/regional WRF-Var system, and its widespread use

+in the WRF community has necessitated the development of a new, efficient, portable forecast background error covariance calculation code. Numerous applications have also indicated

+that superior results are obtained if one invests effort in calculating domain-specific 

+error covariances, instead of using the the default statistics supplied with the WRF-Var 

+release. In this section, the new {\it gen$\_$be} code developed by NCAR/MMM to generate 

+forecast error statistics for use with the WRF-Var system is described.

 The background error covariance matrix is defined as 

@@ -325,15 +313,6 @@

 result is an ensemble of model perturbation vectors from which estimates of 

 background error may be derived. The new {\it gen$\_$be} utility has been designed to work with 

 either forecast difference, or ensemble-based, perturbations.

-

-As described above, the WRF-Var background error covariances are specified not in 

-model space ${\bf x'}$, but in a control variable space ${\bf v}$, which is related to the model variables 

-(e.g., wind components, temperature, humidity, and surface pressure) via the control 

-variable transform defined in (\ref{var-cv}). Both (\ref{var-cv}) and 

-its adjoint are required in WRF-Var. In contrast, the background error code performs the 

-inverse control variable transform ${\bf v}={\rm U}_{h}^{-1} {\rm U}_{v}^{-1} {\rm U}_{p}^{-1}{\bf x'}$ in order to 

-accumulate statistics for each component of the control vector ${\bf v}$.

-

 Using the NMC-method, ${\bf x}'={\bf x_{T2}}-{\bf x_{T1}}$ where $T2$ and $T1$ 

 are the forecast difference times (e.g., 48h minus 24h for global, 24h minus 12h for regional). 

 Alternatively, for an ensemble-based approach, ${\bf x_{k}}'={\bf x_{k}}-\bar{\bf 

@@ -343,6 +322,14 @@

 Using the NMC-method, $n_e=1$ (1 forecast difference per time). For ensemble-based 

 statistics, $n_e$ is the number of ensemble members.

+As described above, the WRF-Var background error covariances are specified not in 

+model space ${\bf x'}$, but in a control variable space ${\bf v}$, which is related to the model variables 

+(e.g., wind components, temperature, humidity, and surface pressure) via the control 

+variable transform defined in (\ref{var-cv}). Both (\ref{var-cv}) and 

+its adjoint are required in WRF-Var. To enable this, the (offline) background error utility is used

+to compute components of the forecast error covariance matrix modeled within the 

+${\rm U}$ transform. This process is described in the following subsections.

+

 The background error covariance generation code {\it gen$\_$be} is designed to process

 data from a variety of regional/global models (e.g., ARW, MM5, KMA global model, 

 NFS, etc.), and process it in order to provide error 

@@ -390,12 +377,6 @@

 \subsection{Multivariate Covariances: Regression coefficients and unbalanced variables}

-The WRF-Var system permits a variety of background error covariance

-models to be employed, as described in Section \ref{var-cvs}

-above. 

-The utility {\it gen$\_$be} is used to provide background error

-statistics only for cv$\_$options 4 and 5.  

-

 The second stage

 of {\it gen$\_$be (gen$\_$be$\_$stage2)} provides statistics for the

 unbalanced fields $\chi_u$, $T_u$, and $P_{su}$ used as control

@@ -408,33 +389,33 @@

 \citet{wu02} for further details). The resulting regression coefficients 

 are output for use 

 in WRF-Var's ${\rm U}_p$ transform. Currently, three regression analyses are

-performed resulting in three sets of regression coefficients (Note:

+performed resulting in three sets of regression coefficients (note:

 The perturbation notation has been dropped for the 

 remainder of this chapter for clarity.):

 \begin{itemize}\setlength{\parskip}{-4pt}

-\item   Velocity potential/streamfunction regression: $\chi_b=c\psi$;

-\item        Temperature/streamfunction regression: $T_{b,k1}=\sum_{k2}G_{k1,k2}\psi_{k2}$; and

-\item        Surface pressure/streamfunction regression: $p_{sb}=\sum{k}W_{k}\psi_{k}$.

+\item   Velocity potential/streamfunction regression: $\chi_b(k)=c(k)\psi(k)$;

+\item        Temperature/streamfunction regression: $T_b(k)=\sum_{k1}G(k1,k)\psi(k1)$; and

+\item        Surface pressure/streamfunction regression: $p_{sb}=\sum_{k1}W(k1)\psi(k1)$.

 \end{itemize}

-Data is read from all $n_f \times n_e$ files and sorted into bins defined via the namelist 

-option {\it bin$\_$type}. Regression coefficients $G(k1,k2)$ and $W(k)$ are computed 

-individually for each bin (bin$\_$type=1 is used here, representing latitudinal dependence) 

-in order to allow representation of differences between, for example, polar, mid-latitude, and 

-tropical dynamical and physical processes. In addition, the scalar coefficient $c$ used to 

-estimate velocity potential errors from those of streamfunction is calculated as a function 

-of height to represent, for example, the impact of boundary-layer physics. Latitudinal/height 

+The summation over the vertical index $k1$ relates to the integral (hydrostatic) relationship between

+mass fields and the wind field. By default, the regression coefficients $c$, $G$, and $W$ do 

+not vary horizontally, however options exists to relax this assumption via the {\it bin\_type} 

+namelist variable in order to allow representation of differences between, for example, polar, mid-latitude, and tropical dynamical and physical processes. The scalar coefficient $c$ used to 

+estimate velocity potential errors from those of streamfunction is permitted to vary with model

+level in order to represent, for example, the impact of boundary-layer physics. Latitudinal/height 

 smoothing of the resulting coefficients may be optionally performed to avoid artificial 

-discontinuities at the edges of latitude/height boxes.

+discontinuities at the edges of latitude/height boxes (see the future WRF-Var technical note for

+details of these &quot;expert&quot; features).

 Having computed regression coefficients, the unbalanced components of the fields are 

-calculated as $\chi_u=\chi-c\psi$, $T_{u,k1}=T_{k1}-\sum_{k2}G_{k1,k2}\psi_{k2}$, 

-and $p_{su}=p_s - \sum_{k} W_{k}\psi_{k}$. These fields are output for the 

+calculated as $\chi_{u}(k)=\chi(k)-c(k)\psi(k)$, $T_{u}(k)=T(k)-\sum_{k1}G(k1,k)\psi(k1)$, 

+and $p_{su}=p_s - \sum_{k1} W(k1)\psi(k1)$. These fields are output for the 

 subsequent calculation of the spatial covariances as described below.

 \subsection{Vertical Covariances: Eigenvectors/eigenvalues and 

-control variable projections}

+control variable projections} 

 The third stage ({\it gen$\_$be$\_$stage3}) of {\it gen$\_$be}

 calculates the statistics required for the vertical component of the

@@ -448,12 +429,8 @@

 The {\it gen$\_$be} code calculates both domain-averaged and local

 values of the vertical component of the background error covariance

-matrix. The definition of local again depends on the value of the

-namelist variable bin$\_$type chosen. For example, for bin$\_$type=1,

-a $kz \times kz$ (where $kz$ is the number of vertical levels) vertical

-component of $\bf B$ is produced at every latitude (data is averaged

-over time and longitude) for each control variable. Eigendecomposition

-of the resulting climatological vertical error covariances ${\bf

+matrix. Eigendecomposition of the resulting $K\times K$ ($K$ is the number of 

+vertical levels) climatological vertical error covariance matrix ${\bf

 B}={\bf E}{\Lambda}{\bf E}^{T}$ results in both domain-averaged and

 local eigenvectors $\bf E$ and eigenvalues $\Lambda$. Both sets of

 statistics are included in the dataset supplied to WRF-Var, allowing

@@ -471,12 +448,12 @@

 The last aspect of the climatological component of background error

 covariance data required for WRF-Var is the horizontal error

 correlations, the representation of which forms the largest difference

-between running WRF-Var in regional and global mode. (It is however,

-still a fairly local change.)

+between running WRF-Var in regional and global mode - the rest of 

+{\it gen\_be} is essentially the same for both regional and global models.

 In a global application ({\it gen\_be\_stage4\_global}), power spectra

-are computed for each of the $kz$ vertical modes of the 3D control

-variables $\psi$, $\chi_u$, $T_u$, and $r$, and for the 2D control

+are computed for each of the $K$ vertical modes of the 3D control

+variables $\psi$, $\chi_u$, $T_u$, and relative humidity $r$, and for the 2D control

 variable $p_{su}$ data. In contrast, in regional mode, horizontal

 correlations are computed between grid-points of each 2D field, binned

 as a function of distance. A Gaussian curve is then fitted to the data

</font>

</pre>