# Data Compression

A large number of stochastic realisations of random fields requires a large amount of memory and powerful computational resources. To decrease memory requirements and computing time we offer to use a low-rank approximation for all realisations of the solution [13]. This low-rank approximation allows us an effective postprocessing (computation of the mean value, variance, exceedance probability) with drastically reduced memory requirements (see Table 4). For each new realisation only the corresponding low-rank update will be computed (see, e. g. [1]). This can be practical when, e. g. the results of many thousands Monte Carlo simulations should be computed and stored. Let v; є Rn, i = 1..Z, be the solution vector (already centred), where Z is a number of stochastic realisations of the solution. Let us build from all these vectors the matrix W = (vi,…, vZ) є RnxZ and consider the factorisation

W = ABT, where A є Rnxk and B є RZxk. (15)

We say that matrix W is a rank-k matrix if the representation in Eq. 15 is given. We denote the class of all rank-k matrices for which factors A and BT in Eq. 15 exist by R(k, n, Z).If W є R(k, n, Z) we say that W has a low-rank representation. The first aim is to compute a rank-k approximation Wk of W, such that

||W – Wk|| < e, k < min{n, Z}.

The second aim is to compute an update for the approximation Wk with a linear complexity for every new coming vector vZ+1. Below in Section 5.1 we present the algorithm which performs this.

To get the reduced singular value decomposition we omit all singular values, which are smaller than a given level e or, alternative variant, we leave a fixed number of largest singular values. After truncation we speak about reduced singular value decomposition (denoted by rSVD) Wk = UkZkVkT, where Uk є Rnxk contains the first k columns of U, Vk є RZxk contains the first k columns of V and Zk є Rkxk contains the k-biggest singular values of Z.

The computation of such basic statistics as the mean value, the variance, the exceedance probability can be done with a linear complexity. The following examples illustrate computation of the mean value and the variance.

Let us take A := UkZk and BT := VT є RkxZ. Denote the j-th row of matrix A by aj є Rk and the i-th column of matrix BT by Ь; є Rk. It is evident, that if W is given explicitly, one can compute the mean value and the variance just keeping in memory 2 vectors – the mean (variance) and the current value. Below we show how to compute the mean and the variance if only A and B are given.

1. One can compute the mean solution v є R" as follows

1 z 1 z –

v= – XAb<=Ab. (Д6)

Z i=1 Z i=1

rank к |
5 |
20 |

max* p (x) — pк (.r) |
1.7e-6 |
4.2e-10 |

maxj 1 var(p) (x) – var (p )* (x) |
6.7e-5 |
2.3e-8 |

Table 2 Rank-k approximation errors of the mean and of the variance of density in Case 1. |

The computational complexity is O(k(Z + n)), in contrast to O(nZ)) for usual dense data format. As a demonstration we compute the mean.

2. One can compute the variance of the solution var(v) є Rn by the computing the covariance matrix and taking its diagonal. First, one computes the centred matrix

WC:=W —vlLr, where v = W • 1L/Z, and 11 = (1,…, 1)г є Rz. (17)

Computing Wc costs O(k2(n + Z)) (addition and truncation of rank-k matrices). By definition, the covariance matrix is C = . The reduced singular

value decomposition of Wc is (Wc)k = UkZkVl, Uk є Rnxk, Ek є Rkxk and Vk є RZxk can be computed via the QR algorithm [5, 13]. Now, the covariance matrix can be written like

The variance of the solution vector (i. e. the diagonal of the covariance matrix C can be computed with the complexity O(k2(Z + n)).

Table 2 demonstrates the rank-5 and rank-20 approximations of the mean and of the variance of density. One can see that both rank-k approximation errors are very small, much smaller than e. g. the discretisation error or Monte Carlo error (by computing the mean value).

Lemma 0.1. Let ||W —W*||2 < є, апсіщ be a rank-k approximation of the mean її. Піепа)\й-щ\ <ф., b) j| Wc — (W с) л-11 < є, r)||C-C,|| <^e2.

Proof: Since її = 11 and її* = ^W*1L, then

] ] P

|u-u*.||2 = — ||(W-W*)U||2 <-||(W-W*)||2-||1L||2 < -7=

Ns Ns N

Let I є RNs xNs be the identity matrix, then

1 T

||Wc-(Wc)*||2< ||W-W*||2.||I – — – І-11г||2<є, and

Ns

*lb < ^jllWcWf – (Wc)*(Wc)[ |b

Ns 1

2.1 Concatenation of Two Low-Rank Matrices

Let A and B such that Wk = ABT be given. Suppose also that matrix W Є Rnxm contains new m solution vectors. For a small m, computing the factors C Є Rnxk and D Є Rmxk such that W « CDT is not expensive. Now our purpose is to compute with a linear complexity the rank-k approximation of Wnew := [WW’ ] Є Rnx (Z+m). To do this, we build two concatenated matrices Anew := [AC] Є Rnx2k and Bnew = blockdiag[BT DT] Є R2kx(Z+m). Note that the difficulty now is that matrices Anew and Bnew have rank 2k. The rank k approximation of the new matrix Wnew is done with a linear complexity O((n + Z)k2 + k3) (for details see [13]).