Fast solvers for tridiagonal Toeplitz linear systems

Let A be a tridiagonal Toeplitz matrix denoted by A=Tritoep(β,α,γ)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A = {\text {Tritoep}} (\beta , \alpha , \gamma )$$\end{document}. The matrix A is said to be: strictly diagonally dominant if |α|>|β|+|γ|\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$|\alpha |>|\beta |+|\gamma |$$\end{document}, weakly diagonally dominant if |α|≥|β|+|γ|\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$|\alpha |\ge |\beta |+|\gamma |$$\end{document}, subdiagonally dominant if |β|≥|α|+|γ|\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$|\beta |\ge |\alpha |+|\gamma |$$\end{document}, and superdiagonally dominant if |γ|≥|α|+|β|\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$|\gamma |\ge |\alpha |+|\beta |$$\end{document}. In this paper, we consider the solution of a tridiagonal Toeplitz system Ax=b\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A\mathbf {x}= \mathbf {b}$$\end{document}, where A is subdiagonally dominant, superdiagonally dominant, or weakly diagonally dominant, respectively. We first consider the case of A being subdiagonally dominant. We transform A into a block 2×2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2\times 2$$\end{document} matrix by an elementary transformation and then solve such a linear system using the block LU factorization. Compared with the LU factorization method with pivoting, our algorithm takes less flops, and needs less memory storage and data transmission. In particular, our algorithm outperforms the LU factorization method with pivoting in terms of computing efficiency. Then, we deal with superdiagonally dominant and weakly diagonally dominant cases, respectively. Numerical experiments are finally given to illustrate the effectiveness of our algorithms.


Introduction
Consider the direct solution of the following nonsingular linear system: where A is an n × n tridiagonal Toeplitz matrix of the following form: (1.2) Toeplitz matrices are a class of special structured matrices, which arise from a variety of applications in mathematics, scientific computing, and engineering, for instance, image restoration storage problems in signal processing, algebraic differential equation, time series, and control theory. Exploiting their structure, some fast and/or accurate algorithms for Toeplitz systems and eigen-problems have been developed, see for instance (Chan and Jin 2007;Chan and Ng 1996;Golub and Van Loan 1996;Liu et al. 2010a, b) and references therein.
There are two main types of methods for solving Toeplitz systems: direct methods and iterative methods. Iterative methods consist of classical splitting iteration methods and Krylov subspace iteration methods. Classical splitting iteration methods for Toeplitz systems require efficient splittings which depend on the structure and property of coefficient matrices, for example, Gauss-Seidel and SOR splittings (Saad 2003) for H -matrices and Hermitian positive definite matrices, circulant and skew circulant splitting for positive definite matrices (Liu et al. 2017;Ng 2003), trigonometric transform splitting for real symmetric positive definite matrices (Liu et al. 2018), real symmetric and skew symmetric splitting for real positive definite matrices (Chen and Jiang 2010;Gu and Tian 2009) which is a tailor of Hermitian and skew Hermitian splitting for general positive definite matrices (Bai et al. 2003), and so on. Krylov subspace iteration methods include CG method for Hermitian positive definite matrices and GMRES methods for non-Hermitian positive definite matrices as well as their variants (Chan and Jin 2007;Chan and Ng 1996;Saad 2003). One important property of iteration methods is that only matrix-vector multiplications are involved at each iteration, which can be obtained in O(n log n) operations by the FFT or DCT and DST, see for instance (Chan and Jin 2007;Chan and Ng 1996;Golub and Van Loan 1996;Liu et al. 2019;Ng 2003), and another important property is that iteration methods are efficient and robust numerically, and suited for solving large sparse linear systems. Direct methods (see Golub and Van Loan (1996)) are mainly applicable for small and medium model problems but often too expensive to be practical for large sparse problems. Furthermore, direct methods may suffer from numerical instability and loss of accuracy of solutions (Chan and Ng 1996).
Tridiagonal Toeplitz linear systems have their own applications in mathematical science and engineering field (Noschese et al. 2013;Rojo 1990;Saad 2003). Due to their special structure, fast and/or accurate solvers have intrigued researchers for decades, see for example (Yan and Chung 1994;Terekhov 2015;Du et al. 2017;Rojo 1990) for serial algorithms and (Garey and Shaw 2001;Kim 1990;McNally et al. 2000) for parallel algorithms. Those algorithms are all numerically reliable, if the coefficient matrix A is strictly diagonally dom-inant. Otherwise, a pivoting strategy will be necessary for the LU factorization of A, see for example (Bunch and Marcia 2000). It is known that a pivoting strategy may cause a lot of nonzero fill-ins, which will result in more memory storage required and more float-point operations increased. This motivates us to consider new methods for solving (1.1), where A is assumed to be subdiagonally dominant, superdiagonally dominant, or weakly diagonally dominant. Such classes of matrices frequently appear in numerical solution to one-dimension convection-diffusion equation via finite difference scheme, see for instance (Saad 2003).
This paper is organized as follows. In the next section, we develop some fast algorithms for solving (1.1) with the coefficient matrix A being strictly subdiagonally dominant, strictly superdiagonally dominant, and weakly diagonally dominant, respectively. Numerical tests are illustrated in Sect. 3 to show the effectiveness of our algorithms. A brief conclusion is drawn in Sect. 4, and the acknowledgements are written in the last section.

Fast algorithms
We begin with the strictly subdiagonally dominant case. Let: be the shifted circulant matrix. It is easy to verify thatÂ = C A has the following block 2 × 2 structure:Â and admits the following bock 2 × 2-LU factorization: where −w T A −1 11 p is the so-called Schur complement ofÂ corresponding to A 11 . Equivalently, the problem for solving (1.1) becomes solving: Partitioning x andb into the following forms: and using the block LU factorization (2.2), we can get the solution to (1.1) by solving: (2.4) To solve (2.4), we first solve: by the following algorithm.
Algorithm 1 Backward substitution for solving A 11 z = p.
Once we get the solutions u and v to systems in Eq. (2.5), we can form the solution to (1.1) given by: (2.6) We remark here that to obtain the solution (2.6), we need to solve two auxiliary linear systems in (2.5), where A 11 is an upper triangular tridiagonal matrix. Obviously, the vectors u and v can be easily obtained by a backward substitution. Also, due to the strictly diagonally dominance of A 11 , both computed vectors u and v are reliable. Based on the above analysis, we can now reformulate our algorithm for solve (1.1) as follows.
Algorithm 2 An algorithm for solving tridiagonal Toeplitz linear system Ax = b.
1: Input β, α, γ , w, p, b 2 , b 1 ; 2: Call Algorithm 1 to solve A 11 u = p, A 11 v = b 2 defined in (2.5); 3: Compute x n and x 1 according to (2.6); In algorithm 2, when computing the x 1 and x n , u and v are used twice, w T u and w T v are used only once, so the influence of the error propagation can be neglected; therefore, x n and x 1 are reliable. Thus, we can conclude that our algorithm for solving such a class of linear systems is numerically stable and the computed solutions are reliable.
For the computational complexity, our algorithm takes about 12n flops, less than 13n flops required for the LU factorization method with pivoting. For memory storage, our algorithm needs store 2 n-vectors only, less than 5 n-vectors required to store for the LU factorization method with pivoting. In particular, our algorithm needs less data transmission. It only reads one vector (the right-hand side vector) and writes one vector (the solution), but the LU factorization method with pivoting needs reads 5 vectors. As we know that modern computers have multi-level memory hierarchy, there are different storage levels such as the smaller high-speed cache and larger low-speed disk storage. During computations, data transfer in different levels of cache memory. Thus, algorithms with less data transmission may show better computating performance. This makes the algorithm more efficient than LU factorization method with pivoting. The efficiency of our algorithm is illustrated by numerical results in the next section. Now, we consider the strictly superdiagonally dominant case. Let J be the exchange matrix with ones on the cross diagonal (bottom left to top right) and zeros elsewhere, that is: Because A is strictly superdiagonally dominant,Ã = J AJ = T ritoep(γ, α, β) becomes strictly subdiagonally dominant. Therefore, we can transform the original linear system (1.1) into the following new one:Ãx =b, (2.7) wherex = J x andb = J b. Thus, combining Algorithm 2 yields the following Algorithm 3 for solving (1.1).
Algorithm 3 An algorithm for solving tridiagonal Toeplitz linear system Ax = b.
Since J is a permutation matrix, the stability, computational complexity, and memory storage of Algorithm 3 are the same as Algorithm 2.
Finally, we turn to the case of A being weakly diagonally dominant. It is known that in this case, A is an H -matrix which admits an LU factorization without pivoting. Obviously, this LU factorization does not cause any nonzero fill-ins, but needs more memory units, see Du et al. (2017) for a similar analysis when A is strictly diagonally dominant. To avoid this problem, we adopt the following strategy: If β > γ , then we call Algorithm 2 for solving (1.1); if β < γ , then we call Algorithm 3 for solving (1.1).
Algorithm 4 An algorithm for solving tridiagonal Toeplitz linear system Ax = b.
The computational complexity and memory storage of Algorithm 4 are about the same as Algorithm 2 or 3.

Numerical experiments
In this section, we use some examples to illustrate the effectiveness of our algorithms.
All the numerical tests were done on a Founder desktop PC with Intel(R) Core(TM) i3-2310M CPU @2.10 by Matlab R2009(b) with a machine precision of 10 −16 .
For convenience, throughout our numerical experiments, we denote by R = ||b − Ax||/||b||, CPU, NEW, PLU, LU the relative residual error, computing time (in seconds), our new algorithm, LU factorization method with pivoting, and LU factorization method. In all tables, the CPU is the average value of computing times required by performing the corresponding algorithm 10 times, the right-hand side vector b is taken to be Ae or Ax * with x * = rand(n, 1).
All tested matrices are arising from the following one-dimension convection-diffusion equation: −au + bu = 0, x ∈ (0, 1) where the coefficients a and b are constant. When we use centered difference schemes for the second-order derivative and the firstorder derivative, we get the following example: When we use centered difference schemes for the second-order derivative and the backward difference scheme for the first-order derivative, we get the following example: Example 2 A = Tritoep(−1 − c, 2 + c, −1). When we use centered difference schemes for the second-order derivative and the forward difference scheme for the first-order derivative, we get the following example: Example 3 A = Tritoep(−1, 2 − c, −1 + c).     In numerical experiments, we test numerous matrices in Examples 1-3 with different order n and various parameter c. We find that all experiments have a similar numerical behavior. We, therefore, select some of them listed in following tables.
Numerical results are illustrated in Tables 1 and 2 for Example 1 with c = 12.5 and c = 2.5, Tables 3 and 4 for Example 2 with c = −6.5 and c = − 9.5 (where those matrices are subdiagonally dominant), and Tables 5 and 6 for Example 3 with c = 5.5 and c = 7.5 (in which both matrices A are superdiagonally dominant). In those six cases, we also test  the LU factorization method with pivoting for comparison. We observe that our method for Examples 1-2 and its variant for Example 3 perform much faster than the LU factorization method with pivoting and produces more accurate solutions for b = Ae than the latter in all cases. As for b = Ax * (x * = rand(n, 1)), both methods produce solutions with almost the same accuracy. Numerical results for fixed n = 2 22 and various c are illustrated in Table 7 for Example 1 and Table 8 for Example 2 (where all those matrices are weakly diagonally dominant) From Tables 7 and 8, we can see our method works much faster than the LU factorization method, while solutions produced by our new method are about as accurate as those obtained by the LU factorization method.

Conclusions
We have explored the special structure of tridiagonal Toeplitz matrices to develop an effective algorithm for solving tridiagonal Toeplitz linear systems. We first transformed the original system into a new one by an elementary transformation. The coefficient matrix of the new system becomes a block 2 × 2 matrix whose principal leading block is an upper triangular tridiagonal Toeplitz matrix of order n − 1. Based on this block 2 × 2 structure, we then proposed an new algorithm.
Compared with the LU factorization method, some significant advantages of our algorithm can be found. One is that our algorithm takes less flops. Another is that less memory storage and data transmission are required to perform our algorithm.
Numerical tests were shown that our algorithm fully outperforms the LU factorization method with pivoting in terms of computing efficiency and accuracy of the produced solution, when solving a tridiagonal Toeplitz system whose coefficient matrix is subdiagonally or superdiagonally dominant. In addition, we found numerically that our method performs as well as the LU factorization method when solving a tridiagonal Toeplitz system whose coefficient matrix is weakly diagonally dominant.