The following is a brief summary of the methodology and results. Please refer to the manuscript for full details.
This paper proposes and studies two extensions of applying hp-variational physics-informed neural networks, more precisely the FastVPINNs framework, to convection-dominated convection-diffusion-reaction problems. First, a term in the spirit of a SUPG stabilization is included in the loss functional and a network architecture is proposed that predicts spatially varying stabilization parameters. Having observed that the selection of the indicator function in hard-constrained Dirichlet boundary conditions has a big impact on the accuracy of the computed solutions, the second novelty is the proposal of a network architecture that learns good parameters for a class of indicator functions. Numerical studies show that both proposals lead to noticeably more accurate results than approaches that can be found in the literature.
The below are the two different loss functionals that we have used for solving SPPDEs.
The SUPG finite element method adds an additional term to the standard Galerkin finite element discretization. This term essentially introduces numerical diffusion in streamline direction. The global SUPG stabilization term has the form
$$ \begin{equation} \mathcal{L_{SUPG}} = \int_{\Omega} \tau(\mathbf{x})\left(\mathbf{b}\cdot\nabla u(\mathbf{x}) + cu(\mathbf{x})-f(\mathbf{x})\right)\left(\mathbf{b}\cdot\nabla v(\mathbf{x})\right)\ \mathrm{d}\mathbf{x} \end{equation} $$where \(\tau(\mathbf{x})\) is called stabilization parameter and the diffusive term is neglected in the residual (the first factor), which is appropriate in the convection-dominated regime. In the framework of neural networks, the SUPG stabilization loss functional is given by
$$ \begin{equation} \mathcal{L^{SUPG}_\tau} = \mathcal{L}^{\text{hard}}_{\text{var}} + \mathcal{L_{SUPG}} \end{equation} $$where \(\mathcal{L}^{\text{hard}}_{\text{var}}\) is the variational loss of the PDE
The training of a neural network requires the solution of a large-scale non-convex optimization problem. For such problems, usually a regularization term is included in the loss functional. In addition, this term might counteract overfitting and thus it enhances the generalization capacity of the network. A standard approach consists in adding a \(L^2\)-type regularization, so that the loss functional becomes
$$ \begin{equation} \mathcal{L}_{\lambda}^{\text{reg}} = \mathcal{L}^{\text{hard}}_{\text{var}} + \frac{\lambda}{N}\sum_jw_j^2 \end{equation} $$where \(N\) is the total number of entries in the weight matrices of the network and \(\{w_j\}_{j=1}^N\) is the set of all weights in the network. The \(L^2\) weight decay regularization parameter \(\lambda\) needs to be tuned.
On Initial experimentation, we observed that the SUPG stabilization works better with SUPG Stabilisation
loss | search range | optimal value | \(L^2_{\text{err}}\) |
---|---|---|---|
\(\mathcal{L}_{\lambda}^{\text{reg}}\) | \([10^{-8}, 10^{-2}]\) | \(4 \cdot 10^{-6}\) \((\lambda)\) | \(4.888 \cdot 10^{-1}\) |
\(\mathcal{L^{SUPG}_\tau}\) | \([10^{-5}, 5]\) | \(1.2\) \((\tau)\) | \(5.384 \cdot 10^{-2}\) |
Problem \(P_{para}\), results for using constant parameters in the SUPG loss or regularization loss
We proposed a novel NN architecture that outputs both the solution and the stabilization parameter at each location. We observed that these networks perform better than those using a constant stabilization parameter throughout the domain.
Figure: NN architecture for Predicting Spatially varying tau
loss | \(\mathcal{L}^{\text{hard}}_{\text{var}}\) | \(\begin{array}{c}\mathcal{L^{SUPG}_\tau} \\ \text{constant }\tau\end{array}\) | \(\begin{array}{c}\mathcal{L^{SUPG}_\tau} \\ \text{learnt }\tau\end{array}\) |
---|---|---|---|
best \(L^2_{\text{err}}\) | \(1.693 \cdot 10^{-4}\) | \(1.340 \cdot 10^{-4}\) | \(1.037 \cdot 10^{-4}\) |
Indicator functions are used to enforce hard boundary constraints when solving VPINNs. While these functions can be chosen in multiple ways to satisfy boundary conditions, they must also match the solution's gradient near the boundaries. This matching becomes particularly important for problems with boundary layers. Therefore, we propose an adaptive indicator function that uses learnable parameters from neural networks to control the slope of the indicator functions. We observed that this adaptive method achieves better accuracy compared to using non-adaptive indicator functions.
We introduce an adaptive indicator function for \(\text{P}_{\text{para}}\) with three learnable parameters: \(\alpha\), \(\beta\), and \(\gamma\), namely
$$ \begin{equation} \begin{split} h(x,y) &= \left( 1 - \text{e}^{-\kappa_1 x} \right) \left( 1 - \text{e}^{-\kappa_2 y} \right) \left( 1 - \text{e}^{-\kappa_3 (1 - x)} \right) \left( 1 - \text{e}^{-\kappa_2 (1 - y)} \right), \\ \kappa_1 &= 10^{\alpha},\;\kappa_2 = 10^{\beta},\;\kappa_3 = 10^{\gamma}. \end{split} \end{equation} $$In this formulation, \(\alpha\) affects the inlet boundary, \(\beta\) the characteristic boundaries, and \(\gamma\) the outflow boundary.
\(\varepsilon\) | \(\begin{array}{c}\text{initial}\\ \alpha\end{array}\) | \(\begin{array}{c}\text{final}\\ \alpha\end{array}\) | \(\begin{array}{c}\text{initial}\\ \beta\end{array}\) | \(\begin{array}{c}\text{final}\\ \beta\end{array}\) | \(\begin{array}{c}\text{initial}\\L^2_{\text{err}}\end{array}\) | \(\begin{array}{c}\text{final}\\L^2_{\text{err}}\end{array}\) |
---|---|---|---|---|---|---|
\(10^{-8}\) | \(1\) | \(0.360\) | \(4\) | \(4.051\) | \(9.936\cdot 10^{-5}\) | \(5.887\cdot 10^{-5}\) |
This work was presented as a poster at the Indo-German Conference on Hardware Aware Scientific Computing held at Heidelberg, Germany, from October 28-30. We received the one of the Best Poster Presentation Award for this work.
Figure: Best Poster Presentation award at IGHASC-2024 with Prof. Peter Bastian and Prof. Sashikumaar Ganesan