Back to Projects

Improving hp-Variational Physics-Informed Neural Networks for Steady-State Convection-Dominated Problems

PrePrint Available. (Accepted for Publication in Computer Methods in Applied Mechanics and Engineering [CMAME] )

Github arXiv

The following is a brief summary of the methodology and results. Please refer to the manuscript for full details.

Project Abstract

This paper proposes and studies two extensions of applying hp-variational physics-informed neural networks, more precisely the FastVPINNs framework, to convection-dominated convection-diffusion-reaction problems. First, a term in the spirit of a SUPG stabilization is included in the loss functional and a network architecture is proposed that predicts spatially varying stabilization parameters. Having observed that the selection of the indicator function in hard-constrained Dirichlet boundary conditions has a big impact on the accuracy of the computed solutions, the second novelty is the proposal of a network architecture that learns good parameters for a class of indicator functions. Numerical studies show that both proposals lead to noticeably more accurate results than approaches that can be found in the literature.

Key Features

Loss Functionals

The below are the two different loss functionals that we have used for solving SPPDEs.

SUPG Loss Functional

The SUPG finite element method adds an additional term to the standard Galerkin finite element discretization. This term essentially introduces numerical diffusion in streamline direction. The global SUPG stabilization term has the form

$$ \begin{equation} \mathcal{L_{SUPG}} = \int_{\Omega} \tau(\mathbf{x})\left(\mathbf{b}\cdot\nabla u(\mathbf{x}) + cu(\mathbf{x})-f(\mathbf{x})\right)\left(\mathbf{b}\cdot\nabla v(\mathbf{x})\right)\ \mathrm{d}\mathbf{x} \end{equation} $$

where \(\tau(\mathbf{x})\) is called stabilization parameter and the diffusive term is neglected in the residual (the first factor), which is appropriate in the convection-dominated regime. In the framework of neural networks, the SUPG stabilization loss functional is given by

$$ \begin{equation} \mathcal{L^{SUPG}_\tau} = \mathcal{L}^{\text{hard}}_{\text{var}} + \mathcal{L_{SUPG}} \end{equation} $$

where \(\mathcal{L}^{\text{hard}}_{\text{var}}\) is the variational loss of the PDE

Regularisation-based loss Functional

The training of a neural network requires the solution of a large-scale non-convex optimization problem. For such problems, usually a regularization term is included in the loss functional. In addition, this term might counteract overfitting and thus it enhances the generalization capacity of the network. A standard approach consists in adding a \(L^2\)-type regularization, so that the loss functional becomes

$$ \begin{equation} \mathcal{L}_{\lambda}^{\text{reg}} = \mathcal{L}^{\text{hard}}_{\text{var}} + \frac{\lambda}{N}\sum_jw_j^2 \end{equation} $$

where \(N\) is the total number of entries in the weight matrices of the network and \(\{w_j\}_{j=1}^N\) is the set of all weights in the network. The \(L^2\) weight decay regularization parameter \(\lambda\) needs to be tuned.

Intial Results

On Initial experimentation, we observed that the SUPG stabilization works better with SUPG Stabilisation

loss search range optimal value \(L^2_{\text{err}}\)
\(\mathcal{L}_{\lambda}^{\text{reg}}\) \([10^{-8}, 10^{-2}]\) \(4 \cdot 10^{-6}\) \((\lambda)\) \(4.888 \cdot 10^{-1}\)
\(\mathcal{L^{SUPG}_\tau}\) \([10^{-5}, 5]\) \(1.2\) \((\tau)\) \(5.384 \cdot 10^{-2}\)

Problem \(P_{para}\), results for using constant parameters in the SUPG loss or regularization loss

Results - Spatially varying Tau

We proposed a novel NN architecture that outputs both the solution and the stabilization parameter at each location. We observed that these networks perform better than those using a constant stabilization parameter throughout the domain.

Spatially varying Tau

Figure: NN architecture for Predicting Spatially varying tau

\(L^2_{\text{err}}\) for problem \(\text{P}_{\text{out}}\) using different loss functionals. Here \( learnt \; \tau \) is the proposed loss from new architecture.
loss \(\mathcal{L}^{\text{hard}}_{\text{var}}\) \(\begin{array}{c}\mathcal{L^{SUPG}_\tau} \\ \text{constant }\tau\end{array}\) \(\begin{array}{c}\mathcal{L^{SUPG}_\tau} \\ \text{learnt }\tau\end{array}\)
best \(L^2_{\text{err}}\) \(1.693 \cdot 10^{-4}\) \(1.340 \cdot 10^{-4}\) \(1.037 \cdot 10^{-4}\)

Results - Adaptive Indicator functions

Indicator functions are used to enforce hard boundary constraints when solving VPINNs. While these functions can be chosen in multiple ways to satisfy boundary conditions, they must also match the solution's gradient near the boundaries. This matching becomes particularly important for problems with boundary layers. Therefore, we propose an adaptive indicator function that uses learnable parameters from neural networks to control the slope of the indicator functions. We observed that this adaptive method achieves better accuracy compared to using non-adaptive indicator functions.

We introduce an adaptive indicator function for \(\text{P}_{\text{para}}\) with three learnable parameters: \(\alpha\), \(\beta\), and \(\gamma\), namely

$$ \begin{equation} \begin{split} h(x,y) &= \left( 1 - \text{e}^{-\kappa_1 x} \right) \left( 1 - \text{e}^{-\kappa_2 y} \right) \left( 1 - \text{e}^{-\kappa_3 (1 - x)} \right) \left( 1 - \text{e}^{-\kappa_2 (1 - y)} \right), \\ \kappa_1 &= 10^{\alpha},\;\kappa_2 = 10^{\beta},\;\kappa_3 = 10^{\gamma}. \end{split} \end{equation} $$

In this formulation, \(\alpha\) affects the inlet boundary, \(\beta\) the characteristic boundaries, and \(\gamma\) the outflow boundary.

Adaptive indicator function parameters and \(L^2_{\text{err}}\) for \(\text{P}_{\text{out}}\). Here initial is the non-adaptive indicator function and final is the adaptive indicator function
\(\varepsilon\) \(\begin{array}{c}\text{initial}\\ \alpha\end{array}\) \(\begin{array}{c}\text{final}\\ \alpha\end{array}\) \(\begin{array}{c}\text{initial}\\ \beta\end{array}\) \(\begin{array}{c}\text{final}\\ \beta\end{array}\) \(\begin{array}{c}\text{initial}\\L^2_{\text{err}}\end{array}\) \(\begin{array}{c}\text{final}\\L^2_{\text{err}}\end{array}\)
\(10^{-8}\) \(1\) \(0.360\) \(4\) \(4.051\) \(9.936\cdot 10^{-5}\) \(5.887\cdot 10^{-5}\)

IGHASC-2024

This work was presented as a poster at the Indo-German Conference on Hardware Aware Scientific Computing held at Heidelberg, Germany, from October 28-30. We received the one of the Best Poster Presentation Award for this work.

Poster Presentation

Figure: Best Poster Presentation award at IGHASC-2024 with Prof. Peter Bastian and Prof. Sashikumaar Ganesan

Back to Projects