Astrophysical Journal:  The National Science Foundation's Daniel K. Inouye Solar Telescope (DKIST) will provide high-resolution, multiline spectropolarimetric observations that are poised to revolutionize our understanding of the Sun. Given the massive data volume, novel inference techniques are required to unlock its full potential. Here, we provide an overview of our "SPIn4D" project, which aims to develop deep convolutional neural networks (CNNs) for estimating the physical properties of the solar photosphere from DKIST spectropolarimetric observations. We describe the magnetohydrodynamic (MHD) modeling and the Stokes profile synthesis pipeline that produce the simulated output and input data, respectively. These data will be used to train a set of CNNs that can rapidly infer the four-dimensional MHD state vectors by exploiting the spatiotemporally coherent patterns in the Stokes profile time series. Specifically, our radiative MHD model simulates the small-scale dynamo actions that are prevalent in quiet-Sun and plage regions. Six cases with different mean magnetic fields have been explored; each case covers six solar-hours, totaling 109 TB in data volume. The simulation domain covers at least 25 × 25 × 8 Mm, with 16 × 16 × 12 km spatial resolution, extending from the upper convection zone up to the temperature minimum region. The outputs are stored at a 40 s cadence. We forward model the Stokes profile of two sets of Fe i lines at 630 and 1565 nm, which will be simultaneously observed by DKIST and can better constrain the parameter variations along the line of sight. The MHD model output and the synthetic Stokes profiles are publicly available, with 13.7 TB in the initial release.

SPin4D model workflow

Schematic representation of the SPIn4D model workflow. The core of the model is the DL neural network, highlighted in blue in the middle of the diagram. The network training step is outlined by the broken green line and uses data derived from the MURaM simulations, both the MHD variables themselves and the Stokes profiles (I, Q, U, V) synthesized from the MHD data cubes (green lines). Once trained on the simulated data, observed Stokes data can be input into the network (red arrow) to produce the most likely 3D MHD state as output (labeled "Predicted MHD Variables"). The network may be trained to receive single-time input Stokes data to produce a reduced dimensional output MHD state ( B , vz , P, T) or to receive multitime input Stokes data to produce a full-dimensional MHD output, including additional derived outputs, such as vector velocities, Poynting flux, and so on. The network may also be trained to produce the derived outputs directly.