IAIFI Theoretical Physics Papers

View high energy physics IAIFI papers on INSPIRE

Theoretical Physics

Pre-prints

Learning Galaxy Intrinsic Alignment Correlations
Sneh Pandya, Yuanyuan Yang, Nicholas Van Alfen, Jonathan Blazek, Robin Walters
[ arXiv:2404.13702 ]

Abstract

The intrinsic alignments (IA) of galaxies, regarded as a contaminant in weak lensing analyses, represents the correlation of galaxy shapes due to gravitational tidal interactions and galaxy formation processes. As such, understanding IA is paramount for accurate cosmological inferences from weak lensing surveys; however, one limitation to our understanding and mitigation of IA is expensive simulation-based modeling. In this work, we present a deep learning approach to emulate galaxy position-position (ξ), position-orientation (ω), and orientation-orientation (η) correlation function measurements and uncertainties from halo occupation distribution-based mock galaxy catalogs. We find strong Pearson correlation values with the model across all three correlation functions and further predict aleatoric uncertainties through a mean-variance estimation training procedure. ξ(r) predictions are generally accurate to ≤10%. Our model also successfully captures the underlying signal of the noisier correlations ω(r) and η(r), although with a lower average accuracy. We find that the model performance is inhibited by the stochasticity of the data, and will benefit from correlations averaged over multiple data realizations. Our code will be made open source upon journal publication.

TENG: Time-Evolving Natural Gradient for Solving PDEs with Deep Neural Net
Zhuo Chen, Jacob McCarran, Esteban Vizcaino, Marin Soljačić, Di Luo
[ arXiv:2404.10771 ]

Abstract

Partial differential equations (PDEs) are instrumental for modeling dynamical systems in science and engineering. The advent of neural networks has initiated a significant shift in tackling these complexities though challenges in accuracy persist, especially for initial value problems. In this paper, we introduce the Time-Evolving Natural Gradient (TENG), generalizing time-dependent variational principles and optimization-based time integration, leveraging natural gradient optimization to obtain high accuracy in neural-network-based PDE solutions. Our comprehensive development includes algorithms like TENG-Euler and its high-order variants, such as TENG-Heun, tailored for enhanced precision and efficiency. TENG's effectiveness is further validated through its performance, surpassing current leading methods and achieving machine precision in step-by-step optimizations across a spectrum of PDEs, including the heat equation, Allen-Cahn equation, and Burgers' equation.

The Frozen Phase of Heterotic F-theory Duality
Paul-Konstantin Oehlmann, Fabian Ruehle, Benjamin Sung
[ arXiv:2404.02191 ]

Abstract

We study the duality between the Spin(32)/ℤ2 heterotic string without vector structure and F-theory with frozen singularities. We give a complete description in theories with 6d =(1,0) supersymmetry and identify the duals of Spin(32)/ℤ2-instantons on ADE singularities without vector structure in the frozen phase of F-theory using an ansatz introduced by Bhardwaj, Morrison, Tachikawa, and Tomasiello. As a consequence, we obtain a strongly coupled description of orbifold phases of type I string theory without vector structure, substantially expanding the list of known examples of 6d F-theory compactifications with frozen singularities. Supergravity theories can be fused from these instanton theories, in a way that commutes with switching off vector structure, which we use to propose new consistency checks via neutral hypermultiplet counting. Finally, we describe various Higgsings of this duality, and comment on constraints on higher form symmetries.

PAPERCLIP: Associating Astronomical Observations and Natural Language with Multi-Modal Models
Siddharth Mishra-Sharma, Yiding Song, Jesse Thaler
[ arXiv:2403.08851 ]

Abstract

We present PAPERCLIP (Proposal Abstracts Provide an Effective Representation for Contrastive Language-Image Pre-training), a method which associates astronomical observations imaged by telescopes with natural language using a neural network model. The model is fine-tuned from a pre-trained Contrastive Language-Image Pre-training (CLIP) model using successful observing proposal abstracts and corresponding downstream observations, with the abstracts optionally summarized via guided generation using large language models (LLMs). Using observations from the Hubble Space Telescope (HST) as an example, we show that the fine-tuned model embodies a meaningful joint representation between observations and natural language through tests targeting image retrieval (i.e., finding the most relevant observations using natural language queries) and description retrieval (i.e., querying for astrophysical object classes and use cases most relevant to a given observation). Our study demonstrates the potential for using generalist foundation models rather than task-specific models for interacting with astronomical data by leveraging text as an interface.

Moments of Clarity: Streamlining Latent Spaces in Machine Learning using Moment Pooling
Rikab Gambhir, Athis Osathapan, Jesse Thaler
[ arXiv:2403.08854 ]

Abstract

Many machine learning applications involve learning a latent representation of data, which is often high-dimensional and difficult to directly interpret. In this work, we propose "Moment Pooling", a natural extension of Deep Sets networks which drastically decrease latent space dimensionality of these networks while maintaining or even improving performance. Moment Pooling generalizes the summation in Deep Sets to arbitrary multivariate moments, which enables the model to achieve a much higher effective latent dimensionality for a fixed latent dimension. We demonstrate Moment Pooling on the collider physics task of quark/gluon jet classification by extending Energy Flow Networks (EFNs) to Moment EFNs. We find that Moment EFNs with latent dimensions as small as 1 perform similarly to ordinary EFNs with higher latent dimension. This small latent dimension allows for the internal representation to be directly visualized and interpreted, which in turn enables the learned internal jet representation to be extracted in closed form.

On classical de Sitter solutions and parametric control
David Andriot, Fabian Ruehle
[ arXiv:2403.07065 ]

Abstract

Finding string backgrounds with de Sitter spacetime, where all approximations and corrections are controlled, is an open problem. We revisit the search for de Sitter solutions in the classical regime for specific type IIB supergravity compactifications on group manifolds, an under-explored corner of the landscape that offers an interesting testing ground for swampland conjectures. While the supergravity de Sitter solutions we obtain numerically are ambiguous in terms of their classicality, we find an analytic scaling that makes four out of six compactification radii, as well as the overall volume, arbitrarily large. This potentially provides parametric control over corrections. If we could show that these solutions, or others to be found, are fully classical, they would constitute a counterexample to conjectures stating that asymptotic de Sitter solutions do not exist. We discuss this point in great detail.

Photonic probabilistic machine learning using quantum vacuum noise
Seou Choi, Yannick Salamin, Charles Roques-Carmes, Rumen Dangovski, Di Luo, Zhuo Chen, Michael Horodynski, Jamison Sloan, Shiekh Zia Uddin, Marin Soljacic
[ arXiv:2403.04731 ]

Abstract

Probabilistic machine learning utilizes controllable sources of randomness to encode uncertainty and enable statistical modeling. Harnessing the pure randomness of quantum vacuum noise, which stems from fluctuating electromagnetic fields, has shown promise for high speed and energy-efficient stochastic photonic elements. Nevertheless, photonic computing hardware which can control these stochastic elements to program probabilistic machine learning algorithms has been limited. Here, we implement a photonic probabilistic computer consisting of a controllable stochastic photonic element - a photonic probabilistic neuron (PPN). Our PPN is implemented in a bistable optical parametric oscillator (OPO) with vacuum-level injected bias fields. We then program a measurement-and-feedback loop for time-multiplexed PPNs with electronic processors (FPGA or GPU) to solve certain probabilistic machine learning tasks. We showcase probabilistic inference and image generation of MNIST-handwritten digits, which are representative examples of discriminative and generative models. In both implementations, quantum vacuum noise is used as a random seed to encode classification uncertainty or probabilistic generation of samples. In addition, we propose a path towards an all-optical probabilistic computing platform, with an estimated sampling rate of ~ 1 Gbps and energy consumption of ~ 5 fJ/MAC. Our work paves the way for scalable, ultrafast, and energy-efficient probabilistic machine learning hardware.

Operator Learning Renormalization Group
Xiu-Zhe Luo, Di Luo, Roger G. Melko
[ arXiv:2403.03199 ]

Abstract

n this paper, we present a general framework for quantum many-body simulations called the operator learning renormalization group (OLRG). Inspired by machine learning perspectives, OLRG is a generalization of Wilsons numerical renormalization group and Whites density matrix renormalization group, which recursively builds a simulatable system to approximate a target system of the same number of sites via operator maps. OLRG uses a loss function to minimize the error of a target property directly by learning the operator map in lieu of a state ansatz. This loss function is designed by a scaling consistency condition that also provides a provable bound for real-time evolution. We implement two versions of the operator maps for classical and quantum simulations. The former, which we call the Operator Matrix Map, can be implemented via neural networks on classical computers. The latter, which we call the Hamiltonian Expression Map, generates device pulse sequences to leverage the capabilities of quantum computing hardware. We illustrate the performance of both maps for calculating time-dependent quantities in the quantum Ising model Hamiltonian.

Multi-particle interpolating operators in quantum field theories with cubic symmetry
William Detmold, William I. Jay, Gurtej Kanwar, Phiala E. Shanahan, Michael L. Wagman
[ arXiv:2403.00672 ]

Abstract

Numerical studies of lattice quantum field theories are conducted in finite spatial volumes, typically with cubic symmetry in the spatial coordinates. Motivated by these studies, this work presents a general algorithm to construct multi-particle interpolating operators for quantum field theories with cubic symmetry. The algorithm automates the block diagonalization required to combine multiple operators of definite linear momentum into irreducible representations of the appropriate little group. Examples are given for distinguishable and indistinguishable particles including cases with both zero and non-zero spin. An implementation of the algorithm is publicly available at this https URL.

Long-Distance Nuclear Matrix Elements for Neutrinoless Double-Beta Decay from Lattice QCD
Zohreh Davoudi, William Detmold, Zhenghao Fu, Anthony V. Grebe, William Jay, David Murphy, Patrick Oare, Phiala E. Shanahan, Michael L. Wagman
[ arXiv:2402.09362 ]

Abstract

Neutrinoless double-beta (0νββ) decay is a heretofore unobserved process which, if observed, would imply that neutrinos are Majorana particles. Interpretations of the stringent experimental constraints on 0νββ-decay half-lives require calculations of nuclear matrix elements. This work presents the first lattice quantum-chromodynamics (LQCD) calculation of the matrix element for 0νββ decay in a multi-nucleon system, specifically the nn→ppee transition, mediated by a light left-handed Majorana neutrino propagating over nuclear-scale distances. This calculation is performed with quark masses corresponding to a pion mass of mπ=806 MeV at a single lattice spacing and volume. The statistically cleaner Σ−→Σ+ee transition is also computed in order to investigate various systematic uncertainties. The prospects for matching the results of LQCD calculations onto a nuclear effective field theory to determine a leading-order low-energy constant relevant for 0νββ decay with a light Majorana neutrino are investigated. This work, therefore, sets the stage for future calculations at physical values of the quark masses that, combined with effective field theory and nuclear many-body studies, will provide controlled theoretical inputs to experimental searches of 0νββ decay.

Real-time Dynamics of the Schwinger Model as an Open Quantum System with Neural Density Operators
Joshua Lin, Di Luo, Xiaojun Yao, Phiala E. Shanahan
[ arXiv:2402.06607 ]

Abstract

Ab-initio simulations of multiple heavy quarks propagating in a Quark-Gluon Plasma are computationally difficult to perform due to the large dimension of the space of density matrices. This work develops machine learning algorithms to overcome this difficulty by approximating exact quantum states with neural network parametrisations, specifically Neural Density Operators. As a proof of principle demonstration in a QCD-like theory, the approach is applied to solve the Lindblad master equation in the 1+1d lattice Schwinger Model as an open quantum system. Neural Density Operators enable the study of in-medium dynamics on large lattice volumes, where multiple-string interactions and their effects on string-breaking and recombination phenomena can be studied. Thermal properties of the system at equilibrium can also be probed with these methods by variationally constructing the steady state of the Lindblad master equation. Scaling of this approach with system size is studied, and numerical demonstrations on up to 32 spatial lattice sites and with up to 3 interacting strings are performed.

Applications of flow models to the generation of correlated lattice QCD ensembles
Ryan Abbott, Aleksandar Botev, Denis Boyda, Daniel C. Hackett, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
[ arXiv:2401.10874 ]

Abstract

Machine-learned normalizing flows can be used in the context of lattice quantum field theory to generate statistically correlated ensembles of lattice gauge fields at different action parameters. This work demonstrates how these correlations can be exploited for variance reduction in the computation of observables. Three different proof-of-concept applications are demonstrated using a novel residual flow architecture: continuum limits of gauge theories, the mass dependence of QCD observables, and hadronic matrix elements based on the Feynman-Hellmann approach. In all three cases, it is shown that statistical uncertainties are significantly reduced when machine-learned flows are incorporated as compared with the same calculations performed with uncorrelated ensembles or direct reweighting.

Anomaly Detection in Collider Physics via Factorized Observables
Eric M. Metodiev, Jesse Thaler, Raymond Wynne
[ arXiv:2312.00119 ]

Abstract

To maximize the discovery potential of high-energy colliders, experimental searches should be sensitive to unforeseen new physics scenarios. This goal has motivated the use of machine learning for unsupervised anomaly detection. In this paper, we introduce a new anomaly detection strategy called FORCE: factorized observables for regressing conditional expectations. Our approach is based on the inductive bias of factorization, which is the idea that the physics governing different energy scales can be treated as approximately independent. Assuming factorization holds separately for signal and background processes, the appearance of non-trivial correlations between low- and high-energy observables is a robust indicator of new physics. Under the most restrictive form of factorization, a machine-learned model trained to identify such correlations will in fact converge to the optimal new physics classifier. We test FORCE on a benchmark anomaly detection task for the Large Hadron Collider involving collimated sprays of particles called jets. By teasing out correlations between the kinematics and substructure of jets, our method can reliably extract percent-level signal fractions. This strategy for uncovering new physics adds to the growing toolbox of anomaly detection methods for collider physics with a complementary set of assumptions.

Safe but Incalculable: Energy-weighting is not all you need
Samuel Bright-Thonney, Benjamin Nachman, Jesse Thaler
[ arXiv:2311.07652 ]

Abstract

Infrared and collinear (IRC) safety has long been used a proxy for robustness when developing new jet substructure observables. This guiding philosophy has been carried into the deep learning era, where IRC-safe neural networks have been used for many jet studies. For graph-based neural networks, the most straightforward way to achieve IRC safety is to weight particle inputs by their energies. However, energy-weighting by itself does not guarantee that perturbative calculations of machine-learned observables will enjoy small non-perturbative corrections. In this paper, we demonstrate the sensitivity of IRC-safe networks to non-perturbative effects, by training an energy flow network (EFN) to maximize its sensitivity to hadronization. We then show how to construct Lipschitz Energy Flow Networks (L-EFNs), which are both IRC safe and relatively insensitive to non-perturbative corrections. We demonstrate the performance of L-EFNs on generated samples of quark and gluon jets, and showcase fascinating differences between the learned latent representations of EFNs and L-EFNs.

T-Duality and Flavor Symmetries in Little String Theories
Hamza Ahmed, Paul-Konstantin Oehlmann, Fabian Ruehle
[ arXiv:2311.02168 ]

Abstract

We explore the T-duality web of 6D Heterotic Little String Theories, focusing on flavor algebra reducing deformations. A careful analysis of the full flavor algebra, including Abelian factors, shows that the flavor rank is preserved under T-duality. This suggests a new T-duality invariant in addition to the Coulomb branch dimension and the two-group structure constants. We also engineer Little String Theories with non-simply laced flavor algebras, whose appearance we attribute to certain discrete 3-form fluxes in M-theory. Geometrically, these theories are engineered in F-theory with non-Kähler favorable K3 fibers. This geometric origin leads us to propose that freezing fluxes are preserved across T-duality. Along the way, we discuss various exotic models, including two inequivalent Spin(32)/ℤ2 models that are dual to the same E8×E8 theory, and a family of self-T-dual models.

Lattice QCD Constraints on the Fourth Mellin Moment of the Pion Light Cone Distribution Amplitude using the HOPE method
William Detmold, Anthony V. Grebe, Issaku Kanamori, C.-J. David Lin, Robert J. Perry, Yong Zhao
[ arXiv:2311.01322 ]

Abstract

The light-cone distribution amplitude (LCDA) of the pion contains information about the parton momentum carried by the quarks and is an important theoretical input for various predictions of exclusive processes at high energy, including the pion electromagnetic form factor. Progress towards constraining the fourth Mellin moment of the LCDA using the heavy-quark operator product expansion (HOPE) method is presented.

Metric Flows with Neural Networks
James Halverson, Fabian Ruehle
[ arXiv:2310.19870 ]

Abstract

We develop a theory of flows in the space of Riemannian metrics induced by neural network gradient descent. This is motivated in part by recent advances in approximating Calabi-Yau metrics with neural networks and is enabled by recent advances in understanding flows in the space of neural networks. We derive the corresponding metric flow equations, which are governed by a metric neural tangent kernel, a complicated, non-local object that evolves in time. However, many architectures admit an infinite-width limit in which the kernel becomes fixed and the dynamics simplify. Additional assumptions can induce locality in the flow, which allows for the realization of Perelman's formulation of Ricci flow that was used to resolve the 3d Poincaré conjecture. We apply these ideas to numerical Calabi-Yau metrics, including a discussion on the importance of feature learning.

Signal-to-noise improvement through neural network contour deformations for 3D SU(2) lattice gauge theory
William Detmold, Gurtej Kanwar, Yin Lin, Phiala E. Shanahan, Michael L. Wagman
[ arXiv:2309.00600 ]

Abstract

Complex contour deformations of the path integral have been demonstrated to significantly improve the signal-to-noise ratio of observables in previous studies of two-dimensional gauge theories with open boundary conditions. In this work, new developments based on gauge fixing and a neural network definition of the deformation are introduced, which enable an effective application to theories in higher dimensions and with generic boundary conditions. Improvements of the signal-to-noise ratio by up to three orders of magnitude for Wilson loop measurements are shown in SU(2) lattice gauge theory in three spacetime dimensions.

Reconstructing S-matrix Phases with Machine Learning
Aurélien Dersy, Matthew D. Schwartz, Alexander Zhiboedov
[ arXiv:2308.09451 ]

Abstract

An important element of the S-matrix bootstrap program is the relationship between the modulus of an S-matrix element and its phase. Unitarity relates them by an integral equation. Even in the simplest case of elastic scattering, this integral equation cannot be solved analytically and numerical approaches are required. We apply modern machine learning techniques to studying the unitarity constraint. We find that for a given modulus, when a phase exists it can generally be reconstructed to good accuracy with machine learning. Moreover, the loss of the reconstruction algorithm provides a good proxy for whether a given modulus can be consistent with unitarity at all. In addition, we study the question of whether multiple phases can be consistent with a single modulus, finding novel phase-ambiguous solutions. In particular, we find a new phase-ambiguous solution which pushes the known limit on such solutions significantly beyond the previous bound.

Score-based Diffusion Models for Generating Liquid Argon Time Projection Chamber Images
Zeviel Imani, Shuchin Aeron, Taritree Wongjirad
[ arXiv:2307.13687 ]

Abstract

We show for the first time, high-fidelity generation of LArTPC-like data using a generative neural network. This demonstrates that methods developed for natural images do transfer to LArTPC-produced images which in contrast to natural images are globally sparse, but locally dense. We present the method we employ, which is a variant of score-based generative diffusion models. We evaluate the fidelity of the generated images using several different approaches that include using a variant of measures used to evaluate natural images, comparisons between high-dimensional distributions, and comparisons relevant to LArTPC experiments.

Neural Network Field Theories: Non-Gaussianity, Actions, and Locality
Mehmet Demirtas, James Halverson, Anindita Maiti, Matthew D. Schwartz, Keegan Stoner
[ arXiv:2307.03223 ]

Abstract

Both the path integral measure in field theory and ensembles of neural networks describe distributions over functions. When the central limit theorem can be applied in the infinite-width (infinite-N) limit, the ensemble of networks corresponds to a free field theory. Although an expansion in 1/N corresponds to interactions in the field theory, others, such as in a small breaking of the statistical independence of network parameters, can also lead to interacting theories. These other expansions can be advantageous over the 1/N-expansion, for example by improved behavior with respect to the universal approximation theorem. Given the connected correlators of a field theory, one can systematically reconstruct the action order-by-order in the expansion parameter, using a new Feynman diagram prescription whose vertices are the connected correlators. This method is motivated by the Edgeworth expansion and allows one to derive actions for neural network field theories. Conversely, the correspondence allows one to engineer architectures realizing a given field theory by representing action deformations as deformations of neural network parameter densities. As an example, ϕ4 theory is realized as an infinite-N neural network field theory.

Hierarchical Neural Simulation-Based Inference Over Event Ensembles
Lukas Heinrich, Siddharth Mishra-Sharma, Chris Pollard, Philipp Windischhofer
[ arXiv:2306.12584 ]

Abstract

When analyzing real-world data it is common to work with event ensembles, which comprise sets of observations that collectively constrain the parameters of an underlying model of interest. Such models often have a hierarchical structure, where "local" parameters impact individual events and "global" parameters influence the entire dataset. We introduce practical approaches for optimal dataset-wide probabilistic inference in cases where the likelihood is intractable, but simulations can be realized via forward modeling. We construct neural estimators for the likelihood(-ratio) or posterior and show that explicitly accounting for the model's hierarchical structure can lead to tighter parameter constraints. We ground our discussion using case studies from the physical sciences, focusing on examples from particle physics (particle collider data) and astrophysics (strong gravitational lensing observations).

SN2023ixf in Messier 101: A Variable Red Supergiant as the Progenitor Candidate to a Type II Supernova
Charles D. Kilpatrick, Ryan J. Foley, Wynn V. Jacobson-Galán, Anthony L. Piro, Stephen J. Smartt, Maria R. Drout, Alexander Gagliano, Christa Gall, Jens Hjorth, David O. Jones, Kaisey S. Mandel, Raffaella Margutti, Conor L. Ransome, V. Ashley Villar, David A. Coulter, Hua Gao, David Jacob Matthews, Yossef Zenati
[ arXiv:2306.04722 ]

Abstract

We present pre-explosion optical and infrared (IR) imaging at the site of the type II supernova (SN II) 2023ixf in Messier 101 at 6.9 Mpc. We astrometrically registered a ground-based image of SN 2023ixf to archival Hubble Space Telescope (HST), Spitzer Space Telescope (Spitzer), and ground-based near-IR images. A single point source is detected at a position consistent with the SN at wavelengths ranging from HST R-band to Spitzer 4.5 μm. Fitting to blackbody and red supergiant (RSG) spectral-energy distributions (SEDs), we find that the source is anomalously cool with a significant mid-IR excess. We interpret this SED as reprocessed emission in a 8600 R⊙ circumstellar shell of dusty material with a mass ∼5×10−5M⊙ surrounding a log(L/L⊙)=4.74±0.07 and Teff=3920+200−160 K RSG. This luminosity is consistent with RSG models of initial mass 11 M⊙, depending on assumptions of rotation and overshooting. In addition, the counterpart was significantly variable in pre-explosion Spitzer 3.6 μm and 4.5 μm imaging, exhibiting ∼70% variability in both bands correlated across 9 yr and 29 epochs of imaging. The variations appear to have a timescale of 2.8 yr, which is consistent with κ-mechanism pulsations observed in RSGs, albeit with a much larger amplitude than RSGs such as α Orionis (Betelgeuse).

Quantum Computation and Simulation using Fermion-Pair Registers
Xiangkai Sun, Di Luo, Soonwon Choi
[ arXiv:2306.03905 ]

Abstract

We propose and analyze an approach to realize quantum computation and simulation using fermionic particles under quantum gas microscopes. Our work is inspired by a recent experimental demonstration of large-scale quantum registers, where tightly localized fermion pairs are used to encode qubits exhibiting long coherence time and robustness against laser intensity noise. We describe how to engineer the SWAP gate and high-fidelity controlled-phase gates by adjusting the fermion hopping as well as Feshbach interaction strengths. Combined with previously demonstrated single-qubit rotations, these gates establish the computational universality of the system. Furthermore, we show that 2D quantum Ising Hamiltonians with tunable transverse and longitudinal fields can be efficient simulated by modulating Feshbach interaction strengths. We present a sample-efficient protocol to characterize engineered gates and Hamiltonian dynamics based on an improved classical shadow process tomography that requires minimal experimental controls. Our work opens up new opportunities to harness existing ultracold quantum gases for quantum information sciences.

First Impressions: Early-Time Classification of Supernovae using Host Galaxy Information and Shallow Learning
Alexander Gagliano, Gabriella Contardo, Daniel Foreman-Mackey, Alex I. Malz, Patrick D. Aleo
[ arXiv:2305.08894 ]

Abstract

Substantial effort has been devoted to the characterization of transient phenomena from photometric information. Automated approaches to this problem have taken advantage of complete phase-coverage of an event, limiting their use for triggering rapid follow-up of ongoing phenomena. In this work, we introduce a neural network with a single recurrent layer designed explicitly for early photometric classification of supernovae. Our algorithm leverages transfer learning to account for model misspecification, host galaxy photometry to solve the data scarcity problem soon after discovery, and a custom weighted loss to prioritize accurate early classification. We first train our algorithm using state-of-the-art transient and host galaxy simulations, then adapt its weights and validate it on the spectroscopically-confirmed SNe Ia, SNe II, and SNe Ib/c from the Zwicky Transient Facility Bright Transient Survey. On observed data, our method achieves an overall accuracy of 82±2% within 3 days of an events discovery, and an accuracy of 87±5% within 30 days of discovery. At both early and late phases, our method achieves comparable or superior results to the leading classification algorithms with a simpler network architecture. These results help pave the way for rapid photometric and spectroscopic follow-up of scientifically-valuable transients discovered in massive synoptic surveys.

Constraint of pionless EFT using two-nucleon spectra from lattice QCD
William Detmold, Fernando Romero-López, Phiala E. Shanahan
[ arXiv:2305.06313 ]

Abstract

Finite-volume pionless effective field theory (FVEFTπ/) at next-to-leading order (NLO) is used to analyze the two-nucleon lattice QCD spectrum of Ref.~\cite{Amarasinghe:2021lqa}, performed at quark masses corresponding to a pion mass of approximately 800 MeV. Specifically, the effective theory is formulated in finite volume, and variational sets of wave functions are optimized using differential programming. Using these wave functions projected to the appropriate finite-volume symmetry group, variational bounds from FVEFTπ/ are obtained for the ground state, as well as excited states. By comparison with the lattice QCD GEVP spectrum, different low energy constants (LECs) are constrained. Relativistic corrections are incorporated, allowing for the extractions of NLO LECs, as well as the leading s-d-wave mixing term in the deuteron channel.

Normalizing flows for lattice gauge theory in arbitrary space-time dimension
Ryan Abbott, Michael S. Albergo, Aleksandar Botev, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Gurtej Kanwar, Alexander G.D.G. Matthews, Sébastien Racanière, Ali Razavi, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
[ arXiv:2305.02402 ]

Abstract

Applications of normalizing flows to the sampling of field configurations in lattice gauge theory have so far been explored almost exclusively in two space-time dimensions. We report new algorithmic developments of gauge-equivariant flow architectures facilitating the generalization to higher-dimensional lattice geometries. Specifically, we discuss masked autoregressive transformations with tractable and unbiased Jacobian determinants, a key ingredient for scalable and asymptotically exact flow-based sampling algorithms. For concreteness, results from a proof-of-principle application to SU(3) lattice gauge theory in four space-time dimensions are reported.

Searching for ribbons with machine learning
Sergei Gukov, James Halverson, Ciprian Manolescu, Fabian Ruehle
[ arXiv:2304.09304 ]

Abstract

We apply Bayesian optimization and reinforcement learning to a problem in topology: the question of when a knot bounds a ribbon disk. This question is relevant in an approach to disproving the four-dimensional smooth Poincaré conjecture; using our programs, we rule out many potential counterexamples to the conjecture. We also show that the programs are successful in detecting many ribbon knots in the range of up to 70 crossings.

Correlation function distributions for O(N) lattice field theories in the disordered phase
Cagin Yunus, William Detmold
[ arXiv:2304.03820 ]

Abstract

Numerical computations in strongly-interacting quantum field theories are often performed using Monte-Carlo sampling methods. A key task in these calculations is to estimate the value of a given physical quantity from the distribution of stochastic samples that are generated using the Monte-Carlo method. Typically, the sample mean and sample variance are used to define the expectation values and uncertainties of computed quantities. However, the Monte-Carlo sample distribution contains more information than these basic properties and it is useful to investigate it more generally. In this work, the exact form of the probability distributions of two-point correlation functions at zero momentum in O(N) lattice field theories in the disordered phase and in infinite volume are determined. These distributions allow for a robust investigation of the efficacy of the Monte-Carlo sampling procedure and are shown also to allow for improved estimators of the target physical quantity to be constructed. The theoretical expectations are shown to agree with numerical calculations in the O(2) model.

Autoregressive Neural TensorNet: Bridging Neural Networks and Tensor Networks for Quantum Many-Body Simulationg
Zhuo Chen, Laker Newhouse, Eddie Chen, Di Luo, Marin Soljačić
[ arXiv:2304.01996 ]

Abstract

Quantum many-body physics simulation has important impacts on understanding fundamental science and has applications to quantum materials design and quantum technology. However, due to the exponentially growing size of the Hilbert space with respect to the particle number, a direct simulation is intractable. While representing quantum states with tensor networks and neural networks are the two state-of-the-art methods for approximate simulations, each has its own limitations in terms of expressivity and optimization. To address these challenges, we develop a novel architecture, Autoregressive Neural TensorNet (ANTN), which bridges tensor networks and autoregressive neural networks. We show that Autoregressive Neural TensorNet parameterizes normalized wavefunctions with exact sampling, generalizes the expressivity of tensor networks and autoregressive neural networks, and inherits a variety of symmetries from autoregressive neural networks. We demonstrate our approach on the 2D J1-J2 Heisenberg model with different systems sizes and coupling parameters, outperforming both tensor networks and autoregressive neural networks. Our work opens up new opportunities for both scientific simulations and machine learning applications.

Artificial intelligence for artificial materials: moiré atom
Di Luo, Aidan P. Reddy, Trithep Devakul, Liang Fu
[ arXiv:2303.08162 ]

Abstract

Moiré engineering in atomically thin van der Waals heterostructures creates artificial quantum materials with designer properties. We solve the many-body problem of interacting electrons confined to a moiré superlattice potential minimum (the moiré atom) using a 2D fermionic neural network. We show that strong Coulomb interactions in combination with the anisotropic moiré potential lead to striking ``Wigner molecule" charge density distributions observable with scanning tunneling microscopy.

Q-Flow: Generative Modeling for Differential Equations of Open Quantum Dynamics with Normalizing Flows
Owen Dugan, Peter Y. Lu, Rumen Dangovski, Di Luo, Marin Soljačić
[ arXiv:2302.12235 ]

Abstract

Studying the dynamics of open quantum systems holds the potential to enable breakthroughs both in fundamental physics and applications to quantum engineering and quantum computation. Due to the high-dimensional nature of the problem, customized deep generative neural networks have been instrumental in modeling the high-dimensional density matrix ρ, which is the key description for the dynamics of such systems. However, the complex-valued nature and normalization constraints of ρ, as well as its complicated dynamics, prohibit a seamless connection between open quantum systems and the recent advances in deep generative modeling. Here we lift that limitation by utilizing a reformulation of open quantum system dynamics to a partial differential equation (PDE) for a corresponding probability distribution Q, the Husimi Q function. Thus, we model the Q function seamlessly with off-the-shelf deep generative models such as normalizing flows. Additionally, we develop novel methods for learning normalizing flow evolution governed by high-dimensional PDEs, based on the Euler method and the application of the time-dependent variational principle. We name the resulting approach Q-Flow and demonstrate the scalability and efficiency of Q-Flow on open quantum system simulations, including the dissipative harmonic oscillator and the dissipative bosonic model. Q-Flow is superior to conventional PDE solvers and state-of-the-art physics-informed neural network solvers, especially in high-dimensional systems.

Geometry of contact: contact planning for multi-legged robots via spin models duality
Baxi Chong, Di Luo, Tianyu Wang, Gabriel Margolis, Juntao He, Pulkit Agrawal, Marin Soljačić, Daniel I. Goldman
[ arXiv:2302.03019 ]

Abstract

Contact planning is crucial in locomoting systems.Specifically, appropriate contact planning can enable versatile behaviors (e.g., sidewinding in limbless locomotors) and facilitate speed-dependent gait transitions (e.g., walk-trot-gallop in quadrupedal locomotors). The challenges of contact planning include determining not only the sequence by which contact is made and broken between the locomotor and the environments, but also the sequence of internal shape changes (e.g., body bending and limb shoulder joint oscillation). Most state-of-art contact planning algorithms focused on conventional robots (e.g.biped and quadruped) and conventional tasks (e.g. forward locomotion), and there is a lack of study on general contact planning in multi-legged robots. In this paper, we show that using geometric mechanics framework, we can obtain the global optimal contact sequence given the internal shape changes sequence. Therefore, we simplify the contact planning problem to a graph optimization problem to identify the internal shape changes. Taking advantages of the spatio-temporal symmetry in locomotion, we map the graph optimization problem to special cases of spin models, which allows us to obtain the global optima in polynomial time. We apply our approach to develop new forward and sidewinding behaviors in a hexapod and a 12-legged centipede. We verify our predictions using numerical and robophysical models, and obtain novel and effective locomotion behaviors.

Simulating 2+1D Lattice Quantum Electrodynamics at Finite Density with Neural Flow Wavefunctions
Zhuo Chen, Di Luo, Kaiwen Hu, Bryan K. Clark
[ arXiv:2212.06835 ]

Abstract

We present a neural flow wavefunction, Gauge-Fermion FlowNet, and use it to simulate 2+1D lattice compact quantum electrodynamics with finite density dynamical fermions. The gauge field is represented by a neural network which parameterizes a discretized flow-based transformation of the amplitude while the fermionic sign structure is represented by a neural net backflow. This approach directly represents the U(1) degree of freedom without any truncation, obeys Guass's law by construction, samples autoregressively avoiding any equilibration time, and variationally simulates Gauge-Fermion systems with sign problems accurately. In this model, we investigate confinement and string breaking phenomena in different fermion density and hopping regimes. We study the phase transition from the charge crystal phase to the vacuum phase at zero density, and observe the phase seperation and the net charge penetration blocking effect under magnetic interaction at finite density. In addition, we investigate a magnetic phase transition due to the competition effect between the kinetic energy of fermions and the magnetic energy of the gauge field. With our method, we further note potential differences on the order of the phase transitions between a continuous U(1) system and one with finite truncation. Our state-of-the-art neural network approach opens up new possibilities to study different gauge theories coupled to dynamical matter in higher dimensions.

Gauge Equivariant Neural Networks for 2+1D U(1) Gauge Theory Simulations in Hamiltonian Formulation
Di Luo, Shunyue Yuan, James Stokes, Bryan K. Clark
[ arXiv:2211.03198 ]

Abstract

Gauge Theory plays a crucial role in many areas in science, including high energy physics, condensed matter physics and quantum information science. In quantum simulations of lattice gauge theory, an important step is to construct a wave function that obeys gauge symmetry. In this paper, we have developed gauge equivariant neural network wave function techniques for simulating continuous-variable quantum lattice gauge theories in the Hamiltonian formulation. We have applied the gauge equivariant neural network approach to find the ground state of 2+1-dimensional lattice gauge theory with U(1) gauge group using variational Monte Carlo. We have benchmarked our approach against the state-of-the-art complex Gaussian wave functions, demonstrating improved performance in the strong coupling regime and comparable results in the weak coupling regime.

QuACK: Accelerating Gradient-Based Quantum Optimization with Koopman Operator Learning
Di Luo, Jiayu Shen, Rumen Dangovski, Marin Soljačić
[ arXiv:2211.01365 ]

Abstract

Finding efficient optimization methods plays an important role for quantum optimization and quantum machine learning on near-term quantum computers. While backpropagation on classical computers is computationally efficient, obtaining gradients on quantum computers is not, because the computational complexity usually scales with the number of parameters and measurements. In this paper, we connect Koopman operator theory, which has been successful in predicting nonlinear dynamics, with natural gradient methods in quantum optimization. We propose a data-driven approach using Koopman operator learning to accelerate quantum optimization and quantum machine learning. We develop two new families of methods: the sliding window dynamic mode decomposition (DMD) and the neural DMD for efficiently updating parameters on quantum computers. We show that our methods can predict gradient dynamics on quantum computers and accelerate the variational quantum eigensolver used in quantum optimization, as well as quantum machine learning. We further implement our Koopman operator learning algorithm on a real IBM quantum computer and demonstrate their practical effectiveness.

Learning to Optimize Quasi-Newton Methods
Isaac Liao, Rumen R. Dangovski, Jakob N. Foerster, Marin Soljačić
[ arXiv:2210.06171 ]

Abstract

We introduce a novel machine learning optimizer called LODO, which online meta-learns an implicit inverse Hessian of the loss as a subroutine of quasi-Newton optimization. Our optimizer merges Learning to Optimize (L2O) techniques with quasi-Newton methods to learn neural representations of symmetric matrix vector products, which are more flexible than those in other quasi-Newton methods. Unlike other L2O methods, ours does not require any meta-training on a training task distribution, and instead learns to optimize on the fly while optimizing on the test task, adapting to the local characteristics of the loss landscape while traversing it. Theoretically, we show that our optimizer approximates the inverse Hessian in noisy loss landscapes and is capable of representing a wide range of inverse Hessians. We experimentally verify our algorithm's performance in the presence of noise, and show that simpler alternatives for representing the inverse Hessians worsen performance. Lastly, we use our optimizer to train a semi-realistic deep neural network with 95k parameters, and obtain competitive results against standard neural network optimizers.

On the Importance of Calibration in Semi-supervised Learning
Charlotte Loh, Rumen Dangovski, Shivchander Sudalairaj, Seungwook Han, Ligong Han, Leonid Karlinsky, Marin Soljacic, Akash Srivastava
[ arXiv:2210.04783 ]

Abstract

State-of-the-art (SOTA) semi-supervised learning (SSL) methods have been highly successful in leveraging a mix of labeled and unlabeled data by combining techniques of consistency regularization and pseudo-labeling. During pseudo-labeling, the model's predictions on unlabeled data are used for training and thus, model calibration is important in mitigating confirmation bias. Yet, many SOTA methods are optimized for model performance, with little focus directed to improve model calibration. In this work, we empirically demonstrate that model calibration is strongly correlated with model performance and propose to improve calibration via approximate Bayesian techniques. We introduce a family of new SSL models that optimizes for calibration and demonstrate their effectiveness across standard vision benchmarks of CIFAR-10, CIFAR-100 and ImageNet, giving up to 15.9% improvement in test accuracy. Furthermore, we also demonstrate their effectiveness in additional realistic and challenging problems, such as class-imbalanced datasets and in photonics science.

Sampling QCD field configurations with gauge-equivariant flow models
Ryan Abbott, Michael S. Albergo, Aleksandar Botev, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Gurtej Kanwar, Alexander G. D. G. Matthews, Sébastien Racanière, Ali Razavi, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
[ arXiv:2208.03832 ]

Abstract

Machine learning methods based on normalizing flows have been shown to address important challenges, such as critical slowing-down and topological freezing, in the sampling of gauge field configurations in simple lattice field theories. A critical question is whether this success will translate to studies of QCD. This Proceedings presents a status update on advances in this area. In particular, it is illustrated how recently developed algorithmic components may be combined to construct flow-based sampling algorithms for QCD in four dimensions. The prospects and challenges for future use of this approach in at-scale applications are summarized.

Strong Lensing Source Reconstruction Using Continuous Neural Fields
Siddharth Mishra-Sharma, Ge Yang
[ arXiv:2206.14820 ]

Abstract

From the nature of dark matter to the rate of expansion of our Universe, observations of distant galaxies distorted through strong gravitational lensing have the potential to answer some of the major open questions in astrophysics. Modeling galaxy-galaxy strong lensing observations presents a number of challenges as the exact configuration of both the background source and foreground lens galaxy is unknown. A timely call, prompted by a number of upcoming surveys anticipating high-resolution lensing images, demands methods that can efficiently model lenses at their full complexity. In this work, we introduce a method that uses continuous neural fields to non-parametrically reconstruct the complex morphology of a source galaxy while simultaneously inferring a distribution over foreground lens galaxy configurations. We demonstrate the efficacy of our method through experiments on simulated data targeting high-resolution lensing images similar to those anticipated in near-future astrophysical surveys.

Simplifying Polylogarithms with Machine Learning
Aurélien Dersy, Matthew D. Schwartz, Xiaoyuan Zhang
[ arXiv:2206.04115 ]

Abstract

Polylogrithmic functions, such as the logarithm or dilogarithm, satisfy a number of algebraic identities. For the logarithm, all the identities follow from the product rule. For the dilogarithm and higher-weight classical polylogarithms, the identities can involve five functions or more. In many calculations relevant to particle physics, complicated combinations of polylogarithms often arise from Feynman integrals. Although the initial expressions resulting from the integration usually simplify, it is often difficult to know which identities to apply and in what order. To address this bottleneck, we explore to what extent machine learning methods can help. We consider both a reinforcement learning approach, where the identities are analogous to moves in a game, and a transformer network approach, where the problem is viewed analogously to a language-translation task. While both methods are effective, the transformer network appears more powerful and holds promise for practical use in symbolic manipulation tasks in mathematical physics.

Identifying equivalent Calabi–Yau topologies: A discrete challenge from math and physics for machine learning
Vishnu Jejjala, Washington Taylor, Andrew Turner
[ arXiv:2202.07590 ]

Abstract

We review briefly the characteristic topological data of Calabi–Yau threefolds and focus on the question of when two threefolds are equivalent through related topological data. This provides an interesting test case for machine learn- ing methodology in discrete mathematics problems motivated by physics.

Finite-Volume Pionless Effective Field Theory for Few-Nucleon Systems with Differentiable Programming
Xiangkai Sun, William Detmold, Di Luo, Phiala E. Shanahan
[ arXiv:2202.03530 ]

Abstract

Finite-volume pionless effective field theory provides an efficient framework for the extrapolation of nuclear spectra and matrix elements calculated at finite volume in lattice QCD to infinite volume, and to nuclei with larger atomic number. In this work, it is demonstrated how this framework may be implemented via a set of correlated Gaussian wavefunctions optimised using differentiable programming and via solution of a generalised eigenvalue problem. This approach is shown to be significantly more efficient than a stochastic implementation of the variational method based on the same form of correlated Gaussian wavefunctions, yielding comparably accurate representations of the ground-state wavefunctions with an order of magnitude fewer terms. The efficiency of representation allows such calculations to be extended to larger systems than in previous work. The method is demonstrated through calculations of the binding energies of nuclei with atomic number A∈{2,3,4} in finite volume, matched to lattice QCD calculations at quark masses corresponding to mπ=806 MeV, and infinite-volume effective field theory calculations of A∈{2,3,4,5,6} systems based on this matching.

Building Quantum Field Theories Out of Neurons
James Halverson
[ arXiv:2112.04527 ]

Abstract

An approach to field theory is studied in which fields are comprised of N constituent random neurons. Gaussian theories arise in the infinite-N limit when neurons are independently distributed, via the Central Limit Theorem, while interactions arise due to finite-N effects or non-independently distributed neurons. Euclidean-invariant ensembles of neurons are engineered, with tunable two-point function, yielding families of Euclidean-invariant field theories. Some Gaussian, Euclidean invariant theories are reflection positive, which allows for analytic continuation to a Lorentz-invariant quantum field theory. Examples are presented that yield dual theories at infinite-N, but have different symmetries at finite-N. Landscapes of classical field configurations are determined by local maxima of parameter distributions. Predictions arise from mixed field-neuron correlators. Near-Gaussianity is exhibited at large-N, potentially explaining a feature of field theories in Nature.

Quantum reservoir computing using arrays of Rydberg atoms
Rodrigo Araiza Bravo, Khadijeh Najafi, Xun Gao, Susanne F. Yelin
[ arXiv:2111.10956 ]

Abstract

Quantum computing promises to provide machine learning with computational advantages. However, noisy intermediate-scale quantum (NISQ) devices pose engineering challenges to realizing quantum machine learning (QML) advantages. Recently, a series of QML computational models inspired by the noise-tolerant dynamics on the brain have emerged as a means to circumvent the hardware limitations of NISQ devices. In this article, we introduce a quantum version of a recurrent neural network (RNN), a well-known model for neural circuits in the brain. Our quantum RNN (qRNN) makes use of the natural Hamiltonian dynamics of an ensemble of interacting spin-1/2 particles as a means for computation. In the limit where the Hamiltonian is diagonal, the qRNN recovers the dynamics of the classical version. Beyond this limit, we observe that the quantum dynamics of the qRNN provide it quantum computational features that can aid it in computation. To this end, we study a qRNN based on arrays of Rydberg atoms, and show that the qRNN is indeed capable of replicating the learning of several cognitive tasks such as multitasking, decision making, and long-term memory by taking advantage of several key features of this platform such as interatomic species interactions, and quantum many-body scars.

Inferring dark matter substructure with astrometric lensing beyond the power spectrum
Siddharth Mishra-Sharma
[ arXiv:2110.01620 ]

Abstract

Astrometry -- the precise measurement of positions and motions of celestial objects -- has emerged as a promising avenue for characterizing the dark matter population in our Galaxy. By leveraging recent advances in simulation-based inference and neural network architectures, we introduce a novel method to search for global dark matter-induced gravitational lensing signatures in astrometric datasets. Our method based on neural likelihood-ratio estimation shows significantly enhanced sensitivity to a cold dark matter population and more favorable scaling with measurement noise compared to existing approaches based on two-point correlation statistics, establishing machine learning as a powerful tool for characterizing dark matter using astrometric data.

Flow-based sampling for multimodal distributions in lattice field theory
Daniel C. Hackett, Chung-Chun Hsieh, Michael S. Albergo, Denis Boyda, Jiunn-Wei Chen, Kai-Feng Chen, Kyle Cranmer, Gurtej Kanwar, Phiala E. Shanahan
[ arXiv:2107.00734 ]

Abstract

Recent results have demonstrated that samplers constructed with flow-based generative models are a promising new approach for configuration generation in lattice field theory. In this paper, we present a set of methods to construct flow models for targets with multiple separated modes (i.e. theories with multiple vacua). We demonstrate the application of these methods to modeling two-dimensional real scalar field theory in its symmetry-broken phase. In this context we investigate the performance of different flow-based sampling algorithms, including a composite sampling algorithm where flow-based proposals are occasionally augmented by applying updates using traditional algorithms like HMC.

Symmetry-via-Duality: Invariant Neural Network Densities from Parameter-Space Correlators
Anindita Maiti, Keegan Stoner, James Halverson
[ arXiv:2106.00694 ]

Abstract

Parameter-space and function-space provide two different duality frames in which to study neural networks. We demonstrate that symmetries of network densities may be determined via dual computations of network correlation functions, even when the density is unknown and the network is not equivariant. Symmetry-via-duality relies on invariance properties of the correlation functions, which stem from the choice of network parameter distributions. Input and output symmetries of neural network densities are determined, which recover known Gaussian process results in the infinite width limit. The mechanism may also be utilized to determine symmetries during training, when parameters are correlated, as well as symmetries of the Neural Tangent Kernel. We demonstrate that the amount of symmetry in the initialization density affects the accuracy of networks trained on Fashion-MNIST, and that symmetry breaking helps only when it is in the direction of ground truth.

Introduction to Normalizing Flows for Lattice Field Theory
Michael S. Albergo, Denis Boyda, Daniel C. Hackett, Gurtej Kanwar, Kyle Cranmer, Sébastien Racanière, Danilo Jimenez Rezende, and Phiala E. Shanahan
[ arXiv:2101.08176 ]

Abstract

This notebook tutorial demonstrates a method for sampling Boltzmann distributions of lattice field theories using a class of machine learning models known as normalizing flows. The ideas and approaches proposed in arXiv:1904.12072, arXiv:2002.02428, and arXiv:2003.06413 are reviewed and a concrete implementation of the framework is presented. We apply this framework to a lattice scalar field theory and to U(1) gauge theory, explicitly encoding gauge symmetries in the flow-based approach to the latter. This presentation is intended to be interactive and working with the attached Jupyter notebook is recommended.

Published

Rigor with Machine Learning from Field Theory to the Poincaré Conjecture
Sergei Gukov, James Halverson, Fabian Ruehle
Nature Reviews Physics 2024 [ arXiv:2402.13321 ]

Abstract

Machine learning techniques are increasingly powerful, leading to many breakthroughs in the natural sciences, but they are often stochastic, error-prone, and blackbox. How, then, should they be utilized in fields such as theoretical physics and pure mathematics that place a premium on rigor and understanding? In this Perspective we discuss techniques for obtaining rigor in the natural sciences with machine learning. Non-rigorous methods may lead to rigorous results via conjecture generation or verification by reinforcement learning. We survey applications of these techniques-for-rigor ranging from string theory to the smooth 4d Poincaré conjecture in low-dimensional topology. One can also imagine building direct bridges between machine learning theory and either mathematics or theoretical physics. As examples, we describe a new approach to field theory motivated by neural network theory, and a theory of Riemannian metric flows induced by neural network gradient descent, which encompasses Perelmans formulation of the Ricci flow that was utilized to resolve the 3d Poincaré conjecture.

Advances in machine-learning-based sampling motivated by lattice quantum chromodynamics
Kyle Cranmer, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, Phiala E. Shanahan
Nature Reviews Physics, 2023, Volume 5 [ arXiv:2309.01156 ]

Abstract

Sampling from known probability distributions is a ubiquitous task in computational science, underlying calculations in domains from linguistics to biology and physics. Generative machine-learning (ML) models have emerged as a promising tool in this space, building on the success of this approach in applications such as image, text, and audio generation. Often, however, generative tasks in scientific domains have unique structures and features -- such as complex symmetries and the requirement of exactness guarantees -- that present both challenges and opportunities for ML. This Perspective outlines the advances in ML-based sampling motivated by lattice quantum field theory, in particular for the theory of quantum chromodynamics. Enabling calculations of the structure and interactions of matter from our most fundamental understanding of particle physics, lattice quantum chromodynamics is one of the main consumers of open-science supercomputing worldwide. The design of ML algorithms for this application faces profound challenges, including the necessity of scaling custom ML architectures to the largest supercomputers, but also promises immense benefits, and is spurring a wave of development in ML-based sampling more broadly. In lattice field theory, if this approach can realize its early promise it will be a transformative step towards first-principles physics calculations in particle, nuclear and condensed matter physics that are intractable with traditional approaches.

Subhalo effective density slope measurements from HST strong lensing data with neural likelihood-ratio estimation
Gemma Zhang, Atınç Çağan Şengül, Cora Dvorkin
Monthly Notices of the Royal Astronomical Society, 2024, Volume 527, Issue 2 [ arXiv:2308.09739 ]

Abstract

Examining the properties of subhalos with strong gravitational lensing images can shed light on the nature of dark matter. From upcoming large-scale surveys, we expect to discover orders of magnitude more strong lens systems that can be used for subhalo studies. To optimally extract information from a large number of strong lensing images, machine learning provides promising avenues for efficient analysis that is unachievable with traditional analysis methods, but application of machine learning techniques to real observations is still limited. We build upon previous work, which uses a neural likelihood-ratio estimator, to constrain the effective density slopes of subhalos and demonstrate the feasibility of this method on real strong lensing observations. To do this, we implement significant improvements to the forward simulation pipeline and undertake careful model evaluation using simulated images. Ultimately, we use our trained model to predict the effective subhalo density slope from combining a set of strong lensing images taken by the extit{Hubble Space Telescope}. We found the subhalo slope measurement of this set of observations to be steeper than the slope predictions of cold dark matter subhalos. Our result adds to several previous works that also measured high subhalo slopes in observations. Although a possible explanation for this is that subhalos with steeper slopes are easier to detect due to selection effects and thus contribute to statistical bias, our result nevertheless points to the need for careful analysis of more strong lensing observations from future surveys.

Data Compression and Inference in Cosmology with Self-Supervised Machine Learning
Aizhan Akhmetzhanova, Siddharth Mishra-Sharma, Cora Dvorkin
Monthly Notices of the Royal Astronomical Society, 2023, Volume 527, Issue 3 [ arXiv:2308.09751 ]

Abstract

The influx of massive amounts of data from current and upcoming cosmological surveys necessitates compression schemes that can efficiently summarize the data with minimal loss of information. We introduce a method that leverages the paradigm of self-supervised machine learning in a novel manner to construct representative summaries of massive datasets using simulation-based augmentations. Deploying the method on hydrodynamical cosmological simulations, we show that it can deliver highly informative summaries, which can be used for a variety of downstream tasks, including precise and accurate parameter inference. We demonstrate how this paradigm can be used to construct summary representations that are insensitive to prescribed systematic effects, such as the influence of baryonic physics. Our results indicate that self-supervised machine learning techniques offer a promising new approach for compression of cosmological data as well its analysis.

Gravitational action for a massive Majorana fermion in 2d quantum gravity
Corinne de Lacroix, Harold Erbin, Vincent Lahoche
Journal of High Energy Physics, 2024, Volume 2024, Article number 68 [ arXiv:2308.08342 ]

Abstract

We compute the gravitational action of a free massive Majorana fermion coupled to two-dimensional gravity on compact Riemann surfaces of arbitrary genus. The structure is similar to the case of the massive scalar. The small-mass expansion of the gravitational yields the Liouville action at zeroth order, and we can identify the Mabuchi action at first order. While the massive Majorana action is a conformal deformation of the massless Majorana CFT, we find an action different from the one given by the David-Distler-Kawai (DDK) ansatz.

Lattice quantum chromodynamics at large isospin density: 6144 pions in a box
Ryan Abbott, William Detmold, Fernando Romero-López, Zohreh Davoudi, Marc Illa, Assumpta Parreño, Robert J. Perry, Phiala E. Shanahan, Michael L. Wagman
APS Journals 2023, Volume 108, Issue 11 [ arXiv:2307.15014 ]

Abstract

We present an algorithm to compute correlation functions for systems with the quantum numbers of many identical mesons from lattice quantum chromodynamics (QCD). The algorithm is numerically stable and allows for the computation of n-pion correlation functions for n∈{1,…,N} using a single N×N matrix decomposition, improving on previous algorithms. We apply the algorithm to calculations of correlation functions with up to 6144 π+s using two ensembles of gauge field configurations generated with quark masses corresponding to a pion mass mπ=170 MeV and spacetime volumes of (4.43×8.8) fm4 and (5.83×11.6) fm4. We also discuss statistical techniques for the analysis of such systems, in which the correlation functions vary over many orders of magnitude. In particular, we observe that the many-pion correlation functions are well approximated by log-normal distributions, allowing the extraction of the energies of these systems. Using these energies, the large-isospin-density, zero-baryon-density region of the QCD phase diagram is explored. A peak is observed in the energy density at an isospin chemical potential μI∼1.5mπ, signalling the transition into a Bose-Einstein condensed phase. The isentropic speed of sound in the medium is seen to exceed the ideal-gas (conformal) limit (c2s≤1/3) over a wide range of chemical potential before falling towards the asymptotic expectation at μI∼15mπ. These, and other thermodynamic observables, indicate that the isospin chemical potential must be large for the system to be well described by an ideal gas or perturbative QCD.

A Spectral Metric for Collider Geometry
Andrew J. Larkoski, Jesse Thaler
Journal of High Energy Physics 2023, Volume 2023, article number 107 [ arXiv:2305.03751 ]

Abstract

By quantifying the distance between two collider events, one can triangulate a metric space and reframe collider data analysis as computational geometry. One popular geometric approach is to first represent events as an energy flow on an idealized celestial sphere and then define the metric in terms of optimal transport in two dimensions. In this paper, we advocate for representing events in terms of a spectral function that encodes pairwise particle angles and products of particle energies, which enables a metric distance defined in terms of one-dimensional optimal transport. This approach has the advantage of automatically incorporating obvious isometries of the data, like rotations about the colliding beam axis. It also facilitates first-principles calculations, since there are simple closed-form expressions for optimal transport in one dimension. Up to isometries and event sets of measure zero, the spectral representation is unique, so the metric on the space of spectral functions is a metric on the space of events. At lowest order in perturbation theory in electron-positron collisions, our metric is simply the summed squared invariant masses of the two event hemispheres. Going to higher orders, we present predictions for the distribution of metric distances between jets in fixed-order and resummed perturbation theory as well as in parton-shower generators. Finally, we speculate on whether the spectral approach could furnish a useful metric on the space of quantum field theories.

Level Crossings, Attractor Points and Complex Multiplication
Hamza Ahmed, Fabian Ruehle
Journal of High Energy Physics, 2023, Volume 2023, Article number 164 [ arXiv:2304.00027 ]

Abstract

We study the complex structure moduli dependence of the scalar Laplacian eigenmodes for one-parameter families of Calabi-Yau n-folds in P^{n+1}. It was previously observed that some eigenmodes get lighter while others get heavier as a function of these moduli, which leads to eigenvalue crossing. We identify the cause for this behavior for the torus. We then show that at points in a sublocus of complex structure moduli space where Laplacian eigenmodes cross, the torus has complex multiplication. We speculate that the generalization to arbitrary Calabi-Yau manifolds could be that level crossing is related to rank one attractor points. To test this, we compute the eigenmodes numerically for the quartic K3 and the quintic threefold, and match crossings to CM and attractor points in these varieties. To quantify the error of our numerical methods, we also study the dependence of the numerical spectrum on the quality of the Calabi-Yau metric approximation, the number of points sampled from the Calabi-Yau variety, the truncation of the eigenbasis, and the the distance from degeneration points in complex structure moduli space.

Exploring the CP-violating Dashen phase in the Schwinger model with tensor networks
Lena Funcke, Karl Jansen, Stefan Kühn
Physical Review D, 2023, Volume 108, Issue 1 [ arXiv:2303.03799 ]

Abstract

We numerically study the phase structure of the two-flavor Schwinger model with matrix product states, focusing on the (1+1)-dimensional analog of the CP-violating Dashen phase in QCD. We simulate the two-flavor Schwinger model around the point where the positive mass of one fermion flavor corresponds to the negative mass of the other fermion flavor, which is a sign-problem afflicted regime for conventional Monte Carlo techniques. Our results indicate that the model undergoes a CP-violating Dashen phase transition at this point, which manifests itself in abrupt changes of the average electric field and the analog of the pion condensate in the model. Studying the scaling of the bipartite entanglement entropy as a function of the volume, we find clear indications that this transition is not of first order.

Computational Mirror Symmetry
Mehmet Demirtas, Manki Kim, Liam McAllister, Jakob Moritz, Andres Rios-Tascon
Journal of High Energy Physics, 2024, Volume 2024, Article number 184 [ arXiv:2303.00757 ]

Abstract

We present an efficient algorithm for computing the prepotential in compactifications of type II string theory on mirror pairs of Calabi-Yau threefolds in toric varieties. Applying this method, we exhibit the first systematic computation of genus-zero Gopakumar-Vafa invariants in compact threefolds with many moduli, including examples with up to 491 vector multiplets.

SHAPER: Can You Hear the Shape of a Jet?
Demba Ba, Akshunna S. Dogra, Rikab Gambhir, Abiy Tasissa, Jesse Thaler
Journal of High Energy Physics, 2023, Volume 2023, Article 195 [ arXiv:2302.12266 ]

Abstract

The identification of interesting substructures within jets is an important tool for searching for new physics and probing the Standard Model at colliders. Many of these substructure tools have previously been shown to take the form of optimal transport problems, in particular the Energy Mover's Distance (EMD). In this work, we show that the EMD is in fact the natural structure for comparing collider events, which accounts for its recent success in understanding event and jet substructure. We then present a Shape Hunting Algorithm using Parameterized Energy Reconstruction (SHAPER), which is a general framework for defining and computing shape-based observables. SHAPER generalizes N-jettiness from point clusters to any extended, parametrizable shape. This is accomplished by efficiently minimizing the EMD between events and parameterized manifolds of energy flows representing idealized shapes, implemented using the dual-potential Sinkhorn approximation of the Wasserstein metric. We show how the geometric language of observables as manifolds can be used to define novel observables with built-in infrared-and-collinear safety. We demonstrate the efficacy of the SHAPER framework by performing empirical jet substructure studies using several examples of new shape-based observables.

EPiC-GAN: Equivariant Point Cloud Generation for Particle Jets
Erik Buhmann, Gregor Kasieczka, Jesse Thaler
SciPost Physics, 2023, Volume 15, Issue 4 [ arXiv:2301.08128 ]

Abstract

With the vast data-collecting capabilities of current and future high-energy collider experiments, there is an increasing demand for computationally efficient simulations. Generative machine learning models enable fast event generation, yet so far these approaches are largely constrained to fixed data structures and rigid detector geometries. In this paper, we introduce EPiC-GAN - equivariant point cloud generative adversarial network - which can produce point clouds of variable multiplicity. This flexible framework is based on deep sets and is well suited for simulating sprays of particles called jets. The generator and discriminator utilize multiple EPiC layers with an interpretable global latent vector. Crucially, the EPiC layers do not rely on pairwise information sharing between particles, which leads to a significant speed-up over graph- and transformer-based approaches with more complex relation diagrams. We demonstrate that EPiC-GAN scales well to large particle multiplicities and achieves high generation fidelity on benchmark jet generation tasks.

Comparing Point Cloud Strategies for Collider Event Classification
Peter Onyisi, Delon Shen, Jesse Thaler
Physical Review D, 2023, Volume 108, Issue 1 [ arXiv:2212.10659 ]

Abstract

In this paper, we compare several event classification architectures defined on the point cloud representation of collider events. These approaches, which are based on the frameworks of deep sets and edge convolutions, circumvent many of the difficulties associated with traditional feature engineering. To benchmark our architectures against more traditional event classification strategies, we perform a case study involving Higgs boson decays to tau leptons. We find a 2.5 times increase in performance compared to a baseline ATLAS analysis with engineered features. Our point cloud architectures can be viewed as simplified versions of graph neural networks, where each particle in the event corresponds to a graph node. In our case study, we find the best balance of performance and computational cost for simple pairwise architectures, which are based on learned edge features.

Characterizing 4-string contact interaction using machine learning
Harold Erbin, Atakan Hilmi Fırat
Journal of High Energy Physics, 2024, Article 16 [ arXiv:2211.09129 ]

Abstract

The geometry of 4-string contact interaction of closed string field theory is characterized using machine learning. We obtain Strebel quadratic differentials on 4-punctured spheres as a neural network by performing unsupervised learning with a custom-built loss function. This allows us to solve for local coordinates and compute their associated mapping radii numerically. We also train a neural network distinguishing vertex from Feynman region. As a check, 4-tachyon contact term in the tachyon potential is computed and a good agreement with the results in the literature is observed. We argue that our algorithm is manifestly independent of number of punctures and scaling it to characterize the geometry of n-string contact interaction is feasible.

Aspects of scaling and scalability for flow-based sampling of lattice QCD
Ryan Abbott, Michael S. Albergo, Aleksandar Botev, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Alexander G. D. G. Matthews, Sébastien Racanière, Ali Razavi, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
The European Physical Journal A 2023, Volume 59, Article Number 257 [ arXiv:2211.07541 ]

Abstract

Recent applications of machine-learned normalizing flows to sampling in lattice field theory suggest that such methods may be able to mitigate critical slowing down and topological freezing. However, these demonstrations have been at the scale of toy models, and it remains to be determined whether they can be applied to state-of-the-art lattice quantum chromodynamics calculations. Assessing the viability of sampling algorithms for lattice field theory at scale has traditionally been accomplished using simple cost scaling laws, but as we discuss in this work, their utility is limited for flow-based approaches. We conclude that flow-based approaches to sampling are better thought of as a broad family of algorithms with different scaling properties, and that scalability must be assessed experimentally.

Large-time correlation functions in bosonic lattice field theories
Cagin Yunus, William Detmold
Physics Letter B, 2023, Volume 840, 137890 [ arXiv:2210.15789 ]

Abstract

Large-time correlation functions have a pivotal role in extracting particle masses from Euclidean lattice field theory calculations, however little is known about the statistical properties of these quantities. In this work, the asymptotic form of the distributions of the correlation functions at vanishing momentum is determined for bosonic interacting lattice field theories with a unique gapped vacuum. It is demonstrated that the deviations from the asymptotic form at large Euclidean times can be utilized to determine the spectrum of the theory.

Deep Learning for Bayesian Optimization of Scientific Problems with High-Dimensional Structure
Samuel Kim, Peter Y. Lu, Charlotte Loh, Jamie Smith, Jasper Snoek, Marin Soljačić
Transactions on Machine Learning Research 2022 [ ]

Abstract

Bayesian optimization (BO) is a popular paradigm for global optimization of expensive black-box functions, but there are many domains where the function is not completely a black-box. The data may have some known structure (e.g. symmetries) and/or the data generation process may be a composite process that yields useful intermediate or auxiliary information in addition to the value of the optimization objective. However, surrogate models traditionally employed in BO, such as Gaussian Processes (GPs), scale poorly with dataset size and do not easily accommodate known structure. Instead, we use Bayesian neural networks, a class of scalable and flexible surrogate models with inductive biases, to extend BO to complex, structured problems with high dimensionality. We demonstrate BO on a number of realistic problems in physics and chemistry, including topology optimization of photonic crystal materials using convolutional neural networks, and chemical property optimization of molecules using graph neural networks. On these complex tasks, we show that neural networks often outperform GPs as surrogate models for BO in terms of both sampling efficiency and computational cost.

Symmetries of Calabi-Yau Prepotentials with Isomorphic Flops
Andre Lukas, Fabian Ruehle
Journal of High Energy 2023, Article 175 [ arXiv:2210.09369 ]

Abstract

Calabi-Yau threefolds with infinitely many flops to isomorphic manifolds have an extended Kahler cone made up from an infinite number of individual Kahler cones. These cones are related by reflection symmetries across flop walls. We study the implications of this cone structure for mirror symmetry, by considering the instanton part of the prepotential in Calabi-Yau threefolds. We show that such isomorphic flops across facets of the Kahler cone boundary give rise to symmetry groups isomorphic to Coxeter groups. In the dual Mori cone, non-flopping curve classes that are identified under these groups have the same Gopakumar-Vafa invariants. This leads to instanton prepotentials invariant under Coxeter groups, which we make manifest by introducing appropriate invariant functions. For some cases, these functions can be expressed in terms of theta functions whose appearance can be linked to an elliptic fibration structure of the Calabi-Yau manifold.

Electric-Magnetic Duality in a Class of G2-Compactifications of M-theory
James Halverson, Benjamin Sung, Jiahua Tian
Journal of High Energy Physics, 2023, Volume 2023, Article 89 [ arXiv:2210.08628 ]

Abstract

We study electric-magnetic duality in compactifications of M-theory on twisted connected sum (TCS) G2 manifolds via duality with F-theory. Specifically, we study the physics of the D3-branes in F-theory compactified on a Calabi-Yau fourfold Y, dual to a compactification of M-theory on a TCS G2 manifold X. =2 supersymmetry is restored in an appropriate geometric limit. In that limit, we demonstrate that the dual of D3-branes probing seven-branes corresponds to the shrinking of certain surfaces and curves, yielding light particles that may carry both electric and magnetic charges. We provide evidence that the Minahan-Nemeschansky theories with En flavor symmetry may be realized in this way. The SL(2,ℤ) monodromy of the 3/7-brane system is dual to a Fourier-Mukai transform of the dual IIA/M-theory geometry in this limit, and we extrapolate this monodromy action to the global compactification. Away from the limit, the theory is broken to =1 supersymmetry by a D-term.

Degeneracy Engineering for Classical and Quantum Annealing: A Case Study of Sparse Linear Regression in Collider Physics
Eric R. Anschuetz, Lena Funcke, Patrick T. Komiske, Serhii Kryhin, Jesse Thaler
Physical Review D, Volume 106, Article 056008 [ arXiv:2205.10375 ]

Abstract

Classical and quantum annealing are computing paradigms that have been proposed to solve a wide range of optimization problems. In this paper, we aim to enhance the performance of annealing algorithms by introducing the technique of degeneracy engineering, through which the relative degeneracy of the ground state is increased by modifying a subset of terms in the objective Hamiltonian. We illustrate this novel approach by applying it to the example of ℓ0-norm regularization for sparse linear regression, which is in general an NP-hard optimization problem. Specifically, we show how to cast ℓ0-norm regularization as a quadratic unconstrained binary optimization (QUBO) problem, suitable for implementation on annealing platforms. As a case study, we apply this QUBO formulation to energy flow polynomials in high-energy collider physics, finding that degeneracy engineering substantially improves the annealing performance. Our results motivate the application of degeneracy engineering to a variety of regularized optimization problems.

Discovering Conservation Laws using Optimal Transport and Manifold Learning
Peter Y. Lu, Rumen Dangovski, Marin Soljačić
Nature Communications [ arXiv:2208.14995 ]

Abstract

Conservation laws are key theoretical and practical tools for understanding, characterizing, and modeling nonlinear dynamical systems. However, for many complex dynamical systems, the corresponding conserved quantities are difficult to identify, making it hard to analyze their dynamics and build efficient, stable predictive models. Current approaches for discovering conservation laws often depend on detailed dynamical information, such as the equation of motion or fine-grained time measurements, with many recent proposals also relying on black box parametric deep learning methods. We instead reformulate this task as a manifold learning problem and propose a non-parametric approach, combining the Wasserstein metric from optimal transport with diffusion maps, to discover conserved quantities that vary across trajectories sampled from a dynamical system. We test this new approach on a variety of physical systems—including conservative Hamiltonian systems, dissipative systems, and spatiotemporal systems—and demonstrate that our manifold learning method is able to both identify the number of conserved quantities and extract their values. Using tools from optimal transport theory and manifold learning, our proposed method provides a direct geometric approach to identifying conservation laws that is both robust and interpretable without requiring an explicit model of the system nor accurate time information.

Inferring subhalo effective density slopes from strong lensing observations with neural likelihood-ratio estimation
Gemma Zhang, Siddharth Mishra-Sharma, Cora Dvorkin
Monthly Notices of the Royal Astronomical Society, 2022, Volume 517, Issue 3 [ arXiv:2208.13796 ]

Abstract

Strong gravitational lensing has emerged as a promising approach for probing dark matter models on sub-galactic scales. Recent work has proposed the subhalo effective density slope as a more reliable observable than the commonly used subhalo mass function. The subhalo effective density slope is a measurement independent of assumptions about the underlying density profile and can be inferred for individual subhalos through traditional sampling methods. To go beyond individual subhalo measurements, we leverage recent advances in machine learning and introduce a neural likelihood-ratio estimator to infer an effective density slope for populations of subhalos. We demonstrate that our method is capable of harnessing the statistical power of multiple subhalos (within and across multiple images) to distinguish between characteristics of different subhalo populations. The computational efficiency warranted by the neural likelihood-ratio estimator over traditional sampling enables statistical studies of dark matter perturbers and is particularly useful as we expect an influx of strong lensing systems from upcoming surveys.

Gauge-equivariant flow models for sampling in lattice field theories with pseudofermions
Ryan Abbott, Michael S. Albergo, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Betsy Tian, Julian M. Urban
Physical REview D, 2022, Volume 106, Issue 7 [ arXiv:2207.08945 ]

Abstract

This work presents gauge-equivariant architectures for flow-based sampling in fermionic lattice field theories using pseudofermions as stochastic estimators for the fermionic determinant. This is the default approach in state-of-the-art lattice field theory calculations, making this development critical to the practical application of flow models to theories such as QCD. Methods by which flow-based sampling approaches can be improved via standard techniques such as even/odd preconditioning and the Hasenbusch factorization are also outlined. Numerical demonstrations in two-dimensional U(1) and SU(3) gauge theories with Nf=2 flavors of fermions are provided.

Radio excess from stimulated dark matter decay
Andrea Caputo,Hongwan Liu, Siddharth Mishra-Sharma,Maxim Pospelov, Joshua T. Ruderman
Physical REview D, 2023, Volume 107, Issue 12 [ ]

Abstract

Despite an intense theoretical and experimental effort over the past decade, observations of the extragalactic radio background at multiple frequencies below 10 GHz are not understood in terms of known radio sources and may represent a sign of new physics. In this paper, we identify a new class of dark sector models with feebly interacting particles, where dark photons oscillate into ordinary photons that contribute to the radio background. Our scenario can explain both the magnitude and the spectral index of the radio background, while being consistent with other cosmological and astrophysical constraints. These models predict new relativistic degrees of freedom and spectral distortions of the cosmic microwave background, which could be detected in the next generation of experiments.

Power Counting Energy Flow Polynomials
Pedro Cal, Jesse Thaler, Wouter J. Waalewijn
Journal of High Energy Physics, 2022, Article 21 [ arXiv:2205.06818 ]

Abstract

Power counting is a systematic strategy for organizing collider observables and their associated theoretical calculations. In this paper, we use power counting to characterize a class of jet substructure observables called energy flow polynomials (EFPs). EFPs provide an overcomplete linear basis for infrared-and-collinear safe jet observables, but it is known that in practice, a small subset of EFPs is often sufficient for specific jet analysis tasks. By applying power counting arguments, we obtain linear relationships between EFPs that hold for quark and gluon jets to a specific order in the power counting. We test these relations in the parton shower generator Pythia, finding excellent agreement. Power counting allows us to truncate the basis of EFPs without affecting performance, which we corroborate through a study of quark-gluon tagging and regression.

Disentangling Quarks and Gluons with CMS Open Data
Patrick T. Komiske, Serhii Kryhin, Jesse Thaler
Physical Review D, 2022, Volume 106 Article 094021 [ arXiv:2205.04459 ]

Abstract

We study quark and gluon jets separately using public collider data from the CMS experiment. Our analysis is based on 2.3/fb of proton-proton collisions at 7 TeV, collected at the Large Hadron Collider in 2011. We define two non-overlapping samples via a pseudorapidity cut -- central jets with |eta| < 0.65 and forward jets with |eta| > 0.65 -- and employ jet topic modeling to extract individual distributions for the maximally separable categories. Under certain assumptions, such as sample independence and mutual irreducibility, these categories correspond to "quark" and "gluon" jets, as given by a recently proposed operational definition. We consider a number of different methods for extracting reducibility factors from the central and forward datasets, from which the fractions of quark jets in each sample can be determined. The greatest stability and robustness to statistical uncertainties is achieved by a novel method based on parametrizing the endpoints of a receiver operating characteristic (ROC) curve. To mitigate detector effects, which would otherwise induce unphysical differences between central and forward jets, we use the OmniFold method to perform central value unfolding. As a demonstration of the power of this method, we extract the intrinsic dimensionality of the quark and gluon jet samples, which exhibit Casimir scaling, as expected from the strongly-ordered limit. To our knowledge, this work is the first application of full phase space unfolding to real collider data, and one of the first applications of topic modeling to extract separate quark and gluon distributions at the LHC.

Infinite Variance in Monte Carlo Sampling of Lattice Field Theories
Cagin Yunus, William Detmold
Physical Review D, Volume 106, Article 094506 [ arXiv:2205.01001 ]

Abstract

In Monte Carlo calculations of expectation values in lattice quantum field theories, the stochastic variance of the sampling procedure that is used defines the precision of the calculation for a fixed number of samples. If the variance of an estimator of a particular quantity is formally infinite, or in practice very large compared to the square of the mean, then that quantity can not be reliably estimated using the given sampling procedure. There are multiple scenarios in which this occurs, including in Lattice Quantum Chromodynamics, and a particularly simple example is given by the Gross-Neveu model where Monte Carlo calculations involve the introduction of auxiliary bosonic variables through a Hubbard-Stratonovich (HS) transformation. Here, it is shown that the variances of HS estimators for classes of operators involving fermion fields are divergent in this model and an even simpler zero-dimensional analogue. To correctly estimate these observables, two alternative sampling methods are proposed and numerically investigated.

Going Beyond the Galaxy Power Spectrum: an Analysis of BOSS Data with Wavelet Scattering Transforms
Georgios Valogiannis, Cora Dvorkin
Physical Review D, 2022, Volume 106, Article 103509 [ arXiv:2204.13717 ]

Abstract

We perform the first application of the wavelet scattering transform (WST) on actual galaxy observations, through a WST analysis of the BOSS DR12 CMASS dataset. We lay out the detailed procedure on how to capture all necessary layers of realism for an application on data obtained from a spectroscopic survey, including the effects of redshift-space anisotropy, non-trivial survey geometry, the shortcomings of the dataset through a set of systematic weights and the Alcock-Paczynski distortion effect. In order to capture the cosmological dependence of the WST, we use galaxy mocks obtained from the state-of-the-art ABACUSSUMMIT simulations, tuned to match the anisotropic correlation function of the BOSS CMASS sample in the redshift range 0.46<z<0.60. Using our theory model for the WST coefficients, as well as for the first 2 multipoles of the galaxy power spectrum, that we use as reference, we perform a likelihood analysis of the CMASS data and obtain the posterior probability distributions of 4 cosmological parameters, {ω_b,ω_c,n_s,σ₈}, as well as the Hubble constant, derived from a fixed value of the angular size of the sound horizon at last scattering measured by the Planck satellite, all of which are marginalized over the 7 nuisance parameters of the Halo Occupation Distribution model. The WST is found to deliver a substantial improvement in the values of the predicted 1σ errors compared to the regular power spectrum, which are tighter by a factor in the range 3−6 in the case of flat and uninformative priors and by a factor of 4−28, when a Big Bang Nucleosynthesis prior is applied on the value of ω_b. Furthermore, in the latter case, we obtain a 0.6% measurement of the Hubble constant. Our results are investigative and subject to certain approximations in our analysis, that we discuss in the text.

Creating Simple, Interpretable Anomaly Detectors for New Physics in Jet Substructure
Layne Bradshaw, Spencer Chang, Bryan Ostdiek
Physical Review D, 2022, Volume 106, Article 035014 [ arXiv:2203.01343 ]

Abstract

Anomaly detection with convolutional autoencoders is a popular method to search for new physics in a model-agnostic manner. These techniques are powerful, but they are still a "black box," since we do not know what high-level physical observables determine how anomalous an event is. To address this, we adapt a recently proposed technique by Faucett this http URL, which maps out the physical observables learned by a neural network classifier, to the case of anomaly detection. We propose two different strategies that use a small number of high-level observables to mimic the decisions made by the autoencoder on background events. Despite the underlying differences in their approach, we find that both strategies have similar ordering performance as the autoencoder and independently use the same five high-level observables. From there, we compare the performance of these networks as anomaly detectors. We find that both strategies perform similarly to the autoencoder across a variety of signals, giving a nontrivial demonstration that learning to order background events transfers to ordering a variety of signal events.

Flow-based sampling in the lattice Schwinger model at criticality
Michael S. Albergo, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
Physical Review D, 2022, Volume 106, Article 014514 [ arXiv:2202.11712 ]

Abstract

Recent results suggest that flow-based algorithms may provide efficient sampling of field distributions for lattice field theory applications, such as studies of quantum chromodynamics and the Schwinger model. In this work, we provide a numerical demonstration of robust flow-based sampling in the Schwinger model at the critical value of the fermion mass. In contrast, at the same parameters, conventional methods fail to sample all parts of configuration space, leading to severely underestimated uncertainties.

Parton physics from a heavy-quark operator product expansion: Lattice QCD calculation of the second moment of the pion distribution amplitude
William Detmold, Anthony Grebe, Issaku Kanamori, C.-J. David Lin, Santanu Mondal, Robert Perry, Yong Zhao
Physical Review D, Volume 105, Article 034506 [ arXiv:2109.15241 ]

Abstract

The pion light-cone distribution amplitude (LCDA) is a central non-perturbative object of interest for high-energy exclusive processes in quantum chromodynamics. In this article, the second Mellin moment of the pion LCDA is determined as a proof-of-concept calculation for the first numerical implementation of the heavy-quark operator product expansion (HOPE) method. The resulting value for the second Mellin moment, determined in quenched QCD at a pion mass of mπ=550 MeV at a factorization scale of 2 GeV is ⟨ξ2⟩=0.210±0.013 (stat.)±0.034 (sys.). This result is compatible with those from previous determinations of this quantity.

Strictification and gluing of Lagrangian distributions on derived schemes with shifted symplectic forms
Dennis Borisov, Ludmil Katzarkov, Artan Sheshmani, Shing-Tung Yau
Science Direct Journals, 2024, Volume 438 [ arXiv:1908.00651 ]

Abstract

A strictification result is proved for isotropic distributions on derived schemes equipped with negatively shifted homotopically closed 2-forms. It is shown that any derived scheme over ℂ equipped with a −2-shifted symplectic structure, and having a Hausdorff space of classical points, admits a globally defined Lagrangian distribution as a dg ℂ∞-manifold.

Towards Quantum Simulations in Particle Physics and Beyond on Noisy Intermediate-Scale Quantum Devices
Lena Funcke, Tobias Hartung, Karl Jansen, Stefan Kühn, Manuel Schneider, Paolo Stornati, Xiaoyang Wang
Philosophical Transactions of the Royal Society A [ arXiv:2110.03809 ]

Abstract

We review two algorithmic advances that bring us closer to reliable quantum simulations of model systems in high energy physics and beyond on noisy intermediate-scale quantum (NISQ) devices. The first method is the dimensional expressivity analysis of quantum circuits, which allows for constructing minimal but maximally expressive quantum circuits. The second method is an efficient mitigation of readout errors on quantum devices. Both methods can lead to significant improvements in quantum simulations, e.g., when variational quantum eigensolvers are used.

SymmetryGAN: Symmetry Discovery with Deep Learning
Krish Desai, Benjamin Nachman, Jesse Thaler
Physical. Rev. D, 2022, 105:096031 [ arXiv:2112.05722 ]

Abstract

What are the symmetries of a dataset? Whereas the symmetries of an individual data element can be characterized by its invariance under various transformations, the symmetries of an ensemble of data elements are ambiguous due to Jacobian factors introduced while changing coordinates. In this paper, we provide a rigorous statistical definition of the symmetries of a dataset, which involves inertial reference densities, in analogy to inertial frames in classical mechanics. We then propose SymmetryGAN as a novel and powerful approach to automatically discover symmetries using a deep learning method based on generative adversarial networks (GANs). When applied to Gaussian examples, SymmetryGAN shows excellent empirical performance, in agreement with expectations from the analytic loss landscape. SymmetryGAN is then applied to simulated dijet events from the Large Hadron Collider (LHC) to demonstrate the potential utility of this method in high energy collider physics applications. Going beyond symmetry discovery, we consider procedures to infer the underlying symmetry group from empirical data.

PQ Axiverse
Mehmet Demirtas, Naomi Gendler, Cody Long, Liam McAllister, Jakob Moritz
Journal of High Energy Physics 2023, Volume 2023, Article number 92 [ arXiv:2112.04503 ]

Abstract

We show that the strong CP problem is solved in a large class of compactifications of string theory. The Peccei-Quinn mechanism solves the strong CP problem if the CP-breaking effects of the ultraviolet completion of gravity and of QCD are small compared to the CP-preserving axion potential generated by low-energy QCD instantons. We characterize both classes of effects. To understand quantum gravitational effects, we consider an ensemble of flux compactifications of type IIB string theory on orientifolds of Calabi-Yau hypersurfaces in the geometric regime, taking a simple model of QCD on D7-branes. We show that the D-brane instanton contribution to the neutron electric dipole moment falls exponentially in N4, with N the number of axions. In particular, this contribution is negligible in all models in our ensemble with N>17. We interpret this result as a consequence of large N effects in the geometry that create hierarchies in instanton actions and also suppress the ultraviolet cutoff. We also compute the CP breaking due to high-energy instantons in QCD. In the absence of vectorlike pairs, we find contributions to the neutron electric dipole moment that are not excluded, but that could be accessible to future experiments if the scale of supersymmetry breaking is sufficiently low. The existence of vectorlike pairs can lead to a larger dipole moment. Finally, we show that a significant fraction of models are allowed by standard cosmological and astrophysical constraints.

Machine Learning in Nuclear Physics
Amber Boehnlein, Markus Diefenthaler, Nobuo Sato, Malachi Schram, Veronique Ziegler, Cristiano Fanelli, Morten Hjorth-Jensen, Tanja Horn, Michelle P. Kuchera, Dean Lee, Witold Nazarewicz, Peter Ostroumov, Kostas Orginos, Alan Poon, Xin-Nian Wang, Alexander Scheinker, Michael S. Smith, and Long-Gang Pang
Reviews of Modern Physics, 2022, Volume 94, Article 031003 [ arXiv:2112.02309 ]

Abstract

Advances in machine learning methods provide tools that have broad applicability in scientific research. These techniques are being applied across the diversity of nuclear physics research topics, leading to advances that will facilitate scientific discoveries and societal applications. This Colloquium provides a snapshot of nuclear physics research, which has been transformed by machine learning techniques.

Infinite Neural Network Quantum States
Di Luo, James Halverson
Machine Learning: Science and Technology, 2023, Volume 4, Number 2 [ arXiv:2112.00723 ]

Abstract

We study infinite limits of neural network quantum states (∞-NNQS), which exhibit representation power through ensemble statistics, and also tractable gradient descent dynamics. Ensemble averages of Renyi entropies are expressed in terms of neural network correlators, and architectures that exhibit volume-law entanglement are presented. A general framework is developed for studying the gradient descent dynamics of neural network quantum states (NNQS), using a quantum state neural tangent kernel (QS-NTK). For ∞-NNQS the training dynamics is simplified, since the QS-NTK becomes deterministic and constant. An analytic solution is derived for quantum state supervised learning, which allows an ∞-NNQS to recover any target wavefunction. Numerical experiments on finite and infinite NNQS in the transverse field Ising model and Fermi Hubbard model demonstrate excellent agreement with theory. ∞-NNQS opens up new opportunities for studying entanglement and training dynamics in other physics applications, such as in finding ground states.

Surrogate- and invariance-boosted contrastive learning for data-scarce applications in science
Charlotte Loh, Thomas Christensen, Rumen Dangovski, Samuel Kim, Marin Soljačić
Nature Communications, 2022, Volume 13, Article 4223 [ arXiv:2110.08406 ]

Abstract

Deep learning techniques have been increasingly applied to the natural sciences, e.g., for property prediction and optimization or material discovery. A fundamental ingredient of such approaches is the vast quantity of labelled data needed to train the model; this poses severe challenges in data-scarce settings where obtaining labels requires substantial computational or labor resources. Here, we introduce surrogate- and invariance-boosted contrastive learning (SIB-CL), a deep learning framework which incorporates three ``inexpensive'' and easily obtainable auxiliary information sources to overcome data scarcity. Specifically, these are: 1)~abundant unlabeled data, 2)~prior knowledge of symmetries or invariances and 3)~surrogate data obtained at near-zero cost. We demonstrate SIB-CL's effectiveness and generality on various scientific problems, e.g., predicting the density-of-states of 2D photonic crystals and solving the 3D time-independent Schrodinger equation. SIB-CL consistently results in orders of magnitude reduction in the number of labels needed to achieve the same network accuracies.

Pruning a restricted Boltzmann machine for quantum state reconstruction
Anna Golubeva, Roger G. Melko
Physical Review B, 2022, Volume 105, Article 125124 [ arXiv:2110.03676 ]

Abstract

Restricted Boltzmann machines (RBMs) have proven to be a powerful tool for learning quantum wavefunction representations from qubit projective measurement data. Since the number of classical parameters needed to encode a quantum wavefunction scales rapidly with the number of qubits, the ability to learn efficient representations is of critical importance. In this paper we study magnitude-based pruning as a way to compress the wavefunction representation in an RBM, focusing on RBMs trained on data from the transverse-field Ising model in one dimension. We find that pruning can reduce the total number of RBM weights, but the threshold at which the reconstruction accuracy starts to degrade varies significantly depending on the phase of the model. In a gapped region of the phase diagram, the RBM admits pruning over half of the weights while still accurately reproducing relevant physical observables. At the quantum critical point however, even a small amount of pruning can lead to significant loss of accuracy in the physical properties of the reconstructed quantum state. Our results highlight the importance of tracking all relevant observables as their sensitivity varies strongly with pruning. Finally, we find that sparse RBMs are trainable and discuss how a successful sparsity pattern can be created without pruning.

Classical Shadows for Quantum Process Tomography on Near-term Quantum Computers
Ryan Levy, Di Luo, Bryan K. Clark
Physical Review Research, 2024, Volume 6, Issue 1 [ arXiv:2110.02965 ]

Abstract

Quantum process tomography is a powerful tool for understanding quantum channels and characterizing properties of quantum devices. Inspired by recent advances using classical shadows in quantum state tomography [H.-Y. Huang, R. Kueng, and J. Preskill, Nat. Phys. 16, 1050 (2020).], we have developed ShadowQPT, a classical shadow method for quantum process tomography. We introduce two related formulations with and without ancilla qubits. ShadowQPT stochastically reconstructs the Choi matrix of the device allowing for an a-posteri classical evaluation of the device on arbitrary inputs with respect to arbitrary outputs. Using shadows we then show how to compute overlaps, generate all k-weight reduced processes, and perform reconstruction via Hamiltonian learning. These latter two tasks are efficient for large systems as the number of quantum measurements needed scales only logarithmically with the number of qubits. A number of additional approximations and improvements are developed including the use of a pair-factorized Clifford shadow and a series of post-processing techniques which significantly enhance the accuracy for recovering the quantum channel. We have implemented ShadowQPT using both Pauli and Clifford measurements on the IonQ trapped ion quantum computer for quantum processes up to n=4 qubits and achieved good performance.

Deep Set Auto Encoders for Anomaly Detection in Particle Physics
Bryan Ostdiek
SciPost Physics, 2022, Vol. 12, Issue 1 [ arXiv:2109.01695 ]

Abstract

There is an increased interest in model agnostic search strategies for physics beyond the standard model at the Large Hadron Collider. We introduce a Deep Set Variational Autoencoder and present results on the Dark Machines Anomaly Score Challenge. We find that the method attains the best anomaly detection ability when there is no decoding step for the network, and the anomaly score is based solely on the representation within the encoded latent space. This method was one of the top-performing models in the Dark Machines Challenge, both for the open data sets as well as the blinded data sets.

A variational study of two-nucleon systems with lattice QCD
Saman Amarasinghe, Riyadh Baghdadi, Zohreh Davoudi, William Detmold, Marc Illa, Assumpta Parreno, Andrew V. Pochinsky, Phiala E. Shanahan, Michael L. Wagman
Physical Review D, Volume 107, Issue 9 [ arXiv:2103.02602 ]

Abstract

The low-energy spectrum and scattering of two-nucleon systems are studied with lattice quantum chromodynamics using a variational approach. A wide range of interpolating operators are used: dibaryon operators built from products of plane-wave nucleons, hexaquark operators built from six localized quarks, and quasi-local operators inspired by two-nucleon bound-state wavefunctions in low-energy effective theories. Sparsening techniques are used to compute the timeslice-to-all quark propagators required to form correlation-function matrices using products of these operators. Projection of these matrices onto irreducible representations of the cubic group, including spin-orbit coupling, is detailed. Variational methods are applied to constrain the low-energy spectra of two-nucleon systems in a single finite volume with quark masses corresponding to a pion mass of 806 MeV. Results for S- and D-wave phase shifts in the isospin singlet and triplet channels are obtained under the assumption that partial-wave mixing is negligible. Tests of interpolating-operator dependence are used to investigate the reliability of the energy spectra obtained and highlight both the strengths and weaknesses of variational methods. These studies and comparisons to previous studies using the same gauge-field ensemble demonstrate that interpolating-operator dependence can lead to significant effects on the two-nucleon energy spectra obtained using both variational and non-variational methods, including missing energy levels and other discrepancies. While this study is inconclusive regarding the presence of two-nucleon bound states at this quark mass, it provides robust upper bounds on two-nucleon energy levels that can be improved in future calculations using additional interpolating operators and is therefore a step toward reliable nuclear spectroscopy from the underlying Standard Model of particle physics.

Real-time lattice gauge theory actions: unitarity, convergence, and path integral contour deformations
Gurtej Kanwar, Michael L. Wagman
Physical Review D, Volume 104, Article 014513 [ arXiv:2103.02602 ]

Abstract

The Wilson action for Euclidean lattice gauge theory defines a positive-definite transfer matrix that corresponds to a unitary lattice gauge theory time-evolution operator if analytically continued to real time. Hoshina, Fujii, and Kikukawa (HFK) recently pointed out that applying the Wilson action discretization to continuum real-time gauge theory does not lead to this, or any other, unitary theory and proposed an alternate real-time lattice gauge theory action that does result in a unitary real-time transfer matrix. The character expansion defining the HFK action is divergent, and in this work we apply a path integral contour deformation to obtain a convergent representation for U(1) HFK path integrals suitable for numerical Monte Carlo calculations. We also introduce a class of real-time lattice gauge theory actions based on analytic continuation of the Euclidean heat-kernel action. Similar divergent sums are involved in defining these actions, but for one action in this class this divergence takes a particularly simple form, allowing construction of a path integral contour deformation that provides absolutely convergent representations for U(1) and SU(N) real-time lattice gauge theory path integrals. We perform proof-of-principle Monte Carlo calculations of real-time U(1) and SU(3) lattice gauge theory and verify that exact results for unitary time evolution of static quark-antiquark pairs in (1 + 1)D are reproduced.

Towards an Optimal Estimation of Cosmological Parameters with the Wavelet Scattering Transform
Georgios Valogiannis, Cora Dvorkin
Physical Review D, 2022, 105, 103534 [ arXiv:2108.07821 ]

Abstract

Optimal extraction of the non-Gaussian information encoded in the Large-Scale Structure (LSS) of the universe lies at the forefront of modern precision cosmology. We propose achieving this task through the use of the Wavelet Scattering Transform (WST), which subjects an input field to a layer of non-linear transformations that are sensitive to non-Gaussianity in spatial density distributions through a generated set of WST coefficients. In order to assess its applicability in the context of LSS surveys, we apply the WST on the 3D overdensity field obtained by the Quijote simulations, out of which we extract the Fisher information in 6 cosmological parameters. It is subsequently found to deliver a large improvement in the marginalized errors on all parameters, ranging between 1.2−4× tighter than the corresponding ones obtained from the regular 3D cold dark matter + baryon power spectrum, as well as a 50% improvement over the neutrino mass constraint given by the marked power spectrum. Through this first application on 3D cosmological fields, we demonstrate the great promise held by this novel statistic and set the stage for its future application to actual galaxy observations.

Deep multi-task mining Calabi-Yau four-folds
Harold Erbin, Riccardo Finotello, Robin Schneider, Mohamed Tamaazousti
Machine Learning: Science and Technology, 2021, Volume 3, Number 1 [ arXiv:2108.02221 ]

Abstract

We continue earlier efforts in computing the dimensions of tangent space cohomologies of Calabi-Yau manifolds using deep learning. In this paper, we consider the dataset of all Calabi-Yau four-folds constructed as complete intersections in products of projective spaces. Employing neural networks inspired by state-of-the-art computer vision architectures, we improve earlier benchmarks and demonstrate that all four non-trivial Hodge numbers can be learned at the same time using a multi-task architecture. With 30% (80%) training ratio, we reach an accuracy of 100% for h^(1,1) and 97% for h^(2,1) (100% for both), 81% (96%) for h^(3,1), and 49% (83%) for h^(2,2). Assuming that the Euler number is known, as it is easy to compute, and taking into account the linear constraint arising from index computations, we get 100% total accuracy.

Nonperturbative renormalization for the neural network–QFT correspondence
Harold Erbin, Vincent Lahoche, Dine Ousmane Samary
Machine Learning Science and Technology, 2022, Volume 3, Number 1, Article 015027 [ arXiv:2108.01403 ]

Abstract

In a recent work~[1], Halverson, Maiti and Stoner proposed a description of neural networks in terms of a Wilsonian effective field theory. The infinite-width limit is mapped to a free field theory, while finite N corrections are taken into account by interactions (non-Gaussian terms in the action). In this paper, we study two related aspects of this correspondence. First, we comment on the concepts of locality and power-counting in this context. Indeed, these usual space-time notions may not hold for neural networks (since inputs can be arbitrary), however, the renormalization group provides natural notions of locality and scaling. Moreover, we comment on several subtleties, for example, that data components may not have a permutation symmetry: in that case, we argue that random tensor field theories could provide a natural generalization. Second, we improve the perturbative Wilsonian renormalization from~[1] by providing an analysis in terms of the nonperturbative renormalization group using the Wetterich-Morris equation. An important difference with usual nonperturbative RG analysis is that only the effective (IR) 2-point function is known, which requires setting the problem with care. Our aim is to provide a useful formalism to investigate neural networks behavior beyond the large-width limit (i.e.~far from Gaussian limit) in a nonperturbative fashion. A major result of our analysis is that changing the standard deviation of the neural network weight distribution can be interpreted as a renormalization flow in the space of networks. We focus on translations invariant kernels and provide preliminary numerical results.

Neural Conditional Reweighting
Benjamin Nachman, Jesse Thaler
Physical Review D, Volume 105, Article 076015 [ arXiv:2107.08979 ]

Abstract

There is a growing use of neural network classifiers as unbinned, high-dimensional (and variable-dimensional) reweighting functions. To date, the focus has been on marginal reweighting, where a subset of features are used for reweighting while all other features are integrated over. There are some situations, though, where it is preferable to condition on auxiliary features instead of marginalizing over them. In this paper, we introduce neural conditional reweighting, which extends neural marginal reweighting to the conditional case. This approach is particularly relevant in high-energy physics experiments for reweighting detector effects conditioned on particle-level truth information. We leverage a custom loss function that not only allows us to achieve neural conditional reweighting through a single training procedure, but also yields sensible interpolation even in the presence of phase space holes. As a specific example, we apply neural conditional reweighting to the energy response of high-energy jets, which could be used to improve the modeling of physics objects in parametrized fast simulation packages.

Single electrons on solid neon as a solid-state quit platform
Xianjing Zhou, Gerwin Koolstra, Xufeng Zhang, Ge Yang, Xu Han, Brennan Dizdar, Divan Ralu, Wei Guo, Kater W. Murch, David I. Shuster, Dafei Jin
Nature, 2022, 605, 46-50 [ arXiv:2106.10326 ]

Abstract

Progress toward the realization of quantum computers requires persistent advances in their constituent building blocks - qubits. Novel qubit platforms that simultaneously embody long coherence, fast operation, and large scalability offer compelling advantages in the construction of quantum computers and many other quantum information systems. Electrons, ubiquitous elementary particles of nonzero charge, spin, and mass, have commonly been perceived as paradigmatic local quantum information carriers. Despite superior controllability and configurability, their practical performance as qubits via either motional or spin states depends critically on their material environment. Here we report our experimental realization of a new qubit platform based upon isolated single electrons trapped on an ultraclean solid neon surface in vacuum. By integrating an electron trap in a circuit quantum electrodynamics architecture, we achieve strong coupling between the motional states of a single electron and a single microwave photon in an on-chip superconducting resonator. Qubit gate operations and dispersive readout are implemented to measure the energy relaxation time T1 of 15 μs and phase coherence time T2 over 200 ns. These results indicate that the electron-on-solid-neon qubit already performs near the state of the art as a charge qubit.

Flow-based sampling for fermionic lattice field theories
Michael S. Albergo, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, Julian M. Urban, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Phiala E. Shanahan
Physical Review D, 2021, Vol. 104, Iss. 11 – 1 [ arXiv:2106.05934 ]

Abstract

Algorithms based on normalizing flows are emerging as promising machine learning approaches to sampling complicated probability distributions in a way that can be made asymptotically exact. In the context of lattice field theory, proof-of-principle studies have demonstrated the effectiveness of this approach for scalar theories, gauge theories, and statistical systems. This work develops approaches that enable flow-based sampling of theories with dynamical fermions, which is necessary for the technique to be applied to lattice field theory studies of the Standard Model of particle physics and many condensed matter systems. As a practical demonstration, these methods are applied to the sampling of field configurations for a two-dimensional theory of massless staggered fermions coupled to a scalar field via a Yukawa interaction.

Preserving New Physics while Simultaneously Unfolding All Observables
Patrick Komiske, W. Patrick McCormack, Benjamin Nachman
Physical Review D, Volume 104, Article 076027 [ arXiv:2105.09923 ]

Abstract

Direct searches for new particles at colliders have traditionally been factorized into model proposals by theorists and model testing by experimentalists. With the recent advent of machine learning methods that allow for the simultaneous unfolding of all observables in a given phase space region, there is a new opportunity to blur these traditional boundaries by performing searches on unfolded data. This could facilitate a research program where data are explored in their natural high dimensionality with as little model bias as possible. We study how the information about physics beyond the Standard Model is preserved by full phase space unfolding using an important physics target at the Large Hadron Collider (LHC): exotic Higgs boson decays involving hadronic final states. We find that if the signal cross section is high enough, information about the new physics is visible in the unfolded data. We will show that in some cases, quantifiably all of the information about the new physics is encoded in the unfolded data. Finally, we show that there are still many cases when the unfolding does not work fully or precisely, such as when the signal cross section is small. This study will serve as an important benchmark for enhancing unfolding methods for the LHC and beyond.

A Compound Poisson Generator approach to Point-Source Inference in Astrophysics
Gabriel H. Collin, Nicholas L. Rodd, Tyler Erjavec, Kerstin Perez
The Astrophysical Journal, 2022, Volume 260, Number 2 [ arXiv:2104.04529 | code ]

Abstract

The identification and description of point sources is one of the oldest problems in astronomy; yet, even today the correct statistical treatment for point sources remains as one of the field's hardest problems. For dim or crowded sources, likelihood based inference methods are required to estimate the uncertainty on the characteristics of the source population. In this work, a new parametric likelihood is constructed for this problem using Compound Poisson Generator (CPG) functionals which incorporate instrumental effects from first principles. We demonstrate that the CPG approach exhibits a number advantages over Non-Poissonian Template Fitting (NPTF) - an existing parametric likelihood method - in a series of test scenarios in the context of X-ray astronomy. These demonstrations show that the effect of the point-spread function, effective area, and choice of point-source spatial distribution cannot, in general, be factorised as they are in the NPTF construction, while the new CPG construction is validated in these scenarios. Separately, an examination of the diffuse-flux emission limit is used to show that most simple choices of priors on the standard parameterisation of the population model can result in unexpected biases: when a model comprising both a point-source population and diffuse component is applied to this limit, nearly all observed flux will be assigned to either the population or to the diffuse component. A new parametrisation is presented for these priors which is demonstrated to properly estimate the uncertainties in this limit. In this choice of priors, the CPG correctly identifies that the fraction of flux assigned to the population model cannot be constrained by the data.

Modern Machine Learning and Particle Physics
Matthew D. Schwartz
Harvard Data Science Review, 2021, Issue 3.2, 13 May [ arXiv:2103.12226 ]

Abstract

Over the past five years, modern machine learning has been quietly revolutionizing particle physics. Old methodology is being outdated and entirely new ways of thinking about data are becoming commonplace. This article will review some aspects of the natural synergy between modern machine learning and particle physics, focusing on applications at the Large Hadron Collider. A sampling of examples is given, from signal/background discrimination tasks using supervised learning to direct data-driven approaches. Some comments on persistent challenges and possible future directions for the field are included at the end.

Topological obstructions to autoencoding
Joshua Batson, C. Grace Haaf, Yonatan Kahn, Daniel A. Roberts
Journal of High Energy Physics, 2021, Issue 4, Article 280 [ arXiv:2102.08380 ]

Abstract

Autoencoders have been proposed as a powerful tool for model-independent anomaly detection in high-energy physics. The operating principle is that events which do not belong to the space of training data will be reconstructed poorly, thus flagging them as anomalies. We point out that in a variety of examples of interest, the connection between large reconstruction error and anomalies is not so clear. In particular, for data sets with nontrivial topology, there will always be points that erroneously seem anomalous due to global issues. Conversely, neural networks typically have an inductive bias or prior to locally interpolate such that undersampled or rare events may be reconstructed with small error, despite actually being the desired anomalies. Taken together, these facts are in tension with the simple picture of the autoencoder as an anomaly detector. Using a series of illustrative low-dimensional examples, we show explicitly how the intrinsic and extrinsic topology of the dataset affects the behavior of an autoencoder and how this topology is manifested in the latent space representation during training. We ground this analysis in the discussion of a mock "bump hunt" in which the autoencoder fails to identify an anomalous "signal" for reasons tied to the intrinsic topology of n-particle phase space.

Few-nucleon matrix elements in pionless effective field theory in a finite volume
W. Detmold and P. E. Shanahan
Physical Review D, Volume 103, Article 074503 [ arXiv:2102.04329 ]

Abstract

Pionless effective field theory in a finite volume (FVEFTπ/) is investigated as a framework for the analysis of multi-nucleon spectra and matrix elements calculated in lattice QCD (LQCD). By combining FVEFTπ/ with the stochastic variational method, the spectra of nuclei with atomic number A∈{2,3} are matched to existing finite-volume LQCD calculations at heavier-than-physical quark masses corresponding to a pion mass mπ=806 MeV, thereby enabling infinite-volume binding energies to be determined using infinite-volume variational calculations. Based on the variational wavefunctions that are constructed in this approach, the finite-volume matrix elements of various local operators are computed in FVEFTπ/ and matched to LQCD calculations of the corresponding QCD operators in the same volume, thereby determining the relevant one and two-body EFT counterterms and enabling an extrapolation of the LQCD matrix elements to infinite volume. As examples, the scalar, tensor, and axial matrix elements are considered, as well as the magnetic moments and the isovector longitudinal momentum fraction.

Path integral contour deformations for observables in SU(N) gauge theory
William Detmold, Gurtej Kanwar, Henry Lamm, Michael L. Wagman, Neill C. Warrington
Physical Review D, 2021, Vol. 103, Issue 9, Article 094517 [ arXiv:2101.12668 ]

Abstract

Path integral contour deformations have been shown to mitigate sign and signal-to-noise problems associated with phase fluctuations in lattice field theories. We define a family of contour deformations applicable to SU(N) lattice gauge theory that can reduce sign and signal-to-noise problems associated with complex actions and complex observables. For observables, these contours can be used to define deformed observables with identical expectation value but different variance. As a proof-of-principle, we apply machine learning techniques to optimize the deformed observables associated with Wilson loops in two dimensional SU(2) and SU(3) gauge theory. We study loops consisting of up to 64 plaquettes and achieve variance reduction of up to 4 orders of magnitude.

Learning to Unknot
Sergei Gukov, James Halverson, Fabian Ruehle, and Piotr Sułkowski
Machine Learning - Science and Technology, 2021, Volume 2, Number 2, Article 025035 [ arXiv:2010.16263 ]

Abstract

We introduce natural language processing into the study of knot theory, as made natural by the braid word representation of knots. We study the UNKNOT problem of determining whether or not a given knot is the unknot. After describing an algorithm to randomly generate $N$-crossing braids and their knot closures and discussing the induced prior on the distribution of knots, we apply binary classification to the UNKNOT decision problem. We find that the Reformer and shared-QK Transformer network architectures outperform fully-connected networks, though all perform well. Perhaps surprisingly, we find that accuracy increases with the length of the braid word, and that the networks learn a direct correlation between the confidence of their predictions and the degree of the Jones polynomial. Finally, we utilize reinforcement learning (RL) to find sequences of Markov moves and braid relations that simplify knots and can identify unknots by explicitly giving the sequence of unknotting actions. Trust region policy optimization (TRPO) performs consistently well for a wide range of crossing numbers and thoroughly outperformed other RL algorithms and random walkers. Studying these actions, we find that braid relations are more useful in simplifying to the unknot than one of the Markov moves.

Elliptic stable envelopes and hypertoric loop spaces
Michael McBreen, Artan Sheshmani, Shing-Tung Yau
Selecta Mathematica, 2023, Volume 29, Article number 73 [ arXiv:2010.0067 ]

Abstract

This paper relates the elliptic stable envelopes of a hypertoric variety X with the K-theoretic stable envelopes of the loop hypertoric space, ℒ˜X. It thus points to a possible categorification of elliptic stable envelopes.