New detectors on the block

Article title: “Toward Machine Learning Optimization of Experimental Design”

Authors: MODE Collaboration

Reference: https://inspirehep.net/literature/1850892 (pdf)

In a previous post we wondered if (machine learning) algorithms can replace the entire simulation of detectors and reconstruction of particles. But meanwhile some experimentalists have gone one step further – and wondered if algorithms can design detectors.

Indeed, the MODE collaboration stands for Machine-learning Optimized Design of Experiments and in its first paper promises nothing less than that.

The idea here is that the choice of characteristics that an experiment can have is vast (think number of units, materials, geometry, dimensions and so on), but its ultimate goal can still be described by a single “utility function”. For instance, the precision of the measurement on specific data can be thought of as a utility function.

Then, the whole process that leads to obtaining that function can be decomposed into a number of conceptual blocks: normally there are incoming particles, which move through and interact with detectors, resulting in measurements; from them, the characteristics of the particles are reconstructed; these are eventually analyzed to get relevant useful quantities, the utility function among them. Ultimately, chaining together these blocks creates a pipeline that models the experiment from one end to the other.

Now, another central notion is differentiation or, rather, the ability to be differentiated; if all the components of this model are differentiable, then the gradient of the utility function can be calculated. This leads to the holy grail: finding its extreme values, i.e. optimize the experiment’s design as a function of its numerous components.

Before we see whether the components are indeed differentiable and how the gradient gets calculated, here is an example of this pipeline concept for a muon radiography detector.

Discovering a hidden space in the Great Pyramid by using muons. ( Financial Times)

Muons are not just the trendy star of particle physics (as of April 2021), but they also find application in scanning closed volumes and revealing details about the objects in them. And yes, the Great Pyramid has been muographed successfully.

In terms of the pipeline described above, a muon radiography device could be modeled in the following way: Muons from cosmic rays are generated in the form of 4-vectors. Those are fed to a fast-simulation of the scanned volume and the detector. The interactions of the particles with the materials and the resulting signals on the electronics are simulated. This output goes into a reconstruction module, which recreates muon tracks. From them, an information-extraction module calculates the density of the scanned material. It can also produce a loss function for the measurement, which here would be the target quantity.

Conceptual layout of the optimization pipeline. (MODE collaboration)

This whole ritual is a standard process in experimental work, although the steps are usually quite separate from one another. In the MODE concept, however, not only are they linked together but also run iteratively. The optimization of the detector design proceeds in steps and in each of them the parameters of the device are changed in the simulation. This affects directly the detector module and indirectly the downstream modules of the pipeline. The loop of modification and validation can be constrained appropriately to keep everything within realistic values, and also to make the most important consideration of all enter the game – that is of course cost and the constraints that it brings along.

Descending towards the minimum. (Dezhi Yu)

As mentioned above, the proposed optimization proceeds in steps by optimizing the parameters along the gradient of the utility function. The most famous incarnation of gradient-based optimization is gradient descent which is customarily used in neural networks. Gradient descent guides the network towards the minimum value of the error that it produces, through the possible “paths” of its parameters.

In the MODE proposal the optimization is achieved through automatic differentiation (AD), the latest word in the calculation of derivatives in computer programs. To shamefully paraphrase Wikipedia, AD exploits the fact that every computer program, no matter how complicated, executes a sequence of elementary arithmetic operations and functions. By applying the chain rule repeatedly to these operations, derivatives can be computed automatically, accurately and efficiently.

Also, something was mentioned above about whether the components of the pipeline are “indeed differentiable”. It turns out that one isn’t. This is the simulation of the processes during the passage of particles through the detector, which is stochastic by nature. However, machine learning can learn how to mimic it, take its place, and provide perfectly fine and differentiable modules. (The brave of heart can follow the link at the end to find out about local generative surrogates.)

This method of designing detectors might sound like a thought experiment on steroids. But the point of MODE is that it’s the realistic way to take full advantage of the current developments in computation. And maybe to feel like we have really entered the third century of particle experiments.

Further reading:

The MODE website: https://mode-collaboration.github.io/

A Beginner’s Guide to Differentiable Programming: https://wiki.pathmind.com/differentiableprogramming

Black-Box Optimization with Local Generative Surrogates: https://arxiv.org/abs/2002.04632

A symphony of data

Article title: “MUSiC: a model unspecific search for new physics in
proton-proton collisions at \sqrt{s} = 13 TeV”

Authors: The CMS Collaboration

Reference: https://arxiv.org/abs/2010.02984

First of all, let us take care of the spoilers: no new particles or phenomena have been found… Having taken this concern away, let us focus on the important concept behind MUSiC.

ATLAS and CMS, the two largest experiments using collisions at the LHC, are known as “general purpose experiments” for a good reason. They were built to look at a wide variety of physical processes and, up to now, each has checked dozens of proposed theoretical extensions of the Standard Model, in addition to checking the Model itself. However, in almost all cases their searches rely on definite theory predictions and focus on very specific combinations of particles and their kinematic properties. In this way, the experiments may still be far from utilizing their full potential. But now an algorithm named MUSiC is here to help.

MUSiC takes all events recorded by CMS that comprise of clean-cut particles and compares them against the expectations from the Standard Model, untethering itself from narrow definitions for the search conditions.

We should clarify here that an “event” is the result of an individual proton-proton collision (among the many happening each time the proton bunches cross), consisting of a bouquet of particles. First of all, MUSiC needs to work with events with particles that are well-recognized by the experiment’s detectors, to cut down on uncertainty. It must also use particles that are well-modeled, because it will rely on the comparison of data to simulation and, so, wants to be sure about the accuracy of the latter.

Display of an event with two muons at CMS. (Source: CMS experiment)

All this boils down to working with events with combinations of specific, but several, particles: electrons, muons, photons, hadronic jets from light-flavour (=up, down, strange) quarks or gluons and from bottom quarks, and deficits in the total transverse momentum (typically the signature of the uncatchable neutrinos or perhaps of unknown exotic particles). And to make things even more clean-cut, it keeps only events that include either an electron or a muon, both being well-understood characters.

These particles’ combinations result in hundreds of different “final states” caught by the detectors. However, they all correspond to only a dozen combos of particles created in the collisions according to the Standard Model, before some of them decay to lighter ones. For them, we know and simulate pretty well what we expect the experiment to measure.

MUSiC proceeded by comparing three kinematic quantities of these final states, as measured by CMS during the year 2016, to their simulated values. The three quantities of interest are the combined mass, combined transverse momentum and combined missing transverse momentum. It’s in their distributions that new particles would most probably show up, regardless of which theoretical model they follow. The range of values covered is pretty wide. All in all, the method extends the kinematic reach of usual searches, as it also does with the collection of final states.

An example distribution from MUSiC: Transverse mass for the final state comprising of one muon and missing transverse momentum. Color histograms: Simulated Standard Model processes. Red line: Signal from a hypothetical W’ boson with mass of 3TeV. (Source: paper)

So the kinematic distributions are checked against the simulated expectations in an automatized way, with MUSiC looking for every physicist’s dream: deviations. Any deviation from the simulation, meaning either fewer or more recorded events, is quantified by getting a probability value. This probability is calculated by also taking into account the much dreaded “look elsewhere effect”. (Which comes from the fact that, statistically, in a large number of distributions a random fluctuation that will mimic a genuine deviation is bound to appear sooner or later.)

When all’s said and done the collection of probabilities is overviewed. The MUSiC protocol says that any significant deviation will be scrutinized with more traditional methods – only that this need never actually arose in the 2016 data: all the data played along with the Standard Model, in all 1,069 examined final states and their kinematic ranges.

For the record, the largest deviation was spotted in the final state comprising three electrons, two generic hadronic jets and one jet coming from a bottom quark. Seven events were counted whereas the simulation gave 2.7±1.8 events (mostly coming from the production of a top plus an anti-top quark plus an intermediate vector boson from the collision; the fractional values are due to extrapolating to the amount of collected data). This excess was not seen in other related final states, “related” in that they also either include the same particles or have one less. Everything pointed to a fluctuation and the case was closed.

However, the goal of MUSiC was not strictly to find something new, but rather to demonstrate a method for model un-specific searches with collisions data. The mission seems to be accomplished, with CMS becoming even more general-purpose.

Read more:

Another generic search method in ATLAS: Going Rogue: The Search for Anything (and Everything) with ATLAS

And a take with machine learning: Letting the Machines Seach for New Physics

Fancy checking a good old model-specific search? Uncovering a Higgs Hiding Behind Backgrounds

A shortcut to truth

Article title: “Automated detector simulation and reconstruction
parametrization using machine learning”

Authors: D. Benjamin, S.V. Chekanov, W. Hopkins, Y. Li, J.R. Love

Reference: https://arxiv.org/abs/2002.11516 (https://iopscience.iop.org/article/10.1088/1748-0221/15/05/P05025)

Demonstration of probability density function as the output of a neural network. (Source: paper)

The simulation of particle collisions at the LHC is a pharaonic task. The messy chromodynamics of protons must be modeled; the statistics of the collision products must reflect the Standard Model; each particle has to travel through the detectors and interact with all the elements in its path. Its presence will eventually be reduced to electronic measurements, which, after all, is all we know about it.

The work of the simulation ends somewhere here, and that of the reconstruction starts; namely to go from electronic signals to particles. Reconstruction is a process common to simulation and to the real world. Starting from the tangle of statistical and detector effects that the actual measurements include, the goal is to divine the properties of the initial collision products.

Now, researchers at the Argonne National Laboratory looked into going from the simulated particles as produced in the collisions (aka “truth objects”) directly to the reconstructed ones (aka “reco objects”): bypassing the steps of the detailed interaction with the detectors and of the reconstruction algorithm could make the studies that use simulations much more speedy and efficient.

Display of a collision event involving hadronic jets at ATLAS. Each colored block corresponds to interaction with a detector element. (Source: ATLAS experiment)

The team used a neural network which it trained on simulations of the full set. The goal was to have the network learn to produce the properties of the reco objects when given only the truth objects. The process succeeded in producing the transverse momenta of hadronic jets, and looks suitable for any kind of particle and for other kinematic quantities.

More specifically, the researchers began with two million simulated jet events, fully passed through the ATLAS experiment and the reconstruction algorithm. For each of them, the network took the kinematic properties of the truth jet as input and was trained to achieve the reconstructed transverse momentum.

The network was taught to perform multi-categorization: its output didn’t consist of a single node giving the momentum value, but of 400 nodes, each corresponding to a different range of values. The output of each node was the probability for that particular range. In other words, the result was a probability density function for the reconstructed momentum of a given jet.

The final step was to select the momentum randomly from this distribution. For half a million of test jets, all this resulted in good agreement with the actual reconstructed momenta, specifically within 5% for values above 20 GeV. In addition, it seems that the training was sensitive to the effects of quantities other than the target one (e.g. the effects of the position in the detector), as the neural network was able to pick up on the dependencies between the input variables. Also, hadronic jets are complicated animals, so it is expected that the method will work on other objects just as well.

Comparison of the reconstructed transverse momentum between the full simulation and reconstruction (“Delphes”) and the neural net output. (Source: paper)

All in all, this work showed the perspective for neural networks to imitate successfully the effects of the detector and the reconstruction. Simulations in large experiments typically take up loads of time and resources due to their size, intricacy and frequent need for updates in the hardware conditions. Such a shortcut, needing only small numbers of fully processed events, would speed up studies such as optimization of the reconstruction and detector upgrades.

More reading:

Argonne Lab press release: https://www.anl.gov/article/learning-more-about-particle-collisions-with-machine-learning

Intro to neural networks: https://physicsworld.com/a/neural-networks-explained/

Crystals are dark matter’s best friends

Article title: “Development of ultra-pure NaI(Tl) detector for COSINE-200 experiment”

Authors: B.J. Park et el.

Reference: arxiv:2004.06287

The landscape of direct detection of dark matter is a perplexing one; all experiments have so far come up with deafening silence, except for a single one which promises a symphony. This is the DAMA/LIBRA experiment in Gran Sasso, Italy, which has been seeing an annual modulation in its signal for two decades now.

Such an annual modulation is as dark-matter-like as it gets. First proposed by Katherine Freese in 1987, it would be the result of earth’s motion inside the galactic halo of dark matter in the same direction as the sun for half of the year and in the opposite direction during the other half. However, DAMA/LIBRA’s results are in conflict with other experiments – but with the catch that none of those used the same setup. The way to settle this is obviously to build more experiments with the DAMA/LIBRA setup. This is an ongoing effort which ultimately focuses on the crystals at its heart.

Cylindrical crystals wrapped in reflector, bounded by photomultipliers (PMTs) and surrounded by scintillators. (COSINE-100)

The specific crystals are made of the scintillating material thallium-doped sodium iodide, NaI(Tl). Dark matter particles, and particularly WIMPs, would collide elastically with atomic nuclei and the recoil would give off photons, which would eventually be captured by photomultiplier tubes at the ends of each crystal.

Right now a number of NaI(Tl)-based experiments are at various stages of preparation around the world, with COSINE-100 at the Yangyang mountain, S.Korea, already producing negative results. However, these are still not on equal footing with DAMA/LIBRA’s because of higher backgrounds at COSINE-100. What is the collaboration to do, then? The answer is focus even more on the crystals and how they are prepared.

Setup of the COSINE-100 experiment. (COSINE-100)

Over the last couple of years some serious R&D went into growing better crystals for COSINE-200, the planned upgrade of COSINE-100. Yes, a crystal is something that can and does grow. A seed placed inside the raw material, in this case NaI(Tl) powder, leads it to organize itself around the seed’s structure over the next hours or days.

In COSINE-100 the most annoying backgrounds came from within the crystals themselves because of the production process, because of natural radioactivity, and because of cosmogenically induced isotopes. Let’s see how each of these was tackled during the experiment’s mission towards a radiopure upgrade.

Improved techniques of growing and preparing the crystals reduced contamination from the materials of the grower device and from the ambient environment. At the same time different raw materials were tried out to put the inherent contamination under control.

Among a handful of naturally present radioactive isotopes particular care was given to 40K. 40K can decay characteristically to an X-ray of 3.2keV and a γ-ray of 1,460keV, a combination convenient for tagging it to a large extent. The tagging is done with the help of 2,000 liters of liquid scintillator surrounding the crystals. However, if the γ-ray escapes the crystal then the left-behind X-ray will mimic the expected signal from WIMPs… Eventually the dangerous 40K was brought down to levels comparable to those in DAMA/LIBRA through the investigation of various techniques and first materials.

But the main source of radioactive background in COSINE-100 was isotopes such as 3H or 22Na created inside the crystals by cosmic ray muons, after their production. Now, their abundance was reduced significantly by two simple moves: the crystals were grown locally at a very low altitude and installed underground within a few weeks (instead of being transported from a lab at 1,400 meters above sea in Colorado). Moreover, most of the remaining cosmogenic background is to decay away within a couple of years.

Components of the background, and temporal evolution of the cosmogenic radioactivity. (Source)

Where are these efforts standing? The energy range of interest for testing the DAMA/LIBRA signal is 1-6keV. This corresponds to a background target of 1 count/kg/day/keV. After the crystals R&D, the achieved contamination was less than about 0.34 counts. In short, everything is ready for COSINE-100 to upgrade to COSINE-200 and test the annual modulation without the previous ambiguities that stood in the way.

Learn more:

More on DAMA/LIBRA in ParticleBites.

Cross-checking the modulation.

The COSINE-100 experiment.

First COSINE-100 results.

Muon to electron conversion

Presenting: Section 3.2 of “Charged Lepton Flavor Violation: An Experimenter’s Guide”
Authors: R. Bernstein, P. Cooper
Reference1307.5787 (Phys. Rept. 532 (2013) 27)

Not all searches for new physics involve colliding protons at the the highest human-made energies. An alternate approach is to look for deviations in ultra-rare events at low energies. These deviations may be the quantum footprints of new, much heavier particles. In this bite, we’ll focus on the decay of a muon to an electron in the presence of a heavy atom.

Muons decay
Muons conversion into an electron in the presence of an atom, aluminum.

The muon is a heavy version of the electron.There  are a few properties that make muons nice systems for precision measurements:

  1. They’re easy to produce. When you smash protons into a dense target, like tungsten, you get lots of light hadrons—among them, the charged pions. These charged pions decay into muons, which one can then collect by bending their trajectories with magnetic fields. (Puzzle: why don’t pions decay into electrons? Answer below.)
  2. They can replace electrons in atoms.  If you point this beam of muons into a target, then some of the muons will replace electrons in the target’s atoms. This is very nice because these “muonic atoms” are described by non-relativistic quantum mechanics with the electron mass replaced with ~100 MeV. (Muonic hydrogen was previous mentioned in this bite on the proton radius problem.)
  3. They decay, and the decay products always include an electron that can be detected.  In vacuum it will decay into an electron and two neutrinos through the weak force, analogous to beta decay.
  4. These decays are sensitive to virtual effects. You don’t need to directly create a new particle in order to see its effects. Potential new particles are constrained to be very heavy to explain their non-observation at the LHC. However, even these heavy particles can leave an  imprint on muon decay through ‘virtual effects’ according (roughly) to the Heisenberg uncertainty principle: you can quantum mechanically violate energy conservation, but only for very short times.
Reach of muon conversion experiments from 1303.4097. The y axis is the energy scale that can be probed, the x axis parameterizes how new physics is spread between different CLFV parameters.
Reach of muon conversion experiments from 1303.4097. The y axis is the energy scale that can be probed and the x axis parameterizes different ways that lepton flavor violation can appear in a theory.

One should be surprised that muon conversion is even possible. The process \mu \to e cannot occur in vacuum because it cannot simultaneously conserve energy and momentum. (Puzzle: why is this true? Answer below.) However, this process is allowed in the presence of a heavy nucleus that can absorb the additional momentum, as shown in the comic at the top of this post.

Muon  conversion experiments exploit this by forming muonic atoms in the 1state and waiting for the muon to convert into an electron which can then be detected. The upside is that all electrons from conversion have a fixed energy because they all come from the same initial state: 1s muonic aluminum at rest in the lab frame. This is in contrast with more common muon decay modes which involve two neutrinos and an electron; because this is a multibody final state, there is a smooth distribution of electron energies. This feature allows physicists to distinguish between the \mu \to e conversion versus the more frequent muon decay \mu \to e \nu_\mu \bar \nu_e in orbit or muon capture by the nucleus (similar to electron capture).

The Standard Model prediction for this rate is miniscule—it’s weighted by powers of the neutrino to the W boson mass ratio  (Puzzle: how does one see this? Answer below.). In fact, the current experimental bound on muon conversion comes from the Sindrum II experiment  looking at muonic gold which constrains the relative rate of muon conversion to muon capture by the gold nucleus to be less than 7 \times 10^{-13}. This, in turn, constrains models of new physics that predict some level of charged lepton flavor violation—that is, processes that change the flavor of a charged lepton, say going from muons to electrons.

The plot on the right shows the energy scales that are indirectly probed by upcoming muonic aluminum experiments: the Mu2e experiment at Fermilab and the COMET experiment at J-PARC. The blue lines show bounds from another rare muon decay: muons decaying into an electron and photon. The black solid lines show the reach for muon conversion in muonic aluminum. The dashed lines correspond to different experimental sensitivities (capture rates for conversion, branching ratios for decay with a photon). Note that the energy scales probed can reach 1-10 PeV—that’s 1000-10,000 TeV—much higher than the energy scales direclty probed by the LHC! In this way, flavor experiments and high energy experiments are complimentary searches for new physics.

These “next generation” muon conversion experiments are currently under construction and promise to push the intensity frontier in conjunction with the LHC’s energy frontier.

 

 

Solutions to exercises:

  1. Why do pions decay into muons and not electrons? [Note: this requires some background in undergraduate-level particle physics.] One might expect that if a charged pion can decay into a muon and a neutrino, then it should also go into an electron and a neutrino. In fact, the latter should dominate since there’s much more phase space. However, the matrix element requires a virtual W boson exchange and thus depends on an [axial] vector current. The only vector available from the pion system is its 4-momentum. By momentum conservation this is $p_\pi = p_\mu + p_\nu$. The lepton momenta then contract with Dirac matrices on the leptonic current to give a dominant piece proportional to the lepton mass. Thus the amplitude for charged pion decay into a muon is much larger than the amplitude for decay into an electron.
  2. Why can’t a muon decay into an electron in vacuum? The process \mu \to e cannot simultaneously conserve energy and momentum. This is simplest to see in the reference frame where the muon is at rest. Momentum conservation requires the electron to also be at rest. However, a particle has rest energy equal to its mass, but now there’s now way a muon at rest can pass on all of its energy to an electron at rest.
  3. Why is muon conversion in the Standard Model suppressed by the ration of the neutrino to W masses? This can be seen by drawing the Feynman diagram (fig below from 1401.6077). Flavor violation in the Standard Model requires a W boson. Because the W is much heavier than the muon, this must be virtual and appear only as an internal leg. Further, W‘s couple charged leptons to neutrinos, so there must also be a virtual neutrino. The evaluation of this diagram into an amplitude gives factors of the neutrino mass in the numerator (required for the fermion chirality flip) and the W mass in the denominator. For some details, see this post.
    Screen Shot 2015-03-05 at 4.08.58 PM

Further Reading:

  • 1205.2671: Fundamental Physics at the Intensity Frontier (section 3.2.2)
  • 1401.6077: Snowmass 2013 Report, Intensity Frontier chapter