How to find invisible particles in a collider

 You might have heard that one of the big things we are looking for in collider experiments are ever elusive dark matter particles. But given that dark matter particles are expected to interact very rarely with regular matter, how would you know if you happened to make some in a collision? The so called ‘direct detection’ experiments have to operate giant multi-ton detectors in extremely low-background environments in order to be sensitive to an occasional dark matter interaction. In the noisy environment of a particle collider like the LHC, in which collisions producing sprays of particles happen every 25 nanoseconds, the extremely rare interaction of the dark matter with our detector is likely to be missed. But instead of finding dark matter by seeing it in our detector, we can instead find it by not seeing it. That may sound paradoxical, but its how most collider based searches for dark matter work. 

The trick is based on every physicists favorite principle: the conservation of energy and momentum. We know that energy and momentum will be conserved in a collision, so if we know the initial momentum of the incoming particles, and measure everything that comes out, then any invisible particles produced will show up as an imbalance between the two. In a proton-proton collider like the LHC we don’t know the initial momentum of the particles along the beam axis, but we do that they were traveling along that axis. That means that the net momentum in the direction away from the beam axis (the ‘transverse’ direction) should be zero. So if we see a momentum imbalance going away from the beam axis, we know that there is some ‘invisible’ particle traveling in the opposite direction.

A sketch of what the signature of an invisible particle would like in a detector. Note this is a 2D cross section of the detector, with the beam axis traveling through the center of the diagram. There are two signals measured in the detector moving ‘up’ away from the beam pipe. Momentum conservation means there must have been some particle produced which is traveling ‘down’ and was not measured by the detector. Figure borrowed from here  

We normally refer to the amount of transverse momentum imbalance in an event as its ‘missing momentum’. Any collisions in which an invisible particle was produced will have missing momentum as tell-tale sign. But while it is a very interesting signature, missing momentum can actually be very difficult to measure. That’s because in order to tell if there is anything missing, you have to accurately measure the momentum of every particle in the collision. Our detectors aren’t perfect, any particles we miss, or mis-measure the momentum of, will show up as a ‘fake’ missing energy signature. 

A picture of a particularly noisy LHC collision, with a large number of tracks
Can you tell if there is any missing energy in this collision? Its not so easy… Figure borrowed from here

Even if you can measure the missing energy well, dark matter particles are not the only ones invisible to our detector. Neutrinos are notoriously difficult to detect and will not get picked up by our detectors, producing a ‘missing energy’ signature. This means that any search for new invisible particles, like dark matter, has to understand the background of neutrino production (often from the decay of a Z or W boson) very well. No one ever said finding the invisible would be easy!

However particle physicists have been studying these processes for a long time so we have gotten pretty good at measuring missing energy in our events and modeling the standard model backgrounds. Missing energy is a key tool that we use to search for dark matter, supersymmetry and other physics beyond the standard model.

Read More:

What happens when energy goes missing?” ATLAS blog post by Julia Gonski

How to look for supersymmetry at the LHC“, blog post by Matt Strassler

“Performance of missing transverse momentum reconstruction with the ATLAS detector using proton-proton collisions at √s = 13 TeV” Technical Paper by the ATLAS Collaboration

“Search for new physics in final states with an energetic jet or a hadronically decaying W or Z boson and transverse momentum imbalance at √s= 13 TeV” Search for dark matter by the CMS Collaboration

Measuring the Tau’s g-2 Too

Title : New physics and tau g2 using LHC heavy ion collisions

Authors: Lydia Beresford and Jesse Liu

Reference: https://arxiv.org/abs/1908.05180

Since April, particle physics has been going crazy with excitement over the recent announcement of the muon g-2 measurement which may be our first laboratory hint of physics beyond the Standard Model. The paper with the new measurement has racked up over 100 citations in the last month. Most of these papers are theorists proposing various models to try an explain the (controversial) discrepancy between the measured value of the muon’s magnetic moment and the Standard Model prediction. The sheer number of papers shows there are many many models that can explain the anomaly. So if the discrepancy is real,  we are going to need new measurements to whittle down the possibilities.

Given that the current deviation is in the magnetic moment of the muon, one very natural place to look next would be the magnetic moment of the tau lepton. The tau, like the muon, is a heavier cousin of the electron. It is the heaviest lepton, coming in at 1.78 GeV, around 17 times heavier than the muon. In many models of new physics that explain the muon anomaly the shift in the magnetic moment of a lepton is proportional to the mass of the lepton squared. This would explain why we are a seeing a discrepancy in the muon’s magnetic moment and not the electron (though there is a actually currently a small hint of a deviation for the electron too). This means the tau should be 280 times more sensitive than the muon to the new particles in these models. The trouble is that the tau has a much shorter lifetime than the muon, decaying away in just 10-13 seconds. This means that the techniques used to measure the muons magnetic moment, based on magnetic storage rings, won’t work for taus. 

Thats where this new paper comes in. It details a new technique to try and measure the tau’s magnetic moment using heavy ion collisions at the LHC. The technique is based on light-light collisions (previously covered on Particle Bites) where two nuclei emit photons that then interact to produce new particles. Though in classical electromagnetism light doesn’t interact with itself (the beam from two spotlights pass right through each other) at very high energies each photon can split into new particles, like a pair of tau leptons and then those particles can interact. Though the LHC normally collides protons, it also has runs colliding heavier nuclei like lead as well. Lead nuclei have more charge than protons so they emit high energy photons more often than protons and lead to more light-light collisions than protons. 

Light-light collisions which produce tau leptons provide a nice environment to study the interaction of the tau with the photon. A particles magnetic properties are determined by its interaction with photons so by studying these collisions you can measure the tau’s magnetic moment. 

However studying this process is be easier said than done. These light-light collisions are “Ultra Peripheral” because the lead nuclei are not colliding head on, and so the taus produced generally don’t have a large amount of momentum away from the beamline. This can make them hard to reconstruct in detectors which have been designed to measure particles from head on collisions which typically have much more momentum. Taus can decay in several different ways, but always produce at least 1 neutrino which will not be detected by the LHC experiments further reducing the amount of detectable momentum and meaning some information about the collision will lost. 

However one nice thing about these events is that they should be quite clean in the detector. Because the lead nuclei remain intact after emitting the photon, the taus won’t come along with the bunch of additional particles you often get in head on collisions. The level of background processes that could mimic this signal also seems to be relatively minimal. So if the experimental collaborations spend some effort in trying to optimize their reconstruction of low momentum taus, it seems very possible to perform a measurement like this in the near future at the LHC. 

The authors of this paper estimate that such a measurement with a the currently available amount of lead-lead collision data would already supersede the previous best measurement of the taus anomalous magnetic moment and further improvements could go much farther. Though the measurement of the tau’s magnetic moment would still be far less precise than that of the muon and electron, it could still reveal deviations from the Standard Model in realistic models of new physics. So given the recent discrepancy with the muon, the tau will be an exciting place to look next!

Read More:

An Anomalous Anomaly: The New Fermilab Muon g-2 Results

When light and light collide

Another Intriguing Hint of New Physics Involving Leptons

New detectors on the block

Article title: “Toward Machine Learning Optimization of Experimental Design”

Authors: MODE Collaboration

Reference: https://inspirehep.net/literature/1850892 (pdf)

In a previous post we wondered if (machine learning) algorithms can replace the entire simulation of detectors and reconstruction of particles. But meanwhile some experimentalists have gone one step further – and wondered if algorithms can design detectors.

Indeed, the MODE collaboration stands for Machine-learning Optimized Design of Experiments and in its first paper promises nothing less than that.

The idea here is that the choice of characteristics that an experiment can have is vast (think number of units, materials, geometry, dimensions and so on), but its ultimate goal can still be described by a single “utility function”. For instance, the precision of the measurement on specific data can be thought of as a utility function.

Then, the whole process that leads to obtaining that function can be decomposed into a number of conceptual blocks: normally there are incoming particles, which move through and interact with detectors, resulting in measurements; from them, the characteristics of the particles are reconstructed; these are eventually analyzed to get relevant useful quantities, the utility function among them. Ultimately, chaining together these blocks creates a pipeline that models the experiment from one end to the other.

Now, another central notion is differentiation or, rather, the ability to be differentiated; if all the components of this model are differentiable, then the gradient of the utility function can be calculated. This leads to the holy grail: finding its extreme values, i.e. optimize the experiment’s design as a function of its numerous components.

Before we see whether the components are indeed differentiable and how the gradient gets calculated, here is an example of this pipeline concept for a muon radiography detector.

Discovering a hidden space in the Great Pyramid by using muons. ( Financial Times)

Muons are not just the trendy star of particle physics (as of April 2021), but they also find application in scanning closed volumes and revealing details about the objects in them. And yes, the Great Pyramid has been muographed successfully.

In terms of the pipeline described above, a muon radiography device could be modeled in the following way: Muons from cosmic rays are generated in the form of 4-vectors. Those are fed to a fast-simulation of the scanned volume and the detector. The interactions of the particles with the materials and the resulting signals on the electronics are simulated. This output goes into a reconstruction module, which recreates muon tracks. From them, an information-extraction module calculates the density of the scanned material. It can also produce a loss function for the measurement, which here would be the target quantity.

Conceptual layout of the optimization pipeline. (MODE collaboration)

This whole ritual is a standard process in experimental work, although the steps are usually quite separate from one another. In the MODE concept, however, not only are they linked together but also run iteratively. The optimization of the detector design proceeds in steps and in each of them the parameters of the device are changed in the simulation. This affects directly the detector module and indirectly the downstream modules of the pipeline. The loop of modification and validation can be constrained appropriately to keep everything within realistic values, and also to make the most important consideration of all enter the game – that is of course cost and the constraints that it brings along.

Descending towards the minimum. (Dezhi Yu)

As mentioned above, the proposed optimization proceeds in steps by optimizing the parameters along the gradient of the utility function. The most famous incarnation of gradient-based optimization is gradient descent which is customarily used in neural networks. Gradient descent guides the network towards the minimum value of the error that it produces, through the possible “paths” of its parameters.

In the MODE proposal the optimization is achieved through automatic differentiation (AD), the latest word in the calculation of derivatives in computer programs. To shamefully paraphrase Wikipedia, AD exploits the fact that every computer program, no matter how complicated, executes a sequence of elementary arithmetic operations and functions. By applying the chain rule repeatedly to these operations, derivatives can be computed automatically, accurately and efficiently.

Also, something was mentioned above about whether the components of the pipeline are “indeed differentiable”. It turns out that one isn’t. This is the simulation of the processes during the passage of particles through the detector, which is stochastic by nature. However, machine learning can learn how to mimic it, take its place, and provide perfectly fine and differentiable modules. (The brave of heart can follow the link at the end to find out about local generative surrogates.)

This method of designing detectors might sound like a thought experiment on steroids. But the point of MODE is that it’s the realistic way to take full advantage of the current developments in computation. And maybe to feel like we have really entered the third century of particle experiments.

Further reading:

The MODE website: https://mode-collaboration.github.io/

A Beginner’s Guide to Differentiable Programming: https://wiki.pathmind.com/differentiableprogramming

Black-Box Optimization with Local Generative Surrogates: https://arxiv.org/abs/2002.04632

A symphony of data

Article title: “MUSiC: a model unspecific search for new physics in
proton-proton collisions at \sqrt{s} = 13 TeV”

Authors: The CMS Collaboration

Reference: https://arxiv.org/abs/2010.02984

First of all, let us take care of the spoilers: no new particles or phenomena have been found… Having taken this concern away, let us focus on the important concept behind MUSiC.

ATLAS and CMS, the two largest experiments using collisions at the LHC, are known as “general purpose experiments” for a good reason. They were built to look at a wide variety of physical processes and, up to now, each has checked dozens of proposed theoretical extensions of the Standard Model, in addition to checking the Model itself. However, in almost all cases their searches rely on definite theory predictions and focus on very specific combinations of particles and their kinematic properties. In this way, the experiments may still be far from utilizing their full potential. But now an algorithm named MUSiC is here to help.

MUSiC takes all events recorded by CMS that comprise of clean-cut particles and compares them against the expectations from the Standard Model, untethering itself from narrow definitions for the search conditions.

We should clarify here that an “event” is the result of an individual proton-proton collision (among the many happening each time the proton bunches cross), consisting of a bouquet of particles. First of all, MUSiC needs to work with events with particles that are well-recognized by the experiment’s detectors, to cut down on uncertainty. It must also use particles that are well-modeled, because it will rely on the comparison of data to simulation and, so, wants to be sure about the accuracy of the latter.

Display of an event with two muons at CMS. (Source: CMS experiment)

All this boils down to working with events with combinations of specific, but several, particles: electrons, muons, photons, hadronic jets from light-flavour (=up, down, strange) quarks or gluons and from bottom quarks, and deficits in the total transverse momentum (typically the signature of the uncatchable neutrinos or perhaps of unknown exotic particles). And to make things even more clean-cut, it keeps only events that include either an electron or a muon, both being well-understood characters.

These particles’ combinations result in hundreds of different “final states” caught by the detectors. However, they all correspond to only a dozen combos of particles created in the collisions according to the Standard Model, before some of them decay to lighter ones. For them, we know and simulate pretty well what we expect the experiment to measure.

MUSiC proceeded by comparing three kinematic quantities of these final states, as measured by CMS during the year 2016, to their simulated values. The three quantities of interest are the combined mass, combined transverse momentum and combined missing transverse momentum. It’s in their distributions that new particles would most probably show up, regardless of which theoretical model they follow. The range of values covered is pretty wide. All in all, the method extends the kinematic reach of usual searches, as it also does with the collection of final states.

An example distribution from MUSiC: Transverse mass for the final state comprising of one muon and missing transverse momentum. Color histograms: Simulated Standard Model processes. Red line: Signal from a hypothetical W’ boson with mass of 3TeV. (Source: paper)

So the kinematic distributions are checked against the simulated expectations in an automatized way, with MUSiC looking for every physicist’s dream: deviations. Any deviation from the simulation, meaning either fewer or more recorded events, is quantified by getting a probability value. This probability is calculated by also taking into account the much dreaded “look elsewhere effect”. (Which comes from the fact that, statistically, in a large number of distributions a random fluctuation that will mimic a genuine deviation is bound to appear sooner or later.)

When all’s said and done the collection of probabilities is overviewed. The MUSiC protocol says that any significant deviation will be scrutinized with more traditional methods – only that this need never actually arose in the 2016 data: all the data played along with the Standard Model, in all 1,069 examined final states and their kinematic ranges.

For the record, the largest deviation was spotted in the final state comprising three electrons, two generic hadronic jets and one jet coming from a bottom quark. Seven events were counted whereas the simulation gave 2.7±1.8 events (mostly coming from the production of a top plus an anti-top quark plus an intermediate vector boson from the collision; the fractional values are due to extrapolating to the amount of collected data). This excess was not seen in other related final states, “related” in that they also either include the same particles or have one less. Everything pointed to a fluctuation and the case was closed.

However, the goal of MUSiC was not strictly to find something new, but rather to demonstrate a method for model un-specific searches with collisions data. The mission seems to be accomplished, with CMS becoming even more general-purpose.

Read more:

Another generic search method in ATLAS: Going Rogue: The Search for Anything (and Everything) with ATLAS

And a take with machine learning: Letting the Machines Seach for New Physics

Fancy checking a good old model-specific search? Uncovering a Higgs Hiding Behind Backgrounds

Machine Learning The LHC ABC’s

Article Title: ABCDisCo: Automating the ABCD Method with Machine Learning

Authors: Gregor Kasieczka, Benjamin Nachman, Matthew D. Schwartz, David Shih

Reference: arxiv:2007.14400

When LHC experiments try to look for the signatures of new particles in their data they always apply a series of selection criteria to the recorded collisions. The selections pick out events that look similar to the sought after signal. Often they then compare the observed number of events passing these criteria to the number they would expect to be there from ‘background’ processes. If they see many more events in real data than the predicted background that is evidence of the sought after signal. Crucial to whole endeavor is being able to accurately estimate the number of events background processes would produce. Underestimate it and you may incorrectly claim evidence of a signal, overestimate it and you may miss the chance to find a highly sought after signal.

However it is not always so easy to estimate the expected number of background events. While LHC experiments do have high quality simulations of the Standard Model processes that produce these backgrounds they aren’t perfect. Particularly processes involving the strong force (aka Quantum Chromodynamics, QCD) are very difficult to simulate, and refining these simulations is an active area of research. Because of these deficiencies we don’t always trust background estimates based solely on these simulations, especially when applying very specific selection criteria.

Therefore experiments often employ ‘data-driven’ methods where they estimate the amount background events by using control regions in the data. One of the most widely used techniques is called the ABCD method.

An illustration of the ABCD method. The signal region, A, is defined as the region in which f and g are greater than some value. The amount of background in region A is estimated using regions B C and D which are dominated by background.

The ABCD method can applied if the selection of signal-like events involves two independent variables f and g. If one defines the ‘signal region’, A,  (the part of the data in which we are looking for a signal) as having f  and g each greater than some amount, then one can use the neighboring regions B, C, and D to estimate the amount of background in region A. If the number of signal events outside region A is small, the number of background events in region A can be estimated as N_A = N_B * (N_C/N_D).

In modern analyses often one of these selection requirements involves the score of a neural network trained to identify the sought after signal. Because neural networks are powerful learners one often has to be careful that they don’t accidentally learn about the other variable that will be used in the ABCD method, such as the mass of the signal particle. If two variables become correlated, a background estimate with the ABCD method will not be possible. This often means augmenting the neural network either during training or after the fact so that it is intentionally ‘de-correlated’ with respect to the other variable. While there are several known techniques to do this, it is still a tricky process and often good background estimates come with a trade off of reduced classification performance.

In this latest work the authors devise a way to have the neural networks help with the background estimate rather than hindering it. The idea is rather than training a single network to classify signal-like events, they simultaneously train two networks both trying to identify the signal. But during this training they use a groovy technique called ‘DisCo’ (short for Distance Correlation) to ensure that these two networks output is independent from each other. This forces the networks to learn to use independent information to identify the signal. This then allows these networks to be used in an ABCD background estimate quite easily.

The authors try out this new technique, dubbed ‘Double DisCo’, on several examples. They demonstrate they are able to have quality background estimates using the ABCD method while achieving great classification performance. They show that this method improves upon the previous state of the art technique of decorrelating a single network from a fixed variable like mass and using cuts on the mass and classifier to define the ABCD regions (called ‘Single Disco’ here).

Using the task of identifying jets containing boosted top quarks, they compare the classification performance (x-axis) and quality of the ABCD background estimate (y-axis) achievable with the new Double DisCo technique (yellow points) and previously state of the art Single DisCo (blue points). One can see the Double DisCo method is able to achieve higher background rejection with a similar or better amount of ABCD closure.

While there have been many papers over the last few years about applying neural networks to classification tasks in high energy physics, not many have thought about how to use them to improve background estimates as well. Because of their importance, background estimates are often the most time consuming part of a search for new physics. So this technique is both interesting and immediately practical to searches done with LHC data. Hopefully it will be put to use in the near future!

Further Reading:

Quanta Magazine Article “How Artificial Intelligence Can Supercharge the Search for New Particles

Recent ATLAS Summary on New Machine Learning Techniques “Machine learning qualitatively changes the search for new particles

CERN Tutorial on “Background Estimation with the ABCD Method

Summary of Paper of Previous Decorrelation Techniques used in ATLAS “Performance of mass-decorrelated jet substructure observables for hadronic two-body decay tagging in ATLAS

A shortcut to truth

Article title: “Automated detector simulation and reconstruction
parametrization using machine learning”

Authors: D. Benjamin, S.V. Chekanov, W. Hopkins, Y. Li, J.R. Love

Reference: https://arxiv.org/abs/2002.11516 (https://iopscience.iop.org/article/10.1088/1748-0221/15/05/P05025)

Demonstration of probability density function as the output of a neural network. (Source: paper)

The simulation of particle collisions at the LHC is a pharaonic task. The messy chromodynamics of protons must be modeled; the statistics of the collision products must reflect the Standard Model; each particle has to travel through the detectors and interact with all the elements in its path. Its presence will eventually be reduced to electronic measurements, which, after all, is all we know about it.

The work of the simulation ends somewhere here, and that of the reconstruction starts; namely to go from electronic signals to particles. Reconstruction is a process common to simulation and to the real world. Starting from the tangle of statistical and detector effects that the actual measurements include, the goal is to divine the properties of the initial collision products.

Now, researchers at the Argonne National Laboratory looked into going from the simulated particles as produced in the collisions (aka “truth objects”) directly to the reconstructed ones (aka “reco objects”): bypassing the steps of the detailed interaction with the detectors and of the reconstruction algorithm could make the studies that use simulations much more speedy and efficient.

Display of a collision event involving hadronic jets at ATLAS. Each colored block corresponds to interaction with a detector element. (Source: ATLAS experiment)

The team used a neural network which it trained on simulations of the full set. The goal was to have the network learn to produce the properties of the reco objects when given only the truth objects. The process succeeded in producing the transverse momenta of hadronic jets, and looks suitable for any kind of particle and for other kinematic quantities.

More specifically, the researchers began with two million simulated jet events, fully passed through the ATLAS experiment and the reconstruction algorithm. For each of them, the network took the kinematic properties of the truth jet as input and was trained to achieve the reconstructed transverse momentum.

The network was taught to perform multi-categorization: its output didn’t consist of a single node giving the momentum value, but of 400 nodes, each corresponding to a different range of values. The output of each node was the probability for that particular range. In other words, the result was a probability density function for the reconstructed momentum of a given jet.

The final step was to select the momentum randomly from this distribution. For half a million of test jets, all this resulted in good agreement with the actual reconstructed momenta, specifically within 5% for values above 20 GeV. In addition, it seems that the training was sensitive to the effects of quantities other than the target one (e.g. the effects of the position in the detector), as the neural network was able to pick up on the dependencies between the input variables. Also, hadronic jets are complicated animals, so it is expected that the method will work on other objects just as well.

Comparison of the reconstructed transverse momentum between the full simulation and reconstruction (“Delphes”) and the neural net output. (Source: paper)

All in all, this work showed the perspective for neural networks to imitate successfully the effects of the detector and the reconstruction. Simulations in large experiments typically take up loads of time and resources due to their size, intricacy and frequent need for updates in the hardware conditions. Such a shortcut, needing only small numbers of fully processed events, would speed up studies such as optimization of the reconstruction and detector upgrades.

More reading:

Argonne Lab press release: https://www.anl.gov/article/learning-more-about-particle-collisions-with-machine-learning

Intro to neural networks: https://physicsworld.com/a/neural-networks-explained/

Crystals are dark matter’s best friends

Article title: “Development of ultra-pure NaI(Tl) detector for COSINE-200 experiment”

Authors: B.J. Park et el.

Reference: arxiv:2004.06287

The landscape of direct detection of dark matter is a perplexing one; all experiments have so far come up with deafening silence, except for a single one which promises a symphony. This is the DAMA/LIBRA experiment in Gran Sasso, Italy, which has been seeing an annual modulation in its signal for two decades now.

Such an annual modulation is as dark-matter-like as it gets. First proposed by Katherine Freese in 1987, it would be the result of earth’s motion inside the galactic halo of dark matter in the same direction as the sun for half of the year and in the opposite direction during the other half. However, DAMA/LIBRA’s results are in conflict with other experiments – but with the catch that none of those used the same setup. The way to settle this is obviously to build more experiments with the DAMA/LIBRA setup. This is an ongoing effort which ultimately focuses on the crystals at its heart.

Cylindrical crystals wrapped in reflector, bounded by photomultipliers (PMTs) and surrounded by scintillators. (COSINE-100)

The specific crystals are made of the scintillating material thallium-doped sodium iodide, NaI(Tl). Dark matter particles, and particularly WIMPs, would collide elastically with atomic nuclei and the recoil would give off photons, which would eventually be captured by photomultiplier tubes at the ends of each crystal.

Right now a number of NaI(Tl)-based experiments are at various stages of preparation around the world, with COSINE-100 at the Yangyang mountain, S.Korea, already producing negative results. However, these are still not on equal footing with DAMA/LIBRA’s because of higher backgrounds at COSINE-100. What is the collaboration to do, then? The answer is focus even more on the crystals and how they are prepared.

Setup of the COSINE-100 experiment. (COSINE-100)

Over the last couple of years some serious R&D went into growing better crystals for COSINE-200, the planned upgrade of COSINE-100. Yes, a crystal is something that can and does grow. A seed placed inside the raw material, in this case NaI(Tl) powder, leads it to organize itself around the seed’s structure over the next hours or days.

In COSINE-100 the most annoying backgrounds came from within the crystals themselves because of the production process, because of natural radioactivity, and because of cosmogenically induced isotopes. Let’s see how each of these was tackled during the experiment’s mission towards a radiopure upgrade.

Improved techniques of growing and preparing the crystals reduced contamination from the materials of the grower device and from the ambient environment. At the same time different raw materials were tried out to put the inherent contamination under control.

Among a handful of naturally present radioactive isotopes particular care was given to 40K. 40K can decay characteristically to an X-ray of 3.2keV and a γ-ray of 1,460keV, a combination convenient for tagging it to a large extent. The tagging is done with the help of 2,000 liters of liquid scintillator surrounding the crystals. However, if the γ-ray escapes the crystal then the left-behind X-ray will mimic the expected signal from WIMPs… Eventually the dangerous 40K was brought down to levels comparable to those in DAMA/LIBRA through the investigation of various techniques and first materials.

But the main source of radioactive background in COSINE-100 was isotopes such as 3H or 22Na created inside the crystals by cosmic ray muons, after their production. Now, their abundance was reduced significantly by two simple moves: the crystals were grown locally at a very low altitude and installed underground within a few weeks (instead of being transported from a lab at 1,400 meters above sea in Colorado). Moreover, most of the remaining cosmogenic background is to decay away within a couple of years.

Components of the background, and temporal evolution of the cosmogenic radioactivity. (Source)

Where are these efforts standing? The energy range of interest for testing the DAMA/LIBRA signal is 1-6keV. This corresponds to a background target of 1 count/kg/day/keV. After the crystals R&D, the achieved contamination was less than about 0.34 counts. In short, everything is ready for COSINE-100 to upgrade to COSINE-200 and test the annual modulation without the previous ambiguities that stood in the way.

Learn more:

More on DAMA/LIBRA in ParticleBites.

Cross-checking the modulation.

The COSINE-100 experiment.

First COSINE-100 results.

Listening for axions

If dark matter actually consists of a new kind of particle, then the most up-and-coming candidate is the axion. The axion is a consequence of the Peccei-Quinn mechanism, a plausible solution to the “strong CP problem,” or why the strong nuclear force conserves the CP-symmetry although there are no reasons for it to. It is a very light neutral boson, named by Frank Wilczek after a detergent brand (in a move that obviously dates its introduction in the ’70s).

Axion decay in a magnetic field: the result is a photon. (Source.)

Most experiments that try to directly detect dark matter have looked for WIMPs (weakly interacting massive particles). However, as those searches have not borne fruit, the focus started turning to axions, which make for good candidates given their properties and the fact that if they exist, then they exist in multitudes throughout the galaxies. Axions “speak” to the QCD part of the Standard Model, so they can appear in interaction vertices with hadronic loops. The end result is that axions passing through a magnetic field will convert to photons.

In practical terms, their detection boils down to having strong magnets, sensitive electronics and an electromagnetically very quiet place at one’s disposal. One can then sit back and wait for the hypothesized axions to pass through the detector as earth moves through the dark matter halo surrounding the Milky Way. Which is precisely why such experiments are known as “haloscopes.”

Now, the most veteran haloscope of all published significant new results. Alas, it is still empty-handed, but we can look at why its update is important and how it was reached.

ADMX (Axion Dark Matter eXperiment) of the University of Washington has been around for a quarter-century. By listening for signals from axions, it progressively gnaws away at the space of allowed values for their mass and coupling to photons, focusing on an area of interest:

ADMX_results_2020
Latest exclusion limits on the axion mass and coupling to photons.

Unlike higher values, this area is not excluded by astrophysical considerations (e.g. stars cooling off through axion emission) and other types of experiments (such as looking for axions from the sun). In addition, the bands above the lines denoted “KSVZ” and “DFSZ” are special. They correspond to the predictions of two models with favorable theoretical properties. So, ADMX is dedicated to scanning this parameter space. And the new analysis added one more year of data-taking, making a significant dent in this ballpark.

As mentioned, the presence of axions would be inferred from a stream of photons in the detector. The excluded mass range was scanned by “tuning” the experiment to different frequencies, while at each frequency step longer observation times probed smaller values for the axion-photon coupling.

Two things that this search needs is a lot of quiet and some good amplification, as the signal from a typical axion is expected to be as weak as the signal from a mobile phone left on the surface of Mars (around 10-23W). The setup is indeed stripped of noise by being placed in a dilution refrigerator, which keeps its temperature at a few tenths of a degree above absolute zero. This is practically the domain governed by quantum noise, so advantage can be taken of the finesse of quantum technology: for the first time ADMX used SQUIDs, superconducting quantum interference devices, for the amplification of the signal.

The heart of the experiment inside the refrigerator. The resonant frequency of the cavity is tuned to match the photons -hopefully- given off by axions. (Source.)




In the end, a good chunk of the parameter space which is favored by the theory might have been excluded, but the haloscope is ready to look at the rest of it. Just think of how, one day, a pulse inside a small device in a university lab might be a messenger of the mysteries unfolding across the cosmos.

References:

Publication by the ADMX collaboration. (arXiv)

Learn more:

  1. The theory behind axions.
  2. The hitchhiker’s guide to the dilution refrigerator.
  3. Intro to KSVZ and DFSZ axions (and more).
  4. Resonant cavities.

Making Smarter Snap Judgments at the LHC

Collisions at the Large Hadron Collider happen fast. 40 million times a second, bunches of 1011 protons are smashed together. The rate of these collisions is so fast that the computing infrastructure of the experiments can’t keep up with all of them. We are not able to read out and store the result of every collision that happens, so we have to ‘throw out’ nearly all of them. Luckily most of these collisions are not very interesting anyways. Most of them are low energy interactions of quarks and gluons via the strong force that have been already been studied at previous colliders. In fact, the interesting processes, like ones that create a Higgs boson, can happen billions of times less often than the uninteresting ones.

The LHC experiments are thus faced with a very interesting challenge, how do you decide extremely quickly whether an event is interesting and worth keeping or not? This what the ‘trigger’ system, the Marie Kondo of LHC experiments, are designed to do. CMS for example has a two-tiered trigger system. The first level has 4 microseconds to make a decision and must reduce the event rate from 40 millions events per second to 100,000. This speed requirement means the decision has to be made using at the hardware level, requiring the use of specialized electronics to quickly to synthesize the raw information from the detector into a rough idea of what happened in the event. Selected events are then passed to the High Level Trigger (HLT), which has 150 milliseconds to run versions of the CMS reconstruction algorithms to further reduce the event rate to a thousand per second.

While this system works very well for most uses of the data, like measuring the decay of Higgs bosons, sometimes it can be a significant obstacle. If you want to look through the data for evidence of a new particle that is relatively light, it can be difficult to prevent the trigger from throwing out possible signal events. This is because one of the most basic criteria the trigger uses to select ‘interesting’ events is that they leave a significant amount of energy in the detector. But the decay products of a new particle that is relatively light won’t have a substantial amount of energy and thus may look ‘uninteresting’ to the trigger.

In order to get the most out of their collisions, experimenters are thinking hard about these problems and devising new ways to look for signals the triggers might be missing. One idea is to save additional events from the HLT in a substantially reduced size. Rather than saving the raw information from the event, that can be fully processed at a later time, instead the only the output of the quick reconstruction done by the trigger is saved. At the cost of some precision, this can reduce the size of each event by roughly two orders of magnitude, allowing events with significantly lower energy to be stored. CMS and ATLAS have used this technique to look for new particles decaying to two jets and LHCb has used it to look for dark photons. The use of these fast reconstruction techniques allows them to search for, and rule out the existence of, particles with much lower masses than otherwise possible. As experiments explore new computing infrastructures (like GPU’s) to speed up their high level triggers, they may try to do even more sophisticated analyses using these techniques. 

But experimenters aren’t just satisfied with getting more out of their high level triggers, they want to revamp the low-level ones as well. In order to get these hardware-level triggers to make smarter decisions, experimenters are trying get them to run machine learning models. Machine learning has become very popular tool to look for rare signals in LHC data. One of the advantages of machine learning models is that once they have been trained, they can make complex inferences in a very short amount of time. Perfect for a trigger! Now a group of experimentalists have developed a library that can translate the most popular types machine learning models into a format that can be run on the Field Programmable Gate Arrays used in lowest level triggers. This would allow experiments to quickly identify events from rare signals that have complex signatures that the current low-level triggers don’t have time to look for. 

The LHC experiments are working hard to get the most out their collisions. There could be particles being produced in LHC collisions already but we haven’t been able to see them because of our current triggers, but these new techniques are trying to cover our current blind spots. Look out for new ideas on how to quickly search for interesting signatures, especially as we get closer the high luminosity upgrade of the LHC.

Read More:

CERN Courier article on programming FPGA’s

IRIS HEP Article on a recent workshop on Fast ML techniques

CERN Courier article on older CMS search for low mass dijet resonances

ATLAS Search using ‘trigger-level’ jets

LHCb Search for Dark Photons using fast reconstruction based on a high level trigger

Paper demonstrating the feasibility of running ML models for jet tagging on FPGA’s

Letting the Machines Search for New Physics

Article: “Anomaly Detection for Resonant New Physics with Machine Learning”

Authors: Jack H. Collins, Kiel Howe, Benjamin Nachman

Reference : https://arxiv.org/abs/1805.02664

One of the main goals of LHC experiments is to look for signals of physics beyond the Standard Model; new particles that may explain some of the mysteries the Standard Model doesn’t answer. The typical way this works is that theorists come up with a new particle that would solve some mystery and they spell out how it interacts with the particles we already know about. Then experimentalists design a strategy of how to search for evidence of that particle in the mountains of data that the LHC produces. So far none of the searches performed in this way have seen any definitive evidence of new particles, leading experimentalists to rule out a lot of the parameter space of theorists favorite models.

A summary of searches the ATLAS collaboration has performed. The left columns show model being searched for, what experimental signature was looked at and how much data has been analyzed so far. The color bars show the regions that have been ruled out based on the null result of the search. As you can see, we have already covered a lot of territory.

Despite this extensive program of searches, one might wonder if we are still missing something. What if there was a new particle in the data, waiting to be discovered, but theorists haven’t thought of it yet so it hasn’t been looked for? This gives experimentalists a very interesting challenge, how do you look for something new, when you don’t know what you are looking for? One approach, which Particle Bites has talked about before, is to look at as many final states as possible and compare what you see in data to simulation and look for any large deviations. This is a good approach, but may be limited in its sensitivity to small signals. When a normal search for a specific model is performed one usually makes a series of selection requirements on the data, that are chosen to remove background events and keep signal events. Nowadays, these selection requirements are getting more complex, often using neural networks, a common type of machine learning model, trained to discriminate signal versus background. Without some sort of selection like this you may miss a smaller signal within the large amount of background events.

This new approach lets the neural network itself decide what signal to  look for. It uses part of the data itself to train a neural network to find a signal, and then uses the rest of the data to actually look for that signal. This lets you search for many different kinds of models at the same time!

If that sounds like magic, lets try to break it down. You have to assume something about the new particle you are looking for, and the technique here assumes it forms a resonant peak. This is a common assumption of searches. If a new particle were being produced in LHC collisions and then decaying, then you would get an excess of events where the invariant mass of its decay products have a particular value. So if you plotted the number of events in bins of invariant mass you would expect a new particle to show up as a nice peak on top of a relatively smooth background distribution. This is a very common search strategy, and often colloquially referred to as a ‘bump hunt’. This strategy was how the Higgs boson was discovered in 2012.

A histogram showing the invariant mass of photon pairs. The Higgs boson shows up as a bump at 125 GeV. Plot from here

The other secret ingredient we need is the idea of Classification Without Labels (abbreviated CWoLa, pronounced like koala). The way neural networks are usually trained in high energy physics is using fully labeled simulated examples. The network is shown a set of examples and then guesses which are signal and which are background. Using the true label of the event, the network is told which of the examples it got wrong, its parameters are updated accordingly, and it slowly improves. The crucial challenge when trying to train using real data is that we don’t know the true label of any of data, so its hard to tell the network how to improve. Rather than trying to use the true labels of any of the events, the CWoLA technique uses mixtures of events. Lets say you have 2 mixed samples of events, sample A and sample B, but you know that sample A has more signal events in it than sample B. Then, instead of trying to classify signal versus background directly, you can train a classifier to distinguish between events from sample A and events from sample B and what that network will learn to do is distinguish between signal and background. You can actually show that the optimal classifier for distinguishing the two mixed samples is the same as the optimal classifier of signal versus background. Even more amazing, this technique actually works quite well in practice, achieving good results even when there is only a few percent of signal in one of the samples.

An illustration of the CWoLa method. A classifier trained to distinguish between two mixed samples of signal and background events learns can learn to classify signal versus background. Taken from here

The technique described in the paper combines these two ideas in a clever way. Because we expect the new particle to show up in a narrow region of invariant mass, you can use some of your data to train a classifier to distinguish between events in a given slice of invariant mass from other events. If there is no signal with a mass in that region then the classifier should essentially learn nothing, but if there was a signal in that region that the classifier should learn to separate signal and background. Then one can apply that classifier to select events in the rest of your data (which hasn’t been used in the training) and look for a peak that would indicate a new particle. Because you don’t know ahead of time what mass any new particle should have, you scan over the whole range you have sufficient data for, looking for a new particle in each slice.

The specific case that they use to demonstrate the power of this technique is for new particles decaying to pairs of jets. On the surface, jets, the large sprays of particles produced when quark or gluon is made in a LHC collision, all look the same. But actually the insides of jets, their sub-structure, can contain very useful information about what kind of particle produced it. If a new particle that is produced decays into other particles, like top quarks, W bosons or some a new BSM particle, before decaying into quarks then there will be a lot of interesting sub-structure to the resulting jet, which can be used to distinguish it from regular jets. In this paper the neural network uses information about the sub-structure for both of the jets in event to determine if the event is signal-like or background-like.

The authors test out their new technique on a simulated dataset, containing some events where a new particle is produced and a large number of QCD background events. They train a neural network to distinguish events in a window of invariant mass of the jet pair from other events. With no selection applied there is no visible bump in the dijet invariant mass spectrum. With their technique they are able to train a classifier that can reject enough background such that a clear mass peak of the new particle shows up. This shows that you can find a new particle without relying on searching for a particular model, allowing you to be sensitive to particles overlooked by existing searches.

Demonstration of the bump hunt search. The shaded histogram is the amount of signal in the dataset. The different levels of blue points show the data remaining after applying tighter and tighter selection based on the neural network classifier score. The red line is the predicted amount of background events based on fitting the sideband regions. One can see that for the tightest selection (bottom set of points), the data forms a clear bump over the background estimate, indicating the presence of a new particle

This paper was one of the first to really demonstrate the power of machine-learning based searches. There is actually a competition being held to inspire researchers to try out other techniques on a mock dataset. So expect to see more new search strategies utilizing machine learning being released soon. Of course the real excitement will be when a search like this is applied to real data and we can see if machines can find new physics that us humans have overlooked!

Read More:

  1. Quanta Magazine Article “How Artificial Intelligence Can Supercharge the Search for New Particles”
  2. Blog Post on the CWoLa Method “Training Collider Classifiers on Real Data”
  3. Particle Bites Post “Going Rogue: The Search for Anything (and Everything) with ATLAS”
  4. Blog Post on applying ML to top quark decays “What does Bidirectional LSTM Neural Networks has to do with Top Quarks?”
  5. Extended Version of Original Paper “Extending the Bump Hunt with Machine Learning”