The simulation of particle collisions at the LHC is a pharaonic task. The messy chromodynamics of protons must be modeled; the statistics of the collision products must reflect the Standard Model; each particle has to travel through the detectors and interact with all the elements in its path. Its presence will eventually be reduced to electronic measurements, which, after all, is all we know about it.
The work of the simulation ends somewhere here, and that of the reconstruction starts; namely to go from electronic signals to particles. Reconstruction is a process common to simulation and to the real world. Starting from the tangle of statistical and detector effects that the actual measurements include, the goal is to divine the properties of the initial collision products.
Now, researchers at the Argonne National Laboratory looked into going from the simulated particles as produced in the collisions (aka “truth objects”) directly to the reconstructed ones (aka “reco objects”): bypassing the steps of the detailed interaction with the detectors and of the reconstruction algorithm could make the studies that use simulations much more speedy and efficient.
The team used a neural network which it trained on simulations of the full set. The goal was to have the network learn to produce the properties of the reco objects when given only the truth objects. The process succeeded in producing the transverse momenta of hadronic jets, and looks suitable for any kind of particle and for other kinematic quantities.
More specifically, the researchers began with two million simulated jet events, fully passed through the ATLAS experiment and the reconstruction algorithm. For each of them, the network took the kinematic properties of the truth jet as input and was trained to achieve the reconstructed transverse momentum.
The network was taught to perform multi-categorization: its output didn’t consist of a single node giving the momentum value, but of 400 nodes, each corresponding to a different range of values. The output of each node was the probability for that particular range. In other words, the result was a probability density function for the reconstructed momentum of a given jet.
The final step was to select the momentum randomly from this distribution. For half a million of test jets, all this resulted in good agreement with the actual reconstructed momenta, specifically within 5% for values above 20 GeV. In addition, it seems that the training was sensitive to the effects of quantities other than the target one (e.g. the effects of the position in the detector), as the neural network was able to pick up on the dependencies between the input variables. Also, hadronic jets are complicated animals, so it is expected that the method will work on other objects just as well.
All in all, this work showed the perspective for neural networks to imitate successfully the effects of the detector and the reconstruction. Simulations in large experiments typically take up loads of time and resources due to their size, intricacy and frequent need for updates in the hardware conditions. Such a shortcut, needing only small numbers of fully processed events, would speed up studies such as optimization of the reconstruction and detector upgrades.
Article: “Anomaly Detection for Resonant New Physics with Machine Learning”
Authors: Jack H. Collins, Kiel Howe, Benjamin Nachman
Reference : https://arxiv.org/abs/1805.02664
One of the main goals of LHC experiments is to look for signals of physics beyond the Standard Model; new particles that may explain some of the mysteries the Standard Model doesn’t answer. The typical way this works is that theorists come up with a new particle that would solve some mystery and they spell out how it interacts with the particles we already know about. Then experimentalists design a strategy of how to search for evidence of that particle in the mountains of data that the LHC produces. So far none of the searches performed in this way have seen any definitive evidence of new particles, leading experimentalists to rule out a lot of the parameter space of theorists favorite models.
Despite this extensive program of searches, one might wonder if we are still missing something. What if there was a new particle in the data, waiting to be discovered, but theorists haven’t thought of it yet so it hasn’t been looked for? This gives experimentalists a very interesting challenge, how do you look for something new, when you don’t know what you are looking for? One approach, which Particle Bites has talked about before, is to look at as many final states as possible and compare what you see in data to simulation and look for any large deviations. This is a good approach, but may be limited in its sensitivity to small signals. When a normal search for a specific model is performed one usually makes a series of selection requirements on the data, that are chosen to remove background events and keep signal events. Nowadays, these selection requirements are getting more complex, often using neural networks, a common type of machine learning model, trained to discriminate signal versus background. Without some sort of selection like this you may miss a smaller signal within the large amount of background events.
This new approach lets the neural network itself decide what signal to look for. It uses part of the data itself to train a neural network to find a signal, and then uses the rest of the data to actually look for that signal. This lets you search for many different kinds of models at the same time!
If that sounds like magic, lets try to break it down. You have to assume something about the new particle you are looking for, and the technique here assumes it forms a resonant peak. This is a common assumption of searches. If a new particle were being produced in LHC collisions and then decaying, then you would get an excess of events where the invariant mass of its decay products have a particular value. So if you plotted the number of events in bins of invariant mass you would expect a new particle to show up as a nice peak on top of a relatively smooth background distribution. This is a very common search strategy, and often colloquially referred to as a ‘bump hunt’. This strategy was how the Higgs boson was discovered in 2012.
The other secret ingredient we need is the idea of Classification Without Labels (abbreviated CWoLa, pronounced like koala). The way neural networks are usually trained in high energy physics is using fully labeled simulated examples. The network is shown a set of examples and then guesses which are signal and which are background. Using the true label of the event, the network is told which of the examples it got wrong, its parameters are updated accordingly, and it slowly improves. The crucial challenge when trying to train using real data is that we don’t know the true label of any of data, so its hard to tell the network how to improve. Rather than trying to use the true labels of any of the events, the CWoLA technique uses mixtures of events. Lets say you have 2 mixed samples of events, sample A and sample B, but you know that sample A has more signal events in it than sample B. Then, instead of trying to classify signal versus background directly, you can train a classifier to distinguish between events from sample A and events from sample B and what that network will learn to do is distinguish between signal and background. You can actually show that the optimal classifier for distinguishing the two mixed samples is the same as the optimal classifier of signal versus background. Even more amazing, this technique actually works quite well in practice, achieving good results even when there is only a few percent of signal in one of the samples.
The technique described in the paper combines these two ideas in a clever way. Because we expect the new particle to show up in a narrow region of invariant mass, you can use some of your data to train a classifier to distinguish between events in a given slice of invariant mass from other events. If there is no signal with a mass in that region then the classifier should essentially learn nothing, but if there was a signal in that region that the classifier should learn to separate signal and background. Then one can apply that classifier to select events in the rest of your data (which hasn’t been used in the training) and look for a peak that would indicate a new particle. Because you don’t know ahead of time what mass any new particle should have, you scan over the whole range you have sufficient data for, looking for a new particle in each slice.
The specific case that they use to demonstrate the power of this technique is for new particles decaying to pairs of jets. On the surface, jets, the large sprays of particles produced when quark or gluon is made in a LHC collision, all look the same. But actually the insides of jets, their sub-structure, can contain very useful information about what kind of particle produced it. If a new particle that is produced decays into other particles, like top quarks, W bosons or some a new BSM particle, before decaying into quarks then there will be a lot of interesting sub-structure to the resulting jet, which can be used to distinguish it from regular jets. In this paper the neural network uses information about the sub-structure for both of the jets in event to determine if the event is signal-like or background-like.
The authors test out their new technique on a simulated dataset, containing some events where a new particle is produced and a large number of QCD background events. They train a neural network to distinguish events in a window of invariant mass of the jet pair from other events. With no selection applied there is no visible bump in the dijet invariant mass spectrum. With their technique they are able to train a classifier that can reject enough background such that a clear mass peak of the new particle shows up. This shows that you can find a new particle without relying on searching for a particular model, allowing you to be sensitive to particles overlooked by existing searches.
This paper was one of the first to really demonstrate the power of machine-learning based searches. There is actually a competition being held to inspire researchers to try out other techniques on a mock dataset. So expect to see more new search strategies utilizing machine learning being released soon. Of course the real excitement will be when a search like this is applied to real data and we can see if machines can find new physics that us humans have overlooked!