Too Massive? New measurement of the W boson’s mass sparks intrigue

This is part one of our coverage of the CDF W mass result covering its implications. Read about the details of the measurement in a sister post here!

Last week the physics world was abuzz with the latest results from an experiment that stopped running a decade ago. Some were heralding this as the beginning of a breakthrough in fundamental physics, headlines read “Shock result in particle experiment could spark physics revolution” (BBC). So what exactly is all the fuss about?

The result itself is an ultra-precise measurement of the mass of the W boson. The W boson is one of the carriers of weak force and this measurement pegged its mass at 80,433 MeV with an uncertainty of 9 MeV. The excitement is coming because this value disagrees with the prediction from our current best theory of particle physics, the Standard Model. In theoretical structure of the Standard Model the masses of the gauge bosons are all interrelated. In the Standard Model the mass of the W boson can be computed based on the mass of the Z as well as few other parameters in the theory (like the weak mixing angle). In a first approximation (ie to the lowest order in perturbation theory), the mass of the W boson is equal to the mass of the Z boson times the cosine of the weak mixing angle. Based on other measurements that have been performed including the Z mass, the Higgs mass, the lifetime of muons and others, the Standard Model predicts that the mass of the W boson should be 80,357 (with an uncertainty of 6 MeV). So the two numbers disagree quite strongly, at the level of 7 standard deviations.

If the measurement and the Standard Model prediction are both correct, this would imply that there is some deficiency in the Standard Model; some new particle interacting with the W boson whose effects haven’t been unaccounted for. This would be welcome news to particle physicists, as we know that the Standard Model is an incomplete theory but have been lacking direct experimental confirmation of its deficiencies. The size of the discrepancy would also mean that whatever new particle was causing the deviation may also be directly detectable within our current or near future colliders.

If this discrepancy is real, exactly what new particles would this entail? Judging based on the 30+ (and counting) papers released on the subject in the last week, there are a good number of possibilities. Some examples include extra Higgs bosons, extra Z-like bosons, and vector-like fermions. It would take additional measurements and direct searches to pick out exactly what the culprit was. But it would hopefully give experimenters definite targets of particles to look for, which would go a long way in advancing the field.

But before everyone starts proclaiming the Standard Model dead and popping champagne bottles, its important to take stock of this new CDF measurement in the larger context. Measurements of the W mass are hard, that’s why it has taken the CDF collaboration over 10 years to publish this result since they stopped taking data. And although this measurement is the most precise one to date, several other W mass measurements have been performed by other experiments.

The Other Measurements

A plot summarizing the various W mass measurements performed to date
A summary of all the W mass measurements performed to date (black dots) with their uncertainties (blue bars) as compared to the the Standard Model prediction (yellow band). One can see that this new CDF result is in tension with previous measurements. (source)

Previous measurements of the W mass have come from experiments at the Large Electron-Positron collider (LEP), another experiment at the Tevatron (D0) and experiments at the LHC (ATLAS and LHCb). Though none of these were as precise as this new CDF result, they had been painting a consistent picture of a value in agreement with the Standard Model prediction. If you take the average of these other measurements, their value differs from the CDF measurement the level about 4 standard deviations, which is quite significant. This discrepancy seems large enough that it is unlikely to arise from purely random fluctuation, and likely means that either some uncertainties have been underestimated or something has been overlooked in either the previous measurements or this new one.

What one would like are additional, independent, high precision measurements that could either confirm the CDF value or the average value of the previous measurements. Unfortunately it is unlikely that such a measurement will come in the near future. The only currently running facility capable of such a measurement is the LHC, but it will be difficult for experiments at the LHC to rival the precision of this CDF one.

W mass measurements are somewhat harder at the LHC than the Tevatron for a few reasons. First of all the LHC is proton-proton collider, while the Tevatron was a proton-antiproton collider, and the LHC also operates at a higher collision energy than the Tevatron. Both differences cause W bosons produced at the LHC to have more momentum than those produced at the Tevatron. Modeling of the W boson’s momentum distribution can be a significant uncertainty of its mass measurement, and the extra momentum of W’s at the LHC makes this a larger effect. Additionally, the LHC has a higher collision rate, meaning that each time a W boson is produced there are actually tens of other collisions laid on top (rather than only a few other collisions like at the Tevatron). These extra collisions are called pileup and can make it harder to perform precision measurements like these. In particular for the W mass measurement, the neutrino’s momentum has to be inferred from the momentum imbalance in the event, and this becomes harder when there are many collisions on top of each other. Of course W mass measurements are possible at the LHC, as evidenced by ATLAS and LHCb’s already published results. And we can look forward to improved results from ATLAS and LHCb as well as a first result from CMS. But it may be very difficult for them to reach the precision of this CDF result.

A histogram of the transverse mass of the W from the ATLAS result. Showing how 50 MeV shifts in the W mass change the spectrum by extremely small amounts (a few tenths of a percent).
A plot of the transverse mass (one of the variables used in a measurement) of the W from the ATLAS measurement. The red and yellow lines show how little the distribution changes if the W mass changes by 50 MeV, which is around two and half times the uncertainty of the ATLAS result. These shifts change the distribution by only a few tenths of a percent, illustrating the difficulty involved. (source)

The Future

A future electron positron collider would be able to measure the W mass extremely precisely by using an alternate method. Instead of looking at the W’s decay, the mass could be measured through its production, by scanning the energy of the electron beams very close to the threshold to produce two W bosons. This method should offer precision significantly better than even this CDF result. However any measurement from a possible future electron positron collider won’t come for at least a decade.

In the coming months, expect this new CDF measurement to receive a lot buzz. Experimentalists will be poring over the details trying to figure out why it is in tension with previous measurements and working hard to produce new measurements from LHC data. Meanwhile theorists will write a bunch of papers detailing the possibilities of what new particles could explain the discrepancy and if there is a connection to other outstanding anomalies (like the muon g-2). But the big question of whether we are seeing the first real crack in the Standard Model or there is some mistake in one or more of the measurements is unlikely to be answered for a while.

If you want to learn about how the measurement actually works, check out this sister post!

Read More:

Cern Courier “CDF sets W mass against the Standard Model

Blog post on the CDF result from an (ATLAS) expert on W mass measurements “[Have we] finally found new physics with the latest W boson mass measurement?”

PDG Review “Electroweak Model and Constraints on New Physics

Measuring the Tau’s g-2 Too

Title : New physics and tau g2 using LHC heavy ion collisions

Authors: Lydia Beresford and Jesse Liu

Reference: https://arxiv.org/abs/1908.05180

Since April, particle physics has been going crazy with excitement over the recent announcement of the muon g-2 measurement which may be our first laboratory hint of physics beyond the Standard Model. The paper with the new measurement has racked up over 100 citations in the last month. Most of these papers are theorists proposing various models to try an explain the (controversial) discrepancy between the measured value of the muon’s magnetic moment and the Standard Model prediction. The sheer number of papers shows there are many many models that can explain the anomaly. So if the discrepancy is real,  we are going to need new measurements to whittle down the possibilities.

Given that the current deviation is in the magnetic moment of the muon, one very natural place to look next would be the magnetic moment of the tau lepton. The tau, like the muon, is a heavier cousin of the electron. It is the heaviest lepton, coming in at 1.78 GeV, around 17 times heavier than the muon. In many models of new physics that explain the muon anomaly the shift in the magnetic moment of a lepton is proportional to the mass of the lepton squared. This would explain why we are a seeing a discrepancy in the muon’s magnetic moment and not the electron (though there is a actually currently a small hint of a deviation for the electron too). This means the tau should be 280 times more sensitive than the muon to the new particles in these models. The trouble is that the tau has a much shorter lifetime than the muon, decaying away in just 10-13 seconds. This means that the techniques used to measure the muons magnetic moment, based on magnetic storage rings, won’t work for taus. 

Thats where this new paper comes in. It details a new technique to try and measure the tau’s magnetic moment using heavy ion collisions at the LHC. The technique is based on light-light collisions (previously covered on Particle Bites) where two nuclei emit photons that then interact to produce new particles. Though in classical electromagnetism light doesn’t interact with itself (the beam from two spotlights pass right through each other) at very high energies each photon can split into new particles, like a pair of tau leptons and then those particles can interact. Though the LHC normally collides protons, it also has runs colliding heavier nuclei like lead as well. Lead nuclei have more charge than protons so they emit high energy photons more often than protons and lead to more light-light collisions than protons. 

Light-light collisions which produce tau leptons provide a nice environment to study the interaction of the tau with the photon. A particles magnetic properties are determined by its interaction with photons so by studying these collisions you can measure the tau’s magnetic moment. 

However studying this process is be easier said than done. These light-light collisions are “Ultra Peripheral” because the lead nuclei are not colliding head on, and so the taus produced generally don’t have a large amount of momentum away from the beamline. This can make them hard to reconstruct in detectors which have been designed to measure particles from head on collisions which typically have much more momentum. Taus can decay in several different ways, but always produce at least 1 neutrino which will not be detected by the LHC experiments further reducing the amount of detectable momentum and meaning some information about the collision will lost. 

However one nice thing about these events is that they should be quite clean in the detector. Because the lead nuclei remain intact after emitting the photon, the taus won’t come along with the bunch of additional particles you often get in head on collisions. The level of background processes that could mimic this signal also seems to be relatively minimal. So if the experimental collaborations spend some effort in trying to optimize their reconstruction of low momentum taus, it seems very possible to perform a measurement like this in the near future at the LHC. 

The authors of this paper estimate that such a measurement with a the currently available amount of lead-lead collision data would already supersede the previous best measurement of the taus anomalous magnetic moment and further improvements could go much farther. Though the measurement of the tau’s magnetic moment would still be far less precise than that of the muon and electron, it could still reveal deviations from the Standard Model in realistic models of new physics. So given the recent discrepancy with the muon, the tau will be an exciting place to look next!

Read More:

An Anomalous Anomaly: The New Fermilab Muon g-2 Results

When light and light collide

Another Intriguing Hint of New Physics Involving Leptons

Letting the Machines Search for New Physics

Article: “Anomaly Detection for Resonant New Physics with Machine Learning”

Authors: Jack H. Collins, Kiel Howe, Benjamin Nachman

Reference : https://arxiv.org/abs/1805.02664

One of the main goals of LHC experiments is to look for signals of physics beyond the Standard Model; new particles that may explain some of the mysteries the Standard Model doesn’t answer. The typical way this works is that theorists come up with a new particle that would solve some mystery and they spell out how it interacts with the particles we already know about. Then experimentalists design a strategy of how to search for evidence of that particle in the mountains of data that the LHC produces. So far none of the searches performed in this way have seen any definitive evidence of new particles, leading experimentalists to rule out a lot of the parameter space of theorists favorite models.

A summary of searches the ATLAS collaboration has performed. The left columns show model being searched for, what experimental signature was looked at and how much data has been analyzed so far. The color bars show the regions that have been ruled out based on the null result of the search. As you can see, we have already covered a lot of territory.

Despite this extensive program of searches, one might wonder if we are still missing something. What if there was a new particle in the data, waiting to be discovered, but theorists haven’t thought of it yet so it hasn’t been looked for? This gives experimentalists a very interesting challenge, how do you look for something new, when you don’t know what you are looking for? One approach, which Particle Bites has talked about before, is to look at as many final states as possible and compare what you see in data to simulation and look for any large deviations. This is a good approach, but may be limited in its sensitivity to small signals. When a normal search for a specific model is performed one usually makes a series of selection requirements on the data, that are chosen to remove background events and keep signal events. Nowadays, these selection requirements are getting more complex, often using neural networks, a common type of machine learning model, trained to discriminate signal versus background. Without some sort of selection like this you may miss a smaller signal within the large amount of background events.

This new approach lets the neural network itself decide what signal to  look for. It uses part of the data itself to train a neural network to find a signal, and then uses the rest of the data to actually look for that signal. This lets you search for many different kinds of models at the same time!

If that sounds like magic, lets try to break it down. You have to assume something about the new particle you are looking for, and the technique here assumes it forms a resonant peak. This is a common assumption of searches. If a new particle were being produced in LHC collisions and then decaying, then you would get an excess of events where the invariant mass of its decay products have a particular value. So if you plotted the number of events in bins of invariant mass you would expect a new particle to show up as a nice peak on top of a relatively smooth background distribution. This is a very common search strategy, and often colloquially referred to as a ‘bump hunt’. This strategy was how the Higgs boson was discovered in 2012.

A histogram showing the invariant mass of photon pairs. The Higgs boson shows up as a bump at 125 GeV. Plot from here

The other secret ingredient we need is the idea of Classification Without Labels (abbreviated CWoLa, pronounced like koala). The way neural networks are usually trained in high energy physics is using fully labeled simulated examples. The network is shown a set of examples and then guesses which are signal and which are background. Using the true label of the event, the network is told which of the examples it got wrong, its parameters are updated accordingly, and it slowly improves. The crucial challenge when trying to train using real data is that we don’t know the true label of any of data, so its hard to tell the network how to improve. Rather than trying to use the true labels of any of the events, the CWoLA technique uses mixtures of events. Lets say you have 2 mixed samples of events, sample A and sample B, but you know that sample A has more signal events in it than sample B. Then, instead of trying to classify signal versus background directly, you can train a classifier to distinguish between events from sample A and events from sample B and what that network will learn to do is distinguish between signal and background. You can actually show that the optimal classifier for distinguishing the two mixed samples is the same as the optimal classifier of signal versus background. Even more amazing, this technique actually works quite well in practice, achieving good results even when there is only a few percent of signal in one of the samples.

An illustration of the CWoLa method. A classifier trained to distinguish between two mixed samples of signal and background events learns can learn to classify signal versus background. Taken from here

The technique described in the paper combines these two ideas in a clever way. Because we expect the new particle to show up in a narrow region of invariant mass, you can use some of your data to train a classifier to distinguish between events in a given slice of invariant mass from other events. If there is no signal with a mass in that region then the classifier should essentially learn nothing, but if there was a signal in that region that the classifier should learn to separate signal and background. Then one can apply that classifier to select events in the rest of your data (which hasn’t been used in the training) and look for a peak that would indicate a new particle. Because you don’t know ahead of time what mass any new particle should have, you scan over the whole range you have sufficient data for, looking for a new particle in each slice.

The specific case that they use to demonstrate the power of this technique is for new particles decaying to pairs of jets. On the surface, jets, the large sprays of particles produced when quark or gluon is made in a LHC collision, all look the same. But actually the insides of jets, their sub-structure, can contain very useful information about what kind of particle produced it. If a new particle that is produced decays into other particles, like top quarks, W bosons or some a new BSM particle, before decaying into quarks then there will be a lot of interesting sub-structure to the resulting jet, which can be used to distinguish it from regular jets. In this paper the neural network uses information about the sub-structure for both of the jets in event to determine if the event is signal-like or background-like.

The authors test out their new technique on a simulated dataset, containing some events where a new particle is produced and a large number of QCD background events. They train a neural network to distinguish events in a window of invariant mass of the jet pair from other events. With no selection applied there is no visible bump in the dijet invariant mass spectrum. With their technique they are able to train a classifier that can reject enough background such that a clear mass peak of the new particle shows up. This shows that you can find a new particle without relying on searching for a particular model, allowing you to be sensitive to particles overlooked by existing searches.

Demonstration of the bump hunt search. The shaded histogram is the amount of signal in the dataset. The different levels of blue points show the data remaining after applying tighter and tighter selection based on the neural network classifier score. The red line is the predicted amount of background events based on fitting the sideband regions. One can see that for the tightest selection (bottom set of points), the data forms a clear bump over the background estimate, indicating the presence of a new particle

This paper was one of the first to really demonstrate the power of machine-learning based searches. There is actually a competition being held to inspire researchers to try out other techniques on a mock dataset. So expect to see more new search strategies utilizing machine learning being released soon. Of course the real excitement will be when a search like this is applied to real data and we can see if machines can find new physics that us humans have overlooked!

Read More:

  1. Quanta Magazine Article “How Artificial Intelligence Can Supercharge the Search for New Particles”
  2. Blog Post on the CWoLa Method “Training Collider Classifiers on Real Data”
  3. Particle Bites Post “Going Rogue: The Search for Anything (and Everything) with ATLAS”
  4. Blog Post on applying ML to top quark decays “What does Bidirectional LSTM Neural Networks has to do with Top Quarks?”
  5. Extended Version of Original Paper “Extending the Bump Hunt with Machine Learning”