The weather and climate science AI revolution isn’t revolutionary

I get the potential for these tools to model numerous different scenarios more efficiently. But doesn't that still require an infinite stream of new data to help with those predictions?

So how is any of this going to help when the US has decided to regress back to the stone age* -at least when it comes to weather reporting- and the official stance is tantamount to a toddler covering their eyes and thinking themselves invisible.

*That AI surveillance though. So hot right now.
 
Upvote
22 (26 / -4)
Thanks for a sober look without the hysteria that's sweeping tech forums right now in actually using more precise terms than conflating "AI" with the subset of "language models". AI/ML has been around in various forms with various subcategories for decades. It's just the LLM evangelists/demonizers are currently sucking all the oxygen out of the room with no middle room. Indeed, many advanced signal processing systems incorporate some form of machine learning, it's just hasn't historically been called that because of the backlash from the late 80s' cycle of AI hype. We'll probably end up in a similar marketing down cycle at some point where the marketing shifts away from the current conflation trend and call the technologies and techniques that survive the current cycle something else yet again.

But yeah, that ML caveat is the same one that's always been there: a learning algorithm learns from what it's seen before, but usually can't intuit an outlying event's severity. The tendency is to underestimate the problem or, in some cases where there's a circuit breaker somewhere in the code, bail out in the computer algorithmic equivalent of a wail of "I DON'T KNOW! DADDY HELP!" (aka they exit simulation with an error equivalence).
 
Upvote
42 (45 / -3)
Thanks for a sober look without the hysteria that's sweeping tech forums right now in actually using more precise terms than conflating "AI" with the subset of "language models". AI/ML has been around in various forms with various subcategories for decades.

In many cases it has been around substantially longer than the 'machines'. Obviously being able to do math in bulk makes certain techniques more attractive and/or increases the amount of spam you want to classify as such; but there's a reason why a lot of techniques are named after statisticians/high-functioning gambling addicts who were dead before the turn of the 19th century.

It's not the most pernicious product of the craze; but the marketing redefiintion of 'AI' to mean "LLM chatbots; and some image classifiers when we are doing surveillance" has been a very much deliberate disaster for cogent discussion of the area. As well as shamelessly exploited by botherds who cry out for the future of AI science when anything threatens their plan to cash in on conversational chatbots.
 
Upvote
35 (35 / 0)

Demios

Smack-Fu Master, in training
12
Predicting the future with accuracy…aka computational prophesy is HARD.
Predicting the future with accuracy isn't hard when it comes to ML. Predicting the future with HIGH accuracy isn't "hard" either. What is hard is predicting with near perfect accuracy (which would worry a researcher anyway), that said high and near perfect accuracy can still be catastrophically bad depending on the context.
 
Upvote
-14 (0 / -14)
Good article, but I'm surprised it didn't mention Functional Generative Network (FGN) models such as Google's WeatherNext 2.

WeatherNext 2 is a transformer-based FGN that uses an attention-based architecture closely related to those that underpin modern LLMs!

Per Google, it outperforms its predecessor on 99.9% of variables/lead times while generating large ensembles in a fraction of the time required by traditional numerical weather prediction systems.

"Our results show FGN offers substantial improvements over previous ML-based probabilistic weather models, and sets a new state-of-the-art in ensemble forecasting."
https://arxiv.org/html/2506.10772v1

Of course researchers have been applying ML in weather forecasting for decades - but that's not the breakthrough: the breakthrough is that large attention-based architectures, originally developed for language, turned out to be quite good at learning the dynamics of chaotic physical systems to the point where they outperform existing methods rather handily.

EDIT: I missed linking the study - no idea why this is being downvoted without any replies - but I guess this was read as a pro-LLM comment hence warranting a reflexive downvote.
 
Last edited:
Upvote
-2 (9 / -11)
Predicting the future with accuracy isn't hard when it comes to ML. Predicting the future with HIGH accuracy isn't "hard" either. What is hard is predicting with near perfect accuracy (which would worry a researcher anyway), that said high and near perfect accuracy can still be catastrophically bad depending on the context.
With something like the weather--it is hard.

What weather happens in the future is dependent on what is happening now. And while we do have "a lot" of data about now, we really don't--not on a global scale for something like weather forecasting. The US weather balloon program is only launching balloons from selected sites every 12 hours (or less) from many miles apart. Oh, and 2/3 of Earth is oceans from which we have basically no weather balloon measurements at all). We have satellites and radar--but the resolutions still isn't very good--especially for isolated storms.

And even if we did get a lot of weather data--we'd be drowning in so much of it that doing compute with it gets unwieldy.

The only kind of forecasting more methodologically problematic...is the econometrics work the Federal Open Markets Committee needs to do to pilot interest rates. All the markers they're basing decisions off of are 3-12 months out of date. And are prone to massive revisions. Imagine trying to fly a plane through the Grand Canyon with a GPS that only told you where you were a half hour ago with +/-100% accuracy. Which made the "soft landing" scenario under Biden supremely unlikely IMHO but Powell basically did it.
 
Upvote
19 (19 / 0)

JaneDoe

Ars Tribunus Militum
1,535
Subscriptor
“If you have a model that’s now 100x faster and requires much less compute to run, how might one use it differently from the models that require hours on a big supercomputer to run?”

You could run multiple simulations with noise (not random, but smart to account for correlations and error bars) added to the inputs to get more robust predictions. At least that is one way my company makes sure no CPU core stays idle.
 
Upvote
2 (2 / 0)

Troper1138

Wise, Aged Ars Veteran
152
Subscriptor
Given the very real concerns about the huge power needs of "A.I." (and therefore the huge water needs, for cooling) it was interesting to me to see some cases where "A.I."/machine learning can actually use less energy than more conventional forms of computing to produce (possibly) comparable results. In spite of all the hype about "A.I.", maybe there is some there there.
 
Upvote
19 (19 / 0)

Lexus Lunar Lorry

Ars Scholae Palatinae
929
Subscriptor++
Given the very real concerns about the huge power needs of "A.I." (and therefore the huge water needs, for cooling) it was interesting to me to see some cases where "A.I."/machine learning can actually use less energy than more conventional forms of computing to produce (possibly) comparable results. In spite of all the hype about "A.I.", maybe there is some there there.
That's the saddest part of the LLM bubble to me. We have a technology that could revolutionize natural language parsing and data retrieval and scientific modeling, yet we choose to use it to fill the Internet with AI slop and turn white collar workers into reverse centaurs.
 
Upvote
16 (20 / -4)
With something like the weather--it is hard.

What weather happens in the future is dependent on what is happening now. And while we do have "a lot" of data about now, we really don't--not on a global scale for something like weather forecasting. The US weather balloon program is only launching balloons from selected sites every 12 hours (or less) from many miles apart. Oh, and 2/3 of Earth is oceans from which we have basically no weather balloon measurements at all). We have satellites and radar--but the resolutions still isn't very good--especially for isolated storms.

And even if we did get a lot of weather data--we'd be drowning in so much of it that doing compute with it gets unwieldy.

The only kind of forecasting more methodologically problematic...is the econometrics work the Federal Open Markets Committee needs to do to pilot interest rates. All the markers they're basing decisions off of are 3-12 months out of date. And are prone to massive revisions. Imagine trying to fly a plane through the Grand Canyon with a GPS that only told you where you were a half hour ago with +/-100% accuracy. Which made the "soft landing" scenario under Biden supremely unlikely IMHO but Powell basically did it.
I'm still not entirely convinced he did do it. Ironically, I think the hype and hysteria over AI that this article is discussing is the reason he "did it" - the shear quantity of money flying around the market over AI and AI-related advancements is really hard to fathom and has arguably bouyed the economy during a time when we should be seeing some disruption.
 
Upvote
0 (0 / 0)

danan

Ars Scholae Palatinae
698
Subscriptor
That's the saddest part of the LLM bubble to me. We have a technology that could revolutionize natural language parsing and data retrieval and scientific modeling, yet we choose to use it to fill the Internet with AI slop and turn white collar workers into reverse centaurs.
I was under the impression that technology was being used for natural language parsing, et. al. And providing revolutionary results. Granted, a lot of the energy and expense seems to be going to the AI slop, but the other uses are there as well.
 
Upvote
1 (1 / 0)

linnen

Ars Tribunus Militum
2,867
Subscriptor
With something like the weather--it is hard.

What weather happens in the future is dependent on what is happening now. And while we do have "a lot" of data about now, we really don't--not on a global scale for something like weather forecasting. The US weather balloon program is only launching balloons from selected sites every 12 hours (or less) from many miles apart. Oh, and 2/3 of Earth is oceans from which we have basically no weather balloon measurements at all). We have satellites and radar--but the resolutions still isn't very good--especially for isolated storms.

And even if we did get a lot of weather data--we'd be drowning in so much of it that doing compute with it gets unwieldy.

The only kind of forecasting more methodologically problematic...is the econometrics work the Federal Open Markets Committee needs to do to pilot interest rates. All the markers they're basing decisions off of are 3-12 months out of date. And are prone to massive revisions. Imagine trying to fly a plane through the Grand Canyon with a GPS that only told you where you were a half hour ago with +/-100% accuracy. Which made the "soft landing" scenario under Biden supremely unlikely IMHO but Powell basically did it.
Cuts to science programs are already cutting into balloon readings. Ocean readings will be killed as the NOAA buoys are getting pulled out.
 
Upvote
10 (11 / -1)

Major Major

Ars Praetorian
559
Subscriptor
Cuts to science programs are already cutting into balloon readings. Ocean readings will be killed as the NOAA buoys are getting pulled out.
Much easier to sharpie up some fake data, slop out politically acceptable forecasts, and threaten anyone who believes their lying eyes over dear leader’s preferred narrative. Lysenkoism is alive and well in the USA.
 
Upvote
6 (8 / -2)

DDopson

Ars Praefectus
3,014
Subscriptor++
This part of the article is incorrect:
A common method is backpropagation, which identifies the data that had the most leverage on a given prediction.
Back propagation is the core mathematical foundation for how we train models, not just the “transformer” models behind today LLMs, but also the simple neural networks and LSTMs that preceded transformer. convolutional image models, models that fold proteins, and pretty much everything that gets “trained”, with a notable exception for Geoffrey Hinton's research on forward-forward learning. The word "model" is enormously broad, able to accommodate plenty of things that don't train with backprop, such as the linear regression fits we all learned in high school, or their higher dimensional equivalents. Or large parts of a climate model are going to be programmed directly based on first principles rather than being learned from a dataset.

Back propagation doesn’t tell us “[which] data had the most leverage on a given prediction”, not in the sense of looking at a test time prediction and determining which training examples influenced that outcome. Backpropagation runs on every training example and computes a “gradient” over the space of all learned parameters saying in which direction we should adjust the parameters to most quickly reduce the training loss on the example that was just evaluated. This gradient is scaled by a “learning rate” and we adjust all parameters in the model by the resulting amount before processing the next batch of examples (real optimizers are more sophisticated than this, with momentum, and sometimes second order derivatives, but the core intuition holds). These gradients are enormous, roughly the same size as the model being trained, so it would be prohibitively expensive to keep them around for every example in the training data, and anyways the gradients from earlier training examples become increasingly meaningless as the model evolves. If we recompute those examples later in training, we would get a different gradient. It’s not feasible to use back prop gradients to determine which training examples influenced a prediction. Not without enough additional cleverness that the technique would have its own name.

A weaker claim would be that the gradient does sort of tell you which parts of the input influenced the outcome in which direction, but it does this only for each dimension individually, and models are notoriously nonlinear so the gradient is only valid for small perturbations around the values that were computed. This means a that even if a word from the input has dimensional values that almost exclusively have a negative gradient, suggesting that if you reduce the values of those dimensions slightly, then the output will move in the direction of the training goal, that doesn’t necessarily predict what happens if you remove that word from the input. The nonlinear effects could cause the output to go in the opposite direction of what the gradient suggested. Or you could overshoot and be wrong in the opposite direction. Anything could happen and you must compute that alternative input to know for sure.

The article was decent overall but there are a few parts that feel like “buzzword bingo” in that they are name checking technical terms of art, but using them in a way that is almost unrecognizable. The video linked in that quote is a great intro lecture by perhaps the best math communicator in the world; highly recommended.
 
Last edited:
Upvote
12 (12 / 0)

PhaseShifter

Ars Tribunus Angusticlavius
8,143
Subscriptor++
Cuts to science programs are already cutting into balloon readings. Ocean readings will be killed as the NOAA buoys are getting pulled out.
I don't think you understand the current adminstration. My guess is, they won't be yanking the buoys out of the ocean. Based on my observations of this administration in action, they'll be weighting the buoys down with plastic bottles full of mercury to sink them to the bottom of the ocean.
 
Upvote
-2 (1 / -3)

Navalia Vigilate

Ars Praefectus
3,169
Subscriptor++
I get the potential for these tools to model numerous different scenarios more efficiently. But doesn't that still require an infinite stream of new data to help with those predictions?

So how is any of this going to help when the US has decided to regress back to the stone age* -at least when it comes to weather reporting- and the official stance is tantamount to a toddler covering their eyes and thinking themselves invisible.

*That AI surveillance though. So hot right now.
The scientist in the article noting that as the climate warms the atmosphere will become thicker and thus have clouds heights higher than ever before recorded as a good indicator that if we reduce scientific measuring of the climate now, existing models will degrade in accuracy over time due to a lack of new data about these atmospheric changes.

No reason to pull out a soap box at this point in the comment, we all know the consequences.
 
Upvote
2 (2 / 0)

PhaseShifter

Ars Tribunus Angusticlavius
8,143
Subscriptor++
The scientist in the article noting that as the climate warms the atmosphere will become thicker and thus have clouds heights higher than ever before recorded as a good indicator that if we reduce scientific measuring of the climate now, existing models will degrade in accuracy over time due to a lack of new data about these atmospheric changes.

No reason to pull out a soap box at this point in the comment, we all know the consequences.
For a certain subset of the population, it's a self-solving problem.

Step 1: Claim "if you can't predict the weather in a week, you can't predict the weather in 100 years."
Step 2: Step up greenhouse gas releases until current short-term weather models' failure rates drastically increase.
Step 3: Claim that failure as proof that climatologists don't know what they're talking about.
 
Upvote
0 (1 / -1)
There are also techniques that can make the black box a bit more transparent—often described as “explainable AI.” A common method is backpropagation, which identifies the data that had the most leverage on a given prediction. To return to our bird identification model, backpropagation can work backward from its prediction that your photo contained a Northern Cardinal to highlight the specific pixels that clinched that classification.

Backpropagation isn't an explainability technique: it's the algorithm used to compute gradients during training.
 
Upvote
8 (8 / 0)

Veritas super omens

Ars Legatus Legionis
26,734
Subscriptor++
NO. This is a blatant abuse of our precious compute resources. I demand congress pass a law that prohibits this type of thing. We must make sure our compute is used for the purpose that God intented. Which is to sell each and every American an exponentially larger pile of plastic crap over time.


/s
 
Upvote
-1 (1 / -2)
“If someone gave us fifty GPUs for two months, we could just make a huge amount of progress,”

Maybe ask a big tech company if they can borrow a few GPUs! I'm sure they'd be happy to help. They have hundreds of thousands sitting in warehouses because they bought up the entire market with nowhere to put them, and now they're all gathering dust while everyone else can't afford the equipment they need.
 
Upvote
0 (2 / -2)

ScottJohnson

Ars Tribunus Militum
2,846
Subscriptor
This part of the article is incorrect:

Back propagation is the core mathematical foundation for how we train models, not just the “transformer” models behind today LLMs, but also the simple neural networks and LSTMs that preceded transformer. convolutional image models, models that fold proteins, and pretty much everything that gets “trained”, with a notable exception for Geoffrey Hinton's research on forward-forward learning. The word "model" is enormously broad, able to accommodate plenty of things that don't train with backprop, such as the linear regression fits we all learned in high school, or their higher dimensional equivalents. Or large parts of a climate model are going to be programmed directly based on first principles rather than being learned from a dataset.

Back propagation doesn’t tell us “[which] data had the most leverage on a given prediction”, not in the sense of looking at a test time prediction and determining which training examples influenced that outcome. Backpropagation runs on every training example and computes a “gradient” over the space of all learned parameters saying in which direction we should adjust the parameters to most quickly reduce the training loss on the example that was just evaluated. This gradient is scaled by a “learning rate” and we adjust all parameters in the model by the resulting amount before processing the next batch of examples (real optimizers are more sophisticated than this, with momentum, and sometimes second order derivatives, but the core intuition holds). These gradients are enormous, roughly the same size as the model being trained, so it would be prohibitively expensive to keep them around for every example in the training data, and anyways the gradients from earlier training examples become increasingly meaningless as the model evolves. If we recompute those examples later in training, we would get a different gradient. It’s not feasible to use back prop gradients to determine which training examples influenced a prediction. Not without enough additional cleverness that the technique would have its own name.

A weaker claim would be that the gradient does sort of tell you which parts of the input influenced the outcome in which direction, but it does this only for each dimension individually, and models are notoriously nonlinear so the gradient is only valid for small perturbations around the values that were computed. This means a that even if a word from the input has dimensional values that almost exclusively have a negative gradient, suggesting that if you reduce the values of those dimensions slightly, then the output will move in the direction of the training goal, that doesn’t necessarily predict what happens if you remove that word from the input. The nonlinear effects could cause the output to go in the opposite direction of what the gradient suggested. Or you could overshoot and be wrong in the opposite direction. Anything could happen and you must compute that alternative input to know for sure.

The article was decent overall but there are a few parts that feel like “buzzword bingo” in that they are name checking technical terms of art, but using them in a way that is almost unrecognizable. The video linked in that quote is a great intro lecture by perhaps the best math communicator in the world; highly recommended.
Thanks for catching this, I was trying to refer to the layer-wise relevance propagation technique covered in the linked "explainable AI" paper and got my terms twisted:

"Given an input sample and an output pixel, LRP reveals which features in the input contributed the most in deriving the value of the output. This is accomplished by sequentially propagating backwards the relevance from the output pixel to the neurons of the previous layers and eventually to the input features."
 
Upvote
4 (4 / 0)

Fatesrider

Ars Legatus Legionis
25,448
Subscriptor
It feels like there’s no escaping AI right now, whether you’re trying to type a sentence without being interrupted by a digital “assistant” or struggling to find a new refrigerator that doesn’t require a Wi-Fi connection for some reason. You’d be forgiven for wondering if we’re in the midst of a quantum leap in tech or whether people are just hyping up a heap of slop.
Thank you for this article. It's been needed for a while.

There's USEFUL AI and there's slop.

So it's both.

I generally categorize useful AI as ML or Machine Learning. It's an analysis tool, and a very useful one. It's been honed and refined for ages. Its only draw back is when it's asked to be accurately predictive, and the drawback requires it to rely on data that no longer is the established norm. So our weather reports (among other things) still are only 10% accurate 10 days out like they've been forever. But you probably will get the weather it predicts for the next day by about 80%, which is an improvement over the 50% it used to be not so long ago. (Recalling a weather report for a "light dusting of snow" in Portland, OR 30+ years ago and we got like 8 inches).

But "assistants" are not ML. They are LLM's, or variations on that theme. And we know how fucked up they can get.

So knowing the source of the analysis or prediction is a good thing for being able to judge the general reliability of that thing. ML, generally reliable. LLM's, generally unreliable.

Some of each are more reliable than others within their class, but overall, I think as a general assessment of each kind of "AI", each fit that paradigm.

The distinction is important for the trust factor. Knowing what you can and can't trust is getting harder to distinguish, mostly because the LLM slop (or variations on that theme) are being forced on a generally unwilling population. The ML thing has been "how predictive science is often done" for ages.

Granted, ML's predictive outcomes aren't always accurate, but it can only go on what we've seen. New shit will fuck it up for prediction every time. That's why the Paris Accords, when it was established, thought we'd be able to keep climate change from permanently going over 1.5 C by 2050. We didn't have enough data to know that so the predictive analysis there was WAY off.

Now they're predicting we'll do that this year, or next, as they discover El Nino is actually acting as a tipping point, and not just a cyclic global oceanic temperature phenomenon. So ML can improve with more recent data.

LLM's? I'm going to go with a standard "nope" on that.
 
Upvote
-1 (2 / -3)

graylshaped

Ars Legatus Legionis
68,703
Subscriptor++
In many cases it has been around substantially longer than the 'machines'. Obviously being able to do math in bulk makes certain techniques more attractive and/or increases the amount of spam you want to classify as such; but there's a reason why a lot of techniques are named after statisticians/high-functioning gambling addicts who were dead before the turn of the 19th century.
I would argue with you over how gambling has influenced language, but I am enjoying a sandwich right now.

To the article, I appreciate the author taking a moment to add this simple sentence: "In all these models, “AI” refers to machine learning," to distinguish between weather forecasting and climate modeling, and to describe how the inherent data smoothing built in to the technology can lead to dismissing potential for events that are wholly statistically valid and well with the capacity of the physical world to create.
 
Upvote
2 (3 / -1)

linnen

Ars Tribunus Militum
2,867
Subscriptor
I don't think you understand the current adminstration. My guess is, they won't be yanking the buoys out of the ocean. Based on my observations of this administration in action, they'll be weighting the buoys down with plastic bottles full of mercury to sink them to the bottom of the ocean.
Is it one of those, "They couldn't possibly plan/do that!" things
Source.
Moving the objectives of Project 2025 one step closer to completion, the National Science Foundation is removing 900 ocean data collecting buoys that cost more than $370 million to install. If left in place, the buoys would have continued to provide climate-related data to scientific researchers for another 15 years - an outcome that NSF's plan will prevent, saving taxpayers nearly $50 million per year.
 
Upvote
2 (3 / -1)

Purple Cow

Smack-Fu Master, in training
25
I get the potential for these tools to model numerous different scenarios more efficiently. But doesn't that still require an infinite stream of new data to help with those predictions?

So how is any of this going to help when the US has decided to regress back to the stone age* -at least when it comes to weather reporting- and the official stance is tantamount to a toddler covering their eyes and thinking themselves invisible.
It's going to help because the planet is a lot bigger than the United States, and a lot of scientists live and work elsewhere in the world. Some of the work described in the article is being done in Europe, and the US isn't the only country with satellite imaging, including weather satellites.

Yes, the current US administration is systematically destroying areas of science, and that's going to hurt research, especially in subjects that affect the US more than other parts of the world, or that the US was doing a lot of the work in, like tornado predictions. Climate change is a global issue, and everywhere in or near the Pacific or Indian Oceans needs to pay attention to tropical cyclones.
 
Upvote
1 (1 / 0)
Backpropagation isn't an explainability technique: it's the algorithm used to compute gradients during training.

Yes it is.

The reason back propagation is used to train networks is because it is a mathematical way to measure how sensitive a given output is to changes in any given node/weight/input1. Knowing this sensitivity allows each weight to be adjusted during training but it can also sometimes provide some insight into what specific inputs nodes/weights/inputs contribute to a given output. It may not, in general, provide great insight but it can sometimes be helpful.

1For those who've had calc III, it's just the multi-variable chain rule. The foundations of machine learning are remarkably simple.
 
Last edited:
Upvote
0 (2 / -2)
Yes it is.

The reason back propagation is used to train networks is because it is a mathematical way to measure how sensitive a given output is to changes in any given node/weight/input1. Knowing this sensitivity allows each weight to be adjusted during training but it can also sometimes provide some insight into what specific inputs nodes/weights/inputs contribute to a given output. It may not, in general, provide great insight but it can sometimes be helpful.

1For those who've had calc III, it's just the multi-variable chain rule. The foundations of machine learning are remarkably simple.
As I wrote, backprop is the general algorithm for computing gradients through a network: those gradients can then be used for training, attribution/saliency maps etc.

That's a bit like saying 'the chain rule is an explainability technique' because some explainability methods rely on derivatives.
 
Upvote
0 (0 / 0)
As I wrote, backprop is the general algorithm for computing gradients through a network: those gradients can then be used for training, attribution/saliency maps etc.

That's a bit like saying 'the chain rule is an explainability technique' because some explainability methods rely on derivatives.
Tell me you don't know what the chain rule is without telling me you don't know what the chain rule is.
 
Upvote
0 (1 / -1)