In motorsport, there’s nowhere to hide as AI becomes new CFD tool

icosapode

Smack-Fu Master, in training
91
This isn't the LLM/image generator/agentic bullshit that we mostly have a problem with though, right? This is more what we used to just call "machine learning" - training specialized models on specific datasets for a particular purpose.

I have no problem with that. That's not what's pumping the nonsense bubble, except a little tangentially via the tendency to lump everything together under the broad heading of "AI".

I have had some very tiresome arguments with LLM boosters who enjoy conflating these things to construct strawmen.
 
Upvote
121 (126 / -5)
I just want to show my appreciation for the term "data hygiene", as a former medical research assistant, science teacher, and skeptic of all statistics (especially those used in advertising). Analysis is a waste of time without clean data.
[edit: That is, I like the implication that unclean data is kind of gross and embarrassing.]
 
Last edited:
Upvote
30 (32 / -2)

aposm

Smack-Fu Master, in training
96
I'm honestly a bit disappointed to see Ars covering stuff like this, which in many cases is deliberately muddying the waters, without properly clarifying the difference between these machine-learning based simulations, vs "AI" as it now exists primarily as a marketing term for the LLMs which a few doomsday-AGI-cultists want us to believe are actually intelligent.
 
Upvote
3 (35 / -32)

BioRebel32

Seniorius Lurkius
43
Subscriptor++
So this LOOKS like they've used ML to create an advanced lookup table. Something simulators have been using for years for replicate things like tire grip.
It still needs a LOT of ACTUAL data to be trained properly though, and its not super clear if it would be able to understand a completely different part.
This does theoretically make it easier for teams to develop track specific aero adjustments.
 
Upvote
-3 (8 / -11)

Qyygle

Ars Praetorian
493
Subscriptor
“It sounds magical, but the reality is that the accuracy of the model is only guaranteed within specific range of situations that are not too far from what you have already explored,” Baqué told me. “So all the trick and the gap from the idea to the value is to find what are the right workflows, what kind of data do I need to generate to be able to explore what kind of configurations afterward in which type of setting, and how often do I need to retrain my model, all the data hygiene around the design workflow.”
Sounds like they're really using this for final tuning and fast comparisons on data gained from models that've run through the traditional means already.
Looks like a genuinely useful application of the technology for once.
Unless you go to a completely different design, which is probably not allowed within race spec, once you have a large enough dataset of the general shapes and situations needed, doing this will save on hours of actual CFD modeling, where the majority of calculations are essentially repeats of the work that's done already, with the slight tweaks being what you're really after.

This being such a detail oriented sport anyway, it's more than likely that teams would do full model runs on major design changes, reducing the potential downsides of running into circumstances not well represented in the existing data.
 
Upvote
20 (20 / 0)
Is this pattern matching for faster empirical results? In which case, if the underlying assumptions (car -> truck) change the data shape changes and the model is weaker.

Or did the AI actually generate faster CFD code/algorithm/optimizations that is inherently a lot faster and can generalize over all/many scenarios?
 
Upvote
0 (2 / -2)

TheOldChevy

Ars Tribunus Militum
1,566
Subscriptor
Begore anyone starts complaining about AI using so much power
The reason folks use AI here is because it is orders of magnitude less computationally expensive for these usecases.

Meaning faster results and more iterations.

AI is not always the heavier option :)
Because it stays in the "multi-dimensional" interpolation domain. And I agree that it is a good use of it. There are plenty of good uses of AI. Just not all what is sold to us is a good use.
 
Upvote
18 (20 / -2)
Begore anyone starts complaining about AI using so much power
The reason folks use AI here is because it is orders of magnitude less computationally expensive for these usecases.

Meaning faster results and more iterations.

AI is not always the heavier option :)
I noticed that and I have a question: Why? What's special about this use case and this type of "AI" that makes it so much less computationally expensive than other options?
 
Upvote
2 (7 / -5)
”AI finds value in motorsport, multiplying limited computational fluid dynamics resources.”

Not addressing the point of the article, but my brain is stuck on what the difference would be with this subheading: “Motorsport finds value in AI, multiplying limited computational fluid dynamics resources.
 
Upvote
3 (4 / -1)
I noticed that and I have a question: Why? What's special about this use case and this type of "AI" that makes it so much less computationally expensive than other options?
The cost is front-loaded in training with this tech, then each end result is very cheap.

The old way would be doing a full (heavy) simulation to get each end result.

With just a few full simulations, the old way might be cheaper, but they add up quick not just in energy usage but time.
 
Upvote
24 (24 / 0)

Berserk

Smack-Fu Master, in training
82
Subscriptor
Out of pure coincidence I had a conversation with a colleague before leaving the office where he was doing exactly this, trying to replace a computationally expensive mechanistic model with a specifically trained AI model. If it is successful then it becomes more useful for every day modelling.
 
Upvote
20 (20 / 0)

rfcavity

Smack-Fu Master, in training
71
This is an interesting approach but in practice only works well when trying many small variations within an otherwise unchanged problem.

You need to generate a significant training set specific to the problem setup (which you do by running a bunch of actual CFD sims) and do the actual training. If you don’t plan on running significantly more sims than the size of the training set then you are actually much slower than just doing the CFD. Curiously, the paper does not report the time spent generating the dataset or on the training.

This is an issue because it’s really difficult to generalize the CFD outside of the training set geometries. They point to a slightly larger wing angle going outside the bounds of the data as being a big achievement here which is telling. A completely different body type would be a no go. If most of the wing angles are in the training set already, what are you gaining?
 
Upvote
13 (16 / -3)
I noticed that and I have a question: Why? What's special about this use case and this type of "AI" that makes it so much less computationally expensive than other options?
The other option is to iteratively evaluate the Navier-Stokes equations on every cell in a grid of a few million cells that describe the flow field around the car. Since the flow conditions in any one cell depend on the flow conditions in all the cells around it, you end up with millions of coupled equations and there's no way to solve that in one shot. You have to start with some assumed initial conditions, propagate that through the whole grid, then initialize again with those conditions, propagate that through the grid, repeat ad nauseum until you either (a) reach an iteration that has no significant difference from the previous iteration and is therefore probably close enough to what the real fluid flow would be, or (b) run out of compute credits and give up. The finer the grid, the more accurate the result, and the more computing power you need for each iteration.

For bonus fun, you can have spinny bits (turbomachinery), hot bits (combustors), changes in fluid composition (combustors again), compressible / supersonic flow, mixed-phase flow, several kinds of heat transfer..... a CFD sim of the guts of an airplane engine will tie up a team of engineers and a high-end supercomputer for an absurd amount of time.

CFD is one of a handful of problem domains that will happily consume ALL the computing power you are willing to throw at them, for as long as payments drawn on your bank account will clear. Weather forecasting and seismic data analysis are two others.

If you can give an AI model a few such simulations and say "now take your best guess at what happens if we try a case that's right here in the middle of the parameter space between them" and it can give you a close-enough answer, you're creating significant value for the engineers who need to do this kind of work.
 
Last edited:
Upvote
52 (52 / 0)

Ben G

Ars Tribunus Militum
2,885
Subscriptor
The previous comment is an excellent one. Just an additional thought though. I highly suspect that if these teams find something interesting using their ML models, that they then verify it with a traditional CFD analysis.

A well tuned ML model may let you explore the trade space quickly and narrow down to the changes that might give you an actual advantage on the track.
 
Upvote
17 (17 / 0)
The other option is to iteratively evaluate the Navier-Stokes equations on every cell in a grid of a few million cells that describe the flow field around the car. Since the flow conditions in any one cell depend on the flow conditions in all the cells around it, you end up with millions of coupled equations and there's no way to solve that in one shot. You have to start with some assumed initial conditions, propagate that through the whole grid, then initialize again with those conditions, propagate that through the grid, repeat ad nauseum until you either (a) reach an iteration that has no significant difference from the previous iteration and is therefore probably close enough to what the real fluid flow would be, or (b) run out of compute credits and give up. The finer the grid, the more accurate the result, and the more computing power you need for each iteration.

For bonus fun, you can have spinny bits (turbomachinery), hot bits (combustors), changes in fluid composition (combustors again), compressible / supersonic flow, mixed-phase flow, several kinds of heat transfer..... a CFD sim of the guts of an airplane engine will tie up a team of engineers and a high-end supercomputer for an absurd amount of time.

CFD is one of a handful of problem domains that will happily consume ALL the computing power you are willing to throw at them, for as long as payments drawn on your bank account will clear. Weather forecasting and seismic data analysis are two others.

If you can give an AI model a few such simulations and say "now take your best guess at what happens if we try a case that's right here in the middle of the parameter space between them" and it can give you a close-enough answer, you're creating significant value for the engineers who need to do this kind of work.
👍
And even transformer models (the architecture used by LLMs) can be useful for these tasks, for instance DeepMind's WeatherNext:
https://developers.google.com/weathernext/guides/models
 
Upvote
6 (6 / 0)
Begore anyone starts complaining about AI using so much power
The reason folks use AI here is because it is orders of magnitude less computationally expensive for these usecases.

Meaning faster results and more iterations.

AI is not always the heavier option :)
Using the model saves a lot of power for the constrained teams.
But training an accurate-enough model is extremely expensive, even when restricted to narrow use cases like these (likely less than the astronomical sums incurred by general LLMs, but by how many orders of magnitude?).
Which bring me to the important point : Are the teams kinda cheating by outsourcing the hard computing cycles (training), so that they can run more sims under the cap?
 
Upvote
5 (5 / 0)
Begore anyone starts complaining about AI using so much power
The reason folks use AI here is because it is orders of magnitude less computationally expensive for these usecases.

Meaning faster results and more iterations.

AI is not always the heavier option :)
Dude, literally nobody is complaining about ML and other specialised and often quite helpful AI approaches using so much power. It's the generic useless LLMs that people complain about…
 
Upvote
-1 (4 / -5)
I noticed that and I have a question: Why? What's special about this use case and this type of "AI" that makes it so much less computationally expensive than other options?
@MMarsh's answer is spot on for the specific details of what this is replacing.
On a meta level, we have a process that the AI is replacing that we can use as a baseline energy cost, and the tighter constraints on what this AI is trying to do give it a more "reachable" endpoint than what LLMs and Image Generators are doing. And I say "reachable" because LLMs and Image generators are thrown at open-ended tasks on the basis of "The more compute we throw at this problem, the better!"

The engineers using this AI-powered simulation modeling are only trying to reach an accuracy and error rate, whereas OpenAI is pumping energy into ChatGPT with a mantra of "Let's hook up another dozen datacenters and see what this baby can do!"
 
Upvote
4 (5 / -1)
This isn't the LLM/image generator/agentic bullshit that we mostly have a problem with though, right? This is more what we used to just call "machine learning" - training specialized models on specific datasets for a particular purpose.

I have no problem with that. That's not what's pumping the nonsense bubble, except a little tangentially via the tendency to lump everything together under the broad heading of "AI".

I have had some very tiresome arguments with LLM boosters who enjoy conflating these things to construct strawmen.
Yes, agreed, this is the old school deep learning stuff from early on before they slapped chat bots into it and told it to start writing and answering questions. This is the stuff it excels at, the stuff we saw early reports on that looks so promising. It's repeatedly mutating and iterating towards a solution and then saving the results.

In other words, this is exactly what this tech is best suited for, without an "agentic" in sight. (I'm purposefully misusing the stupid word the industry cooked up for their LLM agents, because it amuses me greatly.)
 
Upvote
16 (16 / 0)

Smeghead

Ars Praefectus
4,634
Subscriptor
Which bring me to the important point : Are the teams kinda cheating by outsourcing the hard computing cycles (training), so that they can run more sims under the cap?
I can only find the sporting regulations from 2025, but appendix 7 lays out the regulations for aero testing restrictions. Section 4 is titled "Restricted CFD (RFCD) Simulations".

It starts by talking about what is and isn't an RFCD, then goes into the details of how computing time is actually defined and measured.

This bit at the beginning stands out (emphasis mine):
For the avoidance of doubt, if any CFD simulation (other than the power unit simulation defined above) reveals information to a Competitor or to an Associate of the Competitor whether directly, via a contracted party or via an external entity working on behalf of a Competitor or for its own purposes and subsequently providing the results of its work to a Competitor, about flows that are gaseous on a F1 car then it is a RCFD simulation.

So, by definition, if it's not a CFD simulation, it can't be a restricted CFD, and so doesn't come under a team's limited CFD time.

Compute time for other tasks doesn't appear to be covered, and since taking the results of existing CFDs and then training models on them doesn't incur another CFD run, that would appear to be fair game. Training the models is undoubtedly very compute-intensive, but there's nothing in the (2025) rules that bans that.

So, typical behaviour in F1 - if the rules don't say you can't do that, then you damn well investigate whether doing that will result in an advantage. It might be this will be something that future rulesets will clamp down on, as it's probably going to become very widespread.
 
Upvote
13 (13 / 0)
I noticed that and I have a question: Why? What's special about this use case and this type of "AI" that makes it so much less computationally expensive than other options?
Gotta love Ars. You guys complain when the general public doesn't understand the nuances of AI, and then downvote a great question like this which serves as a perfect way to help explain some of those nuances to an interested party.

The answer, at a high level, is that CFD is already very computationally expensive at the fidelity you'd need for these sorts of simulations. The AI technique is essentially to learn a computational shortcut to achieve similar fidelity for at least a subset of simulations at much lower cost.

genAI in general is attempting to replicate something humans are already quite good at, just at massive scale. You're competing against the efficiency of a human brain at something it can generally do very well. And then there are all the moral implications.
 
Upvote
6 (8 / -2)

KingKrayola

Ars Tribunus Militum
1,648
Subscriptor
I can only find the sporting regulations from 2025, but appendix 7 lays out the regulations for aero testing restrictions. Section 4 is titled "Restricted CFD (RFCD) Simulations".

It starts by talking about what is and isn't an RFCD, then goes into the details of how computing time is actually defined and measured.

This bit at the beginning stands out (emphasis mine):


So, by definition, if it's not a CFD simulation, it can't be a restricted CFD, and so doesn't come under a team's limited CFD time.

Compute time for other tasks doesn't appear to be covered, and since taking the results of existing CFDs and then training models on them doesn't incur another CFD run, that would appear to be fair game. Training the models is undoubtedly very compute-intensive, but there's nothing in the (2025) rules that bans that.

So, typical behaviour in F1 - if the rules don't say you can't do that, then you damn well investigate whether doing that will result in an advantage. It might be this will be something that future rulesets will clamp down on, as it's probably going to become very widespread.
By that logic, could someone also feed a bunch if wind tunnel test data into the same or similar transformer model and do the same work?

i.e. the innovation is using a transformer to interpolate/extrapolate from a set of data obtained in restrictive conditions, with a lot less CPU time than traditional methods?

Have I got that right? I read the abstract and all I could understand was that I could not understand the abstract.
 
Upvote
4 (4 / 0)
I am anti-LLM, but separately I've been anti-ML for less moral and more scientific reasons. It's because of stuff like this:

“It sounds magical, but the reality is that the accuracy of the model is only guaranteed within a specific range of situations that are not too far from what you have already explored,” Baqué told me. “So all the trick and the gap from the idea to the value is to find what are the right workflows, what kind of data do I need to generate to be able to explore what kind of configurations afterward in which type of setting, and how often do I need to retrain my model, all the data hygiene around the design workflow.”

The reason classical computational fluid dynamics works for almost any non-quantum fluid, even complex non-Newtonian fluids, is that the Navier-Stokes equation being solved is a theoretically robust model that truly describes the motion; it is very complicated mathematically but physically it is just F=dp/dt applied to fluid (not quite F=ma since "m" is changing).

That is not true with these AI interpolations. These are all ad hoc and empirical, with no theoretical basis. And it's probably true that for most cases its good enough, especially if you're going to validate it against a wind tunnel. But you're always losing accuracy when you take these shortcuts, and you'll never know how wrong you are unless you do it properly.
 
Upvote
8 (10 / -2)

Free Thought

Seniorius Lurkius
13
Subscriptor
This is an interesting approach but in practice only works well when trying many small variations within an otherwise unchanged problem.

You need to generate a significant training set specific to the problem setup (which you do by running a bunch of actual CFD sims) and do the actual training. If you don’t plan on running significantly more sims than the size of the training set then you are actually much slower than just doing the CFD. Curiously, the paper does not report the time spent generating the dataset or on the training.

This is an issue because it’s really difficult to generalize the CFD outside of the training set geometries. They point to a slightly larger wing angle going outside the bounds of the data as being a big achievement here which is telling. A completely different body type would be a no go. If most of the wing angles are in the training set already, what are you gaining?
I think you gain the ability to try many permutations quickly, and importantly, you can optimize very fast once you are "in the ball park". The use case is limited for the reasons you highlight, but making these optimizations quickly are apparently valuable to this industry.
 
Upvote
5 (5 / 0)
I am anti-LLM, but separately I've been anti-ML for less moral and more scientific reasons. It's because of stuff like this:



That is not true with these AI interpolations. These are all ad hoc and empirical, with no theoretical basis. And it's probably true that for most cases its good enough, especially if you're going to validate it against a wind tunnel. But you're always losing accuracy when you take these shortcuts, and you'll never know how wrong you are unless you do it properly.
But that's the entire point of loss-minimization gradient descent. When you have an incredibly large search space (massive number of dimensions and coefficients), estimation can be valuable to help you concentrate on "neighborhoods" of likely returns, rather than searching the entire space sequentially (or stochastically). On some level this was also what genetic algorithms gave us, but with far less impact and far more responsibility to encode solution sub-spaces to hybridize against. NNs are all empirical, but that doesn't inherently make their results less actionable.
 
Upvote
6 (6 / 0)
@Dr Gitlin, have you ever considered grand prix level sailing? Sail GP and Americas Cup seem to be right up your alley. The nerdom of the technical aspects of the sport are F1 level, completely at the bleeding edge of speed technology with all of the technical rules weaving through the craft like a motorsport, without the motor but with more spectacular crashes. And if you decide to burn a few hours trying to understand it all, like F1 there are forums of former racers, builders, designers, and too pedantic nerds diving deep into the details.

Good article, thank you as always!
 
Upvote
5 (5 / 0)

HoorayForEverything

Ars Scholae Palatinae
914
Subscriptor
It starts by talking about what is and isn't an RFCD, then goes into the details of how computing time is actually defined and measured.

This bit at the beginning stands out (emphasis mine):
The critical meaning of that bit at the beginning very much depends on how "CFD simulation" is defined though, so you've kind of skipped the important part there.

The actual first para of Appendix 7 (thanks for digging this out!) says "test environment or numerical simulation" and doesn't define either of THOSE terms, so it's essentially useless.

Because this is a simulation. It's a parametric simulation with an unusual internal model, but it is a means of modelling fluid dynamics computationally. Just... not numerically. Or at least not first-order numerically, but it is trained on numerical models.
So, by definition, if it's not a CFD simulation, it can't be a restricted CFD, and so doesn't come under a team's limited CFD time.
Yeah but... see above. It could be characterised as CFD. It's just not approved COTS CFD software.

There's another interesting definition in para 4a:
Solver refers to the program or programs that compute the solution of the equations describing the flow including any extension of the simulation or simulations involving additional numerical computation (for example but not limited, to adjoint computation).

I think the ibnlt clause here is supposed to broader the definition to catch second-order things. I agree it probably doesn't and it is another example of horrifically poor drafting. But a broad read of this could catch all sorts of methodologies and models and would indeed catch parametric simulation, however implemented; and you could certainly argue that the IBM/Dallara doohicky is or includes a "solver"

I think if the intent was to specifically limit the computationally expensive "use Navier-Stokes or the simplifications thereof across a high-resolution grid of cells and crunch this entirely numerically" which was costing everyone an exact kind of fortune, they would have said so.
Compute time for other tasks doesn't appear to be covered
The drafters would have been fearful of accidentally banning spreadsheets I think, but as I said above, it's nowhere near as narrow as they could have made it.

, and since taking the results of existing CFDs and then training models on them doesn't incur another CFD run, that would appear to be fair game.
I'm not sure I'm actually suggesting "simulations involving additional numerical computation (for example but not limited, to adjoint computation)" blocks this but I'm definitely highlighting that this clause exists and is up for discussion...

So, typical behaviour in F1 - if the rules don't say you can't do that, then you damn well investigate whether doing that will result in an advantage. It might be this will be something that future rulesets will clamp down on, as it's probably going to become very widespread.
Yup. And of course we don't know what this year's rules say anyway, but good sleuthing on last year's.
 
Upvote
2 (2 / 0)
I am anti-LLM, but separately I've been anti-ML for less moral and more scientific reasons. It's because of stuff like this:



The reason classical computational fluid dynamics works for almost any non-quantum fluid, even complex non-Newtonian fluids, is that the Navier-Stokes equation being solved is a theoretically robust model that truly describes the motion; it is very complicated mathematically but physically it is just F=dp/dt applied to fluid (not quite F=ma since "m" is changing).

That is not true with these AI interpolations. These are all ad hoc and empirical, with no theoretical basis. And it's probably true that for most cases its good enough, especially if you're going to validate it against a wind tunnel. But you're always losing accuracy when you take these shortcuts, and you'll never know how wrong you are unless you do it properly.
Next, you're gonna tell me that drinking green tea and cleansing my chakras isn't as scientific and precise as getting an mRNA vaccine, or that my snowball doesn't disprove climate change!
We've had it with you and your "science", Mr smartypants ! Jesus said we'd be okay with fine-enough empirical data, and we've been listening to him for 6000 glorious dino-riding years !

More seriously, those empirical-based ML solutions get to be checked in real life in controlled settings, so it's not a bad place to try to save computing power if the result is good enough.
 
Upvote
1 (2 / -1)

Dr Gitlin

Ars Legatus Legionis
24,871
Ars Staff
@Dr Gitlin, have you ever considered grand prix level sailing? Sail GP and Americas Cup seem to be right up your alley. The nerdom of the technical aspects of the sport are F1 level, completely at the bleeding edge of speed technology with all of the technical rules weaving through the craft like a motorsport, without the motor but with more spectacular crashes. And if you decide to burn a few hours trying to understand it all, like F1 there are forums of former racers, builders, designers, and too pedantic nerds diving deep into the details.

Good article, thank you as always!
I’ve written about yacht racing a couple of times but like motorbikes it’s pretty much outside my wheelhouse.
 
Upvote
1 (1 / 0)
This is an interesting approach but in practice only works well when trying many small variations within an otherwise unchanged problem.

You need to generate a significant training set specific to the problem setup (which you do by running a bunch of actual CFD sims) and do the actual training. If you don’t plan on running significantly more sims than the size of the training set then you are actually much slower than just doing the CFD. Curiously, the paper does not report the time spent generating the dataset or on the training.

This is an issue because it’s really difficult to generalize the CFD outside of the training set geometries. They point to a slightly larger wing angle going outside the bounds of the data as being a big achievement here which is telling. A completely different body type would be a no go. If most of the wing angles are in the training set already, what are you gaining?
People are really glossing over the most damning point here: "the reality is that the accuracy of the model is only guaranteed within a specific range of situations that are not too far from what you have already explored,"

My contention is, yes, these models will allow you to navigate the initial design space rapidly, but so would having a senior designer, or access to modern F1 car photos.

It will do nothing to permit the creation of novel design solutions, which is basically what all of modern professional racing is actually built upon.

For instance, the MGU-H. Only Mercedes could get it to work well. The AI could readily help other teams develop a functional unit, if they had Mercedes' test data. But if you had their data you wouldn't need the AI help now would you?
 
Upvote
-1 (0 / -1)

ewelch

Ars Tribunus Angusticlavius
9,356
Subscriptor++
The fasted plane in the world was designed in the 1950s on a slide rule. I'm sure AI can be helpful, but the human mind is still potentially the best cure for AI's weaknesses around. The one area where AI legitimately pushes back is it lacks much of what is the worst of humanity.
 
Upvote
-1 (0 / -1)

doubleyewdee

Ars Scholae Palatinae
844
Subscriptor++
I just want to show my appreciation for the term "data hygiene", as a former medical research assistant, science teacher, and skeptic of all statistics (especially those used in advertising). Analysis is a waste of time without clean data.
[edit: That is, I like the implication that unclean data is kind of gross and embarrassing.]

Code hygiene and simialr are quite common terms of art in software development. Indeed, people use hygiene and similar words (clean, dirty, etc) a great deal across the discipline.

For example, when doing a build/compilation, you want a clean working directory. Sometimes even pristine! Similarly, ugly code that technically works but is unsatisfactory aesthetically may be called a dirty hacks. And then there's code smells as a whole category of indicators of underlying problems with quality, architecture, etc.
 
Upvote
-1 (0 / -1)