CrowdStrike blames testing bugs for security update that took down 8.5M Windows PCs

cbreak · Jul 24, 2024

cyberfunk said:
As has been previously addressed, maybe they shouldn't be running certain parts in Ring 0, but AV software necessarily has to run with very high privileges to do the job. It's an unfortunately necessary evil.

What's indefensible is that their CI/CD mechanisms were so shoddy that they didn't catch this.

And that they wrote their crappy code to run in kernel space instead of doing at least the parsing of files and analysis in userland.
And that they obviously didn't even do basic testing / fuzzing of said parser code.
And that they didn't fail safely instead of crashing the whole system in an unrecoverable way.

cbreak · Jul 24, 2024

steelcobra said:
No, on this Microsoft is stuck, and beholden to their EU anti-trust agreements from the oughts forcing them to allow 3rd party software to operate like this.
https://www.tomshardware.com/softwa...oid-crowdstrike-like-calamities-in-the-future

Bullshit.

They could easily write some userland API for security software and provide these capabilities safely. Or at least they should try to, who knows if MS still has enough skilled devs to actually pull it off. Apparently they weren't able to, so they wanted to still use kernel level access themselves. And THAT is the problem.

cbreak · Jul 24, 2024

steelcobra said:
On that last line, it's a general OS safety issue. Windows hard crashes to prevent that kind of activity from causing damage to the system and OS from a bad system call.

And Linux and Mac do it too, they just weren't forced to allow third parties access to the kernel ring.

Neither is windows. Either microsoft or crowdstrike could have easily provided a userland API (via a minimal kernel module), and used that in userland to do all the heavy lifting. No one forced crowdstrike to parse untrusted and potentially (and obviously actually) malformed data in kernel space.

cbreak · Jul 24, 2024

steelcobra said:
Did you miss that this was an EU-mandated requirement, not a choice they were given?

It was absolutely a choice. And they chose poorly. Try reading next time.

cbreak · Jul 24, 2024

bifrost said:
Those user-space processes would be searched for and killed by the malware installer script before it downloads and runs the main payload.

That would mean the malware already has root-equivalent permissions, so it could also inject code into the kernel and kill it that way (for example by replacing the channel files used by crowdstrike with even more malicious ones).
And it'd be a gigantic red flag for the kernel side component of the detector that something's up. That's not what a stealthy malware would want to do.

cbreak · Jul 24, 2024

Rosyna said:
Why are you under the impression there was a NULL dereference?

Especially since a lot of the versions of the borked file had difference garbage in it.

View: https://youtu.be/pCxvyIx922A?t=169

cbreak · Jul 24, 2024

Rosyna said:
Yeah, that’s wrong. Only some people got a 291*.*32.sys file full of zeroes. Others just got garbage.

And the reverse engineering of the code that crashed shows that it explicitly checks for NULL before dereferencing.

Well, obviously it didn't check correctly, because the debugger clearly shows that a pointer in the null page is attempted to be dereferenced. Either that, or the debugger is lying.

cbreak · Jul 24, 2024

Rosyna said:
That’s where strong process isolation comes in. It’d have to be added to Windows first for Microsoft to make a non-kernel EDR API.

Can a normal user really kill system level / other user's processes on windows? That OS is even more garbage than I thought...

cbreak · Jul 24, 2024

Rosyna said:
The crash dump shows that it was trying to jump to an invalid pointer address and many people incorrectly assumed, because the address was so low, that it was an offset to a NULL pointer. However, CSAgent.sys clearly checks for NULL pointers before jumping to offsets.

Which means the small value that looked like an offset was loaded directly from something else, like a serialized instance.

... or by computing an offset from null... and checking afterwards.

Regardless on how that value was arrived at, I'd classify anything that tries to dereference anything in the null page a null pointer dereferencing error.

cbreak · Jul 24, 2024

steelcobra said:
Maybe you should read better, they were forced into it as part of the EU anti-trust agreements.

No.

They were forced to not abuse their position as OS manufacturer unfairly. They could have just provided an API AND USED IT THEMSELVES TOO. That way everyone would have the same access.

But no, they didn't.

And crowdstrike isn't even bound by any agreement MS made, they could have created such an API too with a small kernel module, and used it from userland to more safely handle untrusted data.

cbreak · Jul 24, 2024

Rosyna said:
You really don’t want to know the answer to this…

A. Windows has a hardcoded list of executable names you’re never allowed to kill.

B. In a completely separate part of the code, Windows makes sure only Microsoft executables can have those names on disk and launch.

So Windows has no way to identify if a process is special based on metadata or data inside the process/executable, it has to look at metadata in the file system to do A.

Interesting. So if users can just manipulate the processes of other users, what's even the point of logging in with different users?

cbreak · Jul 24, 2024

Rosyna said:
Because Microsoft doesn’t believe there should be a security boundary between admin and kernel, part of Microsoft is trying to convince people to only log in as non-admin* users. However, Microsoft simultaneously says the majority of users log in as admin users.

*Microsoft does believe there should be a security boundary between different low privileged users, but not between processes running as the same user.

Microsoft Security Boundaries

So presumably crowdstrike would run as some system user, and not as the logged-in user. Who therefore should not have the permissions to kill it. Unless the logged-in user is an admin, who could presumably also unload kernel modules.

Search

Search

CrowdStrike blames testing bugs for security update that took down 8.5M Windows PCs

cbreak

Ars Praefectus

More options

cbreak

Ars Praefectus

More options

cbreak

Ars Praefectus

More options

cbreak

Ars Praefectus

More options

cbreak

Ars Praefectus

More options

cbreak

Ars Praefectus

More options

cbreak

Ars Praefectus

More options

cbreak

Ars Praefectus

More options

cbreak

Ars Praefectus

More options

cbreak

Ars Praefectus

More options

cbreak

Ars Praefectus

More options

cbreak

Ars Praefectus

More options