what did you learn today?

mqatrombone · Feb 4, 2010

Originally posted by padster:
I found out that Exchange 2003 on a Hyper-V VM is not (officially) supported.

Odd. I don't have any stats, but my gut says that Exch 2003 is still the most popularly implemented version of Exchange. You'd think MIcrosoft would ensure that such a crucial piece of the infrastructure would be supported in their enterprise virtualization product.

If 2007 and 2010 weren't such much better at I/O, they might. But the I/O improvements in 2007 and 2010 plus the fact that virtualization really didn't start taking off until after 2003 was finished are the primary reasons why.

Spatula · Feb 4, 2010

I truly believe that modern hardware and software just isn't as reliable as the old stuff. That might be because these kinds of examples pop up now and again, helping to wipe out all the memories of the true horror of the vintage gear.

I would agree with you base on the complexity of modern hardware and software vs the days of olde. However, the only still running ancient computers you still find are... still running. Selection bias. If you would like to work with a more representative sample I would be happy to lend you the keys to our "inventory" room.

Danger Mouse · Feb 4, 2010

Originally posted by Spatula:

I truly believe that modern hardware and software just isn't as reliable as the old stuff. That might be because these kinds of examples pop up now and again, helping to wipe out all the memories of the true horror of the vintage gear.

Click to expand...

I would agree with you base on the complexity of modern hardware and software vs the days of olde. However, the only still running ancient computers you still find are... still running. Selection bias. If you would like to work with a more representative sample I would be happy to lend you the keys to our "inventory" room.

Reduced complexity can result in slower, yet more reliable performance.

But I definitely agree that it's selection bias. We've got a pile of old systems that are horrible and unusable, but still in service. However, we've retired 10x to 20x that many that died or were partially dead when salvaged out of inventory. The sooner we get rid of the old crap, the happier I'll be.

We're still retiring P3s 1ghz down to 700mhz. I soon hope to have all S775 (less than 50 AMD systems, another 30 or so PowerPC G4 and G5 systems) or better as desktop platforms.

That's only until we migrate to thin clients only with bladePCs in a blade enclosure/chassis.

akro · Feb 4, 2010

I know this has been added to the EVA for a couple years. I am sure EVA isn't the only midrange that can do it...

Originally posted by M. Jones:
Last week I found out our Compellent SAN can only update drive firmware if all controllers come offline. They don't have an algorithm to drop out the drives that are being flashed. I realized that even if we didn't end up going in an unanticipated direction on that acquisition, that I hadn't verified that all of our contenders could update drive firmware without downtime. Do other midrange and lower-end arrays have the same limitation?

BajanDude · Feb 5, 2010

We're still retiring P3s 1ghz down to 700mhz. I soon hope to have all S775 (less than 50 AMD systems, another 30 or so PowerPC G4 and G5 systems) or better as desktop platforms.

I'm still retiring P-II based Proliants

Just had to do a spreadsheet of power, disk, ram, U-space, business function, security layer etc etc etc. I ran through some silly math with SPEC benchmarks, and concluded that compared to a Nehalem, any P2 is 50MHz. Remember that scene in Matrix where Neo realises his power and fights the agent with one hand behind his back? I get the feeling our VMware server will be doing the same thing when I virtualise our P2 and P3 systems.

And yes, it's definitely selection bias. We migrated data off of an aging Proliant 3000 last year (just IIS static pages). Rebooted the server to check something, and it died :>

If anyone really cares about the Nehalem thing:

Relative processor speeds were calculated by evaluating the SPECint base scores (not the rate scores) for the Nehalem processors, a representative sample of the Intel 5440 processors (G5s) and a representative sample of the Intel Xeon processors (G4s). These numbers were derived from the 2006 benchmark.

The G4 processor was then referenced in the SPECint 2000 benchmark, and a value was derived for the Pentium III processors.

Math - SPECint 2006 edition
G6 processor, base score 29.2
G5 processor, base score 21, ratio 0.72
G4 processor, base score 10, ratio 0.3

Math - SPECint 2000 edition
G4 processor, base score 1500, 2006 ratio 0.3
P3 processor, base score 318, 2006 ratio ~0.08

Real-world application
A Pentium III 933 is the equivalent of 75 MHz of Nehalem processor power. (933 * 0.08)
This probably isn't completely accurate, but it's probably accurate enough for our purposes.
Any processor that pre-dates the Pentium III will be classified as 50 MHz. It would be possible to derive the values using the SPEC95 benchmark, but there's no point.

erratick · Feb 5, 2010

Last week I found out our Compellent SAN can only update drive firmware if all controllers come offline.

NetApp clusters can update drive firmware hot, not sure about single controller.

erratick · Feb 5, 2010

when I virtualise our P2 and P3 systems.

Yup. You'll be ram limited at best. I have a couple old converted ESX servers that the one ESX box is the entire datacentre of the acquired company because they were running on old hardware- 40 VMs on one ESX with head room left because it was mostly P2/P3 w/ 1G RAM tops.

Danger Mouse · Feb 5, 2010

PII Proliants? Yeah, we still had a few of those. Luckily, we're down to just half a dozen of the PIII Proliants.

One of them died on the reboot after the data migration off of it. Prior to that, using ROBOCOPY would make it reboot. I actually had to resort to a Windows Explorer based drag and drop operation to prevent the system from copying too fast and overheating/overstressing the system.

When systems get old enough, it's more art than science to keep them alive long enough to get the data off.

It was "only" 400GB of data, but at one point was the largest RAID storage volume (500Gb + 300GB) for the last few years, prior to our EMC Celerra install last December.

Danger Mouse · Feb 5, 2010

...that the geniuses that decided not to give us a genset or even a shed/cover/roof over the A/C units (rain has been falling heavily off and on for the last few weeks) or proper fire suppression,

did NOT install any UPS monitoring/server auto shutdown agents/reporting mechanisms to our UPS setup. So, it's been alarming for the last 2 or more weeks and we didn't know about it. (the vendors are the only ones in the new data center consistently, until the final sign off is done).

This is while knowing that our 100+ year old property has had and continues to have major electrical issues (campus localized brownouts, blackouts).

The IT contractor finally understood why we've got our coffee maker on a UPS: we've had 3 explode (switch cover flies off, sparks, burn on the person) over the last few years.

A few weeks ago, I found out that our campus became a superfund site for a few weeks (about a year ago), after the contractors discovered not one but TWO abandoned/heretofor unknown oil storage tanks. There was also a sub basement that was flooded with old oil.

This was separate from another sub basement that was found filled with sewage that had been there for who knows how long, due to a broken city sewer main shunt.

So, given those two tidbits, they're still trying to put another data center into a basement, on a campus known for flooding and other environmental issues.

Fulgan · Feb 5, 2010

One of them died on the reboot after the data migration off of it. Prior to that, using ROBOCOPY would make it reboot. I actually had to resort to a Windows Explorer based drag and drop operation to prevent the system from copying too fast and overheating/overstressing the system.

Did you know that the /IPG:XX parameter of Robocopy can force it to pause in beteen 2 packets ? It's usually used to conserve bandwidth but it would probably have worked in your situation too.

Danger Mouse · Feb 5, 2010

Originally posted by Fulgan:

One of them died on the reboot after the data migration off of it. Prior to that, using ROBOCOPY would make it reboot. I actually had to resort to a Windows Explorer based drag and drop operation to prevent the system from copying too fast and overheating/overstressing the system.

Click to expand...

Did you know that the /IPG:XX parameter of Robocopy can force it to pause in beteen 2 packets ? It's usually used to conserve bandwidth but it would probably have worked in your situation too.

Thanks. I know about the pausing, but I was pressed for time and didn't want to spend time experimenting when the server was on the verge of dying. I had previously used the same method (windows drag and drop) successfully on the same server to migrate the data TO it.

It's a dual P3 1Ghz (Compaq Proliant) server and was a frankenstein built from several different servers.

akro · Feb 5, 2010

I don't think your vendors will cover damage due to raw sewage...

=-D

Originally posted by Danger Mouse:
...that the geniuses that decided not to give us a genset or even a shed/cover/roof over the A/C units (rain has been falling heavily off and on for the last few weeks) or proper fire suppression,

did NOT install any UPS monitoring/server auto shutdown agents/reporting mechanisms to our UPS setup. So, it's been alarming for the last 2 or more weeks and we didn't know about it. (the vendors are the only ones in the new data center consistently, until the final sign off is done).

This is while knowing that our 100+ year old property has had and continues to have major electrical issues (campus localized brownouts, blackouts).

The IT contractor finally understood why we've got our coffee maker on a UPS: we've had 3 explode (switch cover flies off, sparks, burn on the person) over the last few years.

A few weeks ago, I found out that our campus became a superfund site for a few weeks (about a year ago), after the contractors discovered not one but TWO abandoned/heretofor unknown oil storage tanks. There was also a sub basement that was flooded with old oil.

This was separate from another sub basement that was found filled with sewage that had been there for who knows how long, due to a broken city sewer main shunt.

So, given those two tidbits, they're still trying to put another data center into a basement, on a campus known for flooding and other environmental issues.

Danger Mouse · Feb 5, 2010

Originally posted by akro:
I don't think your vendors will cover damage due to raw sewage...

=-D

Vendor no, however the building contractors are responsible for screwing up the construction of our new buildings such that the sewage pipes are all backing up. If it screws up terribly, they're going to be responsible for paying for the cleanup and replacement of any equipment/building materials damaged by the construction.

The HVAC/Plumbing supervisor was kind of sarcastic about it. As long as you don't use the bathrooms, there is no sewage backup.

The initial diagnosis is that construction crews left "stuff" inside the pipes. They're going to double check that they didn't do a zig zag style pipe layout (worthy of Polyester's house).

The vendor is responsible, as part of the data center buildout, for NOT fully integrating the UPS into the data center. It's one vendor with a few subcontractors, but I think the project manager missed something. It's a large project and I'm not surprised that something is missing because it's so big and not due to negligence/malice.

The negligence/malice was in the pizza party they threw in front of the onsite IT staff, while not inviting us and using our plates/cups/etc.

ostiguy · Feb 5, 2010

Originally posted by erratick:

Last week I found out our Compellent SAN can only update drive firmware if all controllers come offline.

Click to expand...

NetApp clusters can update drive firmware hot, not sure about single controller.

We have a single controller 3050 that can do this. This was the case in both the 7.2.5 timeframe, and with 7.3 as well.

K0DE · Feb 5, 2010

I am building my ultimate admin workstation image and start tracking revisions etc. I am interested in how well it evolves.

brainchasm · Feb 5, 2010

Originally posted by Danger Mouse:
PII Proliants? Yeah, we still had a few of those. Luckily, we're down to just half a dozen of the PIII Proliants.

One of them died on the reboot after the data migration off of it. Prior to that, using ROBOCOPY would make it reboot. I actually had to resort to a Windows Explorer based drag and drop operation to prevent the system from copying too fast and overheating/overstressing the system.

When systems get old enough, it's more art than science to keep them alive long enough to get the data off.

It was "only" 400GB of data, but at one point was the largest RAID storage volume (500Gb + 300GB) for the last few years, prior to our EMC Celerra install last December.

Not work-related, but I still have a PentiumPro ProLiant 2500 that I love...

inferno77 · Feb 6, 2010

Originally posted by akro:
I know this has been added to the EVA for a couple years. I am sure EVA isn't the only midrange that can do it...

Originally posted by M. Jones:
Last week I found out our Compellent SAN can only update drive firmware if all controllers come offline. They don't have an algorithm to drop out the drives that are being flashed. I realized that even if we didn't end up going in an unanticipated direction on that acquisition, that I hadn't verified that all of our contenders could update drive firmware without downtime. Do other midrange and lower-end arrays have the same limitation?

Click to expand...

Sometimes you get what you pay for. I would classify EVA 4400 and above differently than a Compellent. I have witnessed the online drive firmware upgrades on an 8000 and was in awe the first time.

Brandon Kahler · Feb 6, 2010

I learned that, for servers acting as transparent bridges between firewalls and internal networks, Gigabit Bypass NICs are freaking awesome!
Just installed one in a box yesterday and the software vendor has support for it with their system services. If the service stops, the system hangs, or the machine power off, the NIC's relays click over to bypass and the neighboring devices are none the wiser that the bridge machine is no longer there.

Sweet!

ronelson · Feb 6, 2010

Brandon, what kind of device did you install that in? Know of any firewalls that do that? Being without a firewall is bad; having it take down a gig+ of traffic is worse.

Brandon Kahler · Feb 6, 2010

Originally posted by ronelson:
Brandon, what kind of device did you install that in? Know of any firewalls that do that? Being without a firewall is bad; having it take down a gig+ of traffic is worse.

The content filter is by LightspeedSystems and runs on Windows. The new box is a Dell R510. The specific card is a PEG2BPi from Silicom. It's just a modified Intel Pro/1000 PT Dual Port card.

I don't know of any firewalls that use bypass NICs. I've worked with Allot Netenforcer packet shapers that use them though.

Laslow · Feb 6, 2010

I learned that no matter how long you've worked in an old building, when someone complains about random problems with their machine, the first thing to check is the network.

Just pulled one sixteen-port and two four-port hubs off our staff network and replaced them with a single twenty-four-port switch. Also, I replaced the old, literally crusty CAT-5 with nice new CAT-6. Like magic, the troubles with the machines that were connected to said hubs went away.

Brandon Kahler · Feb 6, 2010

Laslow:

I've encountered stuff like this. The most recent one I recall was a staff person had their desk next to an exterior window. They kept potted plants in the window and their network outlets were on the wall below the window. Years of over watering the plants had seeped down the wall and onto the faceplate of the network jacks. All exposed metal on the jacks and terminations were completely oxidized/corroded. The staff person had been complaining about slow network speeds for quite a while apparently. I cut the cable ends back an inch and terminate new jacks. Just like new.

Defenestrator · Feb 7, 2010

Originally posted by mkg:
There is nothing worse than Computer Associates eTrust.

I'm not sure about that. CA Unicenter is at least on par with it.

Incarnate · Feb 8, 2010

Originally posted by Danger Mouse:
The negligence/malice was in the pizza party they threw in front of the onsite IT staff, while not inviting us and using our plates/cups/etc.

Dude, go out and buy yourself a pizza already. You can get a large for $5. This is at least the 2nd time you have brought it up in this thread. Let it go.

I'm thinking some of the trouble's you've posted in this thread are self-inficted, or you take it too personally.

ronelson · Feb 8, 2010

Dude, go out and buy yourself a pizza already. You can get a large for $5. This is at least the 2nd time you have brought it up in this thread. Let it go.

Anyone who fills my basement with raw sewage, regardless of the reason, is not allowed to use my plates and cups. Not even if they wash their hands

Danger Mouse · Feb 9, 2010

Originally posted by Incarnate:

Originally posted by Danger Mouse:
The negligence/malice was in the pizza party they threw in front of the onsite IT staff, while not inviting us and using our plates/cups/etc.

Click to expand...

Dude, go out and buy yourself a pizza already. You can get a large for $5. This is at least the 2nd time you have brought it up in this thread. Let it go.

I'm thinking some of the trouble's you've posted in this thread are self-inficted, or you take it too personally.

Eh, I was making a point about their behavior, and the pizza was the best example of their flaunting it. If you're a consultant onsite, don't make waves amongst the onsite staff.

It burns me (and the rest of my coworkers) that some of those people earn 4x to 10x the money we do, for doing the same damn job.

Actually no, for doing 1/4th to 1/10th as much work, they get 4x to 10x the money. And because of the way funding is done, the same money could not be used to increase our salaries or pay us overtime.

Some of them earn that money with all the work they do, the others, not so much. The CCIE that we've got doing the switch/router work is excellent. He has to deal with a heterogeneous mix of switches/hubs/media types that he likely has not encountered. The consultant we have out doing the bladepc deployment is likewise very good.

I'm not that impressed with the rest of them.

I get the feeling that between the bladepc consultant and the CCIE, that they could probably have done most if not all of the consulting work and probably done it quicker.

--

ronelson,

well, the rehab/remodeling contractor tries to do the right thing, but is stymied by things like finding two previously unknown/almost 100 year old oil storage tanks. Instant superfund site. I heard that for some parts of the project, it took them more than 1 year to get paid.

The architect that made the design and the building contractor that fucked up something as simple as a sewer main feed probably need to be taken to court.

I'll settle for the hq org accidentally killing the companies through nonpayment/late payment.

--

What I learned last Friday...

Trying to remotely remove Novell client from a totally crapped up computer, usually winds up with several visits to get it working and takes LONGER than it would have to reimage.

Remotely run defrag did resolve the performance issue. PSEXEC, FTW!

What I learned a few weeks ago, that newer HP Business desktop models will let you crank up the fan speed. Cranking it from 1 to 11 (not really 11) will produce a nice cloud of dust in the room and a roar like a jet engine.

What I learned today...

The amount of time between an initial LDAP sync completing and you being able to enable a LDAP lookup to drop emails to invalid recipients, is the right amount of time needed to allow a DHA based spam/phishing attack to take place.

:facepalm:

Bonus points for having to fiddle with the Email AV Gateway's LDAP settings, so's not to allow the apparently build in memory leaks to bring the server to its knees after running LDAP syncs every 3 hours as part of its default settings.

PaveHawk- · Feb 9, 2010

Originally posted by mqatrombone:

Originally posted by padster:
I found out that Exchange 2003 on a Hyper-V VM is not (officially) supported.

Odd. I don't have any stats, but my gut says that Exch 2003 is still the most popularly implemented version of Exchange. You'd think MIcrosoft would ensure that such a crucial piece of the infrastructure would be supported in their enterprise virtualization product.

Click to expand...

If 2007 and 2010 weren't such much better at I/O, they might. But the I/O improvements in 2007 and 2010 plus the fact that virtualization really didn't start taking off until after 2003 was finished are the primary reasons why.

Hah, I learnt that virtualised Exchange 2007 on Windows 2003 R2 isnt supported (in any way/shape/form). Only way Ex2007 is supported is if it was running 2008. Makes the support case thats attempting to be logged rather hard...

socoj2 · Feb 9, 2010

Originally posted by ronelson:
Brandon, what kind of device did you install that in? Know of any firewalls that do that? Being without a firewall is bad; having it take down a gig+ of traffic is worse.

This is more of an IPS feature a lot of the Proventia and Tipping point do this.

Fulgan · Feb 9, 2010

Today, I learned that regedit can fail to load files created with regedit.

I have an app that stores string values in the registry using CR+LF inside the string. Regedit can perfectly save that, is fully capable of parsing the resulting file again (because it will load the registry keys that are after the multi-line entries) but will silently fail to load the relevant keys (as in reporting success but not having actually loaded the keys).

Rick25 · Feb 9, 2010

That Microsoft deployment tools (MDT2010, USMTv4) have gotten pretty good since the last time I had a look at them.

Darthkim · Feb 9, 2010

Originally posted by Danger Mouse:
So, given those two tidbits, they're still trying to put another data center into a basement, on a campus known for flooding and other environmental issues.

Dude, you got to admit, that is some iron clad resolve they got there. Or whoever has decision making authority really has it for the IT department.

Paladin · Feb 9, 2010

Originally posted by socoj2:

Originally posted by ronelson:
Brandon, what kind of device did you install that in? Know of any firewalls that do that? Being without a firewall is bad; having it take down a gig+ of traffic is worse.

Click to expand...

This is more of an IPS feature a lot of the Proventia and Tipping point do this.

Yeah, generally the firewall is a routing device so if it fails in some way that prevents it from functioning at a normal level (access control for traffic fails or completely brain dead) simply bridging the network won't do anything for you since you have lost NAT and gateway features anyway. There are bridging or transparent firewalls but I think most people would consider it worse to have your stuff suddenly exposed to the world than to go offline. That is why most firewalls of a certain size or larger support pretty good failover instead of 'fail-through' or whatever.

montegard · Feb 9, 2010

Lesson learned today from Danger Mouse: Find out where he works, be a consultant.

Danger Mouse · Feb 9, 2010

Originally posted by montegard:
Lesson learned today from Danger Mouse: Find out where he works, be a consultant.

I am so going to "spit" on your pizza, when you have your pizza party.

WingMan · Feb 9, 2010

Today I learned the greatness of RD Tabs. A little late to the party, I know. However, it has made my life easier.

Laslow · Feb 9, 2010

Today I learned that no matter how small and simple the change, don't futz around with VLANs on the primary switch during peak hours. /facepalm

scorp508 · Feb 9, 2010

LLR went from 1 in Exchange 2007 SCC, to 10 in 2007 CCR, and then right back to 1 in 2010 DAGs. lol... yay for page patching!!

Darthkim · Feb 10, 2010

I now have to redo my 2 hour helpdesk migration, because one of my #*&$^@( SENIOR engineers can't follow instructions and continues to attach HTML files to a Work Order Ticket. Apparently attaching an HTML file with folders containing images causes the migration to crap out. (it doesn't like any subfolders underneath the work order number folder. )

Now, I could just get mad at Numara and their craptacular development of Track-It, but this is one thing that could have been easily avoided by my staff.

Sigh...

Danger Mouse · Feb 10, 2010

....that as predicted, the fancy new cooling system in our NEW multimillion dollar Data Center would crap out before it was even fully loaded.

It's going to get interesting to see how much of the new equipment survived and how consistently it will function in the future.

With the new budget crisis looming ahead, I may be building servers out of desktops again.

EDIT: one of the Blade servers showed 140F internal temperature with 280F exhaust temp. I'm willing to be the exhaust temp may be off a bit, because I don't think HP's engineers would have calibrated for a temperature that high.

jaericho · Feb 10, 2010

Originally posted by Laslow:
Today I learned that no matter how small and simple the change, don't futz around with VLANs on the primary switch during peak hours. /facepalm

Why? It's the quickest way to find out if the change will work (or not).

what did you learn today?

Ars Scholae Palatinae

Ars Tribunus Angusticlavius

Ars Legatus Legionis

Ars Scholae Palatinae

Ars Praefectus

Ars Legatus Legionis

Ars Legatus Legionis

Ars Legatus Legionis

Ars Legatus Legionis

Ars Tribunus Angusticlavius

Ars Legatus Legionis

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Praefectus

Ars Tribunus Angusticlavius

Ars Tribunus Angusticlavius

Ars Tribunus Militum

Ars Praefectus

Ars Legatus Legionis

Ars Praefectus

Wise, Aged Ars Veteran

Ars Praefectus

Ars Scholae Palatinae

Ars Tribunus Angusticlavius

Ars Legatus Legionis

Ars Legatus Legionis

Ars Praefectus

Ars Praefectus

Ars Tribunus Angusticlavius

Ars Tribunus Militum

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Praefectus

Wise, Aged Ars Veteran

Ars Legatus Legionis

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Scholae Palatinae