what did you learn today?

Status
Not open for further replies.

Metzen

Ars Scholae Palatinae
1,042
ronelson":2u4ueohq said:
Then use NetSentry to inform you when the box doesn't come up so you can remote in via iLo to check it out at 2AM, fire an email off to your boss and not come in till noon :)
Bonus if it was the mailserver that went down, I guess? Heh.

Well... NetSentry has multiple methods of contacting you... Fax, email, MMS, pager, calling you...

We actually use two mail servers, our own exchange server and the ISP's SMTP server. NetSentry does auto-failover should one fail to work.


*Edit* Had the program name wrong. It's IPSentry.
 
Sunner":1gfj6rir said:
nathaniel":1gfj6rir said:
no matter how many times I tell/beat/rape/yell at my developers about chmoding I still see this crap in push scripts:
cd <non-existing directory>
chmod -R user1:nobody .

and if they ran as root well I'll be restoring a server from a backup over night.

Reminds me of an Indian developer I had the pleasure of "working with" at an old job. He was supposed to install some home grown application on a Solaris zone, this required root privileges for some reason, normally I really really don't want to give developers root, but he was supposedly a "Solaris technician" or some such, and someone higher up decided that giving him the root password was the way to go, so yeah...

I come in to work the next day, I see a mail from said developer complaining that the zone is down, I wonder how the hell that happened, I try to loginvia SSH, no dice. I try to login via the zconsole, no dice. I check the file system from the global zone, the messages log is full of warnings about file permissions for PAM modules, SSH keys, etc etc.
Turns out the guy had to install something in /opt/whatever, and he couldn't do this as his regular user. The solution?
su -
cd /
chmod -R 777 *

Funny thing is, after I restored the zone and kindly asked him not to ruin it again, he went ahead and did the same thing again, only this time he limited his chmod'ing to /opt. Unfortunately there was lots of other software living in /opt and all that got completely hosed, so we had to do a second restore, and my project manager told his Indian counterpart that we would refuse to give this particular developer any access whatsoever ever again. The Indian counterpart was understanding and removed the offending developer(who was by then known as Mr Chmod) from the project as a whole, he didn't seem all that surprised. I guess Mr Chmod has a history of some sort.

I printed out on 11x17 the title "Chown -R & Chmod -R Associates, LLC" over two developers desks because I got a page from our OSSEC last night that a different developer having problems with his ssh keys chmod -R 777 his home folder. That shit must be contagious.

We have can take week long learning sabbaticals here at the office, I'm going to demand they both take on Advanced Bash-Scripting Guide http://tldp.org/LDP/abs/html/
 

bigmikebrooklyn

Ars Centurion
345
Subscriptor
Metzen":7m42xtek said:
bigmikebrooklyn":7m42xtek said:
Dilbert":7m42xtek said:
reboot is 2am eastern, 11pm pacific, 3am was a typo.

I've learned not to reboot, or schedule any tasks, between 2 AM and 3 AM.

Because once a year the reboot won't happen at all, and once a year it will reboot twice. ;)

point well taken!
iLo fixes all.

Reboot without worry :)

Then use NetSentry to inform you when the box doesn't come up so you can remote in via iLo to check it out at 2AM, fire an email off to your boss and not come in till noon :)

isn't iLO an HP thing?
 

Brandon Kahler

Ars Praefectus
5,347
Moderator
bigmikebrooklyn":2txem4z5 said:
Metzen":2txem4z5 said:
bigmikebrooklyn":2txem4z5 said:
Dilbert":2txem4z5 said:
reboot is 2am eastern, 11pm pacific, 3am was a typo.

I've learned not to reboot, or schedule any tasks, between 2 AM and 3 AM.

Because once a year the reboot won't happen at all, and once a year it will reboot twice. ;)

point well taken!
iLo fixes all.

Reboot without worry :)

Then use NetSentry to inform you when the box doesn't come up so you can remote in via iLo to check it out at 2AM, fire an email off to your boss and not come in till noon :)

isn't iLO an HP thing?

Yes, Integrated Lights Out.

iLO, DRAC, IPMI, etc... They are all LOMs (Lights Out Management).
Any new box should absolutely be purchased with a LOM that provides console redirection. The basic DELL RAC doesn't do console redirection, just monitoring and power on/off.
 

Whittey

Ars Tribunus Militum
2,006
If you delete a datastore in ESX, you delete your partition table. You can get this back by recreating the partition (fdisk the device, create the partition, change partition type to fb (vmfs), align to 128 and call it a day)http://kb.vmware.com/selfservice/mi...nguage=en_US&cmd=displayKC&externalId=1002281

HP support (cheaper than vmware's support, so it must be better) says to restore from backups, sorry for your loss.


-=Whittey=-
 

Brandon Kahler

Ars Praefectus
5,347
Moderator
ronelson":2in6hgnr said:
The basic DELL RAC doesn't do console redirection, just monitoring and power on/off.
Odd. I was just using the DRAC this morning to access the console and do an install.

The Enterprise DRAC does console redirection, not the standard. Note: this is the current generation DRAC6 units.
If you have a physical DRAC interface then you've got the Enterprise card. If the DRAC is running off the built-in NICs you've got the standard one.
 

w00key

Ars Tribunus Angusticlavius
9,157
Subscriptor
Brandon Kahler":1212g935 said:
ronelson":1212g935 said:
The basic DELL RAC doesn't do console redirection, just monitoring and power on/off.
Odd. I was just using the DRAC this morning to access the console and do an install.

The Enterprise DRAC does console redirection, not the standard. Note: this is the current generation DRAC6 units.
If you have a physical DRAC interface then you've got the Enterprise card. If the DRAC is running off the built-in NICs you've got the standard one.

Those features used to be standard in DRAC5, console redirection and virtual media, but Dell decided to charge 200e extra for that privilege now.
 
I learned to never again trust a 3750 switch stack to keep the same switch as master after adding another stack to the switch and rebooting the stack.

startup-config : gone
vlan database: gone

Fortunately, I had backed up the config before starting the whole process and the vlan database was easily recreated after re-entering the VTP information and letting it propagate from another stack.

I will manually set the switch priority on all 3750 stacks from this point forward!
 

Paladin

Ars Legatus Legionis
33,629
Subscriptor
Dr. Xing":33ucgfyo said:
I learned to never again trust a 3750 switch stack to keep the same switch as master after adding another stack to the switch and rebooting the stack.

startup-config : gone
vlan database: gone

Fortunately, I had backed up the config before starting the whole process and the vlan database was easily recreated after re-entering the VTP information and letting it propagate from another stack.

I will manually set the switch priority on all 3750 stacks from this point forward!

I think I have yet to encounter a switch stack system that gets it right 100% of the time if you don't do it manually. I have been burned by a variety of manufacturers and models when it comes to stacking and I always do manual now days if I remember to/have the ability to do so.
 

PVO

Ars Scholae Palatinae
899
Subscriptor
I learned today that Microsoft just doesn't give a damn about their customers, or us poor admins.

Remove the simple copy functionality to tweak and set the local default profile in 7 and 2008 R2? Require you to sysprep the machine as the only "officially supported" way?!

Stupid, stupid, stupid.

Thank goodness for 3rd party workarounds. Of course, 5 years ago, I would never use a workaround in production. My, how times have changed.
 

meglet

Ars Scholae Palatinae
1,447
When SQL 2005 Enterprise fails during install with "unexpected error" what it really means is "the database services installed successfully, client tools are located on disc 2, please run the client tools install from disc 2." Of course, once I figured this out, it did sound vaguely familiar from the SQL 2005 class I sat in 2? years ago.

Disclaimer: not a DB or SQL admin, got stuck with the rebuild cause the previous SQL admin left the state. Lucky me.
 

Incarnate

Ars Tribunus Angusticlavius
9,004
Subscriptor++
Paladin":2vim6b0t said:
Dr. Xing":2vim6b0t said:
I learned to never again trust a 3750 switch stack to keep the same switch as master after adding another stack to the switch and rebooting the stack.

startup-config : gone
vlan database: gone

Fortunately, I had backed up the config before starting the whole process and the vlan database was easily recreated after re-entering the VTP information and letting it propagate from another stack.

I will manually set the switch priority on all 3750 stacks from this point forward!

I think I have yet to encounter a switch stack system that gets it right 100% of the time if you don't do it manually. I have been burned by a variety of manufacturers and models when it comes to stacking and I always do manual now days if I remember to/have the ability to do so.
I have never had a config lost when adding a switch or rebooting (but then again he said he was adding another stack, not sure if he meant switch).

We usually set the priority of the switch to the one that we always want to be the master. That way we always know which one will be the master in case of a reboot or power outage or something. I believe the command is "switch # priority #". Replacing # with the proper number, of course.
 

Paladin

Ars Legatus Legionis
33,629
Subscriptor
Yeah I should have been more specific. I have rarely lost config, but I have often encountered stack setups that will randomly pick a new master when a new switch comes online if you haven't manually prepared either the stack with a non-default priority for your desired master and secondary or prepped the new switch to be lower priority. Depending on the manufacturer or software, this usually seems to result in all the switches dropping their ports just long enough for people to get pissed. :mad: :rolleyes:
 

ronelson

Ars Legatus Legionis
21,399
Subscriptor
I have learned how frustrating trying to automate SSH is, especially when there is no support for pre-shared keys (IOS and ASAs). Net::SSH::Expect is a pretty nifty perl mod, but handling all the input is excessively annoying. On the upside, once I am done wasting my time on this, other people will have to waste less time. Er... Somehow it seems less helpful when I phrase it that way.
 

hutch85

Ars Scholae Palatinae
652
That no matter how many times you ask someone "Did you make sure you carried out all the steps to do ___?" before you do your part in the process, they invariably missed something and your day is screwed up.

(background: we run a proprietary application from a vendor, and it has a year-end fiscal 'rollover' process. Person X is responsible for defining the rules of the rollover in the client app, then I run the server-side scripts to carry out the actual rollover on the Oracle DB. They forgot to do one step in their process, and only noticed it after the rollover was done... now it's "oh shit" time. There are backups in place, because I'm not stupid :) but it's still a pain in the butt and worthy of a facepalm. Of course, it doesn't help that the rollover process doesn't check that all the rules are defined or in place before running - because why would you want your proprietary/expensive application do any validity checking?? *sigh*...)
 
hutch85":2ppzwx2e said:
That no matter how many times you ask someone "Did you make sure you carried out all the steps to do ___?" before you do your part in the process, they invariably missed something and your day is screwed up.

It may be worth it to set up a meeting with the other guys' team (including boss) and working out a checklist (possibly with swim-lanes, if you swing that way) that defines the whole process. That might sound like needless complication, but clearly it's not needless for your situation. Put the pain back on his group to improve the process. Might even find some ways to streamline or otherwise improve it.

And I do mean a checklist. Excel sheet, or Visio, electronic or printed; or a workflow tool like Oracle MOP if you've already got something lying around. The dufus handing stuff off to you has to have initialed his steps.
 
ronelson":37cvahtj said:
I have learned how frustrating trying to automate SSH is, especially when there is no support for pre-shared keys (IOS and ASAs). Net::SSH::Expect is a pretty nifty perl mod, but handling all the input is excessively annoying. On the upside, once I am done wasting my time on this, other people will have to waste less time. Er... Somehow it seems less helpful when I phrase it that way.
Have you tried using expect?
 

ronelson

Ars Legatus Legionis
21,399
Subscriptor
Have you tried using expect?
It is more painful. As a limited subset of tcl, it is difficult to do things like run a "sh flash" and make calculations to ensure the free space is correct, much less to modify the boot statements properly so it does not try and boot the old AND new firmware.

However, I think I may have to combine some interactive statements with Net::SSH::Expect and Net::SCP::Expect, tftp is too fucking slow. Of course, we have to upgrade some customers whose firmware does not allow scp uploads, so... Sometimes I love my job, but other times I really fucking hate it.
 

Whittey

Ars Tribunus Militum
2,006
Heresiarch":184mxo2v said:
That occasionally and against all expectations, the backhoe operator has a clue and phones to ask about "the orange cable I've just found"

Yes, really "found" as opposed to "severed with my trusty bucket of doom"
I've never felt the urge to post this before but...

Pics or it didn't happen.


-=Whittey=-
 

ronelson

Ars Legatus Legionis
21,399
Subscriptor
However, I think I may have to combine some interactive statements with Net::SSH::Expect and Net::SCP::Expect, tftp is too fucking slow.
Yeah, this was it. I think TFTP is just so god awful slow that sometimes the "ssh timeout 15" would kill the connection...but other times it would all the way without a problem. I changed it around to use FTP instead, which has the downside of another user/pass to configure, but the upside that it does the same thing in 3.5 minutes instead of 35.

FYI, Net::SSH::Expect is a perl module that implements an SSH object that includes Expect syntax.
 

Incarnate

Ars Tribunus Angusticlavius
9,004
Subscriptor++
Soko":2se7ho3k said:
ronelson":2se7ho3k said:
Today I learned, for the umpteenth time, that no plan survives contact with the enemy. And when I say enemy, I mean the lusers!

T,FTFY.
I learned a long time ago that treating users as the "enemy" and calling them "lusers" makes your job infintely more difficult. Your job is to listen to their needs, support them, and help them if they don't understand how something works.
 

Matt Wallis

Ars Scholae Palatinae
1,268
Subscriptor++
Incarnate":ag0isk3f said:
I learned a long time ago that treating users as the "enemy" and calling them "lusers" makes your job infintely more difficult. Your job is to listen to their needs, support them, and help them if they don't understand how something works.

Of course, this is what we _have_ to do, and your response is something I've had to push onto other SysAdmins, particularly Linux admins who think Windows Administration is an extreme form of punishment, which, even in my more moderate days, I'm starting to agree with them.

But sometimes, usually after explaining the no clicking on attachments thing, again, you have to vent, and places like this tend to be outlets.

Now I shall return to trying to figure out what step in the Windows Server 2008 R2 Active Directory Domain Services wizard, gave me a Read Only Domain Controller.
 

ronelson

Ars Legatus Legionis
21,399
Subscriptor
I learned a long time ago that treating users as the "enemy" and calling them "lusers" makes your job infintely more difficult. Your job is to listen to their needs, support them, and help them if they don't understand how something works.
If you cannot vent on an internet forum, where can you vent??? Of course they are customers, I know that, but the only reason I can call them that is because I can vent the frustration somewhere.
 

Soko

Ars Praefectus
4,068
Subscriptor++
Incarnate":1x0yb8or said:
Soko":1x0yb8or said:
ronelson":1x0yb8or said:
Today I learned, for the umpteenth time, that no plan survives contact with the enemy. And when I say enemy, I mean the lusers!

T,FTFY.
I learned a long time ago that treating users as the "enemy" and calling them "lusers" makes your job infintely more difficult. Your job is to listen to their needs, support them, and help them if they don't understand how something works.

Very true. My attitude in a forum haunted by my peers is very different than what it is in a professional setting.

While on the clock they're my customers and I do anything and everything I can to help them - with a smile. :)


Here, they can be lusers if I'm in that mood. :p
 
Status
Not open for further replies.