Archive for the ‘Linux’ Category

EPEL NSD RPM and the missing PID file directory

Sunday, June 26th, 2011

NSD is a fantastic authoritative nameserver from NLnet Labs which was developed in conjunction with the RIPE NCC to be a highly scalable, secure authoritative nameserver which has no recursive features by design. In fact, it is such as good nameserver that it is used on three of the root namesevers (k.root-servers.net, h.root-servers.net and l.root-servers.net).

Thanks to the EPEL project run by the Fedora guys, you can quickly and easily install an up to date copy of NSD on CentOS/RHEL systems. The only problem that I have found so far is that the RPM doesn’t seem to create directory for the PID file specified in the /etc/nsd/nsd.conf and so the daemon won’t start out of the box.

Obviously it is easy enough to create the /var/run/nsd directory with mkdir, but remember to chown/chgrp this directory to the nsd user and group, otherwise and “nsdc restart” will fail with errors in /var/log/messages along the lines of “failed to unlink pidfile /var/run/nsd/nsd.pid: Permission denied

Recovering Logical Volumes deleted with lvremove

Sunday, March 20th, 2011

Need to recover an LVM Logical Volume that you deleted by mistake? No problem, luckily LVM archives all of the old Logical Volume and Volume Group metadata in /etc/lvm/archive/ whenever you use something like lvremove to make adjustments to the Logical Volumes.

Before you start, remember that any kind of operation like this is potentially dangerous and so backup anything of importance on the intact Logical Volumes. Also, as a safety net, backup the contents of /etc/lvm just in case and then run “vgcfgbackup” to put an up to date copy of the metadata in /etc/lvm/backup.

Now look for the appropriate file in /etc/lvm/archive. Each LVM operation will create a file in here with the name of the affected Volume Group and an incrementing number. By default on RHEL/CentOS this will be “VolGroup00″, so you will be looking for /etc/lvm/archive/VolGroup00_xxx.vg where xxx is the appropriate increment.

If you open up your chosen file in your favourite text editor you should see a line that starts “description” and has something like “Created *before* executing ‘lvremove -f /dev/VolGroup00/xxx’” on it. You can use this to verify that you have the right file.

The next thing that you need to do is verify that this file has valid metadata and thus will be useable. To do this, you can use the vgcfgrestore command in a test mode that will perform a dry run. Assuming you are trying to restore Volume Group VolGroup00 from VolGroup00_00161.vg, this would look like:

vgcfgrestore VolGroup00 –test -f /etc/lvm/archive/VolGroup00_00161.vg

Which will return something along the lines of:

Test mode: Metadata will NOT be updated.
Restored volume group VolGroup00

Assuming this all went well, you can now re-run the same command but without the “–test” option:

# vgcfgrestore VolGroup00 –test -f /etc/lvm/archive/VolGroup00_00161.vg
Restored volume group VolGroup00

Now run an “lvscan” and you should see your missing Logical Volume(s) have returned, but are inactive. Re-activate them with “lvchange” and you are back in business, just be more careful next time ;)

lvchange -a y /dev/VolGroup00/xxx

Using kill to display dd progress

Tuesday, November 9th, 2010

How long has that dd process been running for now? Is it even doing anything? How long is it going to take?

If you want dd to give you a progress update, then find out the process ID (PID) and then send the USR1 signal to it with

kill -USR1

And dd will then print out the same records in/out, bytes copied, time taken and overall speed to STDERR as it would when it finishes.

It doesn’t matter if you are re-directing STDOUT (such as to pipe the data stream to another machine via netcat or even compressing it with gzip) but make sure that you aren’t sending STDERR anywhere such as /dev/null

Make sure you specify the “-USR1” argument to kill as you don’t want to send SIGTERM to dd by mistake!
By default, kill will send SIGTERM (or SIGKILL if you use kill -9) to the specified process, but using “-USR1” you are telling kill to send a different signal, in this case one that causes dd to print the progress summary and so you aren’t actually going to “kill” the process.

You can even use have the progress refresh every few seconds with a command such as

watch -n 10 kill -USR1 $PID

Just replace $PID with the PID of the running dd process (or set the PID environment variable to the process ID).
If dd was the last command you ran, then you can get the PID with the special $! variable, otherwise you’ll have to use ps to find the PID.

The dreaded scrolling GRUB GRUB GRUB on startup

Thursday, September 16th, 2010

Rebooting a server is always stressful, particularly when you don’t have immediate physical access to it. Of course, when the server inevitably doesn’t come back up you need to either get directly on the local console or connect in via KVMoIP and one of the worst things you can see is “GRUB GRUB GRUB” scrolling past endlessly.

This is a sign that stage 1 of the GRUB bootloader which is stored in the Master Boot Record (MBR) has somehow become corrupted and do GRUB can’t start. There is no way to even get into the GRUB command line and boot the system manually or even troubleshoot further as the problem is with stage 1 and not stage 2.

As I ran into this on a CentOS machine, I used a netinstall CD with the virtual media feature on an IP KVM attached to the server to boot into rescue mode and chroot into the operating system installed on the drive in question. I could then identify the /boot hard drive number and partition from /boot/grub/menu.lst ready to re-install GRUB and point it at the stage 2 files.

Simply run /sbin/grub to get to a version of the GRUB command prompt and then (assuming /boot/grub/menu.lst references root {hd0,0) for each of the menu options) just run:

root (hd0,0)
setup (hd0)

You should see a series of messages about looking for stage 1.5 and 2 files and then that it has successfully embedded. Congratulations, GRUB has now been re-installed and simply rebooting your machine should take you straight into your operating system as normal.

Python setuptools and get_python_version is not defined

Sunday, September 12th, 2010

If you run into the below error when using setuptools (setup.py), then it’s quite possible that you’re using an outdated version of Python’s setuptools. In particular, the python-setuptools package in the CentOS yum repository is too old.

Traceback (most recent call last):
File “setup.py”, line 19, in ?
setup(**metadata)
File “/usr/lib64/python2.4/distutils/core.py”, line 149, in setup
dist.run_commands()
File “/usr/lib64/python2.4/distutils/dist.py”, line 946, in run_commands
self.run_command(cmd)
File “/usr/lib64/python2.4/distutils/dist.py”, line 966, in run_command
cmd_obj.run()
File “/usr/lib/python2.4/site-packages/setuptools/command/bdist_rpm.py”, line 28, in run
_bdist_rpm.run(self)
File “/usr/lib64/python2.4/distutils/command/bdist_rpm.py”, line 377, in run
self.move_file(rpm, self.dist_dir)
File “/usr/lib/python2.4/site-packages/setuptools/command/bdist_rpm.py”, line 20, in move_file
getattr(self.distribution,’dist_files’,[]).append(
NameError: global name ‘get_python_version’ is not defined

Luckily, this is quite easy to fix; simply remove the RPM and download the latest version from http://pypi.python.org/pypi/setuptools then just run it with “sh” as if it was a normal shell script.

Sysconfig ifcfg scripts and VLAN sub-interfaces

Monday, August 16th, 2010

If you are using the ifcfg scripts in /etc/sysconfig/network-scripts to bring up VLAN sub-interfaces on a NIC and you are getting messages such as:

Bringing up interface eth0.200: Device eth0.200 does not seem to be present, delaying initialization.

instead of

Bringing up interface eth0.200: Added VLAN with VID == 200 to IF -:eth0:-

as you would expect, then make sure that you have the vconfig RPM installed.

HyperVM and yum update Transaction Check Errors

Monday, August 16th, 2010

If you’re having file conflict problems when running “yum update” on servers with the lxlabsupdate repository for HyperVM (or Kloxo) installed then there’s a simple resolution:

cd /var/cache/yum/lxlabsupdate/packages/
rpm -Uvh *.rpm –replacefiles –replacepkgs

This should fix errors such as:

file /usr/share/man/man1/pcregrep.1.gz from install of pcre-8.02-1.el5_5.1.x86_64 conflicts with file from package pcre-6.6-2.el5_1.7.i386
file /usr/share/man/man1/pcretest.1.gz from install of pcre-8.02-1.el5_5.1.x86_64 conflicts with file from package pcre-6.6-2.el5_1.7.i386

Restoring the contents of /dev

Sunday, July 18th, 2010

Have you ever deleted everything out of /dev by accident (or even on purpose)? Although it may seem that all is lost or that you have a lot of work ahead of you, it’s actually quite easy to restore on a modern Linux system such as CentOS 5 (or the RHEL equivalent).

The first thing you need to know is that CentOS and Red Hat use udevd, which means that the entries in /dev are dynamically created by the udev daemon and restarting this daemon will force it to re-create everything in /dev, just as it would when you start your computer up. This daemon isn’t controller in the normal way through the /etc/init.d scripts though, all you need to run is:

/sbin/start_udev

This will kill any copies of udev running and then start it back up, re-creating the /dev entries in the process. This seems to be quite safe to do on a production system, but it might be wise to only do this if you really have to, as if you haven’t damaged the contents of /dev, then some of your applications may not take kindly to the contents disappearing.

This will have re-created most of your device nodes in /dev, but there are still a couple of important ones missing, namely those used by device-mapper and LVM. You can get these back with the following two commands:

dmsetup mknodes
vgmknodes

The first of which will re-create entries under /dev/mapper and the second of which will re-create LVM volume group entries under /dev/ such as /dev/VolGroup00/ by default on CentOS or Red Hat.

Helpfully this will save someone a real headache or even rebuilding/restoring from backup unnecessarily. Just be more careful with rm next time! ;)

Retrieve the Dell PowerEdge Service Tag remotely from Windows or Linux

Saturday, June 19th, 2010

Have you ever wanted to get the Dell Service Tag from a PowerEdge machine that you don’t have physical access to? Well it’s actually quite easy as Dell make this available through the standardised Desktop Management Interface (otherwise known as DMI) framework, so you don’t even have to install any of Dell’s OpenManage tools to view it!

On a Linux system, you just need to run the following as root:

/usr/sbin/dmidecode -s system-serial-number

On a Windows box, you can accomplish the same thing from the command prompt with:

wmic bios get serialnumber

Both of these tools should be installed by default on the respective operating system. If you have some kind of super stripped down installation, then they are available from the vendor’s original media.

Intel VT Virtualisation Technology on Dell PowerEdge servers

Saturday, June 19th, 2010

Somewhat annoyingly, Dell seem to like to disable Intel’s VT (Virtualisation Technology, sometimes called VMX) in the BIOS on their Dell PowerEdge servers, which means that you can’t use the Xen hypervisor to virtualise Microsoft Windows Server without changing this setting, which requires a reboot of the server to take effect.
You can use omreport from the Dell OpenManage Server Administrator software to check whether or not you have Intel Virtualisation Technology enabled.
If you haven’t got OpenManaged Server Administrator installed, then you can enable the Dell yum repository for CentOS/Red Hat systems and install it with:

wget -q -O – http://linux.dell.com/repo/hardware/latest/bootstrap.cgi | bash
yum -y install srvadmin-base
/opt/dell/srvadmin/sbin/srvadmin-services.sh start

Once you’ve got the Dell OpenManage Server Administrator services running, you can take a look at what processor is installed in your system and what the current BIOS settings are with:

omreport chassis processors
omreport chassis biossetup

The two attributes that you’re looking for are Processor Virtualization Technology (which needs to be enabled) and Demand-Based Power Management (which needs to be disabled).

If you need to change them, then you can do this with:

omconfig chassis biossetup attribute=cpuvt setting=enabled
omconfig chassis biossetup attribute=dbs setting=disabled

Once that's done, then verify the new settings by running omreport chassis biossetup again and then once you’ve rebooted the server you can start taking advantage of the hardware virtualisation provided by Intel’s Virtualisation Technology.