Linux remote-boot quick guide

Contents

Introduction

A remote-boot computer is a computer that does not relies on local ressources (such as its hard disk) to start, but uses centralized remote ressources (through the network) instead.

In the context of remote-booting, Linux can be used at both end : as a remote-boot server or as a remote-boot client. This document will describe each of these two alternatives, beginning with the client-side.

Linux as a remote-boot client

The simplest way to remote-boot a Linux client is to let the bootstrap program download the kernel from a file server and immediately start it. This is the traditional configuration for diskless X terminals for instance.

Of course, this is not the most efficient way of doing it. If your Linux client is disk-based (i.e., if there is a hard-disk in your Linux client), you can do many operations on it before starting the kernel, and use it as a cache to reduce network load. But let's start with simple case : just booting the kernel.

Remote-booting the kernel

Booting a Linux kernel is basically loading the file to the computer's RAM at the appropriate place and jumping into it. This is so simple that early Linux bootroms (etherboot, netboot) did not even need to load a bootstrap program to do it - they had the loader code built-in. However, as the kernel format occasionally changes and new functionalities are offered to the loader, it is wiser to leave this code out of the bootrom, and have it at a place where it is easier to upgrade.

Starting the kernel is not all of it. If you want your client to do more than displaying the banner of the kernel, you need to provide it with a root filesystem. Assuming you do not want to rely on any local ressource, your choice is between :

  1. using a ramdisk (initrd)
  2. using a remote filesystem (NFS-root)

Using a ramdisk is the preferred way, because NFS-root is very inefficient and generates a lot of useless network trafic. NFS-root is still pretty much in use, mainly for historical reasons, as it was the only available solution for primitive loaders that did not support initrd.

If you use a ramdisk as root filesystem, you will have to carefully decide what you want to put on it. Its size is limited, and it is not as convenient to maintain as a live shared filesystem. Basically, you will put in it the fundamental files needed to start a decent client, and add some of your most frequently used files. The rest (and specially configuration files when possible) should be mounted through NFS.

At this point, we come to an interesting problem : how to handle host-specific configuration ? For network parameters, the traditional solution used by older loaders is to let the kernel discover them itself using the RARP or BOOTP protocol; modern loaders are able to transmit them to the kernel using command line arguments, avoiding some unnecessary network traffic. For the rest of the configuration, host customization is typically handled by smart startup scripts on the basis of the host unique network parameters.

Several tools are available for remote-booting Linux clients. The older ones are Etherboot and Netboot, which are programs to burn into an EPROM to make your own bootrom for some common network cards. This bootrom will let you download a kernel and start a NFS-root client.

More recent remote-boot tools are not to be burned into an EPROM, but are bootstrap programs designed to be used with some standard types of bootroms. Intel provides for free a fully featured bootstrap Linux loader, including ramdisk support, to be used with any PXE-compliant bootrom. This package is known as the Intel PXE PDK for Linux. BpBatch can also be used as a bootstrap Linux loader with PXE-compliant bootroms or with Incom/Bootix TCP/IP bootprom. It has many additional features (such as caching), but in contrast with the loader from Intel, it does not make use of a multicast protocol, and is therefore less robust when the cache feature is not used. Finally, Beoboot is a commercial Linux bootstrap loader from Rembo Technology, specially designed for large clusters of remote-boot clients, for which MTFTP-based loaders (such as Intel's loader) are not robust enough. Beoboot works with PXE-compliant bootroms only.

Disk-based remote-boot

When the client computer has a hard-disk (as it is almost always the case nowadays), there is much benefit to take from it in the context of remote-booting. And in opposition to a wide-spread credence, a properly configured disk-based remote-boot client is as safe and robust as a disk-less client.

There are three ways to safely use a hard disk for remote-booting Linux, that can be freely combined :

  1. as a cache for the kernel and ramdisk images

  2. as a cache for a remote (NFS-mounted) filesystem

  3. as a giant "ramdisk" (that is, as a volatile storage media that is entirely refreshed at each boot).

To be safe, a disk-based cache has to be validated by some kind of hash function, in order to ensure that the data it holds is valid and up-to-date. The only remote-boot package that currently supports caching kernel and ramdisk images for Linux is BpBatch.

Remote filesystem caching is not part of the standard kernel distribution, but was contributed some years ago by Unifix GmbH under the name filecache. Their implementation is made of a kernel patch and a daemon, and has been successfully ported up to the last 2.0 kernels. Unfortunately, a system call conflict appeared with 2.2 kernels, and as the company seems to have disappeared, there is very little chance that the filecache will be ported to more recent kernels. We are still waiting to find someone to write with a new filecache...

Using a hard-disk as a volatile storage media involves either blindly rewriting it completely at each boot, or verifying each file by a hash function. BpBatch uses the first approach, and is able to completely rewrite a read-to-use ext2 filesystem at a rate up to 3 MB/second. We do not know of any other package that has this capability.

Linux as a remote-boot server

Linux can be used as the server for remote-boot clients by providing DHCP and TFTP services to bootroms. Installation and configuration of services for operation with PXE bootroms will be discussed in this section.

BOOTP/DHCP server

We strongly recommend to use DHCP instead of BOOTP on your linux server. DHCP is easier to install and has more features than BOOTP. Our preferred DHCP server on linux is the ISC DHCP daemon, version 2.0 or version 3.0. At the time of this writing, version 3.0 is still in beta stage.

The first step of the installation is to get the binaries of the ISC DHCP server. RedHat distributions (at least 5.x and 6.0) come with the ISC DHCP 2.0 server preinstalled. If the DHCP daemon is not installed on your linux server, download it from www.isc.org. Once installed, you can change DHCP parameters by editing the file /etc/dhcpd.conf.

A PXE bootrom requires specific DHCP parameters. If these parameters are not present in the DHCP reply, the PXE bootrom will display an error message and will halt. Here is the list of required DHCP parameters :

  • Basic IP information: IP address, default gateway, subnet mask.
  • Bootrom information: the server IP address and a boot filename.
    The bootrom will load the specified filename using TFTP on the given server IP address. This filename will then be executed.
  • PXE-specific information: vendor class set to "PXEClient" and a valid set of encapsulated options (TAG 43)
    Encapsulated vendor options describe PXE-specific parameters. You can find a list of these parameters in the Intel PXE PDK documentation.

Additionnaly, you may want to define parameters used by the operating system started by the PXE bootrom: a valid hostname, DNS server and DNS domain, NIS server and domain, etc...

Here is a sample DHCP v2.0 configuration file. In this example, our server's IP address is 192.168.1.2, the default gateway is 192.168.1.1. We do not use dynamic addressing. We instead allocate a static IP address to each PXE-host. PXE-host are identified by their hardware address.

#
# DHCP configuration file. ISC DHCP server v2.0
#

#
# Global parameters
#
# Use declaration identifier as hostname
use-host-decl-names on;


#
# Shared-network definition
# shared-network companynet { # # Company-wide parameters # option domain-name "company.com"; # # Subnet definition # subnet 192.168.1.0 netmask 255.255.255.0 { # # Subnet-specific information # # Default gateway option routers 192.168.1.1; # DNS server option domain-name-servers 192.168.1.2; # # PXE group declaration # group { # # PXE specific parameters # # Infinite lease time default-lease-time -1; # TFTP server IP address next-server 192.168.1.2; # Name of the bootstrap program filename "bpbatch"; # Vendor class setup for PXE option dhcp-class-identifier "PXEClient"; # Vendor-specific parameters # Since we do not use PXE parameters in # this example, we set this option to # 01:04:00:00:00:00 which means 'NULL parameter' option vendor-encapsulated-options 01:04:00:00:00:00; # BpBatch specific parameters option option-135 "bpbscript"; # User-level parameters (opt 128 to 135 free for use) option option-132 "workgroup"; # # PXE hosts # host pxetest1 { hardware ethernet 00:54:55:56:67:68; fixed-address 192.168.1.100; } host pxetest2 { hardware ethernet 00:54:55:56:67:69; fixed-address 192.168.1.101; } } } }

Here is the same configuration, but for ISC DHCP server v3.0 :

#
# DHCP configuration file. ISC DHCP server v2.0
#

# Global options
option subnet-mask 255.255.255.0;
default-lease-time -1;

# Definition of PXE-specific options
# Code 1: Multicast IP address of bootfile
# Code 2: UDP port that client should monitor for MTFTP responses
# Code 3: UDP port that MTFTP servers are using to listen
#         for MTFTP requests
# Code 4: Number of secondes a client must listen for activity before
#         trying to start a new MTFTP transfer
# Code 5: Number of secondes a client must listen before trying to
#         restart a MTFTP transfer
option space PXE;
option PXE.mtftp-ip    code 1 = ip-address;  
option PXE.mtftp-cport code 2 = unsigned integer 16;
option PXE.mtftp-sport code 3 = unsigned integer 16;
option PXE.mtftp-tmout code 4 = unsigned integer 8;
option PXE.mtftp-delay code 5 = unsigned integer 8;

# Subnet-specific options
subnet 192.168.1.0 netmask 255.255.255.0 {
  option routers 192.168.1.1;
  
# Host specific options
  host pxetest1 {
	hardware ethernet 00:01:02:03:04:05;
        filename "bpbatch";
	next-server 192.168.1.2;
	fixed-address 192.168.1.100;
	# PXE specific options
	class "pxeclients" 
	{
	 match if substring (option vendor-class-identifier, 0, 9) = 
		"PXEClient";
	 option vendor-class-identifier "PXEClient";
	 # At least one of the vendor-specific option must be set.
     # We set the MCAST IP address to 0.0.0.0 to be PXE compliant
	 option PXE.mtftp-ip 0.0.0.0;
	 vendor-option-space PXE;
	}
  }
}

Note for BpBatch users: BpBatch gets its arguments from the DHCP option 135. If you plan to use DHCP site-specific options in your script, please note that you can only use options 128 to 134. Other options are not processed by PXE bootroms.

TFTP server

TFTP is the protocol used by bootroms in order to get files from the server. Any TFTP server could be used for PXE bootroms, but we recommend you to use an enhanced server, supporting large blocks transfers. This will speed up large TFTP downloads. Enhanced servers also support MTFTP, a multicast variant of the TFTP protocol, used to reduce the amount of traffic generated.

Two enhanced TFTP servers are available : Incom/Bootix TFTP server and Intel TFTP server for Linux. At this time, Intel's server is still in beta stage, and should not be used for production.

You can find the Incom TFTP server in our distribution directory. This server supports three modes of operation : standard TFTP service on port 69, large packets service on port 59 and MTFTP. Since BpBatch does not use MTFTP, we will focus on the two first modes of operation.
The setup of the TFTP server is very easy: create a TFTP directory for storing files available by TFTP and run the TFTP daemon. Here is an example of command-line options (Incom TFTP server cannot be started from inetd) :

  tftpd.incom -c 64 -d /tftpboot -h -i 0 -k 5 -l /var/log/tftp.log -r
  -s 1408 59 -v 2

The above example setup a TFTP deamon with files in /tftpboot, large packets (1408 bytes) service on port 59 and maximum verbosity (-v 2).

If you plan to use the Intel TFTP server, download it from Intel website and install the TFTP server as an inetd service. Intel TFTP server supports large packets service with the blksize TFTP option and supports MTFTP as well. You can find additional information in the PXE Linux PDK documentation.

Note for BpBatch users: the boot filename DHCP parameter is used by BpBatch to determine which kind of TFTP is installed. If the filename is "bpbatch", BpBatch will use standard TFTP services on port 69. If the filename is "bpbatch.P", then BpBatch will use large packets service on port 59. Finally, if the filename is "bpbatch.B", then BpBatch will use large packets service on port 69 using the blksize option.
The following list will help you choose the name of the boot file depending on your TFTP server:

  • Incom/Bootix TFTP server: bpbatch.P
  • Intel (and Bootware) TFTP server: bpbatch.B
  • Any other TFTP server not supporting the blksize option: bpbatch

Links and related documentation

Visit the
Rembo site
and the
Beoboot site.