Ed's tech corner: June 2012

This is a how to easily create a very small Linux box purely running on RAM memory that boots using PXE. This is an introductory topic (not new at all), for further posts about scalability, load balancing and high availability. That's why I mention clustering very often and also start simple & small by preparing a single node that boots smoothly in a controlled environment.

Why RAM-only & diskless?

I must say that such configuration of a Linux box can bring many advantages if you plan to assemble a cheap cluster without persistence in mind and with low maintenance costs. Consider the following:

ram availability: RAM memory is cheaper & fast wrt. the pass years
modern hardware, big ram: A lot of hardware support a great amount RAM memory installed, from 1 GB to 128 GB, and go on
Linux rocks: 64-bits Linux systems are able to manage a lot of RAM memory efficiently
HDD & the planet: Electromechanical HDD have serious implications in energy consumption, recycling, NOISE and are more susceptible to failures
SSD & your wallet: SSD are more advanced wrt. their electromechanical counterparts (less susceptible to physical shock, are silent, and have lower access time and latency) but at present market prices, more expensive per unit of storage. So if you problem is not the storage, just processing, caching and networking, you are in the right place!
less is sometimes cheaper: It's not a bad idea, if you have a good chance to buy cheaper nodes by parts/complete w/o HDD
crashing doesn't matter: if a misbehaving node crashes you just need to restart it and it'll wake up again in a healthy state. A single node state doesn't wander in time
scaling better: adding a node to the cluster is easy, just connect it, enable PXE boot and add an entry in DHCP config
network congestion is reduced: the RAM filesystem is copied once per boot to the target node

Life is easy and cluster's maintenance costs are reduced, but remember that this is only if you don't need persistence in every single node, just CPU power, networking and RAM memory.

When don't I need persistence?

I don't have a full inventory of persistence-less & memory-network-only scenarios, but a practical and discrete list, I'm sure you can see the benefits:

cryptographic stuff, privacy: you need to run a cryptographic algorithm and ensure a full cleanup of private keys after the execution is complete, a HDD formating is not enough sometimes, and recovery data from a RAM memory after a full power of is very difficult if not impossible. Also an encrypted filesystem on top of the RAM shall be challenging for hackers
caching efficiently: if your RAM is enough and your backend cluster is under a constantly growing demand for static content. You can delegate all your caching needs to a dedicated frontend cluster running purely in RAM and release the load of backend servers by processing only dynamic content on this physical layer
time only algorithms: many algorithms have only processing power needs and low/medium memory foot print, some of them even only need volatile (non-persistence) memory for allocating data structures
display only apps: some software solutions only need for displaying incoming data via graphs, video streaming, etc.. So a good display, a RAM-only system and a network is enough

What will I obtain at the end of this guide? A Linux box, named it rambox, purely running in RAM memory, that means a root (/) filesystem mounted in RAM, that's why memory preservation is a priority as well as avoidance of a filesystem full of never-used archives which also increase the memory usage.

We'll also make a customized Kernel compilation to shrink it, with a "minimal" set of features incorporated. Keeping it simple small! At this point you should be careful about omitting mandatory kernel features, there's another set of features that are not mandatory but useful to obtain the best performance. They mainly depend on your hardware, so take care of them.

What's a RAM filesystem? A filesystem mounted on RAM isn't a new invention, is a awesome Kernel feature mostly used to load firmware/modules before starting the normal boot process. It's called initrd or initramfs, there are differences between both (see references) and we'll be using initramfs.

What do I need?

For this guide I use two KVM-virtualized computers, running in a CentOS 6.2 host with bridged networking. For simplicity, the host and the two guests are in the same subnetwork

pxe: a server computer with CentOS 6.2 amd64 installed, w/ 16 GB HDD, 1 GB RAM, no GUI, networking. With DHCP and TFTP role. Static IP = 192.168.24.202, subnet =192.168.24.0/24
rambox: a RAM-only computer, w/o HDD installed, w/ networking. With cluster node role. Dynamic DHCP-designated IP = 192.168.24.203, subnet =192.168.24.0/24

NOTE: Notice that BIOS used for QEMU and KVM virtual machines (the SeaBIOS) supports an open source implementation of PXE, named gPXE. So KVM-based virtual machine is able to boot via network. Now days almost any motherboard should have a BIOS with PXE support. Ensure that your rambox support it by checking the BIOS setup.

How does it work?

In summary, when the rambox with PXE boot activated wake ups:

the BIOS PXE boot loader requests an address to DHCP server
the DHCP server offers an IP address, a TFPT server IP address (himself), and the Linux PXE boot loader's location on the TFTP server
the BIOS PXE boot loader downloads the Linux PXE boot loader from the TFPT server
the Linux PXE boot loader takes control and uses the same IP configuration to connect to TFTP server and fetch two archives: the kernel and the ramdisk
the Kernel takes control and configures its network interface, statically or by performing a second round of DHCP request, it depends
the Kernel uncompress the ramdisk in memory
the RAM disk is mounted on / and the /init script gets invoked

What do we have to configure and where? In pxe server computer is where everything takes place:

Install and configure a DHCP server with support for PXE extensions
Install and configure a TFTP server
Create a reduced ramdisk with a minimal set of utils and programs
Compile and optionally shrink the Kernel to include support for Kernel-level IP configuration, including NIC drivers
Locate all the stuff in the correct place and wake up the rambox!

There are several detailed explanations of the Linux boot process, some of them are outdated but still useful. At the moment, I won't make a full description of every single step of the boot process, ramdisk, PXE, Kernel-level IP, etc. (see references)

Hands on Bash

Installing phase

Install the dhcp, tftp-server and syslinux packages, syslinux contains the Linux PXE boot loader:

$ sudo yum install dhcp tftp-server syslinux

Additionally, install some tools:

$ sudo yum install bc wget

Finally install kernel packages for kernel compilation. These packages ensure that you have all the required tools for the build:

$ sudo yum install kernel-devel
$ sudo yum groupinstall "Development Tools"

# This is required to enable a make *config command to execute correctly. 
$ sudo yum install ncurses-devel

# These are required when building a CentOS-6 kernel. 
$ sudo yum install hmaccalc zlib-devel binutils-devel elfutils-libelf-devel 

# These are required when working with the full Kernel source
$ sudo yum install rpm-build redhat-rpm-config unifdef

# These are needed by kernel-2.6.32-220.el6
$ sudo yun install xmlto asciidoc newt-devel python-devel perl-ExtUtils-Embed

Configure DHCP

Ensure that the dhcpd starts at boot time:

$ sudo chkconfig --level 35 dhcpd on
$ chkconfig --list dhcpd
dhcpd              0:off    1:off    2:off    3:on    4:off    5:on    6:off

Edit dhcp.conf adding PXE specific options:

$ sudo nano /etc/dhcp/dhcpd.conf

# dhcpd.conf
#
# DHCP configuration file for ISC dhcpd
#

# Use this to enble / disable dynamic dns updates globally.
ddns-update-style none;

# Definition of PXE-specific options
# Code 1: Multicast IP address of boot file server
# Code 2: UDP port that client should monitor for MTFTP responses
# Code 3: UDP port that MTFTP servers are using to listen for MTFTP requests
# Code 4: Number of seconds a client must listen for activity before trying
#         to start a new MTFTP transfer
# Code 5: Number of seconds a client must listen before trying to restart
#         a MTFTP transfer
option space PXE;
option PXE.mtftp-ip               code 1 = ip-address;  
option PXE.mtftp-cport            code 2 = unsigned integer 16;
option PXE.mtftp-sport            code 3 = unsigned integer 16;
option PXE.mtftp-tmout            code 4 = unsigned integer 8;
option PXE.mtftp-delay            code 5 = unsigned integer 8;
option PXE.discovery-control      code 6 = unsigned integer 8;
option PXE.discovery-mcast-addr   code 7 = ip-address;

subnet 192.168.24.0 netmask 255.255.255.0 {

  class "pxeclients" {
    match if substring (option vendor-class-identifier, 0, 9) = "PXEClient";
    option vendor-class-identifier "PXEClient";
    vendor-option-space PXE;

    # At least one of the vendor-specific PXE options must be set in
    # order for the client boot ROMs to realize that we are a PXE-compliant
    # server.  We set the MCAST IP address to 0.0.0.0 to tell the boot ROM
    # that we can't provide multicast TFTP (address 0.0.0.0 means no
    # address).
    option PXE.mtftp-ip 0.0.0.0;

    # This is the name of the file the boot ROMs should download.
    filename "pxelinux.0";

    # This is the name of the server they should get it from.
    next-server 192.168.24.202;
  }

  pool {
    max-lease-time 86400;
    default-lease-time 86400;
    range 192.168.24.203 192.168.24.203;
    deny unknown clients;
  }

  host rambox {
    hardware ethernet 08:00:07:26:c0:a5;
    fixed-address 192.168.24.203;
    hostname rambox01.home.dev;
  }

}

NOTE

"deny unknown clients"

"allow unknown clients"

Configuring TFPT

To enable the TFTP server, edit /etc/xinetd.d/tftp replacing the word yes on the disable line with the word no. Then save the file and exit the editor:

$ sudo nano /etc/xinetd.d/tftp

# default: off
# description: The tftp server serves files using the trivial file transfer \
# protocol.  The tftp protocol is often used to boot diskless \
# workstations, download configuration files to network-aware printers, \
# and to start the installation process for some operating systems.
service tftp
{
        socket_type             = dgram
        protocol                = udp
        wait                    = yes
        user                    = root
        server                  = /usr/sbin/in.tftpd
        server_args             = -s /var/lib/tftpboot
        disable                 = no
        per_source              = 11
        cps                     = 100 2
        flags                   = IPv4
}

Restart the xinetd daemon to reload configuration files:

$ sudo service xinetd restart

Verify if xinetd is started at boot time, it should be, if not then use chkconfig like the previous step:

$ chkconfig --list xinetd
xinetd             0:off    1:off    2:off    3:on    4:on    5:on    6:off

Concerning the firewall

Allow access to TFTP via standard ports:

$ sudo iptables -I INPUT -p udp --dport 69 -j ACCEPT
$ sudo iptables -I INPUT -m state --state NEW -m tcp -p tcp --dport 21 -j ACCEPT
$ sudo service iptables save
$ sudo service iptables restart

Configuring the PXE environment

Copy the Linux PXE boot loader pxelinux.0 to tftpboot published root directory:

$ sudo cp /usr/share/syslinux/pxelinux.0 /var/lib/tftpboot

Create PXE config directory on TFP root, this directory will contains a single configuration file per node or per subnet:

$ sudo mkdir -p /var/lib/tftpboot/pxelinux.cfg

The Linux PXE boot loader uses its own IP address in hexadecimal format to look for a single configuration file under pxelinux.cfg directory, if its not found it will remove the last octet and try again, repeating until it runs out of octets. That's why I define a helper function to convert IPv4 decimal to an hexadecimal string:

#/**
# * converts an IPv4 address to hexadecimal format completing the missing 
# * leading zero
# * 
# * @example:
# *   $ hxip 10.10.24.203
# *   0A0A18CB
# *
# * @param $1: the IPv4 address 
# */
hxip() {
  ( bc | sed 's/^\([[:digit:]]\|[A-F]\)$/0\1/' | tr -d '\n' ) <<< "obase=16; ${1//./;}"
}

$ hxip 192.168.24.203
C0A818CB

Create PXE Linux config file using the designated IPv4 address in hexadecimal format:
```
$ sudo nano /var/lib/tftpboot/pxelinux.cfg/$(hxip 192.168.24.203)
```
with the following content:
```
DEFAULT bzImage
APPEND initrd=initramfs.cpio.gz rw ip=dhcp shell
```
or if you prefer to avoid the second round of DHCP issued by the Kernel:
```
DEFAULT bzImage
APPEND initrd=initramfs.cpio.gz rw ip=192.168.24.203:192.168.24.202:192.168.24.1:255.255.252.0:rambox:eth0:off shell
```
where DEFAULT provides the Kernel archive and APPEND the Kernel parameters passed on boot:
- bzImage: is the name of the compressed Kernel image
- initrd=initramfs.cpio.gz: tells to Linux PXE boot loader to download this file and pass it to the Kernel later which will interpret it to be a compressed ramdisk filesystem image
- rw: Kernel mounts the ramdisk filesystem in read-write mode
- ip=dhcp: a Kernel-level IP parameter indicating to perform a DHCP request to obtain a valid network parameters, or alternative you can used a fixed network configuration
- ip=192.168.24.203:192.168.24.202:192.168.24.1:255.255.252.0:rambox:eth0:off
- shell: a custom parameter added by me to run a shell

Creating a compressed root filesystem

The Kernel support for initramfs allow us to create a customizable boot process to load modules and provide a minimalistic shell that runs on RAM memory. An initramfs disk is nothing else than a compressed cpio archive, that is then either embedded directly into your kernel image, or stored as a separate file which can be loaded by the Linux PXE boot loader. Embedded or not, it should always contains at least:

a minimum set of directories:
- /sbin -- Critical system binaries
- /bin -- Essential binaries considered part of the system
- /dev -- Device files, required to perform I/O
- /etc -- System configuration files
- /lib, /lib32, /lib64 -- Shared libraries to provide run-time support
- /mnt -- A mount point for maintenance and use after the boot/root system is running
- /proc -- Directory stub required by the proc filesystem. The /proc directory is a stub under which the proc filesystem is placed
- /root -- the root's home directory
- /sys --
- /tmp -- Temporal directory
- /usr -- Additional utilities and applications
- /var -- Variable files whose content is expected to continually change during normal operation of the system—such as logs, spool files, and temporary e-mail files.
basic set of utilities: sh, ls, cp, mv, etc
minimum set of config files: rc, inittab, fstab, etc
devices: /dev/hd*, /dev/tty*, etc
runtime libraries to provide basic functions used by utilities

Is there any other simple method to create the RAM disk? Creating an initramfs can be also achieved by copying the content of an already installed Linux distro into an empty directory then package it, but you must be aware of carrying undesired and/or useless archives. There other methods, some of them simple, some of them not, but they are outside of the scopte of this guide which aims to show you a handy approch to obtain a lightweight RAM disk and Kernel

Use the following steps to create the initramfs:

Creating a download cache & working zone. Also defining a helper command to download and cache archives:

$ mkdir -p /tmp/cache
$ mkdir /tmp/wrk
$ pushd /tmp/wrk

#/** 
# * Downloads a file to the cache if doesn't exists
# *
# * @param $1 the file to download
# * @param $2 the url where the file is located
# */
$ get() {
 [ -f /tmp/cache/$1 ] || wget -t inf -w 5 -c $2/$1 -O /tmp/cache/$1
}

Creating and entering to initramfs root directory:

$ mkdir initramfs
$ pushd initramfs

Creating filesystem's base directories:

$ mkdir -p -m 0755 dev etc/{,init,sysconfig} mnt sys usr/{,local} var/{,www,log,lib,cache} run
$ mkdir -p -m 0555 {,s}bin lib{,32,64} proc usr/{,s}bin
$ mkdir -p -m 0700 root
$ mkdir -p -m 1777 tmp
$ pushd var
$ ln -s ../run run
$ popd

Creating /etc/profile to exports environment variables:

$ dd of=etc/profile << EOT
## /etc/profile

export PATH="/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/sbin"
EOT

Creating /etc/fstab with various mount points:

$ dd of=etc/fstab << EOT
devpts  /dev/pts  devpts  nosuid,noexec,gid=5,mode=0620  0 0
tmpfs   /dev/shm  tmpfs   nosuid,nodev,mode=0755  0 0
sysfs   /sys      sysfs   nosuid,nodev,noexec  0 0
proc    /proc     proc    nosuid,nodev,noexec  0 0
EOT
$ chmod 0644 etc/fstab

Configure passwd & group settings:

$ dd of=etc/passwd << EOT
root:x:0:0:root:/root:/bin/sh
nobody:x:99:99:NoBody:/none:/bin/false
www:x:33:33:HTTP Server:/var/www:/bin/false
EOT
$ dd of=etc/group << EOT
root:x:0:
nobody:x:99:
www:x:33:
EOT

Configure some host related settings:

$ dd of=etc/host.conf <<< "multi on"
$ dd of=etc/hostname <<< "rambox"
$ dd of=etc/hosts << EOT
127.0.0.1 localhost.localdomain localhost
127.0.1.1 $(cat etc/hostname)
EOT

Configure timezone:

$ dd of=etc/timezone <<< "America/New_York"
$ cp /usr/share/zoneinfo/$(cat etc/timezone) etc/localtime

Busybox is a handful tool used very often in ramdisks and small devices with very limited resources, providing a self-contained and minimal set of POSIX compatible unix tools in a single executable archive. I'll be using busybox on this guide. Getting busybox and create sh symbolic link:

$ pushd bin
$ chmod +w .
$ bb=busybox-x86_64 && get $bb http://www.busybox.net/downloads/binaries/latest/busybox-x86_64
$ cp /tmp/cache/$bb busybox && chmod +x busybox
$ ln -s busybox sh
$ chmod -w . 
$ popd

Additionally we MAY need an DHCP configuration script, so we'll use busybox's udhcp and simple.script. The we'll create an script named renew_ip that performs all the job:

$ pushd bin
$ chmod +w .
$ ss=simple.script
$ get $ss http://git.busybox.net/busybox/plain/examples/udhcp/$ss
$ cp /tmp/cache/$ss . && chmod +x $ss 

$ dd of=renew_ip << EOT
#!/bin/sh

ifconfig eth0 up
udhcpc -t 5 -q -s /bin/simple.script
EOT

$ chmod +x renew_ip
$ chmod -w . 
$ popd

One of the most important phases is /init script execution, this is a simple shell script file that performs all initialization process on the ramdisk. It usually mounts all filesystems listed on fstab, creates device nodes (like udev device manager), loads device firmware and finally remounts another root (/) directory in other device and relaunches the new mounted /sbin/init. This is the point where we intervened, by just launching the shell or by executing our own /sbin/init w/o remounting the root (/). So edit init script and add the following content:

$ nano init

#!/bin/sh

# Make all core utils reachable 
. /etc/profile

# Create all busybox's symb links
/bin/busybox --install -s

# Create some devices statically

# pts: pseudoterminal slave 
mkdir dev/pts

# shm
mkdir dev/shm
chmod 1777 dev/shm

# Mount the fstab's filesystems.
mount -av

# Some things don't work properly without /etc/mtab.
ln -sf /proc/mounts /etc/mtab

# mdev is a suitable replacement of the udev device node creator for loading 
# firmware
touch /etc/mdev.conf 
echo /sbin/mdev > /proc/sys/kernel/hotplug
mdev -s

# Only renew the IP address via DHCP if you need it. Not needed if Kernel-level 'ip=...' 
# was used. 
#renew_ip 

# set hostname
hostname $(cat /etc/hostname)

# shell launcher 
shell() {
 echo "${1}Launching shell..." && exec /bin/sh
}

# launch the shell if the 'shell' parameter was supplied
grep -q 'shell' /proc/cmdline && shell

# parse kernel command and obtain the init & root parameters
# if not then use default values
for i in $(cat /proc/cmdline); do
 par=$(echo $i | cut -d "=" -f 1)
 val=$(echo $i | cut -d "=" -f 2)
 case $par in
  root)
   root=$val
   ;;
  init)
   init=$val
   ;;
 esac
done
init=${init:-/sbin/init}
root=${root:-/dev/hda1}

# if rambox parameter is supplied then keep the ramdisk mounted, ignore root parameter 
# and run the other init script. Located at /sbin/init by default
if grep -q 'rambox' /proc/cmdline ; then
  [ -e ${init} ] || shell "Not init found on ramdisk at '${init}'... " 

  echo "Keeping the ramdisk since rambox param was supplied & executing init... "
  exec ${init}
  
  #This will only be run if the exec above failed
  shell "Failed keeping the ramdisk and executing '${init}'... "
fi

# Neither shell nor rambox parameters were supplied then, try to switch to the new
# root and launch init 
mkdir /newroot
mount ${root} /newroot || shell "An error ocurred mounting '${root}' at /newroot... "
[ -e /newroot${init} ] || shell "Not init found at '${init}'... " 

echo "Resetting kernel hotplugging... "
:> /proc/sys/kernel/hotplug
echo "Umounting all... "
umount -a
echo "Switching to the new root and executing init... "
exec switch_root /newroot ${init}

#This will only be run if the exec above failed
mount -av
mdev -s
shell "Failed to switch_root... "

/etc/profile is sourced to export PATH variable and make all executables reachable
All busybox's symbolic links are created
Some special devices are created by hand
All /etc/fstab filesystems are mounted
The rest of the devices are discovery and created by busybox's mdev
The Kernel command line located at /proc/cmdline is parsed to see if the shell parameter was supplied, is so the shell is immediately launched replacing the current process instance, hence everything else is ignored
The Kernel command line is checked again to see if rambox parameter was supplied, indicating that we want to keep the ramdisk mounted at / and launch the normal /sbin/init process
If neither shell nor rambox parameters were supplied then, try to mount the new root (/) and launch the /sbin/init on this new location
Finally if neither the new root cannot be mounted nor the /sbin/init script cannot be executed, then a shell is launched indicating this situation
If on any of these shell launching steps an error is produced, then a Kernel panic is issued

Append execution permissions to /init:

$ chmod +x init

Change ownership to everything:

$ sudo chown -R root:root *

Create the initramfs.cpio.gz compressed archive and copy it to tftp's root directory:

$ sudo find . -print0 | sudo cpio --null -ov --format=newc | gzip -9 > ../initramfs.cpio.gz
$ sudo cp ../initramfs.cpio.gz /var/lib/tftpboot

Go back to working directory:

$ popd

Now the Kernel stuff:

What I am about to do with the Kernel is very simple, compile it using a minimal set of features that makes it boot and recognize MY hardware, mainly the NIC device. Hence, depending on your hardware you should probably use a different selection of features for Kernel compiling. So I recommend first to do a once-time installation of any modern Linux distribution (like I did) like CentOS, Gentoo, Fedora, Debian or Ubuntu with a modern Kernel version and check the modules loaded on boot using /sbin/lsmod. Then using this modules list, look for the corresponding Kernel options and INCLUDE them all in the Kernel, making it a solid rock!. That's what I did.

NOTE: In our journey for making the Kernel simple and small, we should be careful in omitting some Kernel critical features and lost the hardware advantages, for example SMP features. So if we really want to use it in a production environment, then a deep research and customization must be done before.

Start Kerneling...

Download the Kernel sources from the sky:

$ krn=linux-2.6.39
$ get $krn.tar.xz http://www.kernel.org/pub/linux/kernel/v2.6/$krn.tar.xz

Uncompress it into a working directory, named it linux:

$ tar xvf /tmp/cache/$krn.tar.xz -C .
$ mv $krn linux
$ pushd linux

Clean all configuration settings and enter the menu:

$ make clean
$ make allnoconfig
$ make menuconfig

An ncurses menu dialog should be opened. Now check a "minimal" set of features, and uncheck the unneeded ones, I'll only list what changes wrt the clean configuration settings. So [*] means explicitly checked to be EMBEDDED it into the Kernel, and [ ] means explicitly unchecked to be not included

General Setup (here the RAM filesystem is the most important feature)

[*] Prompt for development and/or incomplete code/drivers
(-minimal) Local version - append to kernel release
[*] Initial RAM filesystem and RAM disk (initramfs/initrd) support

Bus options (PCI etc.) ---> (Enable support for PCI devices, you may add support for your PCI hardware here)

[*] PCI support

Executable file formats / Emulations ---> (An important piece!, you won't be able to execute almost anything if you don't check it)

[*] Kernel support for ELF binaries

[*] Networking support (Beside of enabling TCP/IP and disable Wireless, IPSec, etc. The most important feature to check here is the IP-Kernel level auto configuration with DHCP support)

  [ ]   Wireless  ---> 
 Networking options 
    [*] Packet socket                                                                                 
    [*]   Packet socket: mmapped IO                                                                     
    [*] Unix domain sockets
    [*] Transformation sub policy support (EXPERIMENTAL)
    [*] Transformation migrate database (EXPERIMENTAL)
    [*] PF_KEY sockets
    [*]   PF_KEY MIGRATE (EXPERIMENTAL)                                                                                  
    [*] TCP/IP networking
    [*]   IP: multicasting                              
    [*]   IP: advanced router                           
        Choose IP: FIB lookup algorithm (choose FIB_HASH if unsure) (FIB_HASH  
    [*]   IP: policy routing                            
    [*]   IP: equal cost multipath                      
    [*]   IP: verbose route monitoring                                                                               
    [*]   IP: kernel level autoconfiguration                                                           
    [*]     IP: DHCP support
    [*]   IP: tunneling                                 
    [*]   IP: GRE tunnels over IP                       
    [*]     IP: broadcast GRE over IP                   
    [*]   IP: multicast routing                         
    [*]     IP: PIM-SM version 1 support                
    [*]     IP: PIM-SM version 2 support                
    [*]   IP: ARP daemon support                        
    [*]   IP: TCP syncookie support (disabled per default)                         
    [*]   IP: AH transformation                         
    [*]   IP: ESP transformation                        
    [*]   IP: IPComp transformation                                                                            
    [ ]   IP: IPsec transport mode                                                                     
    [ ]   IP: IPsec tunnel mode                                                                        
    [ ]   IP: IPsec BEET mode
    [*]   TCP: advanced congestion control  --->                                                                           
     [*]   CUBIC TCP (NEW) (only cubic)
    [*]   TCP: MD5 Signature Option support (RFC2385) (EXPERIMENTAL) 
    [ ]   The IPv6 protocol  --->

Device Drivers ---> (RAM block device support and Network device support + Ethernet are the most important things, the remaining stuff is related to my current hardware)

  [*] Block devices  --->
    [*]   RAM block device support
  [*] Multiple devices driver support (RAID and LVM)  --->
    [*]   Device mapper support
  [*] Network device support  ---> 
    [*]   Ethernet (10 or 100Mbit)  --->
    [ ]   Wireless LAN  ---> 
 Character devices  --->
    [*] /dev/kmem virtual device support
    [*] Hardware Random Number Generator Core support
  [*] I2C support  --->
    [*]   I2C device interface
    I2C Hardware Bus support  ---> 
      [*] Intel PIIX4 and compatible (ATI/AMD/Serverworks/Broadcom/SMSC)
 Serial ATA (prod) and Parallel ATA (experimental) drivers (ATA [=n])
    [*]   ATA SFF support (NEW) 
      [*]    Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support
      [*]    Generic ATA support

File systems ---> (File systems are very important, they support depend on what's your final goal: mount an NFS remotely for a shared storage? use a GlusterFS / Ceph filesystem in top of a NAS? The configuration I used is the simplest one, only support for initramfs and other pseudo filesystem. I recommend to start with this one, then gradually embed your filesystems)

  [ ] Network File Systems  --->
  Pseudo filesystems  --->
    [*] Virtual memory file system support (former shm fs)
    [*]   Tmpfs POSIX Access Control Lists

[*] Virtualization ---> (As I mentioned earlier, I'm using a KVM-virtualized hardware with a wide usage of Virtio paravirtualization technology. Virtio adds supports for a paravirtual Ethernet card, a paravirtual disk I/O controller, a balloon device for adjusting guest memory usage, and a VGA graphics interface using SPICE drivers. Virtio drivers for guest machines are included in the Kernel >= 2.6.25, see details here)

  [*]   PCI driver for virtio devices (EXPERIMENTAL)
  [*]   Virtio balloon driver (EXPERIMENTAL)

Device Drivers ---> [for virtualization]

 
  [*] Block devices  --->
    [*]   Virtio block driver (EXPERIMENTAL)
  [*] Network device support  ---> 
    [*]   Virtio network driver (EXPERIMENTAL)
  Character devices  --->
    [*] Virtio console
    [*] Hardware Random Number Generator Core support
    [*]   VirtIO Random Number Generator support

Exit the Kernel configuration menu and don't forget to save the settings file.

Compile the Kernel (-j4 means 4 threads devoted to compilation), copy it to TFPT's root directory:

$ make -j4 bzImage
$ sudo cp arch/x86/boot/bzImage /var/lib/tftpboot/

Do cleanup:

$ popd
$ sudo rm -rf /tmp/wrk

Power on the rambox and enjoy it! It should boot smoothly and launch the busybox's shell.

You will find the basic tools at /bin, /sbin, /usr/bin, /usr/sbin, /usr/local/sbin, all these tools are indeed in the PATH environment variable. To renew your IP address just run renew_ip. Finally notice that any Kernel module is loaded since all that you need is embedded.

Enjoy it!

Post install

Perform some checks after install to ensure that everything is OK and measure for resource consumption:

Free memory, as you may notice approximately only 12Mb are used:

$ free -m
                total    used    free    shared    buffers
Mem:             1255      12    1243         0          0 
-/+ buffers:               12    1243
Swap:               0       0       0

Network configuration/connectivity, both interfaces should be listed, eth0 and lo:

$ ifconfig
eth0      Link encap:Ethernet  HWaddr 08:00:07:26:c0:a5  
          inet addr:192.168.24.203  Bcast:192.168.24.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:378 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:43745 (42.7 KiB)  TX bytes:1180 (1.1 KiB)
          
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0 B)  TX bytes:0 (0 B)

$ ping -c 3 192.168.24.202
PING 192.168.24.202 (172.26.24.202) 56(84) bytes of data.
64 bytes from 192.168.24.202: icmp_req=1 ttl=63 time=0.774 ms
64 bytes from 192.168.24.202: icmp_req=2 ttl=63 time=0.639 ms
64 bytes from 192.168.24.202: icmp_req=3 ttl=63 time=0.574 ms

--- 192.168.24.202 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.574/0.662/0.774/0.085 ms

Mounted partitions:

$ mount
rootfs on / type rootfs (rw)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,nosuid,nodev,relatime,mode=0755)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)

Disk usage, about 1.1Mb:

$ du -chs /*
1.1M    /bin
0       /dev
28.0K   /etc
4.0K    /init
0       /lib
0       /lib32
0       /lib64
0       /linuxrc
0       /mnt
0       /proc
0       /root
36.0K   /sbin
0       /sys
0       /tmp
20.0K   /usr
0       /var
1.1M    total

Loaded modules, since no module was loaded an empty list or a 'No such a file or directory' message is issued

$ lsmod
lsmod: can't open '/proc/modules': No such file or directory

Device nodes created by device manager(mdev/udev). The result depends on many factors:

$ find /dev | wc -l
106

Ed's tech corner

Thursday, June 7, 2012

RAM-only PXE boot & the "smallest" diskless Linux box