Developing software with GNU

An introduction to the GNU development tools

This is edition 0.1.5

Last updated, 26 March 1999

Eleftherios Gkioulekas
Department of Applied Mathematics
University of Washington
lf@amath.washington.edu">lf@amath.washington.edu


@dircategory Development * toolsmanual: (toolsmanual). Developing software with GNU

@shorttitlepage Developing software with GNU

This edition of the manual is consistent with:
Autoconf 2.13, Automake 1.4, Libtool 1.3,
Autotools 0.11, Texinfo 3.12b, Emacs 20.3.
Published on the Internet
http://www.amath.washington.edu/~lf/tutorials/autoconf/

Copyright (C) 1998, 1999 Eleftherios Gkioulekas. All rights reserved.

Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.

Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that they are marked clearly as modified versions, that the authors' names and title are unchanged (though subtitles and additional authors' names may be added), and that other clearly marked sections held under separate copyright are reproduced under the conditions given withinthem, and that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.

Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Free Software Foundation.

Preface

The GNU project was founded in 1984 by Richard Stallman in response to the increasing obstacles to cooperation imposed on the computing community by the owners of proprietary software. The goal of the GNU project is to remove these obstacles by developing a complete software system, named GNU (1) and distributing it as free software. GNU is not about software that costs $0. It is about software that gives to all its users the freedom to use, modify and redistribute it. These freedoms are essential to building a community based on cooperation and the open sharing of ideas.

Today, millions of people use GNU/Linux, a combination of the GNU system and the popular Linux kernel that was developed since 1991 by Linus Torvalds and a group of volunteers. The GNU project's kernel, the Hurd, is also in service but it is not yet sufficiently developed for widespread use. Technically, Unix and GNU have many similarities, and it is very easy to port software from Unix to GNU or use GNU software on Unix systems.

Because GNU is a community effort, it provides very powerful development tools that enable every user to contribute to the community by writing free software. The GNU development tools include the GNU compilers, the GNU build system and Emacs. Proprietary systems often do not bundle such tools with their distributions because their developers regard the users as a market that buys software licenses and treats the computer as an appliance. (2)

This manual will introduce you to the development tools that are used in the GNU system. These tools can also be used to develop software with GNU/Linux and Unix. This manual will not teach you how to use C, or any other programming language. It is assumed that you are already familiar with C. This manual will also not cover every detail about the tools that we discuss. Each tool has its own reference manual, and you should also read these manuals, sooner or later, if you want to learn more. This manual aims to be a practical introduction to the GNU development tools that will show you how to use them together to accomplish specific common tasks. The intended audience is a programmer that has learned programming in C, and would now like to learn everything else that person needs to know to develop software that conforms to the GNU coding standards. So, we will tell you what to need to know, and then you can read the specific reference manuals to learn everything that you can possibly learn.

Note on terminology

There is a growing concern among womyn that there are important gender issues with the English language. As a result, it became common to use terms such as "chairperson" instead of "chairman". In this manual we will use the words person, per, pers and perself. These words are used just like the words she, her, hers, herself. For example, we will say: "person wrote a manual to feel good about perself, and to encourage per potential significant other's heart to become pers". These terms were introduced, and perhaps invented, by Marge Piercy, and have been first used in software documentation and email correspondance by Richard Stallman. By using these terms, we hope to make this manual less threatening to womyn and to encourage our womyn readers to join the free software community.

Roadmap to manual

This manual was written as a tutorial and not a reference manual, so in general, it works to read the chapters in the order in which they are presented. If you came fresh from your CS courses with a good knowledge of C, but have learned nothing about the GNU development tools, reading all the chapters in order is probably what you should do. However, if you are already familiar with some of the topics that we discuss, you might want to skip a few chapters to get to the material that is new to you.

For example, many readers are already familiar with Emacs and Makefiles, and they just want to get started with learning about Autoconf and Automake. In that case, you can skip to section The GNU build system, and start reading from there. If you are a vi user and are not interested in learning Emacs, please reconsider (see section Using vi emulation). You will find some of the other development tools, especially the Texinfo documentation system, much easier to use with Emacs than without it.

Here's a brief outline of the chapters in this manual, and what is covered by each chapter.

Copying

This book that you are now reading is actually free. The information in it is freely available to anyone. The machine readable source code for the book is freely distributed on the internet and anyone may take this book and make as many copies as they like. (take a moment to check the copying permissions on the Copyright page). If you paid money for this book, what you actually paid for was the book's nice printing and binding, and the publisher's associated costs to produce it.

The GNU development tools include Automake, Autoconf, Libtool, Make, Emacs, Texinfo and the GNU C and C++ compilers. These programs are "free"; this means that everyone is free to use them and free to redistribute them on a free basis. These programs are not in the public domain; they are copyrighted and there are restrictions on their distribution, but these restrictions are designed to permit everything that a good cooperating citizen would want to do. What is not allowed is to try to prevent others from further sharing any version of these programs that they might get from you.

Specifically, we want to make sure that you have the right to give away copies of the programs and documents that relate to them, that you receive source code or else can get it if you want it, that you can change these programs or use pieces of them in new free programs, and that you know you can do these things.

To make sure that everyone has such rights, we don't allow you to deprive anyone else of these rights. For example, if you distribute copies of the code related to the GNU development tools, you must give the recipients all the rights that you have. You must make sure that they, too, can get the source code. And you must tell them their rights.

Also for our own protection, we must make certain that everyone finds out that there is no warranty for the programs that relate to the GNU development tools. If these programs are modified by someone else and passed on, we want their recipients to know that what they have is not what we distributed, so that any problems introduced by others will not reflect on our reputation.

The precise conditions of the licenses for the GNU development tools are found in the General Public Licenses that accompany them.

Acknowledgements

This manual was written and is being maintained by Eleftherios Gkioulekas. Many people have contributed to this effort in various ways. Here is a list of these contributions. Please help me keep it complete and exempt of errors.

Installing GNU software

Free software is distributed in source code distributions. Many of these programs are difficult to install because they use system dependent features, and they require the user to edit makefiles and configuration headers. By contrast, the software distributed by the GNU project is autoconfiguring; it is possible to compile it from source code and install it automatically, without any tedious user intervention.

In this chapter we discuss how to compile and install autoconfiguring software written by others. In the subsequent chapters we discuss how to use the development tools that allow you to make your software autoconfiguring as well.

Installing a GNU package

Autoconfiguring software is distributed with packaged source code distributions. These are big files with filenames of the form:

package-version.tar.gz

For example, the file `autoconf-2.13.tar.gz' contains version 2.13 of GNU Autoconf. We often call these files source distributions; sometimes we simply call them packages.

The steps for installing an autoconfiguring source code distribution are simple, and if the distribution is not buggy, can be carried out without substantial user intervention.

  1. First, you have to unpack the package to a directory:
    % gunzip foo-1.0.tar.gz
    % tar xf foo-1.0.tar
    
    This will create the directory `foo-1.0' which contains the package's source code and documentation. Look for the files `README' to see if there's anything that you should do next. The `README' file might suggest that you need to install other packages before installing this one, or it might suggest that you have to do unusual things to install this package. If the source distribution conforms to the GNU coding standards, you will find many other documentation files like `README'. See section Maintaining the documentation files, for an explanation of what these files mean.
  2. Configure the source code. Once upon a time that used to mean that you have to edit makefiles and header files. In the wonderful world of Autoconf, source distributions provide a `configure' script that will do that for you automatically. To run the script type:
    % ./configure
    
  3. Now you can compile the source code. Type:
    % cd foo-1.0
    % make
    
    and if the program is big, you can make some coffee. After the program compiles, you can run its regression test-suite, if it has one, by typing
    % make check
    
  4. If everything is okey, you can install the compiled distribution with:
    % su
    # make install
    

The `make' program launches the shell commands necessary for compiling, testing and installing the package from source code. However, `make' has no knowledge of what it is really doing. It takes its orders from makefiles, files called `Makefile' that have to be present in every subdirectory of your source code directory tree. From the installer perspective, the makefiles define a set of targets that correspond to things that the installer wants to do. The default target is always compiling the source code, which is what gets invoked when you simply run make. Other targets, such as `install', `check' need to be mentioned explicitly. Because `make' takes its orders from the makefile in the current directory, it is important to run it from the correct directory. See section Compiling with Makefiles, for the full story behind `make'.

The `configure' program is a shell script that probes your system through a set of tests to determine things that it needs to know, and then uses the results to generate `Makefile' files from templates stored in files called `Makefile.in'. In the early days of the GNU project, developers used to write `configure' scripts by hand. Now, no-one ever does that any more. Now, `configure' scripts are automatically generated by GNU Autoconf from an input file `configure.in'. GNU Autoconf is part of the GNU build system and we first introduce in in section The GNU build system.

As it turns out, you don't have to write the `Makefile.in' templates by hand either. Instead you can use another program, GNU Automake, to generate `Makefile.in' templates from higher-level descriptions stored in files called `Makefile.am'. In these files you describe what is being created by your source code, and Automake computes the makefile targets for compiling, installing and uninstalling it. Automake also computes targets for compiling and running test suites, and targets for recursively calling make in subdirectories. The details about Automake are first introduced in section Using Automake.

The Makefile standards

The GNU coding standards are a document that describes the requirements that must be satisfied by all GNU programs. These requirements are driven mainly by technical ocnsiderations, and they are excellent advice for writing good software. The makefile standards, a part of the GNU coding standards, require that your makefiles do a lot more than simply compile and install the software.

One requirement is cleaning targets; these targets remove the files that were generated while installing the package and restore the source distribution to a previous state. There are three cleaning targets that corresponds to three levels of cleaning: clean, distclean, maintainer-clean.

clean
Cleans up all the files that were generated by make and make check, but not the files that were generated by running configure. This targets cleans the build, but does not undo the source configuration by the configure script.
distclean
Cleans up all the files generated by make and make check, but also cleans the files that were generated by running configure. As a result, you can not invoke any other make targets until you run the configure script again. This target reverts your source directory tree back to the state in which it was when you first unpacked it.
maintainer-clean
Cleans up all the files that distclean cleans. However it also removes files that the developers have automatically generated with the GNU build system. Because users shouldn't need the entire GNU build system to install a package, these files should not be removed in the final source distribution. However, it is occasionally useful for the maintainer to remove and regenerate these files.

Another type of cleaning that is required is erasing the package itself from the installation directory; uninstalling the package. To uninstall the package, you must call

% make uninstall

from the toplevel directory of the source distribution. This will work only if the source distribution is configured first. It will work best only if you do it from the same source distribution, with the same configuration, that you've used to install the package in the first place.

When you install GNU software, archive the source code to all the packages that you install in a directory like `/usr/src' or `/usr/local/src'. To do that, first run make clean on the source distribution, and then use a recursive copy to copy it to `/usr/src'. The presense of a source distribution in one of these directories should be a signal to you that the corresponding package is currently installed.

Francois Pinard came up with a cute rule for remembering what the cleaning targets do:

GNU standard compliant makefiles also have a target for generating tags. Tags are files, called `TAGS', that are used by GNU Emacs to allow you to navigate your source distribution more efficiently. More specifically, Emacs uses tags to take you from a place where a C function is being used in a file, to the file and line number where the function is defined. To generate the tags call:

% make tags

Tags are particularly useful when you are not the original author of the code you are working on, and you haven't yet memorized where everything is. See section Navigating source code, for all the details about navigating large source code trees with Emacs.

Finally, in the spirit of free redistributable code, there must be targets for cutting a source code distribution. If you type

% make dist

it will rebuild the `foo-1.0.tar.gz' file that you started with. If you modified the source, the modifications will be included in the distribution (and you should probably change the version number). Before putting a distribution up on FTP, you can test its integrity with:

% make distcheck

This makes the distribution, then unpacks it in a temporary subdirectory and tries to configure it, build it, run the test-suite, and check if the installation script works. If everything is okey then you're told that your distribution is ready.

Writing reliable makefiles that support all of these targets is a very difficult undertaking. This is why we prefer to generate our makefiles instead with GNU Automake.

Configuration options

The `configure' script accepts many command-line flags that modify its behaviour and the configuration of your source distribution. To obtain a list of all the options that are available type

% ./configure --help

on the shell prompt.

The most useful parameter that the installer controls during configuration is the directory where they want the package to be installed. During installation, the following files go to the following directories:

Executables   ==> /usr/local/bin
Libraries     ==> /usr/local/lib
Header files  ==> /usr/local/include
Man pages     ==> /usr/local/man/man?
Info files    ==> /usr/local/info

The `/usr/local' directory is called the prefix. The default prefix is always `/usr/local' but you can set it to anything you like when you call `configure' by adding a `--prefix' option. For example, suppose that you are not a privilidged user, so you can not install anything in `/usr/local', but you would still like to install the package for your own use. Then you can tell the `configure' script to install the package in your home directory `/home/username':

% ./configure --prefix=/home/username
% make
% make check
% make install

The `--prefix' argument tells `configure' where you want to install your package, and `configure' will take that into account and build the proper makefile automatically.

If you are installing the package on a filesystem that is shared by computers that run variations of GNU or Unix, you need to install the files that are independent of the operating system in a shared directory, but separate the files that are dependent on the operating systems in different directories. Header files and documentation can be shared. However, libraries and executables must be installed separately. Usually the scheme used to handle such situations is:

Executables   ==> /usr/local/system/bin
Libraries     ==> /usr/local/system/lib
Header files  ==> /usr/local/include
Man pages     ==> /usr/local/man/mann
Info files    ==> /usr/local/info

The directory `/var/local/system' is called the executable prefix, and it is usually a subdirectory of the prefix. In general, it can be any directory. If you don't specify the executable prefix, it defaults to being equal to the prefix. To change that, use the `--exec-prefix' flag. For example, to configure for a GNU/Linux system, you would run:

% configure --exec-prefix=/usr/local/linux

To configure for GNU/Hurd, you would run:

% configure --exec-prefix=/usr/local/hurd

In general, there are many directories where a package may want to install files. Some of these directories are controlled by the prefix, where others are controlled by the executable prefix. See section Installation standard directories, for a complete discussion of what these directories are, and what they are for.

Some packages allow you to enable or disable certain features while you configure the source code. They do that with flags of the form:

   --with-package   --enable-feature
--without-package  --disable-feature

The --enable flags usually control whether to enable certain optional features of the package. Support for international languages, debugging features, and shared libraries are features that are usually controlled by these options. The --with flags instead control whether to compile and install certain optional components of the package. The specific flags that are available for a particular source distribution should be documented in the `README' file.

Finally, configure scripts can be passed parameters via environment variables. One of the things that configure does is decide what compiler to use and what flags to pass to that compiler. You can overrule the decisions that configure makes by setting the flags CC and CFLAGS. For example, to specify that you want the package to compile with full optimization and without any debugging symbols (which is a bad idea, yet people want to do it):

% export CFLAGS="-O3"
% ./configure

To tell configure to use the system's native compiler instead of gcc, and compile without optimization and with debugging symbols:

% export CC="cc"
% export CFLAGS="-g"
% ./configure

This assumes that you are using the bash shell as your default shell. If you use the csh or tcsh shellls, you need to assign environment variables with the setenv command instead. For example:

% setenv CFLAGS "-O3"
% ./configure 

Similarly, the flags CXX, CXXFLAGS control the C++ compiler.

Doing a VPATH build

Autoconfiguring source distributions also support vpath builds. In a vpath build, the source distribution is stored in a, possibly read-only, directory, and the actual building takes place in a different directory where all the generated files are being stored. We call the first directory, the source tree, and the second directory the build tree. The build tree may be a subdirectory of the source tree, but it is better if it is a completely separate directory.

If you, the developer, use the standard features of the GNU build system, you don't need to do anything special to allow your packages to support vpath builds. The only exception to this is when you define your own make rules (see section General Automake principles). Then you have to follow certain conventions to allow vpath to work correctly.

You, the installer, however do need to do something special. You need to install and use GNU make. Most Unix make utilities do not support vpath builds, or their support doesn't work. GNU make is extremely portable, and if vpath is important to you, there is no excuse for not installing it.

Suppose that `/sources/foo-0.1' contains a source distribution, and you want to build it in the directory `/build/foo-0.1'. Assuming that both directories exist, all you have to do is:

% cd /build/foo-0.1
% /sources/foo-0.1/configure ...options...
% make
% make check
% su
# make install

The configure script and the generated makefiles will take care of the rest.

vpath builds are prefered by some people for the following reasons:

  1. They prevent the build process form cluttering your source directory with all sorts of build files.
  2. To remove a build, all you have to do is remove the build directory.
  3. You can build the same source multiple times using different options. This is very useful if you would like to write a script that will run the test suite for a package while the package is configured in many different ways (e.g. different features, different compiler optimization, and so on). It is also useful if you would like to do the same with releasing binary distributions of the source.

Some developers like to use vpath builds all the time. Others use them only when necessary. In general, if a source distribution builds with a vpath build, it also builds under the ordinary build. The opposite is not true however. This is why the distcheck target checks if your distribution is correct by attempting a vpath build.

Making a binary distribution

After compiling a source distribution, instead of installing it, you can make a snapshot of the files that it would install and package that snapshot in a tarball. It is often convenient to the installers to install from such snapshots rather than compile from source, especially when the source is extremely large, or when the amount of packages that they need to install is large.

To create a binary distribution run the following commands as root:

# make install DESTDIR=/tmp/dist
# tar -C /tmp/dist -cvf package-version.tar
# gzip -9 package-version.tar

The variable DESTDIR specifies a directory, alternative to root, for installing the compiled package. The directory tree under that directory is the exact same tree that would have normally been installed. Why not just specify a different prefix? Because very often, the prefix that you use to install the software affects the contents of the files that actually get installed.

Please note that under the terms of the GNU General Public License, if you distribute your software as a binary distribution, you also need to provide the corresponding source distribution. The simplest way to comply with this requirement is to distribute both distributions together.

Using GNU Emacs

Emacs is an environment for running Lisp programs that manipulate text interactively. To call Emacs merely an editor does not do it justice, unless you redefine the word "editor" to the broadest meaning possible. Emacs is so extensive, powerful and flexible, that you can almost think of it as a self-contained "operating system" in its own right.

Emacs is a very important part of the GNU development tools because it provides an integrated environment for software development. The simplest thing you can do with Emacs is edit your source code. However, you can do a lot more than that. You can run a debugger, and step through your program while Emacs showes you the corresponding sources that you are stepping through. You can browse on-line Info documentation and man pages, download and read your email off-line, and follow discussions on newsgroups. Emacs is particularly helpful with writing documentation with the Texinfo documentation system. You will find it harder to use Texinfo, if you don't use Emacs. It is also very helpful with editing files on remote machines over FTP, especially when your connection to the internet is over a slow modem. Finally, and most importantly, Emacs is programmable. You can write Emacs functions in Emacs Lisp to automate any chore that you find particularly useful in your own work. Because Emacs Lisp is a full programming language, there is no practical limit to what you can do with it.

If you already know a lot about Emacs, you can skip this chapter and move on. If you are a "vi" user, then we will assimilate you: See section Using vi emulation, for details. (3) This chapter will be most useful to the novice user who would like to set per Emacs up and running for software development, however it is not by any means comprehensive. See section Further reading on Emacs, for references to more comprehensive Emacs documentation.

Installing GNU Emacs

If Emacs is not installed on your system, you will need to get a source code distribution and compile it yourself. Installing Emacs is not difficult. If Emacs is already installed on your GNU/Linux system, you might still need to reinstall it: you might not have the most recent version, you might have Xemacs instead, you might not have support for internationalization, or your Emacs might not have compiled support for reading mail over POP (a feature very useful to developers that hook up over modem). If any of these is the case, then uninstall that version of Emacs, and reinstall Emacs from a source code distribution.

The entire Emacs source code is distributed in three separate files:

`emacs-20.3.tar.gz'
This is the main Emacs distribution. If you do not care about international language support, you can install this by itself.
`leim-20.3.tar.gz'
This supplements the Emacs distribution with support for multiple languages. If you develop internationalized software, it is likely that you will need this.
`intlfonts-1.1.tar.gz'
This file contains the fonts that Emacs uses to support international languages. If you want international language support, you will definetely need this.

Get a copy of these three files, place them under the same directory and unpack them with the following commands:

% gunzip emacs-20.3.tar.gz
% tar xf emacs-20.3.tar
% gunzip leim-20.3.tar.gz
% tar xf leim-20.3.tar

Both tarballs will unpack under the `emacs-20.3' directory. When this is finished, configure the source code with the following commands:

% cd emacs-20.3
% ./configure --with-pop --with-gssapi
% make

The `--with-pop' flag is almost always a good idea, especially if you are running Emacs from a home computer that is connected to the internet over modem. It will let you use Emacs to download your email from your internet provider and read it off-line (see section Using Emacs as an email client). Most internet providers use GSSAPI-authenticated POP. If you need to support other authentication protocols however, you may also want to add one of the following flags:

--with-kerberos
support Kerberos-authenticated POP
--with-kerberos5
support Kerberos version 5 authenticated POP
--with-hesiod
support Hesiod to get the POP server host

Then compile and install Emacs with:

$ make
# make install

Emacs is a very large program, so this will take a while.

To install `intlfonts-1.1.tar.gz' unpack it, and follow the instructions in the `README' file. Alternatively, you may find it more straightforward to install it from a Debian package. Packages for `intlfonts' exist as of Debian 2.1.

Basic Emacs concepts

In this section we describe what Emacs is and what it does. We will not yet discuss how to make Emacs work. That discussion is taken up in the subsequent sections, starting with section Configuring GNU Emacs. This section instead covers the fundamental ideas that you need to understand in order to make sense out of Emacs.

You can run Emacs from a text terminal, such as a vt100 terminal, but it is usually nicer to run Emacs under the X-windows system. To start Emacs type

% emacs &

on your shell prompt. The seasoned GNU developer usually sets up per X configuration such that it starts Emacs when person logs in. Then, person uses that Emacs process for all of per work until person logs out. To quit Emacs press C-x C-c, or select

Files ==> Exit Emacs

from the menu. The notation C-c means CTRL-c. The separating dash `-' means that you press the key after the dash while holding down the key before the dash. Be sure to quit Emacs before logging out, to ensure that your work is properly saved. If there are any files that you haven't yet saved, Emacs will prompt you and ask you if you want to save them, before quiting. If at any time you want Emacs to stop doing what it's doing, press C-g.

Under the X window system, Emacs controls multiple x-windows which are called franes. Each frame has a menubar and the main editing area. The editing area is divided into windows (4) by horizontal bars, called status bars. Every status bar contains concise information about the status of the window above the status bar. The minimal editing area has at least one big window, where editing takes place, and a small one-line window called the minibuffer. Emacs uses the minibuffer to display brief messages and to prompt the user to enter commands or other input. The minibuffer has no status bar of its own.

Each window is bound to a buffer. A buffer is an Emacs data structure that contains text. Most editing commands operate on buffers, modifying their contents. When a buffer is bound to a window, then you can see its contents as they are being changed. It is possible for a buffer to be bound to two windows, on different frames or on the same frame. Then whenever a change is made to the buffer, it is reflected on both windows. It is not necessary for a buffer to be bound to a window, in order to operate on it. In a typical Emacs session you may be manipulating more buffers than the windows that you actually have on your screen.

A buffer can be visiting files. In that case, the contents of the buffer reflect the contents of a file that is being editted. But buffers can be associated with anything you like, so long as you program it up. For example, under the Dired directory editor, a buffer is bound to a directory, showing you the contents of the directory. When you press RET while the cursor is over a file name, Emacs creates a new buffer, visits the file, and rebinds the window with that buffer. From the user's perspective, by pressing RET person "opened" the file for editing. If the file has already been "opened" then Emacs simply rebinds the existing buffer for that file.

Sometimes Emacs will divide a frame to two or more windows. You can switch from one window to another by clicking the 1st mouse button, while the mouse is inside the destination window. To resize these windows, grab the status bar with the 1st mouse button and move it up or down. Pressing the 2nd mouse button, while the mouse is on a status bar, will bury the window bellow the status bar. Pressing the 3rd mouse button will bury the window above the status bar, instead. Buried windows are not killed; they still exist and you can get back to them by selecting them from the menu bar, under:

Buffers ==> name-of-buffer

Buffers, with some exceptions, are usually named after the filenames of the files that they correspond to.

Once you visit a file for editing, then all you need to do is to edit it! The best way to learn how to edit files using the standard Emacs editor is by working through the on-line Emacs tutorial. To start the on-line tutorial type C-h t or select:

Help ==> Emacs Tutorial

If you are a vi user, or you simply prefer to use `vi' keybindings, then read section Using vi emulation.

In Emacs, every event causes a Lisp function to be executed. An event can be any keystroke, mouse movement, mouse clicking or dragging, or a menu bar selection. The function implements the appropriate response to the event. Almost all of these functions are written in a variant of Lisp called Emacs Lisp. The actual Emacs program, the executable, is an Emacs Lisp interpreter with the implementation of frames, buffers, and so on. However, the actual functionality that makes Emacs usable is implemented in Emacs Lisp.

Sometimes, Emacs will bind a few words of text to an Emacs function. For example, when you use Emacs to browse Info documentation, certain words that corresponds to hyperlinks to other nodes are bound to a function that makes Emacs follow the hyperlink. When such a binding is actually installed, moving the mouse over the bound text highlights it momentarily. While the text is highlighted, you can invoke the binding by clicking the 2nd mouse button.

Sometimes, an Emacs function might go into an infinite loop, or it might start doing something that you want to stop. You can always make Emacs abort (5) the function it is currently running by pressing C-g.

Emacs functions are usually spawned by Emacs itself in response to an event. However, the user can also spawn an Emacs function by typing:

ALT-x function-name RET

These functions can also be aborted with C-g.

It is standard in Emacs documentation to refer to the ALT key with the letter `M'. So, in the future, we will be refering to function invokations as:

M-x function-name

Because Emacs functionality is implemented in an event-driven fashion, the Emacs developer has to write Lisp functions that implement functionality, and then bind these functions to events. Tables of such bindings are called keymaps.

Emacs has a global keymap, which is in effect at all times, and then it has specialized keymaps depending on what editing mode you use. Editing modes are selected when you visit a file depending on the name of the file. So, for example, if you visit a C file, Emacs goes into the C mode. If you visit `Makefile', Emacs goes into makefile mode. The reason for associating different modes with different types of files is that the user's editing needs depend on the type of file that person is editing.

You can also enter a mode by running the Emacs function that initializes the mode. Here are the most commonly used modes:

M-x c-mode
Mode for editing C programs according to the GNU coding standards.
M-x c++-mode
Mode for editing C++ programs
M-x sh-mode
Mode for editing shell scripts.
M-x m4-mode
Mode for editing Autoconf macros.
M-x texinfo-mode
Mode for editing documentation written in the Texinfo formatting language. See section Introduction to Texinfo.
M-x makefile-mode
Mode for editing makefiles.

As a user you shouldn't have to worry too much about the modes. The defaults do the right thing. However, you might want to enhance Emacs to suit your needs better.

Configuring GNU Emacs

To use Emacs effectively for software development you need to configure it. Part of the configuration needs to be done in your X-resources file. On a Debian GNU/Linux system, the X-resources can be configured by editing

/etc/X11/Xresources

In many systems, you can configure X-resources by editing a file called `.Xresources' or `.Xdefaults' on your home directory, but that is system-dependent. The configuration that I use on my system is:

! Emacs defaults
emacs*Background: Black
emacs*Foreground: White
emacs*pointerColor: White
emacs*cursorColor: White
emacs*bitmapIcon: on
emacs*font: fixed
emacs*geometry: 80x40

In general I favor dark backgrounds and `fixed' fonts. Dark backgrounds make it easier to sit in front of the monitor for a prolonged period of time. `fixed' fonts looks nice and it's small enough to make efficient use of your screenspace. Some people might prefer larger fonts however.

When Emacs starts up, it looks for a file called `.emacs' at the user's home directory, and evaluates it's contents through the Emacs Lisp interpreter. You can customize and modify Emacs' behaviour by adding commands, written in Emacs Lisp, to this file. Here's a brief outline of the ways in which you can customize Emacs:

  1. A common change to the standard configuration is assigning global variables to non-default values. Many Emacs features and behaviours can be controlled and customized this way. This is done with the `setq' command, which accepts the following syntax:
    (setq variable value)
    
    For example:
    (setq viper-mode t)
    
    You can access on-line documentation for global variables by running:
    M-x describe-variable
    
  2. In some cases, Emacs depends on the values of shell environment variables. These can be manipulated with `setenv':
    (setenv "variable" "value")
    
    For example:
    (setenv "INFOPATH" "/usr/info:/usr/local/info")
    
    `setenv' does not affect the shell that invoked Emacs, but it does affect Emacs itself, and shells that are run under Emacs.
  3. Another way to enhance your Emacs configuration is by modifying the global keymap. This can be done with the `global-set-key' command, which follows the following syntax:
    (global-set-key [key sequence] 'function)
    
    For example, adding:
    (global-set-key [F12 d] 'doctor)
    
    to `.emacs' makes the key sequence F12 d equivalent to running `M-x doctor'. Emacs has many functions that provide all sorts of features. To find out about specific functions, consult the Emacs user manual. Once you know that a function exists, you can also get on-line documentation for it by running:
    M-x describe-function
    
    You can also write your own functions in Emacs Lisp.
  4. It is not always good to introduce bindings to the global map. Any bindings that are useful only within a certain mode should be added only to the local keymap of that mode. Consider for example the following Emacs Lisp function:
    (defun texi-insert-@example ()
      "Insert an @example @end example block"
      (interactive)
      (beginning-of-line)
      (insert "\n@example\n")
      (save-excursion 
        (insert "\n")
        (insert "@end example\n")
        (insert "\n@noindent\n")))
    
    We would like to bind this function to the key `F9', however we would like this binding to be in effect only when we are within `texinfo-mode'. To do that, first we must define a hook function that establishes the local bindings using `define-key':
    (defun texinfo-elef-hook ()
      (define-key texinfo-mode-map [F9] 'texi-insert-@example))
    
    The syntax of `define-key' is similar to `global-set-key' except it takes the name of the local keymap as an additional argument. The local keymap of any `name-mode' is `name-mode-map'. Finally, we must ask `texinfo-mode' to call the function `texinfo-elef-hook'. To do that use the `add-hook' command:
    (add-hook 'texinfo-mode-hook 'texinfo-elef-hook)
    
    In some cases, Emacs itself will provide you with a few optional hooks that you can attach to your modes.
  5. You can write your own modes! If you write a program whose use involves editing some type of input files, it is very much appreciated by the community if you also write an Emacs mode for thet file and distribute it with your program.

With the exception of simple customizations, most of the more complicated ones require that you write new Emacs Lisp functions, distribute them with your software and somehow make them visible to the installer's Emacs when person installs your software. See section Emacs Lisp with Automake, for more details on how to include Emacs Lisp packages to your software.

Here are some simple customizations that you might want to add to your `.emacs' file:

Emacs now has a graphical user interface to customization that will write `.emacs' for you automatically. To use it, select:

Help ==> Customize ==> Browse Customization Groups

from the menu bar. You can also manipulate some common settings from:

Help ==> Options

Using vi emulation

Many hackers prefer to use the `vi' editor. The `vi' editor is the standard editor on Unix. It is also always available on GNU/Linux. Many system administrators find it necessary to use vi, especially when they are in the middle of setting up a system in which Emacs has not been installed yet. Besides that, there are many compelling reasons why people like vi.

Because most rearrangements of finger habits are not as optimal as the vi finger habits, most vi users react very unpleasently to other editors. For the benefit of these users, in this section we describe how to run a vi editor under the Emacs system. Similarly, users of other editors find the vi finger habits strange and unintuitive. For the benefit of these users we describe briefly how to use the vi editor, so they can try it out if they like.

The vi emulation package for the Emacs system is called Viper. To use Viper, add the following lines in your `.emacs':

(setq viper-mode t)
(setq viper-inhibit-startup-message 't)
(setq viper-expert-level '3)
(require 'viper)

We recommend expert level 3, as the most balanced blend of the vi editor with the Emacs system. Most editing modes are aware of Viper, and when you begin editing the text you are immediately thrown into Viper. Some modes however do not do that. In some modes, like the Dired mode, this is very appropriate. In other modes however, especially custom modes that you have added to your system, Viper does not know about them, so it does not configure them to enter Viper mode by default. To tell a mode to enter Viper by default, add a line like the following to your `.emacs' file:

(add-hook 'm4-mode-hook 'viper-mode)

The modes that you are most likely to use during software development are

c-mode  , c++-mode , texinfo-mode
sh-mode , m4-mode  , makefile-mode

Sometimes, Emacs will enter Viper mode by default in modes where you prefer to get Emacs modes. In some versions of Emacs, the compilation-mode is such a mode. To tell a mode not to enter Viper by default, add a line like the following to your `.emacs' file:

(add-hook 'compilation-mode-hook 'viper-change-state-to-emacs)

The Emacs distribution has a Viper manual. For more details on setting Viper up, you should read that manual.

The vi editor has these things called editing modes. An editing mode defines how the editor responds to your keystrokes. Vi has three editing modes: insert mode, replace mode and command mode. If you run Viper, there is also the Emacs mode. Emacs indicates which mode you are in by showing one of `<I>', `<R>', `<V>', `<E>' on the statusbar correspondingly for the Insert, Replace, Command and Emacs modes. Emacs also shows you the mode by the color of the cursor. This makes it easy for you to keep track of which mode you are in.

While you are in Command mode, you can prepend keystrokes with a number. Then the subsequent keystroke will be executed as many times as the number. We now list the most important keystrokes that are available to you, while you are in Viper's command mode:

These are enough to get you started. Getting used to dealing with the modes and learning the commands is a matter of building finger habits. It may take you a week or two before you become comfortable with Viper. When Viper becomes second nature to you however, you won't want to tolerate what you used to use before.

Navigating source code

When you develop software, you need to edit many files at the same time, and you need an efficient way to switch from one file to another. The most general solution in Emacs is by going through Dired, the Emacs Directory Editor.

To use Dired effectively, we recommend that you add the following customizations to your `.emacs' file: First, add

(add-hook 'dired-load-hook (function (lambda () (load "dired-x"))))
(setq dired-omit-files-p t)

to activate the extended features of Dired. Then add the following key-bindings to the global keymap:

(global-set-key [f1] 'dired)
(global-set-key [f2] 'dired-omit-toggle)
(global-set-key [f3] 'shell)
(global-set-key [f4] 'find-file)
(global-set-key [f5] 'compile)
(global-set-key [f6] 'visit-tags-table)
(global-set-key [f8] 'add-change-log-entry-other-window)
(global-set-key [f12] 'make-frame)

If you use viper (see section Using vi emulation), you should also add the following customization to your `.emacs':

(add-hook 'compilation-mode-hook 'viper-change-state-to-emacs)

With these bindings, you can navigate from file to file or switch between editing and the shell simply by pressing the right function keys. Here's what these key bindings do:

f1
Enter the directory editor.
f2
Toggle the omission of boring files.
f3
Get a shell at the current Emacs window.
f4
Jump to a file, by filename.
f5
Run a compilation job.
f6
Load a `TAGS' file.
f8
Update the `ChangeLog' file.
f12
Make a new frame.

When you first start Emacs, you should create a few frames with f12 and move them around on your screen. Then press f1 to enter the directory editor and begin navigating the file system. To select a file for editing, move the cursor over the filename and press enter. You can select the same file from more than one emacs window, and edit different parts of it in every different window, or use the mouse to cut and paste text from one part of the file to another. If you want to take a direct jump to a specific file, and you know the filename of that file, it may be faster to press f4 and enter the filename rather than navigate your way there through the directory editor.

To go down a directory, move the cursor over the directory filename and press enter. To go up a few directories, press f1 and when you are prompted for the new directory, with the current directory as the default choice, erase your way up the hierarchy and press RET. To take a jump to a substantially different directort that you have visited recently, press f1 and then when prompted for the destination directory name, use the cursor keys to select the directory that you want among the list of directories that you have recently visited.

While in the directory navigator, you can use the cursor keys to move to another file. Pressing <RET> will bring that file up for editing. However there are many other things that Dired will let you do instead:

Z
Compress the file. If already compressed, uncompress it.
L
Parse the file through the Emacs Lisp interpreter. Use this only on files that contain Emacs Lisp code.
I, N
Visit the current file as an Info file, or as a man page. See section Browsing documentation.
d
Mark the file for deletion
u
Remove a mark on the file for deletion
x
Delete all the files marked for deletion
C destination <RET>
Copy the file to destination.
R filename <RET>
Rename the file to filename.
+ directoryname <RET>
Create a directory with name directoryname.

Dired has many other features. See the GNU Emacs User Manual, for more details.

Emacs provides another method for jumping from file to file: tags. Suppose that you are editing a C program whose source code is distributed in many files, and while editing the source for the function foo, you note that it is calling another function gleep. If you move your cursor on gleep, then Emacs will let you jump to the file where gleep is defined by pressing M-.. You can also jump to other occurances in your code where gleep is invoked by pressing M-,. In order for this to work, you need to do two things: you need to generate a tags file, and you need to tell emacs to load the file. If your source code is maintained with the GNU build system, you can create that tags files by typing:

% make tags

from the top-level directory of your source tree. Then load the tags file in Emacs by navigating Dired to the toplevel directory of your source code, and pressing f6.

While editing a file, you may want to hop to the shell prompt to run a program. You can do that at any time, on any frame, by pressing f3. To get out of the shell, and back into the file that you were editing, enter the directory editor by pressing f1, and then press <RET> repeatedly. The default selections will take you back to the file that you were most recently editing on that frame.

One very nice feature of Emacs is that it understands tar files. If you have a tar file `foo.tar' and you select it under Dired, then Emacs will load the entire file, parse it, and let you edit the individual files that it includes directly. This only works, however, when the tar file is not compressed. Usually tar files are distributed compressed, so you should uncompress them first with Z before entering them. Also, be careful not to load an extremely huge tar file. Emacs may mean "eating memory and constantly swaping" to some people, but don't push it!

Another very powerful feature of Emacs is the Ange-FTP package: it allows you to edit files on other computers, remotely, over an FTP connection. From a user perspective, remote files behave just like local files. All you have to do is press f1 or f4 and request a directory or file with filename following this form:

/username@host:/pathname

Then Emacs will access for you the file `/pathname' on the remote machine host by logging in over FTP as username. You will be prompted for a password, but that will happen only once per host. Emacs will then download the file that you want to edit and let you make your changes locally. When you save your changes, Emacs will use an FTP connection again to upload the new version back to the remote machine, replacing the older version of the file there. When you develop software on a remote computer, this feature can be very useful, especially if your connection to the Net is over a slow modem line. This way you can edit remote files just like you do with local files. You will still have to telnet to the remote computer to get a shell prompt. In Emacs, you can do this with M-x telnet. An advantage to telneting under Emacs is that it records your session, and you can save it to a file to browse it later.

While you are making changes to your files, you should also be keeping a diary of these changes in a `ChangeLog' file (see section Maintaining the documentation files). Whenever you are done with a modification that you would like to log, press f8, while the cursor is still at the same file, and preferably near the modification (for example, if you are editing a C program, be inside the same C function). Emacs will split the frame to two windows. The new window brings up your `ChangeLog' file. Record your changes and click on the status bar that separates the two windows with the 2nd mouse button to get rid of the `ChangeLog' file. Because updating the log is a frequent chore, this Emacs help is invaluable.

If you would like to compile your program, you can use the shell prompt to run `make'. However, the Emacs way is to use the M-x compile command. Press f5. Emacs will prompt you for the command that you would like to run. You can enter something like: `configure', `make', `make dvi', and so on (see section Installing a GNU package). The directory on which this command will run is the current directory of the current buffer. If your current buffer is visiting a file, then your command will run on the same directory as the file. If your current buffer is the directory editor, then your command will run on that directory. When you press <RET>, Emacs will split the frame into another window, and it will show you the command's output on that window. If there are error messages, then Emacs converts these messages to hyperlinks and you can follow them by pressing <RET> while the cursor is on them, or by clicking on them with the 2nd mouse button. When you are done, click on the status bar with the 2nd mouse button to get the compilation window off your screen.

Using Emacs as an email client

You can use Emacs to read your email. If you maintain free software, or in general maintain a very active internet life, you will get a lot of email. The Emacs mail readers have been designed to address the needs of software developers who get endless tons of email every day.

Emacs has two email programs: Rmail and Gnus. Rmail is simpler to learn, and it is similar to many other mail readers. The philosophy behind Rmail is that instead of separating messages to different folders, you attach labels to each message but leave the messages on the same folder. Then you can tell Rmail to browse only messages that have specific labels. Gnus, on the other hand, has a rather eccentric approach to email. It is a news-reader, so it makes your email look like another newsgroup! This is actually very nice if you are subscribed to many mailing lists and want to sort your email messages automatically. To learn more about Gnus, read the excellent Gnus manual. In this manual, we will only describe Rmail.

When you start Rmail, it moves any new mail from your mailboxes to the file `~/RMAIL' in your home directory. So, the first thing you need to tell Rmail is where your mailboxes are. To do that, add the following to your `.emacs':

(require 'rmail)
(setq rmail-primary-inbox-list
      (list "mailbox1" "mailbox2" ...))

If your mailboxes are on a filesystem that is mounted to your computer, then you just have to list the corresponding filenames. If your mailbox is on a remote computer, then you have to use the POP protocol to download it to your own computer. In order for this to work, the remote computer must support POP. Many hobbyist developers receive their email on an internet provider computer that is connected to the network 24/7 and download it on their personal computer whenever they dial up.

For example, if karl@whitehouse.gov is your email address at your internet provider, and they support POP, you would have to add the following to your `.emacs':

(require 'rmail)
(setq rmail-primary-inbox-list
      (list "po:karl"))
(setenv "MAILHOST" "whitehouse.gov")
(setq rmail-pop-password-required t)
(setq user-mail-address "karl@whitehouse.gov")
(setq user-full-name "President Karl Marx")

The string `"po:username"' is used to tell the POP daemon which mailbox you want to download. The environment variable MAILHOST tells Emacs which machine to connect to, to talk with a POP daemon. We also tell Emacs to prompt in the minibuffer to request the password for logging in with the POP daemon. The alternative is to hardcode the password into the `.emacs' file, but doing so is not a very good idea: if the security of your home computer is compromised, the cracker also gets your password for another system. Emacs will remember the password however, after the first time you enter it, so you won't have to enter it again later, during the same Emacs session. Finally, we tell Emacs our internet provider's email address and our "real name" in the internet provider's account. This way, when you send email from your home computer, Emacs will spoof it to make it look like it was sent from the internet provider's computer.

In addition to telling Rmail where to find your email, you may also want to add the following configuration options:

  1. Quote messages that you respond to with the > prefix:
    (setq mail-yank-prefix ">")
    
  2. Send yourself a blind copy of every message
    (setq mail-self-blind t)
    
  3. Alternatively, archive all your outgoing messages to a separate file:
    (setq mail-archive-file-name "/home/username/mail/sent-mail")
    
  4. To have Rmail insert your signature in every message that you send:
    (setq mail-signature t)
    
    and add the actual contents of your signature to `.signature' at your home directory.

Once Rmail is configured, to start downloading your email, run M-x rmail in Emacs. Emacs will load your mail, prompt you for your POP password if necessary, and download your email from the internet provider. Then, Emacs will display the first new message. You may quickly navigate by pressing n to go to the next message or p to go to the previous message. It is much better however to tell Emacs to compile a summary of your messages and let you to navigate your mailbox using the summary. To do that, press h. Emacs will split your frame to two windows: one window will display the current message, and the other window the summary. A highlighted bar in the summary indicates what the current message is. Emacs will also display any labels that you have associated with your messages. While the current buffer is the summary, you can navigate from message to message with the cursor keys (up and down in particular). You can also run any of the following commands:

h
display a summary of all the messages
s
save any changes made to the mail box
<
go to the first message in the summary
>
go to the last message in the summary
g
download any new email
r
reply to a message
f
forward a message
m
compose a new message
d
delete the current message
u
undelete the current message
x
expunge messages marked for deletion
a label <RET>
add the label label to the current message
k label <RET>
remove the label label from the current message
l label <RET>
display a summary only of the messages with label label
o folder <RET>
add the current message to another folder
w filename <RET>
write the body of the current message to a file

Other than browing email, here is some things that you will want to do:

In every one of these three cases you may need to edit the message's headers. The most commonly used header entries that Emacs recognizes are:

`To:'
list address of the recipient to whom the message is directed
`Cc:'
list addresses of other recipients that need to recieve courtesy copies of the message
`BCC:'
list addresses of other recipients to send a copy to, without showing their email address on the actual message
`FCC:'
list folders (filenames) where you would like the outgoing message to be appended to
`Subject:'
the subject field for the message

The fields `To:', `CC:', `BCC:' and `FCC:' can also have continuation lines: any subsequent lines that begin with a space are considered part of the field.

Handling patches

Believe it or not, I really don't know how to do that. I need a volunteer to explain this to me so I can explain it then in this section

Inserting copyright notices with Emacs

When you develop free software, you must place copyright notices at every file that invokes the General Public License. If you don't place any notice whatsoever, then the legal meaning is that you refuse to give any permissions whatsoever, and the software consequently is not free. For more details see section Applying the GPL. Many hackers, who don't take the law seriously, complain that adding the copyright notices takes too much typing. Some of these people live in countries where copyright is not really enforced. Others simply ignore it.

There is an Emacs package, called `gpl', which is currently distributed with Autotools, that makes it possible to insert and maintain copyright notices with minimal work. To use this package, in your `.emacs' you must declare your identity by adding the following commands:

(setq user-mail-address "me@here.com")
(setq user-full-name "My Name")

Then you must require the packages to be loaded:

(require 'gpl)
(require 'gpl-copying)

This package introduces the following commands:

gpl
Insert the standard GPL copyright notice using appropriate commenting.
gpl-fsf
Toggle FSF mode. Causes the gpl command to insert a GPL notice for software that is assigned to the Free Software Foundation. The gpl command autodetects what type of file you are editing, from the filename, and uses the appropriate commenting.
gpl-personal
Toggle personal mode. Causes the gpl command to insert a GPL notice for software in which you keep the copyright.

If you are routinely assigning your software to an organization other than the Free Software Foundation, then insert:

(setq gpl-organization "name")

after the `require' statements in your `.emacs'.

Hacker sanity with Emacs

Every once in a while, after long heroic efforts in front of the computer monitor, a software developer will need to some counseling to feel better about perself. In RL (real life) counseling is very expensive and it also involves getting up from your computer and transporting yourself to another location, which descreases your productivity. Emacs can help you. Run M-x doctor, and you will talk to a psychiatrist for free.

Many people say that hackers work too hard and they should go out for a walk once in a while. In Emacs, it is possible to do that without getting up from your chair. To enter an alternate universe, run M-x dunnet. Aside from being a refreshing experience, it is also a very effective way to procrastinate away work that you don't want to do. Why do today, what you can postpone for tomorrow?

Further reading on Emacs

This chapter should be enough to get you going with GNU Emacs. This is really all you need to know to use Emacs to develop software. However, the more you learn about Emacs, the more effectively you will be able to use it, and there is still a lot to learn; a lot more than we can fit in this one chapter. In this section we refer to other manuals that you can read to learn more about Emacs. Unlike many proprietary manuals that you are likely to find in bookstores, these manuals are free (see section Why free software needs free documentation). Whenever possible, please contribute to the GNU project by ordering a bound copy of the free documentation from the Free Software Foundation, or by contributing a donation.

The Free Software Foundation publishes the following manuals on Emacs:

The Emacs Editor
This manual tells you all there is to know about all the spiffy things that Emacs can do, except for a few things here and there that are so spiffy that they get to have their own separate manual. The printed version, published by the Free Software Foundation, features our hero, Richard Stallman, riding a gnu. It also includes the GNU Manifesto. The machine readable source for the manual is distributed with GNU Emacs.
Programming in Emacs Lisp
A wonderful introduction to Emacs Lisp, written by Robert Chassell. If you want to learn programming in Emacs Lisp, start by reading this manual. You can order this manual as a bound book from the Free Software Foundation. You can also download a machine readable copy of the manual from any GNU ftp site. Look for `elisp-manual-20-2.5.tar.gz'.
The GNU Emacs Lisp Reference Manual
This is a comprehensive reference manual for the Emacs Lisp language. You can also order this manual as a bound book from the Free Software Foundation. You can also download a machine readable copy of the manual from any GNU ftp site. Look for `emacs-lisp-intro-1.05.tar.gz'.

The following manuals are also distributed with the GNU Emacs source code and they make for some very fun reading:

Gnus Manual
Gnus is the Emacs newsreader. You can also use it to sort out your email, especially if you are subscribed to twenty mailing lists and receive tons of email every day. This manual will tell you all you need to know about Gnus to use it effectively. (`gnus.dvi')
CC Mode
The Emacs C editing mode will help you write C code that is beautifully formatted and consistent with the GNU coding standards. If you develop software for an organization that follows different coding standards, you will have to customize Emacs to use their standards instead. If they are lame and haven't given you Elisp code for their standards, then this manual will show you how to roll your own. (`cc-mode.dvi')
Common Lisp Extensions
Emacs has a package that introduces many Common Lisp extensions to Emacs Lisp. This manual describes what extensions are available and how to use them. (`cl.dvi')
Writing Customization Definitions
Recent versions of Emacs have an elaborate user-friendly customization interface that will let users customize Emacs and update their `.emacs' files automatically for them. If you are writing large Emacs packages, it is very easy to add a customization interface to them. This manual explains how to do it. (`customize.dvi')
The Emacs Widget Library
It is possible to insert actual widgets in an Emacs buffer that are bound to Emacs Lisp functions. This feature of Emacs is used, for example, in the newly introduced customization interface. This manual documents the Elisp API for using these widgets in your own Elisp packages. (`widget.dvi')
RefTeX User Manual
If you are writing large documents with LaTeX that contain a lot of crossreferences, then the RefTeX package will make your life easier. (`reftex.dvi')
Ediff User's Manual
Ediff is a comprehensive package for dealing with patches under Emacs. If you receive a lot of patches to your software projects from contributors, you can use Ediff to apply them to your source code. (`ediff.dvi')
Supercite User's Manual
If you think that quoting your responses to email messages with `>' is for lamers and you want to be elite, then use Supercite. (`sc.dvi')
Viper Is a Package for Emacs Rebels
This manual has more than you will ever need to know about Viper, the Emacs vi emulation. section Using vi emulation, actually describes all the features of Viper that you will ever really need. But still, it's a good reading for a long airplane trip. (`viper.dvi')

Compiling with Makefiles

In this chapter we describe how to use the compiler to compile simple software and libraries, and how to use makefiles.

Compiling simple programs

It is very easy to compile simple C programs on the GNU system. For example, consider the famous "Hello world" program:

`hello.c'
#include <stdio.h>
int
main ()
{
  printf ("Hello world\n");
}

The simplest way to compile this program is to type:

% gcc hello.c

on your shell. The resulting executable file is called `a.out' and you can run it from the shell like this:

% ./a.out
Hello world

To cause the executable to be stored under a different filename pass the `-o' flag to the compiler:

% gcc hello.c -o hello
% ./hello
Hello world

Even with simple one-file hacks like this, the GNU compiler can accept many options that modify its behaviour:

`-g'
The `-g' flag causes the compiler to output debugging information to the executable. This way, you can step your program through a debugger if it crashes. (FIXME: Crossreference)
`-O, -O2, -O3'
The `-O', `-O2', `-O3' flags activate optimization. The numbers are called optimization levels. When you compile your program with optimization enabled, the compiler applies certain algorithms to the machine code output to make it go faster. The cost is that your program compiles much more slowly and that although you can step it through a debugger if you used the `-g' flag, things will be a little strange. During development the programmer usually uses no optimization, and only activates it when person is about to run the program for a production run. A good advice: always test your code with optimization activated as well. If optimization breaks your code, then this is telling you that you have a memory bug. Good luck finding it.
`-Wall'
The `-Wall' flag tells the compiler to issue warnings when it sees bad programming style. Some of these warning catch actual bugs, but occasionally some of the warnings complain about something correct that you did on purpose. For this reason this flag is feature is not activated by default.

Here are some variations of the above example:

% gcc -g -O3 hello.c hello
% gcc -g -Wall hello.c -o hello
% gcc -g -Wall -O3 hello.c -o hello

To run a compiled executable in the current directory just type its name, prepended by `./'. In general, once you compile a useful program, you should install it so that it can be run from any current directory, simply by typing its name without prepending `./'. To install an executable, you need to move it to a standard directory such as `/usr/bin' or `/usr/local/bin'. If you don't have permissions to install files there, you can instead install them on your home directory at `/home/username/bin' where username is your username. When you write the name of an executable, the shell looks for the executable in a set of directories listed in the environment variable `PATH'. To add a nonstandard directory to your path do

% export PATH="$PATH:/home/username/bin"

if you are using the Bash shell, or

% setenv PATH "$PATH:/home/username/bin"

if you are using a different shell.

Programs with many source files

Now let's consider the case where you have a much larger program made of source files `foo1.c', `foo2.c', `foo3.c' and header files `header1.h' and `header2.h'. One way to compile the program is like this:

% gcc foo1.c foo2.c foo3.c -o foo

This is fine when you have only a few files to deal with. Eventually, when you have more than a few dozen files, it becomes wasteful to compile all of the files, all the time, every time you make a change in only one of the files. For this reason, the compiler allows you to compile every file separately into an intermediate file called object file, and link all the object files together at the end. This can be done with the following commands:

% gcc -c foo1.c
% gcc -c foo2.c
% gcc -c foo3.c
% gcc foo1.o foo2.o foo3.o -o foo

The first three commands generate the object files `foo1.o', `foo2.o', `foo3.o' and the last command links them together to the final executable file `foo'. The `*.o' suffix is reserved for use only by object files.

If you make a change only in `foo1.c', then you can rebuild `foo' like this:

% gcc -c foo1.c
% gcc foo1.o foo2.o foo3.o -o foo

The object files `foo2.o' and `foo3.o' do not need to be rebuilt since only `foo1.c' changed, so it is not necessary to recompile them.

Object files `*.o' contain definitions of variables and subroutines written out in assembly (machine language "pseudocode"). Most of these definitions will eventually be embedded in the final executable program at a specific address. At this stage however these memory addresses are not known so they are being refered to symbolically. These symbolic references are called symbols. It is possible to list the symbols defined in an object file with the `nm' command. For example:

% nm xmalloc.o
         U error
         U malloc
         U realloc
00000000 D xalloc_exit_failure
00000000 t xalloc_fail
00000004 D xalloc_fail_func
00000014 R xalloc_msg_memory_exhausted
00000030 T xmalloc
00000060 T xrealloc

The first column lists the symbol's address within the object file, when the symbol is actually defined in that object file. The second column lists the symbol type. The third column is the symbolic name of the symbol. In the final executable, these names become irrelevant. The following types commonly occur:

`T'
A function definition
`t'
A private function definition. Such functions are defined in C with the keyword static.
`D'
A global variable
`R'
A read-only (const) global variable
`U'
A symbol used but not defined in this object file.

For more details, see the Binutils manual.

The job of the compiler is to translate all the C source files to object files containing a corresponding set of symbolic definitions. It is the job of another program, the linker, to put the object files together, resolve and evaluate all the symbolic addresses, and build a complete machine language program that can actually be executed. When you ask `gcc' to link the object files into an executable, the compiler is actually running the linker to do the job.

During the process of linking, all the machine language instructions that refer to a specific memory address need to be modified to use the correct addresses within the executable, as oppposed to the addresses within their object file. This becomes an issue when you want to your program to load and link compiled object files during run-time instead of compile-time. To make such dynamic linking possible, your symbols need to be relocatable. This means that your symbols definitions must be correct no matter where you place them in memory. There should be no memory addresses that need to be modified. One way to do this is by refering to memory addresses within the object file by giving an offset from the refering address. Memory addresses outside the object file must be treated as interlibrary dependencies and you must tell the compiler what you expect them to be when you attempt to build relocatable machine code. Unfortunately some flavours of Unix do not handle interlibrary dependencies correctly. Fortunately, all of this mess can be dealt with in a uniform way, to the extent that this is possible, by using GNU Libtool. See section Using Libtool, for more details.

On GNU and Unix, all compiled languages compile to object files, and it is possible, in principle, to link object files that have originated from source files written in different programming languages. For example it is possible to link source code written in Fortran together with source code written in C or C++. In such cases, you need to know how the compiler converts the names with which the program language calls its constructs (such as variable, subroutines, etc.) to symbol names. Such conversions, when they actually happen, are called name-mangling. Both C++ and Fortran do name-mangling. C however is a very nice language, because it does absolutely no name-mangling. This is why when you want to write code that you want to export to many programming languages, it is best to write it in C. See section Using Fortran effectively, for more details on how to deal with the name-mangling done by Fortran compilers.

Building libraries

In many cases a collection of object files form a logical unit that is used by more than one executable. On both GNU and Unix systems, it is possible to collect such object files and form a library. On the GNU system, to create a library, you use the `ar' command:

ar cru libfoo.a foo1.o foo2.o foo3.o

This will create a file `libfoo.a' from the object files `foo1.o', `foo2.o' and `foo3.o'. The suffix `*.a' is reserved for object file libraries. Before using the library, it needs to be "blessed" by a program called `ranlib':

% ranlib libfoo.a

The GNU system, and most Unix systems require that you run `ranlib', but there have been some Unix systems where doing so is not necessary. In fact there are Unix systems, like some versions of SGI's Irix, that don't even have the `ranlib' command!

The reason for this is historical. Originally ar was meant to be used merely for packaging files together. The more well known program tar is a descendent of ar that was designed to handle making such archives on a tape device. Now that tape devices are more or less obsolete, tar is playing the role that was originally meant for ar. As for ar, way back, some people thought to use it to package *.o files. However the linker wanted a symbol table to be passed along with the archive. So the ranlib program was written to generate that table and add it to the *.a file. Then some Unix vendors thought that if they incorporated ranlib to ar then users wouldn't have to worry about forgetting to call ranlib. So they provided ranlib but it did nothing. Some of the more evil ones dropped it all-together breaking many people's scripts.

Once you have a library, you can link it with other object files just as if it were an object file itself. For example

% gcc bar.o libfoo.a -o foo

using `libfoo.a' as defined above, is equivalent to writing

% gcc bar.o foo1.o foo2.o foo3.o -o foo

Libraries are particularly useful when they are installed. To install a library you need to move the file `libfoo.a' to a standard directory. The actual location of that directory depends on your compiler. The GNU compiler looks for installed libraries in `/usr/lib' and `/usr/local/lib'. Because many Unix systems also use the GNU compiler, it is safe to say that both of these directories are standard in these systems too. However there are some Unix compilers that don't look at `/usr/local/lib' by default. Once a library is installed, it can be used in any project from any current directory to compile an executable that uses the subroutines that that library provides. You can direct the compiler to link an installed library with a set of executable files to form an executable by using the `-l' flag like this:

% gcc -o foo bar.o -lfoo

Note that if the filename of the library is `libfoo.a', the corresponding argument to the `-l' flag must be only the substring `foo', hence `-lfoo'. Libraries must be named with names that have the form `lib*.a'. If you have installed the `libfoo.a' library in a non-standard directory, you can tell the linker to look for the library in that directory as well by using the `-L' flag. For example, if the library was installed in `/home/lf/lib' then we would have to invoke the linking like this:

gcc -o bar bar.o -L/home/lf/lib -lfoo

The `-L' flag must appear before the `-l' flag.

Some people like to pass `-L.' to the compiler so they can link uninstalled libraries in the current working directory using the `-l' flag instead of typing in their full filenames. The idea is that they think "it looks better" that way. Actually this is considered bad style. You should use the `-l' flag to link only libraries that have already been installed and use the full pathnames to link in uninstalled libraries. The reason why this is important is because, even though it makes no difference when dealing with ordinary libraries, it makes a lot of difference when you are working with shared libraries. (FIXME: Crossreference). It makes a difference whether or not you are linking to an uninstalled or installed shared library, and in that case the `-l' semantics mean that you are linking an installed shared library. Please stick to this rule, even if you are not using shared libraries, to make it possible to switch to using shared libraries without too much hassle.

Also, if you are linking in more than one library, please pay attention to the order with which you link your libraries. When the linker links a library, it does not embed into the executable code the entire library, but only the symbols that are needed from the library. In order for the linker to know what symbols are really needed from any given library, it must have already parsed all the other libraries and object files that depend on that library! This implies that you first link your object files, then you link the higher-level libraries, then the lower-level libraries. If you are the author of the libraries, you must write your libraries in such a manner, that the dependency graph of your libraries is a tree. If two libraries depend on each other bidirectionally, then you may have trouble linking them in. This suggests that they should be one library instead!

Dealing with header files

In general libraries are composed of many `*.c' files that compile to object files, and a few header files (`*.h'). The header files declare the resources that are defined by the library and need to be included by any source files that use the library's resources. In general a library comes with two types of header files: public and private. The public header files declare resources that you want to make accessible to other software. The private header files declare resources that are meant to be used only for developing the library itself. To make an installed library useful, it is also necessary to install the corresponding public header files. The standard directory for installing header files is `/usr/include'. The GNU compiler also understands `/usr/local/include' as an alternative directory. When the compiler encounters the directive

#include <foo.h>

it searches these standard directories for `foo.h'. If you have installed the header files in a non-standard directory, you can tell the compiler to search for them in that directory by using the `-I' flag. For example, to build a program `bar' from a source file `bar.c' that uses the libfoo library installed at `/home/username' you would need to do the following:

% gcc -c -I/home/lf/include bar.c
% gcc -o bar bar.o -L/home/lf/lib -lfoo

You can also do it in one step:

% gcc -I/home/lf/include -o bar bar.o -L/home/lf/lib -lfoo

For portability, it is better that the `-I' appear before the filenames of the source files that we want to compile.

A good coding standard is to distringuish private from public header files in your source code by including private header files like

#include "private.h"

and public header files like

#include <public.h>

in your implementation of the library, even when the public header files are not yet installed while building the library. This way source code can be moved in or out of the library without needing to change the header file inclusion semantics from `<..>' to `".."' back and forth. In order for this to work however, you must tell the compiler to search for "installed" header files in the current directory too. To do that you must pass the `-I' flag with the current directory `.' as argument (`-I.').

In many cases a header file needs to include other header files, and it is very easy for some header files to be included more than once. When this happens, the compiler will complain about multiple declarations of the same symbols and throw an error. To prevent this from happening, please surround the contents of your header files with C preprocessor conditional like this:

#ifndef __defined_foo_h
#define __defined_hoo_h
[...contents...]
#endif

The defined macro __defined_foo_h is used as a flag to indicate that the contents of this header file have been included. To make sure that each one of these macros is unique to only one header file, please combine the prefix __defined with the pathname of the header file when it gets installed. If your header file is meant to be installed as in `/usr/local/include/foo.h' or `/usr/include/foo.h' then use __defined_foo_h. If your header files is meant to be installed in a subdirectory like `/usr/include/dir/foo.h' then please use __defined_dir_foo_h instead.

In principle, every library can be implemented using only one public header file and perhaps only one private header file. There are problems with this approach however:

For small libraries, these problems are not very serious. For large libraries however, you may need to split the one large header file to many smaller files. Sometimes a good approach is to have a matching header file for each source file, meaning that if there is a `foo.c' there should be a `foo.h'. Some other times it is better to distribute declarations among header files by splitting the library's provided resources to various logical categories and declaring each category on a separate header file. It is up to the developer to decide how to do this best.

Once this decision is made, a few issues still remain:

One way of preventing the filename conflicts is to install the library's header files in a subdirectory bellow the standard directory for installing header files. Then we install one header file in the standard directory itself that includes all the header files in the subdirectory.

For example, if the Foo library wants to install headers `foo1.h', `foo2.h' and `foo3.h', it can install them under `/usr/include/foo' and install in `/usr/include/' only a one header file `foo.h' containing only:

#include <foo/foo1.h>
#include <foo/foo2.h>
#include <foo/foo3.h>

Please name this "central" header and the directory for the subsidiary headers consistently after the corresponding library. So the `libfoo.a' library should install a central header named `foo.h' and all subsidiary headers under the subdirectory `foo'.

The subsidiary header files should be guarded with preprocessor conditionals, but it is not necessary to also guard the central header file that includes them. To make the flag macros used in these preprocessor conditionals unique, you should include the directory name in the flag macro's name. For example, `foo/foo1.h' should be guarded with

#ifndef __defined_foo_foo1_h
#define __defined_foo_foo1_h
[...contents...]
#endif

and similarly with `foo/foo2.h' and `foo/foo3.h'.

This approach creates yet another problem that needs to be addressed. If you recall, we suggested that you use the include "..." semantics for private header files and the include <...> semantics for public header files. This means that when you include the public header file `foo1.h' from one of the source files of the library itself, you should write:

#include <foo/foo1.h>

Unfortunately, if you place the `foo1.h' in the same directory as the file that attempts to include it, using these semantics, it will not work, because there is no subdirectory `foo' during compile time.

The simplest way to resolve this is by placing all of the source code for a given library under a directory and all such header files in a subdirectory named `foo'. The GNU build system in general requires that all the object files that build a specific library be under the same directory. This means that the C files must be in the same directory. It is okey however to place header files in a subdirectory.

This will also work if you have many directories, each containing the sources for a separate library, and a source file in directory `bar', for example, tries to include the header file `<foo/foo1.h>' from a directory `foo' bellow the directory containing the source code for the library libfoo. To make it work, just pass `-I' flags to the compiler for every directory of containing the source code of every library in the package. See section Libraries with Automake, for more details. It will also work even if there are already old versions of `foo/foo1.h' installed in a standard directory like `/usr/include', because the compiler will first search under the directories mentioned in the `-I' flags before trying the standard directories.

The GPL and libraries

A very common point of contention is whether or not using a software library in your program, makes your program derived work from that library. For example, suppose that your program uses the readline () function which is defined in the library `libreadline.a'. To do this, your program needs to link with this library. Whether or not this makes the program derived work makes a big difference. The readline library is free software published under the GNU General Public License, which requires that any derived work must also be free software and published under the same terms. So, if your program is derived work, you have to free it; if not, then you are not required to by the law.

When you link the library with your object files to create an executable, you are copying code from the library and combining it with code from your object files to create a new work. As a result, the executable is derived work. It doesn't matter if you create the executable by hand by running an assembler and putting it together manually, or if you automate the process by letting the compiler do it for you. Legally, you are doing the same thing.

Some people feel that linking to the library dynamically avoids making the executable derived work of the library. A dynamically linked executable does not embed a copy of the library. Instead, it contains code for loading the library from the disk during run-time. However, the executable is still derived work. The law makes no distinction between static linking and dynamic linking. So, when you compile an executable and you link it dynamically to a GPLed library, the executable must be distributed as free software with the library. This also means that you can not link dynamically both to a GPLed library and a proprietary library because the licenses of the two libraries conflict. The best way to resolve such conflicts is by replacing the proprietary library with a free one, or by convincing the owners of the proprietary library to license it as free software.

The law is actually pretty slimy about what is derived work. In the entertainment industry, if you write an original story that takes placed in the established universe of a Hollywood serial, like Star Trek, in which you use characters from that serial, like Captain Kirk, your story is actually derived work, according to the law, and Paramount can claim rights to it. Similarly, a dynamically linked executable does not contain a copy of the library itself, but it does contain code that refers to the library, and it is not self-contained without the library.

Note that there is no conflict when a GPLed utility is invoked by a proprietary program or vice versa via a system () call. There is a very specific reason why this is allowed: When you were given a copy of the invoked program, you were given permission to run it. As a technical matter, on Unix systems and the GNU system, using a program means forking some process that is already running to create a new process and loading up the program to take over the new process, until it exits. This is exactly what the system () call does, so permission to use a program implies that you have permission to call it from any other program via system (). This way, you can run GNU programs under a proprietary sh shell on Unix, and you can invoke proprietary programs from a GNU program. However, a free program that depends on a proprietary program for its operation can not be included in a free operating system, because the proprietary program would also have to be distributed with the system.

Because any program that uses a library becomes derived work of that library, the GNU project occasionally uses another license, the Lesser GPL, (often called LGPL) to copyleft libraries. The LGPL protects the freedom of the library, just like the GPL does, but allows proprietary executables to link and use LGPLed libraries. However, this permission should only be given when it benefits the free software community, and not to be nice to proprietary software developers. There's no moral reason why you should let them use your code if they don't let you use theirs. See section The LGPL vs the GPL, for a detailed discussion of this issue.

The language runtime libraries.

When you compile ordinary programs, like the hello world program the compiler will automatically link to your program a library called `libc.a'. So when you type

% gcc -c hello.c
% gcc -o hello hello.o

what is actually going on behind the scenes is:

% gcc -c hello.c
% gcc -o hello hello.c -lc

To see why this is necessary, try `nm' on `hello.o':

% nm hello.o
00000000 t gcc2_compiled.
00000000 T main
         U printf

The file `hello.o' defines the symbol `main', but it marks the symbol `printf' as undefined. The reason for this is that `printf' is not a built-in keyword of the C programming language, but a function call that is defined by the `libc.a' library. Most of the facilities of the C programming language are defined by this library. The include files `stdio.h', `stdlib.h', and so on are only header files that declare parts of the C library. You can read all about the C library in the Libc manual.

The catch is that there are many functions that you may consider standard features of C that are not included in the `libc.a' library itself. For example, all the math functions that are declared in `math.h' are defined in a library called `libm.a' which is not linked by default. So if your program is using math functions and including `math.h', then you need to explicitly link the math library by passing the `-lm' flag. The reason for this particular separation is that mathematicians are very picky about the way their math is being computed and they may want to use their own implementation of the math functions instead of the standard implementation. If the math functions were lumped into `libc.a' it wouldn't be possible to do that.

For example, consider the following program that prompts for a number and prints its square root:

`dude.c'
#include <stdio.h>
#include <math.h>

int 
main ()
{
  double a;
  printf ("a = ");
  scanf ("%f", &a);
  printf ("sqrt(a) = %f", sqrt(a));
}

To compile this program you will need to do:

% gcc -o dude dude.c -lm

otherwise you will get an error message from the linker about sqrt being an unresolved symbol.

On GNU, the `libc.a' library is very comprehensive. On many Unix systems however, when you use system-level features you may need to link additional system libraries such as `libbsd.a', `libsocket.a', `libnsl.a', etc. If you are linking C++ code, the C++ compiler will link both `libc.a' and the C++ standard library `libstdc++.a'. If you are also using GNU C++ features however, you will explicitly need to link `libg++.a' yourself. Also if you are linking Fortran and C code together you must also link the Fortran run-time libraries. These libraries have non-standard names and depend on the Fortran compiler that you use. (see section Using Fortran effectively) Finally, a very common problem is encountered when you are writing X applications. The X libraries and header files like to be placed in non-standard locations so you must provide system-dependent -I and -L flags so that the compiler can find them. Also the most recent version of X requires you to link in some additional libraries on top of libX11.a and some rare systems require you to link some additional system libraries to access networking features (recall that X is built on top of the sockets interface and it is essentially a communications protocol between the computer running the program and computer that controls the screen in which the X program is displayed.) FIXME: Crossreferences, if we explain all this in more details.

Because it is necessary to link system libraries to form an executable, under copyright law, the executable is derived work from the system libraries. This means that you must pay attention to the license terms of these libraries. The GNU `libc' library is under the LGPL license which allows you to link and distribute both free and proprietary executables. The `stdc++' library is also under terms that permit the distribution of proprietary executables. The `libg++' library however only permits you to build free executables. If you are on a GNU system, including Linux-based GNU systems, the legalese is pretty straightforward. If you are on a proprietary Unix system, you need to be more careful. The GNU GPL does not allow GPLed code to be linked against proprietary library. Because on Unix systems, the system libraries are proprietary, their terms also may not allow you to distribute executables derived from them. In practice, they do however, since proprietary Unix systems do want to attract proprietary applications. In the same spirit, the GNU GPL also makes an exception and explicitly permits the linking of GPL code with proprietary system libraries, provided that these libraries are a major component of the operating system (i.e. they are part of the compiler, or the kernel, and so on), unless the copy of the library itself accompanies the executable!

This includes proprietary `libc.a' libraries, the `libdxml.a' library in Digital Unix, proprietary Fortran system libraries like `libUfor.a', and the X11 libraries.

Basic Makefile concepts

To build a very large program, you need an extended set of invocations to the `gcc' compiler and utilities like `ar', `ranlib'. As we explained (see section Programs with many source files) if you make changes only to a few files in your source code, it is not necessary to rebuild everything; you only need to rebuild the object files that get to change because of your modifications and link those together with all the other object files to form an updated executable. The `make' utility was written mainly to automate rebuilding software by determining the minimum set of commands that need to be called to do this, and invoking them for you in the right order. It can also handle, many other tasks. For example, yo