Software Development in the Unix Environment

Prof. Brian L. Evans
Department of Electrical and Computer Engineering
The University of Texas at Austin
bevans@ece.utexas.edu

http://www.ece.utexas.edu/~bevans/talks/software_development/

Table of Contents

Books to Help Get Started - Setting Up Your Unix Account - Getting It Done - Making It Portable - More Information - Documentation Extraction Tools

This information serves as a starting point for people who want to develop software for the Unix environment. I present this material regularly in departmental digital signal processing seminars. The first four seminars took place on March 26, 1999; January 23, 1998; February 28, 1997; and October 18, 1996.

Books to Help Get Started

Developing software under Unix can be daunting at first. Unix shell commands are cryptic, as if Unix had been developed by professional programmers for professional programmers. In demystifying Unix, I have found the following textbooks and reference books useful.

High-level compiled languages

Unix operating system

Setting Up Your Unix Account

It takes some time to set up a Unix account for software development. When logging into a Unix system, the system will execute your personal startup file in your home directory: To find out what shell you are running, type finger -l username. You can change your shell by typing chsh.

In the appropriate login file, you will need to define a path and other environment variables. For an example, see my login files, which are available on the machines in the Learning Resource Center cluster in my ~bevans account. Some these settings are explained below. Note that many shells will also evaluate a second file

but this is not guaranteed from machine to machine. For example, this second login file is not evaluated in Common Desktop Environment terminal windows. Your setup will be more portable if you do not rely on the second login file.

Environment Variables

When you login, the shell is only guaranteed to define the HOME environment variable, which is set to the full path name of your home directory. The use of ~ to represent your home directory is not part of the Portable Operating Systems Interface (POSIX) standard. When typing in commands to Unix in an interactive shell, the shell will generally expand the ~ character properly. When referring to your home directory in your login files or other shell scripts, use $HOME to be portable.

Another useful setting is the name of the user, which is generally maintained by the LOGNAME environment variable. An alternate way of retrieving this information is to evaluate the whoami program. The whoami program, however, does not work properly in all instances. Some platforms define the USER environment variable to hold the user name.

For software development, you will want to set the dynamic library path set properly. This path lists the directories to search for finding dynamic or shared libraries. The environment variable is

Shared libraries are used for X window routines and system calls. On the Solaris machines in the Learning Resource Center, for example, I set LD_LIBRARY_PATH to
/usr/openwin/lib:/usr/local/X11/lib:/usr/lib:/usr/ucblib:/usr/local/gnu/lib
When you distribute binary programs to other users and you use shared libraries, then you will want to inform the others users about how to set these environment variables properly.

Path Setting

The path setting, which lists the directories to search for executable programs, is machine dependent. For software development, you will want the directories of the GNU tools (usually /usr/local/bin and /usr/local/gnu/bin) at the beginning of the path in your login file, as shown below:

# Part of the path setting independent of specific tools and operating systems
set basicpath = (/usr/ucb /usr/sbin /usr/bin /bin)
set iopath = (/usr/dt/bin)
set localpath = (/usr/local/bin /usr/local/gnu/bin /usr/local/packages/mh)
set path = ($localpath $basicpath $iopath)

Getting It Done

Software development in the Unix environment is greatly simplified by a variety of freely distributable tools from the Free Software Foundation: Their tools are consistent across platforms. Complementing these tools are For the 1997-1998 academic year, we have a departmental license for the Parasoft toolset (Insure++, CodeWizard) installed on the Solaris machines on the Learning Resource Center cluster. Their tools run on Solaris, HP, Linux, and AIX operating systems among others. During the 1997-1998 academic year, we had a departmental license for the Rational toolset (Purify, Quantify, Pure Coverage). Demonstration versions of these tools may be downloaded.

Compilation

A simple example using the GNU C compiler to compile x.c into x:

gcc -o x x.c
As more and more libraries are called by the program:
gcc -o x x.c -lm -lbsd
We can automate the commands by using makefiles.

Make files

Makefiles have two parts: definitions and commands:

Here is a simple makefile to build a C file x.c into an object module x.o and an executable program x.
# x.o must be rebuilt if x.c changes
# note that the white space in front of gcc is a TAB
x.o: x.c
	gcc -c x.c

# x must be rebuilt if x.o changes
x: x.o
	gcc -o x x.o
Note that the white space before the gcc is a TAB. Using spaces instead of a TAB will result in an error. To build (make) the executable x, type
make x
which will execute the following commands:
gcc -c x.c
gcc -o x x.o
If you make the executable again,
unix> make x
make: `x' is up to date.
No commands were executed because neither x.c nor x.o changed.

The above makefile is not very general. We have hard-coded what compiler to use which varies from machine to machine. Here is a more flexible makefile.

# Definitions
CC = gcc
LINKER = gcc
CFLAGS = -c
OBJS = x.o
 
# Commands

# GNU implicit make rule using pattern matching
# defines how to convert a C file into an object module
%.o:	%.c
        $(CC) $(CFLAGS) $<
 
# Explicit make rule for x
x:	$(OBJS)
        $(LINKER) $(OBJS) -o x
Now, we can configure the values of CC and CFLAGS based on the machine and operating system we are using.

Debuggers

The GNU debugger is useful for tracking down run-time errors. It can also be used in conjunction with Purify to investigate run-time Purify warnings and errors. The GNU debugger is perhaps most commonly used to track down the cause of segmentation faults and bus errors in programs. Follow this procedure:

The backtrace will list the call stack which is the list of all of the functions (in order) that were called leading up the core dump.

If the faulty program produces a segmentation fault or a bus error but does not produce a file called 'core', then your login scripts have told Unix not to produce a core file. Under Solaris, you can undo this setting by typing unlimit coredumpsize.

Automated Debuggers: Purify

Purify is an automated run-time debugger in that it tracks and reports common programming errors. Specifically, Purify detects

The GNU make rule to build Purify into your program follows. It will produce a duplicate copy of an executable program program called program.purify:
%.purify: %.o $(PT_DEPEND) $(VERSION)
	$(PURIFY) $(LINKER) $(LINKFLAGS_D) $< $(OBJFILES) $(LIBS) -o $(@F)
Here, you must define what linker you are using. To make sure that you are running GNU make, typing which make should return either /usr/local/bin or /usr/local/gnu/bin. Prof. Craig Chase has developed an alternate makefile for his EE380L course to handle Purify.

For more information about Purify, please see the

Managing Versions of Source Code

Keep tracking of different versions of your code is critical for long-term use and maintainability of your code. Many source code management systems exist on the Unix. These are useful for keeping track of changes made to any text file, e.g. C, Latex, HTML, and make files. The two most common systems are:

These two systems have similar functionality, but RCS is more flexible. Yes, this means that the freely distributable tool is better than the commercial tool. This happens often in the Unix world.

To get started, create a directory called SCCS to store the versions of the files (create an RCS directory for RCS). Next, create a text file such as the x.c C source file below. The source file below contains tags that the source code management system will replace with the file name, version number, and last date modified into the file.

/* SCCS Version: %W% %G% */
/* RCS Version: $Id$ */
 
#include <stdio.h>
#define BLOCK_SIZE 64
 
main() {
   char c;
   char* mem;
 
   /* Allocate a block of memory numBytes long */
   mem = malloc(BLOCK_SIZE);
 
   /* Index out of range */
   printf("%c\n", mem[BLOCK_SIZE]);

   /* Reading uninitialized memory */
   printf("%c\n", mem[BLOCK_SIZE - 1]);
 
   exit(0);
}
Source code control systems act like a library. Once you check out a file, no one else can write to it. This allows multiple people to develop the same source code with clobbering each other's changes. Here are several useful commands:
Function SCCS RCS
Initialize an entry sccs create -fi x.c ci -i x.c
Check out a file for editing sccs edit x.c co -l x.c
Check in a file sccs delget x.c ci -u x.c
To see what changes have been made sccs prs -e x.c rlog x.c
List changes since the last version sccs diffs x.c rcsdiff x.c
List all files that are checked out sccs info see below
List the files that you have checked out sccs tell -u n/a
To list all of the files that are checked out, use sccs info for SCCS, and for RCS, use
rlog -L -R RCS/* | sed s/,v// | sed s+RCS/++
If you are using SCCS and you want to back out of the changes you've made to a file you have checked out, then use
sccs unedit x.c

One advantage to using GNU make in conjunction with source code control systems is that GNU make will automatically check out files under source code control if they are newer that the corresponding files in the current directory. This feature guarantees that your code is always up-to-date. In addition, Emacs works seemlessly with RCS, since both are developed by the Free Software Foundation. You can use a single Meta command in Emacs to check files in and out of RCS.

Making It Portable

Changes in processor architecture, operating system, X windows versions, and compiler used are captured in three different ways: A makefile can include other makefiles. This allows the programmer to write general-purpose makefiles, and the machine-dependent make definitions are provided by make include files. In the Ptolemy Project, we define a series of make configuration files named config-$PTARCH.mk where PTARCH is the environment variable set to be the name of the computer architecture you are using (e.g. sol2 for Solaris 2.4 machines). A collection of these make include files are available. An alternative is to use the GNU autoconfig utility.

It turns out that developing software using Microsoft tools can easily make the code unportable. As Prof. Michael Ogg points out:

Visual C++, while regarded as something of a standard for the NT world, is having more "vendor-specific" features put into it. e.g. with the evolution from DCOM to COM+ all the COM+ features will be automagically generated by the (Visual C++) compiler. You could argue that this is a "good thing" because it hides the mess from the programmer. I would argue this is a "bad thing" because it makes code less and less portable. For a vendor-neutral (and free) C++, there is always the Win32 port of g++ from Cygnus. Some of the dept's C++ courses have been using g++ on Unix platforms, so this would give a cross-platform commonality.
Prof. Ogg notes similar problems with Microsoft's Java language and development tools.

Management of object code

Put the source code and object code in separate parallel directory trees. There should be one object directory tree per platform on which you will compile. For example, if you have a src/common directory for your source code, then you might have a obj.sol2/common directory to contain the Solaris 2.4 object files. In the object directory, you should a symbolic link to the makefile in the equivalent source directory:

cd obj.sol2/common
ln -s ../../src/common/makefile .
The idea is to separate source code from object code so that you can support multiple platforms. In order to determine the architecture, one can parse the string returned by uname. The uname program is portable among Unix operating systems. For example, the ptarch script uses uname and returns sol2 for Solaris 2.4, sol2.5 for Solaris 2.5, hppa for HP-UX 10, and so forth.

More Information

For more information, see the Guide for software developers at http://www.ece.utexas.edu/~bevans/talks/software_development/developer.html. Also, see Programming hints at http://www.ece.utexas.edu/~bevans/talks/software_development/Programming.html.

Documentation Extraction Tools

Documentation extraction tools will extract comments, function prototypes, and class definitions from code to produce a programmer's reference manual. By extracting the documentation from the code, one guarantees that the documentation is up-to-date. Several documentation extraction tools are available: These tools have been installed on the Solaris and AIX machines on the Learning Resource Center cluster.


Last Updated 03/09/03.