This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new Linux.com!

Linux.com

Feature

What you need to know to write man pages

By Peter Seebach on February 10, 2004 (8:00:00 AM)

Share    Print    Comments   

Despite the frequent attempts at competing formats, manual pages remain the core of standard Unix documentation. Users often prefer manual pages to other forms of documentation, and a well-written manual page is a valuable addition to any open-source project. This article discusses the issues faced in developing and writing manual pages, and a few quirks that you may encounter along the way.

Why man pages?

Man pages, some assert, are obsolete. They're old-fashioned, they don't have a lot of neat modern features, and they're written in nroff, an old and strange markup language.

All of this is true; nonetheless, man pages are the primary source of documentation for Unix-like systems, and have been for a long time. The man program is installed as standard on pretty much every Unix-like system ever shipped; competing formats may require users to install and learn new, unfamiliar tools. Man pages are well-suited to being printed without substantial effort; you don't have to format them, you can just feed them to a printer. Furthermore, everyone knows how to use more to read files.

A user having trouble with a command is very likely to start with man command when looking for help. A man page that directs the user toward the format-of-the-week is unhelpful. Even if there's a compelling reason to use another documentation format (such as better markup or support for hyperlinking) providing man pages is a good thing.

Nroff, troff, and macro packages

Man pages are written in a markup language, generally referred to as nroff; in fact, there are other processors for it (such as troff or groff), but it's all the same language. Nroff is a markup language, but it's a bit more primitive than HTML or SGML. On the other hand, it's a macro language, so arbitrarily complex things can be done in it. There are sets of macros designed to support certain document types. The flag given to the traditional nroff program to specify a macro set is "-mNAME," so many names are chosen to look natural with this; for instance, old-style man pages were written using the "tmac.an" set of macros, invoked with nroff -man.

There's a newer set of macros called doc (tmac.doc) which come with groff, and a strange superset called mandoc (tmac.andoc) found on 4.4BSD systems which magically guesses whether you really want the old "man" macros or the new "doc" macros. We'll cover the older macro set in passing, simply because it's arcane and poorly documented, and it is therefore most likely that you will suddenly be tasked with writing in it. On systems using groff, there is usually a man page for these macros, installed under the name groff_man. The mdoc macros are probably a lot cleaner, but not everyone has them. Such is the price of progress.

In general, troff macros are introduced by starting a line with a period. For instance, an old man page might start out like this:

	.TH C 1 local
	.SH NAME
	c \- columnize input or files

This takes a bit of interpretation. The TH macro introduces a manual page. TH probably stands for "title header." It indicates the title of the man page ("C"), the section ("1"), and up to three extra flags; in this case, the "local" flag is used to indicate a local command, as opposed to a standard part of the system. (As a trivia point, on a sytem supporting the mandoc macro set, the TH macro may also load the "old" macros from the traditional man macro set.) The manual section is referenced elsewhere by referring to a command as name(section); for instance, ls(1), or printf(3).

In the doc macro set, you see introductions like this:

	.Dd September 22, 2003
	.Dt LS 1
	.Os
	.Sh NAME
	.Nm ls
	.Nd list directory contents

Macro names still start with a period at the beginning of a line. The Dd macro is the magic cookie the mandoc macros use to identify a file formatted using the doc macros. Note that the doc macros have specific sub-macros for the name and description of a command. In both macro sets, the name of the man page, and its section, are introduced very early on.

Pages within sections

The online manual is divided into sections. The exact scope of manual sections varies slightly from one system to another; a typical layout is this:

	1	commands
	2	system calls
	3	library functions
	4	device drivers
	5	file formats
	6	games
	7	miscellaneous
	8	system utilities

There are often additional subsections; for instance, programs installed in /usr/local might get man pages in section 1l. There may also be additional sections; NetBSD documents kernel internals in section 9, for instance.

Sections within a page

Confusingly, just as each manual page is in a "section" of the manual (e.g., section 1 for command-line utilities, or section 3 for library functions), each manual page consists of several named sections.

The SH macro (Sh in the doc macro set) introduces a section header. The exact set of section headers varies from one system to another. A reasonably complete set is NAME, SYNOPSIS, DESCRIPTION, OPTIONS, RETURN VALUE, ERRORS, DIAGNOSTICS, EXAMPLES, ENVIRONMENT, FILES, CAVEATS, BUGS, RESTRICTIONS, NOTES, SEE ALSO, AUTHOR, and HISTORY.

The NAME section is the one used for the whatis database, and also for the man -k or apropos commands. It should have a one-line summary of what the thing described is -- concise, but informative. Made-up words like "columnize" are probably bad style.

The SYNOPSIS section describes the usage of the command. In general, it should look very much like a traditional usage message. Here's how it might look in source, with the old man macro set:

	.SH SYNOPSIS
	.B c
	.RB [ " \-hV123456789 " ]
	.RB [ \-w\ width ]
	.RB [ \-c\ columns ]
	.RB [ \-n\ spacing ]
	[
	.I "name \&..."
	]

By convention, optional arguments are surrounded by square brackets. Options which take arguments are traditionally separated out and given meaningful names. The RB macro alternates between "roman" (plain) and bold styling. For instance, this line:

.RB [ \-n\ spacing ]
prints a square bracket in plain font, the text "-n spacing" in bold, and then the closing bracket plain again. The backslash before the space is used to make the while "-n spacing" into a single argument; it would also work to write it this way:
.RB [ "-n spacing" ]
There is a related macro, BR, which also alternates between bold and roman, starting with bold.

The doc macro set improves substantially on this:

	.Sh SYNOPSIS
	.Nm
	.Op Fl FINZ
	.Op Fl a Ar maxcontig
	.Op Fl B Ar byte-order
	.Op Fl b Ar block-size

The Nm macro prints the name of the command; it doesn't need to be repeated here, because an earlier macro saved it. The Op macro surrounds its arguments in square brackets, the Fl macro indicates flags, and the Ar macro indicates a named argument. This is a lot easier to write. The essential content remains the same.

The DESCRIPTION section gives a brief summary of what the man page describes; for instance, for a command or library function, it would be a summary of functionality. For a data structure or file format, it would be a summary of the structure of data and kind of data stored. Some man pages include descriptions of options in this section. Otherwise, they're put in a separate section titled OPTIONS, which should start with a clear summary of what the manual page describes.

The formatting for options is itself a little weird. A typical OPTIONS section might look a bit like this:

	.TP
	.B \-h --horizontal
	Items go left to right, then top to bottom, rather than
	top to bottom, then left to right.  For instance, the 2nd
	item will be at the beginning of the 2nd column, rather
	than being the 2nd item in the 1st column.
	.TP
	.B \-V --version
	Print version info and exit.

The TP macro introduces an indented paragraph with a label, suitable for option lists. The B macro puts its arguments in bold print. With the doc macros, it's formatted like this:

	.Bl -tag -width indent
	.It Fl A
	List all entries except for
	.Ql \&.
	and
	.Ql \&.. .
	Always set for the super-user.
	.It Fl a
	Include directory entries whose names begin with a
	dot
	.Pq Sq \&. .
	.El
The list is introduced with a Bl macro, and ended with El. The It macro introduces an item, and the Fl macro introduces a flag, just as it did in the SYNOPSIS section.

Every option should be described. Avoid the mistake of not telling someone what the option really does; give some idea of when an option is useful and what its effects are. Merely saying that the "-f" option toggles the "foo" setting doesn't help the reader.

Functions and command-line utilities should generally describe their return values or exit statuses; this is what the RETURN VALUE section is used for. This section is often omitted for command-line utilities that return 0 on success. Some man pages merge this into the DIAGNOSTICS section.

ERRORS and DIAGNOSTICS should describe any possible error indications a program or function can yield. By convention, programs have diagnostics, but functions have errors. Error messages should be explained in reasonable detail. For library functions or system calls, return codes should be described, and so should any changes that may be made to errno.

The ENVIRONMENT section should describe any way in which environment variables affect the behavior of a program. For instance, does it care about PATH, or TMPDIR? Many GNU utilities, for instance, follow the POSIX specification completely only when the environment variable POSIXLY_CORRECT has been set. This is where such behaviors should be documented.

Similarly, the FILES section should describe any files a program or function interacts with, especially any that are likely to be modified.

The SEE ALSO section should cross-reference other man pages that may be relevant to a user reading this man page. For instance, on NetBSD, the man page for ls(1) has SEE ALSO references for:

chflags(1), chmod(1), stat(2), getbsize(3), dir(5), symlink(7), sticky(8)

BUGS should include, not just crashing problems, but general limitations. For instance, the "c" utility described assumes an 80-column screen; while it tries to get the right value, it may fail, and this is a limitation, so it's documented in BUGS. Related would be RESTRICTIONS, which are, to quote the pod2man man page, "bugs you don't plan to fix." Also related are CAVEATS, sometimes called WARNINGS, which are things to watch out for in how the program is designed, but which may not be what a user wants.

If present, a HISTORY section should describe where a command comes from; for instance, what Unix-like system, and what version, it first appeared in. The AUTHOR section should be used to identify the author or authors of the command.

The STANDARDS section should describe what standards, if any, something complies with. For instance, the manual page for printf(3) should say which version of the C standard the printf routine is compliant with. This section should also indicate whether any functionality provided is an extension to such standards; users may wish to avoid extensions when writing portable programs.

Most man pages benefit a lot from an EXAMPLES section. Whatever you're documenting, show a couple of sample usages. For programs with many options, show how some common ones interact. [Editor's note: This is a huge pet peeve of mine. Too few man pages provide any examples at all. Man writers, take this advice to heart!]

Finally, if you really have to say something but it doesn't fit anywhere else, you can make a section called NOTES.

Weird macros

Some of the macros used are a bit confusing, or may have unusual limitations. For instance, on some systems, the RB and BR macros may take a maximum of 6 arguments. The same may apply to the BI, IB, IR, and RI macros (which alternate bold or roman text with italicized text). This can require special considerations when writing descriptions of C functions which take a number of arguments. One convention used in a lot of man pages is to have the function name and argument types in bold, and argument names in italics. This can result in needing to split a line up. For instance:

	.BI "int szncmp(void *" "s1" ", void *" "s2"\c
	.BI ", size_t " "len" );

The "\c" at the end of the first line tells the macro processor that the newline should not be treated as introducing whitespace between the last argument on the first line, and the first argument on the second line.

There are additional macros for some systems. For instance, on old SunOS systems, there's an IX macro, used something like this:

.IX "mem2sz()" "" "makes sz from mem"
This was used in section 3 man pages to help populate an index; it doesn't appear to have any effect in current man page systems. The pod2man utility generates these for section headings.

In addition to the font-selection macros, there are things you can do within a line. For instance, these two lines look the same:

	.RB [ \-s\ step ]
	[ \fB-s step\fR ]
The most likely ones to use are \fB (bold), \fI (italic), and \fR (roman, or plain).

Alternatives to nroff

It's pretty easy to imagine not wanting to write in nroff, especially not in the archaic man macro set. Some people swear off producing man pages at all. This is very annoying -- don't do it. Here are two alternatives to consider:

1. The doc macro set used with groff is widely available, and fairly friendly.
2. Perl's POD documentation format converts reasonably well to manual pages.

Even if you're stuck writing in the old man macro set, it's still quite possible to produce good, readable, solid documentation. Don't be afraid to copy some bits from existing manuals!

Many open source developers seem a little shy about documentation. Documentation can be a bit hard to write, and harder to write well. Putting out documentation sometimes seems like a way to ask people to waste your time with typo reports. It's still worth it. Remember that bug reports for well-documented code will avoid the things you put in the CAVEATS section; furthermore, users will understand what you thought this widget did in the first place.

Don't think of documentation as time taken away from developing a product; think of it as time spent figuring out what exactly you're developing. Documentation is as much part of the final product as anything else; without documentation, a product is inaccessible to users.

Share    Print    Comments   

Comments

on What you need to know to write man pages

Note: Comments are owned by the poster. We are not responsible for their content.

Re:Getting from XML to man

Posted by: Anonymous Coward on April 17, 2007 09:38 AM
A good method to convert from docbook xml refentry items to a man page is to use docbook2X (specifically docbook2man). Search for it with google. Otherwise you can look at using stylesheets and xsltproc but the former is a bit easier to use i think.

#

Re:Political correctness?

Posted by: Anonymous Coward on May 19, 2007 10:07 PM
The man command has nothing to do with gender. It's short for manual.

#

examples are good

Posted by: Administrator on February 10, 2004 08:59 PM
Why are examples of usage so rare in man pages? Of the man pages I use, fewer than half of them have examples. On my system, of sed and gawk, two of the oldest utilities in Unix, only one (gawk) has examples in its man page. The sed examples are buried in its info page tree.

#

Re:examples are good

Posted by: Administrator on February 11, 2004 12:25 PM
Sounds like you are now armed with the knowlege to add examples to your man pages, and then share with the rest of us.<nobr> <wbr></nobr>:) I think I might sit down and write a couple now. Thanks for the great article linux.com!

#

Re:examples are good

Posted by: Administrator on February 15, 2004 02:57 AM
HEAR HEAR. I can use cdrecord and mkisofs because of the examples.. but a lot of man pages do not include examples..

Or EXHAUSTIVE examples of it.

Code man Pages need to include some more example code.

If we could incorperate Richard Stevens books into the man pages, coding would be much easier<nobr> <wbr></nobr>:)

#

examples are good

Posted by: Anonymous [ip: 210.212.228.78] on January 30, 2008 05:09 PM
difference between bangaram and ravikiranam

#

Alternative

Posted by: Administrator on February 11, 2004 08:24 PM
I recommend <A HREF="http://wolfpack.twu.net/ManEdit/" TITLE="twu.net">ManEdit</a twu.net>

#

The ultramodern alternative...

Posted by: Administrator on February 13, 2004 07:16 PM
Another good way to generate man pages is to write in DocBook XML (or the older SGML variant), which is designed for technical documentation. There are de-facto-standard stylesheets to generate nroff man page source from DocBook documents, as well as a wide variety of other formats (so it becomes very easy to generate HTML versions, for instance).


(It's a vastly better solution that the abominable "info"!)

#

Re:The ultramodern alternative...

Posted by: Administrator on February 19, 2004 04:22 PM
I totally agree with this statement.

docbook is SGML/XML
so can be edited with any XML aware editor.
Some of them are even WYSIWYG (provided a CSS).

then, this format can generate from man pages to PDF books. really professional.

I used xmlmind editor which was quite good. and has a CSS for docbook.

On Solaris, all man pages are written in a derivative of docbook.

#

Political correctness?

Posted by: Administrator on February 14, 2004 03:38 AM
In view of the unfortunate attitude of a small number of people, it might be best to make a symlink from "man" to "woman", or re-name the command "person", or something entirely neutral. You can never be too careful.....

Seriously though, a good article, and man pages will be around for a long time yet. It was the correct thing to do 1970-ish, and is still useful. It may well be an area where non-programmers who want to contribute can do so, as a lot of the man pages could do with expansion or clarification. The software advances, but the documentation tends to lag behind.

#

Getting from XML to man

Posted by: Administrator on January 12, 2006 01:50 AM
I'm a tech writer who's been charged with creating man pages for a Linux product. I can create XML files. What I don't know is how to translate the XML into man pages. Can anyone tell me what the process is? Or maybe refer a good resource?

Thanks.

#

This story has been archived. Comments can no longer be posted.



 
Tableless layout Validate XHTML 1.0 Strict Validate CSS Powered by Xaraya