Documentation Galore

John F. Moore

 
           
            Mail to: John F. Moore from WPCUG 

        

Last Revised: 2006-01-03 11:28

Revision History
Revision 1.1 Mon Jan 2 14:19:53 EST 2006
Added more copyright years and the CC copyright.
Revision 1.0 Tue Jun 8 20:45:21 EDT 2004

Abstract

This talk will focus on the different types of documentation available on and for the Linux system.


Table of Contents

1. Help, Info, Howto, and more
2. The Linux Documentation Project
3. Man Pages
4. Texinfo Pages
5. Program Documentation
6. Frequently Asked Questions (FAQ)
7. Network News Transport Protocol (NNTP)
8. HOWTOs
9. Guides.
10. On-line Tutorials
11. Paper Books
12. Conclusions

1.  Help, Info, Howto, and more

Before there was help for Dos, before there was help in Windows, there were man pages. The original documentation tool available on the Unix system was the Roff formatter. This tool allowed the creation of on-line manual pages for the commands. As we saw last week, the manual pages were only one of the formating options available to the document writer today.

But man pages are not the only type of documentation on the Linux system. One of the most common help systems, used by the GNU FSF (Free Software Federation) is know as Texinfo . This Texinfo system was designed to be a hyperlinked system before web pages became common. The output of this system can be viewed using the Info program or a help mode which is integrated into Emacs. Additionally, the Texinfo pages can be converted to print using the TeX formatter.

One of the information sources available for many programs is the documentation which comes with a particular application. This documentation is commonly stored in /usr/share/doc . This documentation often takes the form of release notes, copy right information, and readme files. In the past the information was held in /usr/doc .

Since Linux was born on the Internet, one form of communication common to most everyone involved in Linux is email. Often when a program becomes used by more than a small group of people, someone, usually the author, created a mail list for their application. This is a good way for users to swap questions, tips and tricks on using the application in question. Generally after an email list has been in use for a while, some questions come up repeatedly. Once someone notices this repetition, there is an effort to create a FAQ (Frequently Asked Questions) list. This list tends to cut down on repeated requests for the same information. It also tends to limit the amount of searching of the Email list archive.

Closely related to email is the Network News Transport Protocol (NNTP) . This has gone somewhat out of favor due to the amount of traffic involved. The system works like a news service where there is a server and you connect to the server.

The next type of help which I know about was the HOWTOs . These are step-by-step explanations designed to provide a more complete understanding of how a particular topic works and how to configure the tools involved. In addition to the step-by-step information, they usually contain discussions about how a particular system works.

At The Linux Documentation Project they also create longer more general guides, which they conveniently call guides. These are based on more than a single subject like the Howtos.

At a slightly higher level than the Guides, we come to the On-line Tutorials. These are often complete books available on the Internet. Many are aimed at the beginner, although there are more complex ones available on-line. A good example is Rute User's Tutorial and Exposition which is available in lessons.

And not to be forgotten is the extensive list of books available from places like Amazon or Borders.

2.  The Linux Documentation Project

Before we start discussing the specific types of documentation, I would like to take a minute to talk about The Linux Documentation Project .

 

The Linux Documentation Project is working on developing free, high quality documentation for the GNU/Linux operating system. The overall goal of the LDP is to collaborate in all of the issues of Linux documentation. This includes the creation of "HOWTOs" and "Guides". We hope to establish a system of documentation for Linux that will be easy to use and search. This includes the integration of the manual pages, info docs, HOWTOs, and other documents.

LDP's goal is to create the canonical set of free Linux documentation. While on-line (and downloadable) documentation can be frequently updated in order to stay on top of the many changes in the Linux world, we also like to see the same docs included on CDs and printed in books. If you are interested in publishing any of the LDP works, see the section "Publishing LDP Documents", below.

The LDP is essentially a loose team of volunteers with minimal central organization. Anyone who would like to help is welcome to join in this effort. We feel that working together informally and discussing projects on our mailing lists is the best way to go. When we disagree on things, we try to reason with each other until we reach an informed consensus.

 
  -- Linux Documentation Project Manifesto

I want to say a couple of words about the TLDP since it represents one of the strengths of Linux. Linux came into being and continues today to be a community of volunteers who spend some of their free time working on software and documentation for the benefit of others. This was best discussed by Eric Raymond in his famous piece, The Cathedral and the Bazaar .

Here is a quote from the beginning of the book: " Linux is subversive. Who would have thought even five years ago (1991) that a world-class operating system could coalesce as if by magic out of part-time hacking by several thousand developers scattered all over the planet, connected only by the tenuous strands of the Internet? "

The significance of this, to my mind, is the concept that people might give away a valuable resource, skilled time. We are not discussing simple community service by kids. We are talking about highly skilled developers and computer professionals who contribute a portion of their time and skill to develop and maintain a tool for use by everyone.

I know I have spoken about this subject before, but I think it bears repeating. The community of people involved in Linux is truly awesome. Awesome enough that some companies are willing to spend a considerable amount of capital to stop or subvert this community. I hope I do not need to remind you of the DMCA, or the extended copyright laws, or the effort to create a Trusted Computing Environment. Many of these are aimed directly at crippling or eliminating people's ability to use the Linux system.

The problem is that without a great deal of support from people like you and me, the Linux system could become history. At a time when much of the rest of the world is trying to move into the information age, some companies are trying to shut down this threat to their profits. I hate to sound like a prophet of doom, but as Linux continues to take revenue or even perceived revenue from some companies, there is more and more pressure on legislators to pass laws to restrict our use of Linux on our own computers.

Let me close this gloomy prediction by urging to get involved in the Electronic Freedom Foundation. If you would like to see some of the lies on the internet have a look at this commentary, The Cathedral and the Bizarre , or this commentary on Open Source and the MAC: The Cathedral and the Bizarre . I found these on Google when I misspelled Bizaar as Bizarre.

3.  Man Pages

Since the original formating used on the Unix system was Roff, lets have a look at the parts of a man page to see how they are created. In these early documents, it was common for the author to place the formating commands into the text stream as it was typed. Some of you might remember Word Star which used a similar system.

 

A man (Manual) page is a standard form of help that is available for many Linux applications and utilities. You can view man pages by using the man command. Many of the GNU utilities have a more detailed form of help, called info pages. You can view info pages by using the info command.

 
  -- TLDP FAQ
 

On any UNIX system the on-line documentation available from the man pages covers all the commands available, including the script languages. For Bourne it's usually between 10 and 15 pages whereas for C Shell or Korne it's about 30 pages or more. ...

The other thing you must do is cultivate an ability to read through these man pages and understand the meaning behind the words. This is a trick that takes time, don't think for a moment that what is written in a man page is actually correct. It is simply someone else's idea of how to explain what is supposed to happen under known circumstances. Don't forget that it's rare for the code writer to be the documentation writer too. This ability will put you in good stead for negotiating the most troublesome passages. If it doesn't make sense the first time, change your view and look at it from another angle. Develop this skill well, it will pay dividends later.

 
  -- Read That Manual!

This form of document is designed for a quick lookup of parameters and/or command line arguments. Although these documents are not always easy for the casual user, they are often the most accurate available, since they are generally written by the developer who wrote the program. I find it best to think of them as a quick reference. One useful command for man pages is whereis <program> . This command will show you where the man page and the command live.

4.  Texinfo Pages

The Texinfo pages were designed as a way of combining text for on-line viewing and printed documentation. Now I know that sounds like what you have today with web pages, but Texinfo predates web pages.

 

A bit of history: in the 1970's at CMU, Brian Reid developed a program and format named Scribe to mark up documents for printing. It used the `@' character to introduce commands, as Texinfo does. Much more consequentially, it strived to describe document contents rather than formatting, an idea wholeheartedly adopted by Texinfo.

Meanwhile, people at MIT developed another, not too dissimilar format called Bolio. This then was converted to using TeX as its typesetting language: BoTeX. The earliest BoTeX version seems to have been 0.02 on October 31, 1984.

BoTeX could only be used as a markup language for documents to be printed, not for on-line documents. Richard Stallman (RMS) worked on both Bolio and BoTeX. He also developed a nifty on-line help format called Info, and then combined BoTeX and Info to create Texinfo, a mark up language for text that is intended to be read both on-line and as printed hard copy.

 
  -- GNU Texinfo 4.7

The Texinfo format is interesting in that it contains a hyper text type linkage which predates the web. I know you think, what's the big deal since we deal with Web pages so much today. But it is still impressive in it's own way. Lets look at part of a header for the source file.

Example 1.  Sample Texinfo File Beginning

     \input texinfo   @c -*-texinfo-*-
     @c %**start of header
     @setfilename infoname.info
     @settitle name-of-manual version
     @c %**end of header
     
     @copying
     This manual is for program, version version.
     
     Copyright @copyright{} years copyright-owner.
     
     @quotation
     Permission is granted to ...
     @end quotation
     @end copying
     
     @titlepage
     @title name-of-manual-when-printed
     @subtitle subtitle-if-any
     @subtitle second-subtitle
     @author author
     
     @c  The following two commands
     @c  start the copyright page.
     @page
     @vskip 0pt plus 1filll
     @insertcopying
     
     Published by ...
     @end titlepage
     
     @c So the toc is printed at the start.
     @contents
     
     @ifnottex
     @node Top
     @top title
     
     @insertcopying
     @end ifnottex
     
     @menu
     * First Chapter::    Getting started ...
     * Second Chapter::          ...
      ...
     * Copying::          Your rights and freedoms.
     @end menu
     
     @node First Chapter
     @chapter First Chapter
     
     @cindex first chapter
     @cindex chapter, first
     ...
      

The strength of this format is that it can be transformed into a Man page, an Info page, a HTML page, and/or a TeX format. We will see later another format which allows these type of translations, ie XML. But from the author's point of view this is a much simpler format to work with. To see what Info pages look like on the Web have a look at Info (Dir) which is the Directory or top page in Info terms.

5.  Program Documentation

The documentation which comes with a program can vary from only release notes, to full blown documentation. As example lets look at two programs: apmd and aspell.

Example 2.  Directory list of apmd-3.0.2

        apmd-3.0.2
        |-- ANNOUNCE
        |-- ChangeLog
        |-- LSM
        |-- README
        `-- README.transfer

        0 directories, 5 files
    

Example 3. Directory listing for

        aspell-0.33.7.1
        |-- README
        |-- TODO
        |-- man-html
        |   |-- 1_Introduction.html
        |   |-- 2_Getting.html
        |   |-- 3_Basic.html
        |   |-- 4_Managing.html
        |   |-- 5_Customizing.html
        |   |-- 6_Writing.html
        |   |-- 7_Adding.html
        |   |-- 8_How.html
        |   |-- A_Changelog.html
        |   |-- About_this.html
        |   |-- B_Do.html
        |   |-- C_Support.html
        |   |-- Contents.html
        |   |-- D_Credits.html
        |   |-- E_Glossary.html
        |   |-- F_Copyright.html
        |   |-- contents.png
        |   |-- crossref.png
        |   |-- index.html
        |   |-- manual.css
        |   |-- manual.html
        |   |-- next.png
        |   |-- next_g.png
        |   |-- prev.png
        |   |-- prev_g.png
        |   |-- up.png
        |   `-- up_g.png
        |-- man-text
        |   |-- 1_Introduction.txt
        |   |-- 2_Getting.txt
        |   |-- 3_Basic.txt
        |   |-- 4_Managing.txt
        |   |-- 5_Customizing.txt
        |   |-- 6_Writing.txt
        |   |-- 7_Adding.txt
        |   |-- 8_How.txt
        |   |-- A_Changelog.txt
        |   |-- About_this.txt
        |   |-- B_Do.txt
        |   |-- C_Support.txt
        |   |-- Contents.txt
        |   |-- D_Credits.txt
        |   |-- E_Glossary.txt
        |   |-- F_Copyright.txt
        |   |-- index.txt
        |   `-- manual.txt
        |-- manual.dvi
        |-- manual.tex
        `-- manual2.lyx

        2 directories, 50 files
      

You will notice that for apmd there are only a few files. This is considered a minimal list of file, with the exception of README.transfer. LSM is basically the information you would get from doing a package info on an RPM package.

Aspell on the other hand contains a complete copy of a manual in Tex, Lyx, and DVI as well as the same manual in both Text and HTML.

I have noticed this documentation is often overlooked because people do not realize it even exists. As an example I went to the directory /usr/share/doc on this computer and tried to see how many html files exist. It came back with 4357 pages. On my server at home I tried the same command and came up with: 9913 pages. So you see what I mean about it being over looked. Even I did not realize how much good information was present in these directories.

6.  Frequently Asked Questions (FAQ)

Internet FAQ Archives

The FAQ is one of the most useful tools when you don't know where to look for answers to a particular question. The reason to use an FAQ instead of emailing a project group, is that some groups can be openly hostile to people asking the same questions over and over. Often you will get the response RTFM (Read The F.... Manual) F is defined as fine or just f...in, depending on the author.

So the real question with an FAQ is where to find them. One source of some is Frequently Asked Questions (FAQs) or Linux Administrators FAQ List , or probably best is to go to Internet FAQ Archives .

7.  Network News Transport Protocol (NNTP)

The network news is an interesting application of the network using an email like interface. The way network news works is that each server sets up an NNTP service. This service then communicates with all other sites it has on it's list and exchanges posts with them.

 

One of the most astounding facts about Usenet is that it isn't part of any organization, nor does it have any sort of centralized network management authority. In fact, it's part of Usenet lore that except for a technical description, you cannot define what it is; at the risk of sounding stupid, one might define Usenet as a collaboration of separate sites that exchange Usenet news. To be a Usenet site, all you have to do is find another Usenet site and strike an agreement with its owners and maintainers to exchange news with you. Providing another site with news is called feeding it, whence another common axiom of Usenet philosophy originates: Get a feed, and you're on it.

The basic unit of Usenet news is the article. This is a message a user writes and posts to the net. In order to enable news systems to deal with it, it is prepended with administrative information, the so-called article header. It is very similar to the mail header format laid down in the Internet mail standard RFC-822, in that it consists of several lines of text, each beginning with a field name terminated by a colon, which is followed by the field's value.[1]

Articles are submitted to one or more newsgroup. One may consider a newsgroup a forum for articles relating to a common topic. All newsgroups are organized in a hierarchy, with each group's name indicating its place in the hierarchy. This often makes it easy to see what a group is all about. For example, anybody can see from the newsgroup name that comp.os.linux.announce is used for announcements concerning a computer operating system named Linux.

These articles are then exchanged between all Usenet sites that are willing to carry news from this group. When two sites agree to exchange news, they are free to exchange whatever newsgroups they like, and may even add their own local news hierarchies. For example, groucho.edu might have a news link to barnyard.edu, which is a major news feed, and several links to minor sites which it feeds news. Now Barnyard College might receive all Usenet groups, while GMU only wants to carry a few major hierarchies like sci, comp, or rec. Some of the downstream sites, say a UUCP site called brewhq, will want to carry even fewer groups, because they don't have the network or hardware resources. On the other hand, brewhq might want to receive newsgroups from the fj hierarchy, which GMU doesn't carry. It therefore maintains another link with gargleblaster.com, which carries all fj groups and feeds them to brewhq.

 
  --What Is Usenet, Anyway?

This service allows non-interactive exchanges of ideas or views using threads. If you are familiar with an email list where one questions is followed by answers, which sometime lead to entire discussions, you have a picture of what a news thread looks like.

In the past I used to use a program called slrn to read network news. But one of the best interfaces today is Google Groups

8.  HOWTOs

 

A HOWTO is usually a step-by-step guide that describe, in detail, how to perform a specific task. For example, you can use the Linux Installation HOWTO to help you install Linux on a system, but it does not cover how to set up a Web server so that you can focus on a particular task.

 
  --Linux Documentation Project (LDP) FAQ

The HOWTOs are one of the most useful information sources, in my opinion. The reason is that they not only explain how to configure some software, they often explain how it works. For example the HOWTO Network-Introduction includes an explanation of the following protocols.

  • 3. Networking protocols

    • 3.1 TCP/IP

    • 3.2 TCP/IP version 6

    • 3.3 IPX/SPX

    • 3.4 AppleTalk Protocol Suite

    • 3.5 WAN Networking: X.25, Frame-relay, etc...

    • 3.6 ISDN

    • 3.7 PPP, SLIP, PLIP

    • 3.8 Amateur Radio

    • 3.9 ATM

These explanations are designed to lay the foundation for the application to be setup. Now these guides are good but remember that the writers are not professional writers. Some are good and clear but others are not. Do not be discouraged if the explanation is not great, but contribute to the process by offering constructive criticism to the author. The contact information is usually provided at the beginning of the document.

9.  Guides.

 

A guide is typically a longer book with broader coverage of a subject; for instance, the Network Administration or User Guide. The intent is to understand the whole subject, as opposed to performing only one task. If you want to have a broader look at some aspect of Linux, then the guides should be very handy.

 
  --Linux Documentation Project (LDP) FAQ

The guides more closely resemble a book than anything else. These documents cover a range of topics of interest to one class of users. For example the quote above about Usenet came from the Network Administrators guide

Current the Linux Documentation Project lists 22 Guides which can be obtained Here.

10.  On-line Tutorials

I suppose one question would be what is the difference between a Guide, such as the one listed above and an on-line Tutorial? The only answer I can provide is that the on-line tutorials are often written for a specific group of users.

As an interesting view lets list 4 on-line tutorial. These were from a listings at Google Linux when searching for Beginners Tutorial

  • The Linux Terminal - a Beginners' Bash

    This tutorial presents the Linux terminal and the "bash" shell to people who have never used a command line to give commands to an operating system before, or who have never done so in Linux/Unix. People who have already used a Unix shell before might find it a bit simple.

  • FreeBSD Handbook

    The FreeBSD newcomer will find that the first section of this book guides the user through the FreeBSD installation process and gently introduces the concepts and conventions that underpin UNIX®. Working through this section requires little more than the desire to explore, and the ability to take on board new concepts as they are introduced.

  • Getting Started with Linux - Introduction

    Welcome to Linux Online's Getting Started with Linux beginner level course. If you're new to Linux and want to find out how to use the fastest growing operating system today, all you have to do is follow these lessons and you'll be using Linux efficiently in no time.

    Getting Started with Linux is designed as a self-study course. We're afraid that due to the numbers of people who follow this course, we cannot answer any specific questions or clear up any doubts you may have about the material. In short, there is no extra help available. You are on your own.

  • Start Linux - Beginners

    A short introduction to what Linux is.

11.  Paper Books

So we have finally come to the printed word. Even though we have discussed a great deal of on-line sources of information, but this should not prejudice you against printed books.

One of my favorite sources for books is O'Reilly Unix/Linux . O'Reilly has made a business of publishing some of the best technical reference books available anywhere. They have gained the trust of computer developers by a combination of good reference information, knowledgeable authors, and a willingness to work with their users. A good example is this document. It is written in DocBook XML which was developed at O'Reilly and then made available to the world at large. I have found that many of the developers I know own several O'Reilly books.

The real point is that despite the quantity of on-line documentation available, there is still a market for the professional writers who produce books. The paper book is not less in demand, it might be that they are in more demand. The paper book is still the standard since it is portable, allows you to make notes in the corners, and gives you something to place beside your computer when you work.

Now I know I have been talking up O'Reilly because I think they are the best, but there are good books from other publishers. For example I went to the books section of Amazon and did a search for Linux books. The search returned 3907 hits. Now I know that some of them will be trivial, some aimed at specific user groups, and some worthless. But if even 10% are good that is 390 titles to choose from. I suppose the real test is which ones are right for you.

So how do you determine which books are best for you. Well the first thing I often do is to go to Borders and read parts of the books. Since they allow browsing it give me a better idea of how the book treats a topic. I find I pick some subject I am familiar with and read what they say about it. If you don't already have the knowledge of one of the subjects. Read the summary of a topic you are interested in and see if it is clear to you.

When it comes to books which are not available at my local Borders, I try to read the reviews, and if I am lucky, parts of the book, especially the table of contents. I can often get a feel for the contents of the book by seeing what topics the author has put in the table of contents.

There is not good yard stick for what is best for you. Unfortunately the only true way to find out how good some books are is to buy them and see.

12.  Conclusions

Well I think this about summarized the documentation for Linux that I know about. This does not mean that I have exhausted all the possible sources of information on Linux.

Next month we will look into how to find help for questions in Linux. Even though I have tried to point out all the sources of information, there will always be questions which are not covered. For those questions we will find out how to search out answers.

This write up was done using DocBook XML instead of writing directly to HTML as in the past. It made it more difficult to write, but hopefully it will be more flexible when it comes to build a search engine or maintain the web site. For those of you curious, you can see the original source here: Documentation Galore Source. And the shell script process.sh is here for you to examine.

One last thing, this document is now using XHTML instead of the traditional HTML. If your browser has trouble displaying it please let me know. As with many new experiments, it might not work for everyone and I can fix this in the future.