Desktop Publishing
Linux Style

Objective of this class

  1. Words to Writing to Printing An opening discussion of what we are about.
    1. What do we mean by Format? Discusses the different types of formating.
    2. Format Tagging vs Usage Tagging What is the difference between usage tagging and format tagging.
      1. Formating for Looks Here is some text which is formated by how it looks.
      2. Formating for Structure Here is the same text formated by structure.
      3. What is the difference? How are these two section different.
      4. Why should we care? Is this just an academic exercise of is there some reasonable use for this.
  2. Man Page Format What type of formating is used for Man pages.
    1. The role of Roff How does the troff program convert from the input file to the output.
  3. Formatter for Math and Books When a college math professor did not like how Mathematics books looked, he decided to write his own formatter.
    1. TeX source files What does the input for the TeX system look like.
    2. What is LaTeX? A new face for an old dog.
  4. HTML, XML, SGML An international standard for markup language, and it’s siblings.
    1. DocBook An organized way to use SGML for publishing books from O’Reilly.
    2. Output formats from XML A new standard for writing SGML type documents.
    3. So how do you write the XML instance? What tools are available for writing XML?
  5. Word Processing Tools OK, so we have text formating tools available on Linux. What about the person who only wants to write a simple letter to print and mail?
    1. Can I get the advantages of XML and the Ease of WYSIWYG I want to do structured writing, but I need to see what I am doing.
    2. Which type of Preparation is Right for Me OK, I understand what how the different systems work, and how to use them. But how can I choose one over the other?
  6. Writing Tools of the Trade What is available to help me write better documents.
  7. OK, it’s a Wrap Parting thoughts and ideas on where to go.

Words to Writing to Printing

When we talk about words, we are discussing the most fundamental need of man to communicate. The written word is considered one of the major achievements of man. Through the written word we communicate ideas from those instructions we all throw away that are included in almost everything we purchase these days, to the Bible and the great books of literature.

Books as we know them, were copied by hand during the middle ages. Since the scribes spent all their time coping books, they were considered precious and often decorative.

With the invention of the printing press by Gutenberg, the ability to reproduce books began. The early pages of type were carved into blocks of wood. This allowed almost infinite designs in layout and type faces. You can find more depth in a web page, “Manuscripts, Books, and Maps: The Printing Press and a Changing World”

Some time after the printing press was invented they developed movable type. This sped up the process of organizing the letters on the page to prepare for printing. Unfortunately, the movable type also limited the originality of type faces and layouts. Yes it was easier to arrange, but it was also more boring since the type face was what ever was available in the box of type. For more interesting reading on printing let me recommend Resources for the History of Books and Printing.

With the invention of Photography and the development if a chemical etching process, printing was able to move back to some of the original creations done in the time of Gutenberg. A page could now be created by what ever means the author wanted, photographed, and transfered to a plate for a printing press. This is what the printing term “Camera Ready” refers to.

Into this history we now come to the computer and Unix. One of the first uses of the Unix system was the production of manuals. You can imagine how the writers at AT&T must have salivated at seeing programmers typing words into a computer using an editor, how ever crude, when compared to using a typewriter. Of course once the words were in the computer file they needed to get them into print. Thus the runoff or roff program was created. Information taken from Unix History .

What do we mean by Format?

Lets start looking at a group of words used to express idea. We can start by saying that they are organized into sentences, then paragraphs, then pages, then chapters, then books, etc.

But the idea of formating includes more than the words, sentences, etc. it involves how the page looks. Here I am discussing the organization of the sentences into a pleasing to the eye organization. Lets look at this web page as an example.

This web page is a fairly simple organization. It consists, of headings, paragraphs, lists, and links. Now, I don’t expect to win any awards on this organization, it is mostly for the utilitarian purpose of presenting ideas. There are certainly more interesting web pages around, but this one was designed simply to communicate. I tend to use few pictures, not because I don’t like them, but I use them sparingly since I find I mostly communicate with words. I also tend to use the font effects, like bold face, italics, etc, sparingly.

The reason I use this structure is that it is easy for me to write in and requires little thought to the formating. This simple layout allows me to concentrate on the verbiage as opposed to the format.

Format Tagging vs Usage Tagging

Let me start by saying that this is one of my pet peeves when it comes to writing. I find that people put too much emphasis into how something looks instead of how it is structured. I tend to view this as the difference between writing in Word as opposed to HTML. Let me explain.

When I refer to formating for how something looks I am referring to marking up the text in a document by font, color and organization, as opposed to how the words are used. Clear? Probably not. Lets see if I can give you an example.

Formating for Looks


Where to Start


At the beginning


This is the best place to have a conversation.


Formating for Structure


Where to Start

At the beginning

This is the best place to have a conversation.


What is the difference?

In the section which was formated for looks, the text is marked by font size and bold facing. In the section formated for structure, the text is marked up using tags indicating what it does.

For example in the first example the heading uses a font size of ? and bold face type. This looks OK to the user, but the computer has no idea what that piece of text means. In the second example the heading is sandwiched between a set of <h1> and </h1> tags. These tags indicate that it is a Section 1 title. In the first example you would not know what the purpose of the text was.

Why should we care?

If all your writing were for human readers and not to be saved, it might not matter. But if the documents are something you are writing for a business or a something worth keeping, it would now be possible to index and retrieve information mechanically with a computer program, instead of a human’s memory.

For example, notice how at the top of the page I have an outline of what we will cover in the presentation. The link portion of that can be automatically generated with a script to capture the Section links from the web page itself. In fact we will see later how a DocBook article or a Latex article can do just that.

If you think this is a lot of trouble for nothing, consider that most businesses have about 85% of their knowledge in computer documents. Can you picture how much easier it would be to find the correct documents, if a computer program could index the documents based on subject, author, date, and overview? If each document contained a set of XML like tags identifying these sections of each document, it would be relatively easy to create a Google like index of the documentation. Also, realize that Microsoft Word, or Open Office, or even Word Perfect could do this today if the users made use of the existing tags.

Man Page Format

Since the original formating used on the Unix system was roff, lets have a look at the parts of a man page to see how they are created. In these early documents, it was common for the author to place the formating commands into the text stream as it was typed. Some of you might remember Word Star which used a similar system. Much of this discussion about man pages will be taken from the HOWTO named Man-Page which normally lives in /usr/share/doc/HOWTO/Man-Page, if you installed the Howto documents on your Linux system.

Now instead of copying portions of the How To into this document, I have included the HTML version on this website. So without further adu let look at Linux Man Page Howto

OK, what did we observer about the man page format that we just reviewed?

The man page does not have a WYSIWYG interface. This is not a big issue for a man page. Besides, realize that viewing the man page is similar to what a programmer does with code. In code they refer to the process as, edit / compile / debug. This means edit the source code, compile it into a program, then test/debug the resulting program. Once you debug, you start the loop again for the next version / iteration. With man pages, you can write in one window, and view the results in another, so we could call it an edit / view cycle.

Lets consider the advantages and disadvantages of this method of preparing documentation.

Advantages:

  1. The format of the document is defined in advance. Since there is a standard way of formating man pages, you know in advance what sections are required and which are optional.
  2. You can use any editor to do your editing, since the output is only ASCII.
  3. The source files are portable across most Operating Systems.
  4. You can always read the source code on any computer which supports ASCII characters, so no lock-in.
  5. Since the format is fairly straight forward it is possible to transform the source file to formated ASCII, or HTML, or PS or PDF for viewing.

Disadvantages:

  1. You do not see changes immediately. Each time you want to see what you typed you need to invoke man to format and display the results.
  2. Man pages them selves have a fairly rigid format. This is not the fault of the roff formatter, only of the conventional layout of man pages.
  3. This method requires the author to think about the formating while entering the text. For a man page this is not too difficult, but for a larger work, say a book, it might be more difficult.

By the way, so you don’t think it is too difficult, I might point out that there are editors which make the formating of a man page easier. One suggestion would be to have a look at ManEdit if you are planning on writing lots of man pages. We will discuss editors later. My observation about using any editor means that you can use a specialized editor or editor mode to create man pages, just that you are not required to use it.

Lastly the source file for a man page has it’s contents tagged according to both function and format. What I mean is that some of the words are marked as section headings, meaning that they are tagged according to function. While some of the words are tagged according to their look such as bold face.

The role of Roff

Even though the man page is written in a text editor, it is the post processing of the roff program the converts the commands and the text into the final presentation. So we should talk about what roff does and where it came from. Again instead of copying over text, I am going to simply refer to it’s man page.

Roff Man Page

One thing to remember about roff is that it uses macro files to help define how it’s output looks. It is also designed to use a number of other programs to do portions of the formating, such as eqn, tbl, and others.

Formater for Math and Books

One of the next tools which came along for formated output was a program know as TeX. This program was created by a mathematician at Stanford University by the name of Donald Knuth. He was disappointed in how math books looked after a publisher type set them. So instead of whining he did something about it. He created a formating program known as TeX. It did not use structured tags as we have discussed before, but it had some interesting features.

TeX is a postprocessor system like roff was. That means that you write you source code in a text editor then call TeX to process the file. But TeX only converts the source file to a Device Indepentent file, also known as a *.dvi file. You then use another program to convert the Device Independent file into a printer file using a program like dvips, which creates postscript output.

Inven though the idea is similiar to roff, in that you embed the formating commands into the source file, TeX has a MUCH richer set of formating tags available. The TeX system has been distributed in source code form from the very beginning, so it has been ported to a number of different operating systems.

One of the interesting things about TeX is that it includes a sister program called Metafont. Metafont is a font creation program. Since you can create the fonts used by TeX is gives it an almost unlimited set of fonts. Of course, since Knuth was a mathematician, the fonts are described in math terms. This ability to define fonts has two interesting properties. First TeX was used for one of the first word processing programs which could publish Hebrew. Since you can create your own fonts, it was straightforward to create a set of Hebrew characters. The only other task was to make the output read from right to left instead of left to right. But because it is a post processing program it was possible.

TeX source files

Here is a sample TeX document from the TeX tutorial essential.tex.

      % This is a small sample LaTeX input file (Version of 10 April 1994)
      %
      % Use this file as a model for making your own LaTeX input file.
      % Everything to the right of a  %  is a remark to you and is ignored by LaTeX.

      % The Local Guide tells how to run LaTeX.

      % WARNING!  Do not type any of the following 10 characters except as directed:
      %                &   $   #   %   _   {   }   ^   ~   \   

      \documentclass{article}        % Your input file must contain these two lines 
      \begin{document}               % plus the \end{document} command at the end.


      \section{Simple Text}          % This command makes a section title.

      Words are separated by one or more spaces.  Paragraphs are separated by
      one or more blank lines.  The output is not affected by adding extra
      spaces or extra blank lines to the input file.

      Double quotes are typed like this: ``quoted text''.
      Single quotes are typed like this: `single-quoted text'.

      Long dashes are typed as three dash characters---like this.

      Emphasized text is typed like this: \emph{this is emphasized}.
      Bold       text is typed like this: \textbf{this is bold}.

      \subsection{A Warning or Two}  % This command makes a subsection title.

      If you get too much space after a mid-sentence period---abbreviations
      like etc.\ are the common culprits)---then type a backslash followed by
      a space after the period, as in this sentence.

      Remember, don't type the 10 special characters (such as dollar sign and
      backslash) except as directed!  The following seven are printed by
      typing a backslash in front of them:  \$  \&  \#  \%  \_  \{  and  \}.  
      The manual tells how to make other symbols.

      \end{document}                 % The input file ends with this command.

OK now that we have seen what the source look like. lets have a look at the output from this sample.

For anyone who is interested in the whole tutorial, here is Essential TeX .

Now lets include a few more samples of what TeX documents can look like.

Latex Source File PDF output file
Artex.tex Artex.pdf
Bookex.tex Bookex.pdf
Slidex.tex Slidex.pdf
Examex.tex Examex.pdf
Faxex.tex Faxex.pdf
Lettex.tex Lettex.pdf
Testex.tex Testex.pdf

These examples came from the web page: PDF LaTeX by example

Why use TeX over troff? The first and most obvious is it’s support of mathematics. The second reason is its flexability. This is a system with much finer control over layout elements, graphics, and structure than almost any other publishing package I know of. It has a group know as TUG (TeX Users Group) who sponsor a web site as well as a web site at CTAN (Comprehensive TeX Archive Network) wherer you can find macros, tutorials, and software for a wide range of operating systems and assignments. One aspect that we have not mentioned is LaTeX. LaTeX is really TeX with a set of macros included to make it more user friendly.


Begin Quote from FAQ at Tug

What is LaTeX?

LaTeX is a TeX macro package, originally written by Leslie Lamport, that provide s a document processing system. LaTeX allows markup to describe the structure of a document, so that the user need not think about presentation. By using document classes and add-on packages, the same document can be produced in a variety of different layouts.

Lamport says that LaTeX “represents a balance between functionality and ease of use”. This shows itself as a continual conflict that leads to the need for such things as FAQs: LaTeX can meet most user requirements, but finding out how is often tricky.

End Quote


Another aspect of TeX is that it encourages the use of templates. It is much easier to work with TeX if you start by importing an existing template So the combination of a web archive at CTAN and the use of existing templates makes TeX a lot more friendly than you might first percieve. For a better view of LaTeX and it’s syntax have a look at Intro to LaTeX or Hypertext Help with LaTeX

Although it is not as obvious to those of us who use work processing programs line Open Office and Word, the ability to have a large selection of available fonts can immensely improve the look of a document layout. Not only does this allow other languages to be supported, but it allows character sets of Chess pieces to be created to document a chess match or strategy. Here is an example:

Although by now you are probably afraid to try TeX because of it’s complexity, there are 2 editors which will make it easier. The first is Lyx . It is a general purpose document preparation tools which supports TeX as the backend. The other editor is TeXmacs. This is a scientific text editor specifically designed to handle mathematics publishing.

Here are a couple of news letters which were written and formated by LaTeX to show you what can be done with this tool. PROSPER: Latex-based Computer Presentations and LaTeX news

HTML, XML, SGML

Now I know many of you are familiar with HTML. After all it is the formating which makes the web what it is today. Some of you have probabaly heard about XML. It is the new kid on the block as far as mark up languages are concerned. But SGML, is probably still greek to most of you. So lets look at a definition of what SGML is:


Quote from http://xml.coverpages.org/sgmlfaq-199904.txt

What is SGML?

ANSWER: SGML stands for “Standard Generalized Markup Language” (or “Standard Goldfarb Mosher Lorie,” but that’s an inside joke). Essentially, SGML is a method for creating interchangeable, structured documents; with it, you can do the following:

  • assemble a single document from many sources (such as SGML fragments, word processor files, database queries, graphics, video clips, and real-time data from sensing instruments);
  • define a document structure using a special grammar called a Document Type Definition (DTD);
  • add markup to show the structural units in a document; and
  • validate that the document follows the structure that you defined in the DTD.

The official definition of SGML is in the international standard ISO 8879:1986. For a list of general information on SGML, including online tutorials, see the following link at Robin Cover’s SGML/XML Web Site (next question):


If you look at the history of mark up from the perspective of SGML you would see that SGML came first. It was used by people like IBM, and the Department of Defense. Although complex and requiring a great deal of setup and structure, these groups felt it’s ability to make documentation more accessable worth the cost.

While consulting for CERN June-December of 1980, Tim Berners-Lee writes a notebook program, “Enquire-Within-Upon-Everything”, which allows links to be made between arbitrary notes. Each node had a title, a type, and a list of bydirectional typed links. He later wrote a paper in 1989 entitled Information Management: A Proposal which layed out the basis for and named it HTML. This document and later work created the basis of HTML for use on the web. You can find more of the information at A Little history of the World Wide Web.

Now we are ready to discuss what is XML. Here is a definition from The XML FAQ


XML is the Extensible Markup Language. It is designed to improve the functionality of the Web by providing more flexible and adaptable information identification.

It is called extensible because it is not a fixed format like HTML (a single, predefined markup language). Instead, XML is actually a `metalanguage’ -a language for describing other languages-which lets you design your own customized markup languages for limitless different types of documents. XML can do this because it’s written in SGML, the international standard metalanguage for text markup systems (ISO 8879).


So what does XML look like and how can we compare it to HTML. Actually that is shown in a very nice tutorial of XML called Extending Your Markup
An XML Tutorial

    <UL>
    <LI>Aho, A. V., Sethi, R., Ullman, J. D.: <EM>Compilers: Principles,
    Techniques, and Tools </EM>, Addison-Wesley, 1985
    </UL>
    (a)

    <BOOK>
    <AUTHOR> Aho, A. V. </AUTHOR>
    <AUTHOR> Sethi, R. </AUTHOR>
    <AUTHOR> Ullman, J. D. </AUTHOR>
    <TITLE> Compilers: Principles, Techniques, and Tools </TITLE>
    <PUBLISHER> Addison-Wesley </PUBLISHER>
    <YEAR> 1985 </YEAR>
    </BOOK>
    (b)

    Figure 1. A bibliography entry (a) in HTML and (b) in XML. The
    HTML description is layout oriented, while the XML description
    is structure oriented.

DocBook

One of the methods used to prepare and format XML documents is the DocBook system. Docbook was an attempt at putting together a complete SGML implementation so a user like you or me could setup and use SGML without the involved training needed by a full implementation. Since there are many parts to a full implementation of SGML this was Norman Walsh’s attempt at making it easier to use.

The Docbook implementation now includes versions for both SGML, XML, simplified, and Relax NG for now. The use of DocBook grew out of work done at O’Reilly for book publishing.

Docbook history states: The DocBook DTD was originally designed and implemented by HaL Computer Systems and O’Reilly & Associates around 1991. It was developed primarily for the purpose of holding the results of troff conversion of UNIX documentation, so that the files could be interchanged. Its design appears to have been based partly on input from SGML interchange projects being conducted by the UNIX International and Open Software Foundation consortia.

Lets take a look at Chapter 1 of the Docbook book to understand better how and why to use Docbook.

Output formats from XML

Now you are probably wondering what type of output you get from an XML document. I will, use the term XML for SGML, and docbook as well as XML. The answer is quite a few. Since I have the docbook tools installed on this compute, I looked for the script files used to do the transforms. I found the following, docbook2dvi, docbook2html, docbook2man, docbook2pdf, docbook2ps, docbook2rtf, docbook2tex, docbook2texi, and docbook2txt. This means that from a single source file, known as an instance, I can product a dvi, html, man, pdf, ps, rtf, tex, texi, or txt document.

Not only can I product all these different formats, but I can even use a program to check the instance document to see that it only contains valid tags by using different DTD files. The DTD (Document Type Definition) contains rules on which tags are allowed where. If you wanted to, you could make a DTD specifically for man pages. This would assure that any man page created in XML would only contain valid tags for a man page. Once you had created the instance, you could use the script docbook2man to produce a man page for inclusion with your application.

So how do you write the XML instance?

When it comes to editors you can use a simple text editor if you want to deal with the tags directly. Or if you want to use an editor specifically designed for XML by checking out the Docbook Authoring Tools.

Word Processing Tools

Well we have talked about using Roff, TeX, and XML for creating documents, but what about tools like OpenOffice?

OpenOffice was designed to be a replacement for an office type word processing program like Word. As with any other tool there is more than one way to write words. For the quick simple document of a page or two that is going to be printed, a WYSIWYG (What You See Is What You Get) editor might be easier. But if you are doing something which needs lots or versions, or complex outputs, or many pages or multiple output formats, then using TeX or XML is probably a better choice.

So what WYSIWYG editors are available on Linux. Here is a list of ones I know about and what they are good for.

  • OpenOffice This is probably the best know of the WYSIWYG editors for Linux. It is a reasonably full featured editor when compared to Word. See OpenOffice .
  • AbiWord This editor is a commerical editor which is available for free. It produces some nice documents and is fairly easy to use. See AbiSource.com .
  • Kword This is the word processing program which is included with KDE. I have not used this one so I can not make a comparison of it. See Kword
  • EZ This word processing program is part of the Andrews system developed at Carnegie Mellon University. See EZ - As a Word Processor
  • FLWriter Flwriter is a small word processor for X-Window. See Fast Light WRITER
  • Scribus Scribus is a desktop page layout program in the tradition of Corel VenturaÒ®, Quark XpressÒ®, PageMakerÒ® and InDesign. See Scribus
  • For a longer list check out Linux Word Processing

Can I get the advantages of XML and the Ease of WYSIWYG

Well interestingly this is finally starting to happen. The writer program in OpenOffice uses XML as it storage medium. If you look at the default format for OpenOffice document is it File.sxw . This is actually a zip archive containing the following list of file:

        Length     Date   Time    Name
       --------    ----   ----    ----
             30  05-12-04 01:50   mimetype
             18  05-12-04 01:50   layout-cache
          41118  05-12-04 01:50   content.xml
          52455  05-12-04 01:50   styles.xml
           1212  05-12-04 01:50   meta.xml
           7440  05-12-04 01:50   settings.xml
            850  05-12-04 01:50   META-INF/manifest.xml
       --------                   -------
         103123                   7 files

The file content.xml contains your input with XML tages. Here is the beginning of a content.xml file for you to view.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE office:document-content PUBLIC "-//OpenOffice.org//DTD OfficeDocument
1.0//EN" "office.dtd"><office:document-content xmlns:office="http://openoffice.o
rg/2000/office" xmlns:style="http://openoffice.org/2000/style" xmlns:text="http:
//openoffice.org/2000/text" xmlns:table="http://openoffice.org/2000/table" xmlns
:draw="http://openoffice.org/2000/drawing" xmlns:fo="http://www.w3.org/1999/XSL/
Format" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:number="http://openoffi
ce.org/2000/datastyle" xmlns:svg="http://www.w3.org/2000/svg" xmlns:chart="http:
//openoffice.org/2000/chart" xmlns:dr3d="http://openoffice.org/2000/dr3d" xmlns:
math="http://www.w3.org/1998/Math/MathML" xmlns:form="http://openoffice.org/2000
/form" xmlns:script="http://openoffice.org/2000/script" office:class="text" offi
ce:version="1.0"><office:script/><office:font-decls><style:font-decl style:name=
"StarSymbol" fo:font-family="StarSymbol" style:font-charset="x-symbol"/><style:f
ont-decl style:name="Symbol" fo:font-family="Symbol" style:font-pitch="variable"
 style:font-charset="x-symbol"/><style:font-decl style:name="Wingdings" fo:font-
family="Wingdings" style:font-pitch="variable" style:font-charset="x-symbol"/><s
tyle:font-decl style:name="Lucidasans1" fo:font-family="Lucidasans"/><style:font
-decl style:name="Avant Garde" fo:font-family="'Avant Garde', 'Ce
ntury Gothic'" style:font-pitch="variable"/><style:font-decl style:name="Bi
tstream Vera Sans1" fo:font-family="'Bitstream Vera Sans'" style:font-
pitch="variable"/>

Here is a tutorial document on how to use OpenOffice with DocBook .

Which type of Preparation is Right for Me

This is always a difficult question. Here is the opinion of one college student.

    hen I wrote my graduate thesis, I wrote it in LaTeX using vi.  A few
    graduate student "generations" before me, some students had started
    using LaTeX and they were very persuasive as to the benefits of LaTeX:
    the cross-referencing, the automatic figure/equation numbering, the
    postscript figure inclusion...  I had the benefit of their experience,
    but I still had to learn the syntax of the language, and the initial
    learning curve was much steeper than that for the people writing in
    Word or other WYSIWYG editors.  But after a time, the WYSIWYG users
    were trying to learn to do those complex things that came
    "effortlessly" in LaTeX and they had a hard time.  In the end I
    figured that to produce the same level of output took about the same
    amount of effort, no matter what system you were using.

    So if your usage is never going to go beyond the "beginner" level
    (which probably covers most people) perhaps the WYSIWYG editors are
    better.  If you are doing complex things, I think the answer is less
    clear.

    There are still things I can't seem to tweak to perfection in LaTeX,
    but for some things, I think LaTeX can definitely be more efficient.
    I write my letters (not that I write terribly many of them) in LaTeX.
    I have a template file I simply cut the old text from and type in the
    new text.  It takes me only as long as it takes to type out the text
    and my letter is done.  If I want to do something funkier, I may be
    able to do it easily or I may not be able to figure it out at all.
    But I've never found Word users to be much better off either.  On the
    minus side, I just figured out how to generate a decent looking
    envelope from LaTeX the other week, but now that I can do it, my world
    is happy for the time being.

Another interesting point of view is expressed in the web page: Street Level: Editors vs. Authoring Tools . Even though this discussion is aimed at HTML authors, it applies to many others.

Writing Tools of the Trade

We have seen how Linux has many tools to do word processing and layout, but what about the tools to help the writer select his/her words? Again here Linux comes to the rescue with a tool box to assist the intrepid author.

  • Spell Checking The Linux system has 3 generations of spell checkers available. The original one was just called spell . This was reworked into a better version called ispell . Since many people like to see if they can do something better, the current incantation of this tools is called aspell . In addition to doing spell checking there is the program look which is quite handy when you are unsure how to spell a word, Look allows you to enter the beginning of a word and it will find the words which start with those letters. For example:

            103 john[pts/4] % look incur 
            incur
            incurable
            incurred
            incurring
            incurs
            incursion
  • edict/ethes These are an interesting pair of perl scripts. What they do is take a work an look it up on a dictionary server on the internet. For example if you entered edict thoroughbred , the script would go to the Merriam Webster Online Dictionary and return the meaning of thoroughbred. Of if you did ethes thoroughbred it would go to Merriam Webster Online Thesaurus and return Synonyms PUREBRED, full-blooded, pedigree, pedigreed, pureblood Contrasted Words: mixed, mongrel. See Edict - Your personal command line dictionary
  • diction This is a program from the old AT&T Writer’s Workbench. It is a program which implements William Strunk’s The Elements of Style . It can analysize a piece of writing and help you structure it better. See Style and Diction to download the software. And see The Elements of Style if you want to understand Strunk’s book, it is online. If you want to learn more about the Writer’s Workbench you should read the article Writer’s Workbench from O’Rielly.

OK, it’s a Wrap

Well we have scratched the surface with this introduction to tools and methods. There are two areas I have not covered in enough detail. One is XML, this is showing up in many applications from XHTML to RSS feeds. The other issue is how to output your masterpiece either online or in print. This would involve print formats and distribution formats such as Post Script, PDF, and others. If you want me to present on either of these, let me know please.



Written by John F. Moore

Last Revised: Wed Oct 18 11:01:35 EDT 2017

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
HTML5 Powered with CSS3 / Styling, and Semantics