Quick-and-dirty HTML

A one-hour crash course in Web page creation


Updates:

Introduction

The purpose of this document is to show you how to create simple Web pages using just twenty or so HTML tags (formatting commands), without any of the more advanced features (tables, frames, style sheets or javascript).

Well, this is not quite true: as you will see, this page uses one teeny-tiny Cascading Style Sheet (CSS) to define some basic page format attributes (width, font, more). This file, however, can be treated as a 'black box': don't touch it, at least for the time being.

Also, this is not about the so-called Web authoring tools. As nice as those may seem at the first glance, they produce HTML code which is ugly, redundant, unreadable and difficult to maintain. What I'm showing here is how to do it by hand.

What you will be able to do, I hope, after having read this, is to convert a plain text into an HTML document, to be formatted and displayed by a Web browser. This looks like a printed book: no-frills, clean and readable. It follows the original HTML concept: the document defines the contents; the browser provides the formatting. (This paradigm seems no longer to be used in practice.)

Actually, this page is coded with only the HTML subset it describes, so it can be used not only as a tutorial, but also as a working example.

General rules and conventions

The HTML tags are enclosed in 'sharp braces', comprised of 'less-than' and 'greater-than' characters: < and >. Therefore if your text to display actually contains these two, they should be spelled out as &lt; or &gt;, respectively. Some other symbols which need a similar treatment will be shown further down.

A tag starting from <!-- opens a comment, which must be closed with a --> sequence. Anything in-between will be ignored, regardless of any sharp braces (or complete tags) inside. Comments are used to explain the code or to disable parts of it temporarily. They can't be nested.

Some HTML tags or other symbols are case-sensitive, some are not. and some are expected to be in the future. I will be using the lower case wherever possible, keeping uppercase just for hyperlink anchors or style names (not discussed here).

Document template

These are few standard lines I'd suggest you put at the beginning and at the end of every HTML file. As long as they are there, you don't have to know exactly what they do but let's have a quick look:

 <!-- quick-temp.html - HTML template -->
 <!doctype html>
 <html lang=en>
 <head>
 <meta charset=utf-8>
 <title>HTML Template</title>
 <style type=text/css>
 @import url(0.css);
 </style>
 </head>
 <body>
 <center><h1>
 A simple HTML template
 </h1></center>
 <!-- START YOUR TEXT -->
 ...your text goes here... <p>
 See <a href=quick-html.html>this</a>.
 <!-- END YOUR TEXT -->
 <p> 2020/02/05
 </body></html>
 

The <html>, <head>, and <body> tags define the scope of the document and its two component parts; header (with some general document information) and body (the actual contents); all three have closing counterparts, like </body> etc. The lang attribute specifies the language (en for English; pl for Polish, etc). The charset meta-tag defines the character set.

The text within the <title> tag will be used for the tab or window (depending on the browser) caption. The <h1> formatting is often used for a title line. Obviously, you will want to provide your own text for both.

Last but not least, leave @import line alone for now; I'm using it as the quickest way to define the text width within the window. Actually, a purist would remove this line entirely, and our page would then take the whole window width, without justification.

The easiest way to use this template is to copy it with a new name to your local computer and use that copy. Make sure to use the .htm or .html extension so the file will be viewable in your local Web browser.

I'm strongly recommending that your file names consist only of lowercase characters, digits, underscore, hyphen and dot. Anything else is asking for a trouble, sometimes when you least expect it.

What you have to do now is to paste your unformatted text between the two comment lines shown above and then add all formatting tags as needed. The latter point is actually what this article is about.

Paragraphs and line breaks

Any spaces, tabs or line breaks in the text to format are treated just as word separators; multiple ones following one another are merged, so the whole pasted text will be shown as a single block of characters. Ugly and unreadable.

This is why HTML provides a paragraph tag, <p>, which inserts a line break and adds some vertical space, so that the spacing between paragraphs is larger than that between lines within a paragraph.

I would suggest that you split your text into paragraphs and have a first look at it in the browser before any further formatting. In most cases, 90% of your job is done.

There is a closing </p> tag, but it is optional: any new <p> implies it. I never use it, as it is redundant, sometimes misleading, and usually makes your HTML more error-prone and harder to maintain. And the worst thing you can do is mixing both styles in the same document.

The text within a paragraph will be split into individual lines (no hyphenated breaks added) and justified. This splitting can be suppressed with the <nobr> formatting tag, see below.

A plain line break (like those used in splitting a paragraph) can be explicitly inserted with a break tag, <br>. In most situations. <p> is preferable, though; try to avoid explicit breaks.

A sequence of consecutive paragraph breaks or line breaks with no text in-between will be rendered as a single break.

Text attribute tags

These come in pairs: marking the beginning and the end of the affected text. Both look almost identical, except that the end tag has a slash between the opening brace and tag name, see below.

They are also referred to as inline tags, affecting the text between points they occur, and inserting no paragraph or line breaks: this passage has been tagged with Mark.

For example, the Italics tag, <i>, is applied like this:

   Hello, children!
   <i>This is in Italics</i>.
   Do you like it?

and it will be formatted like this:

Hello, children! This is in Italics. Do you like it?

Here is a full list of similar tags.

  1. Italics: <i>as shown in the example above
  2. Bold: <b> — similar, shows like this
  3. Emphasis: <em>looks like Italics
  4. Strong: <strong>looks like Bold
  5. Mark: <mark>like a highlighter pen
  6. Big: <big> — increase font size
  7. Small: <small> — decrease font size
  8. Code: <code>monospaced, spaces merged
  9. No-break: <nobr> — no implicit line breaks inserted or preserved.
  10. Superscript: <sup> — like the '3' in πR3
  11. Subscript: <sub> — like the '2' and '5' in C2H5OH

The distinction between Italics and Emphasis is, at the moment, not in how the text is formatted but why. In the first case, the writer is requesting the Italics just for the sake of Italics. In the second, the writer requests whatever formatting the browser uses for emphasis. While at this level both look identical, more advanced HTML allows for tweaking of both formats (style sheets) so that they will look different. The 'official' HTML specification recommends using Emphasis over Italics, but I would suggest you using both, while sticking to some clearly set rules of usage.

The same relationship holds between Bold and Strong, except that the standard advice against the former is even stronger, declaring it deprecated; I wouldn't worry about that, as it can be easily restored or modified with a style sheet (when you are ready to use them).

These tags can be nested, usually with quite predictable results. For example. the text tagged as

   None <big>Big <b>Bold <i>Italic <mark>Mark
   </mark></i></b></big>
will be rendered as
None Big Bold Italic Mark

Note the order of closing tags. If it is not exactly reversed, the result may depend on the browser or whatever.

Most of these tags define the result in absolute terms. For example, <b> renders text in bold, regardless of if and within what other formatting it may be nested. Unlike that, <big> and <small> modify the attribute of the surrounding scope. Thus, multiple nesting of the same size attribute amplifies the effect; for example

   None <big>One <big>Two <big>Three <big>Four!
   </big></big></big></big>
renders as
None One Two Three Four!

Remember that usually it is up to the browser to set not only the multipliers used by size tags, but also the default font size (or sizes). Usually this is not a problem: the page can be rescaled with the +/- keys or Ctrl-Wheel use. If, however, the defaults are mismatched (for, say, sans vs. mono), some pages may render ugly, rescaled or not. This is not something you can easily fix in the code.

Formatting tags <h1>...<h6> define text styles used for headers in your documents, with <h1> being the largest.

A line break will be inserted before the opening tag and after its closing counterpart, so that paragraph tags are not necessary (although they do no harm). Such tags are referred to as block-level.

These tags can be nested with others, first of all <center> (see the next section)). This is how the two-line title section of this page was created:

 <center>
 <h1>
 Quick-and-dirty HTML
 </h1>
 <h4>
 A one-hour crash course in Web page creation
 </h4>
 </center>
All non-centered section titles in this page were entered with <h3> tags.

Centered text

The handy <center> tag (block-level) does what its name implies.

If the text does not fit into a single line, it will be broken, and then each individual line will be centered, like in this paragraph.

Any explicit paragraph or line breaks will be shown as well.

Preformatted text

The <pre> tag (block-level) denotes preformatted text, preserving spaces and line breaks exactly as they were entered in the code and using a monospaced font. For example:

<pre>
  City           State    Mean     S.D.

  Salt Lake City    UT   1.263   -0.432
  Newark            NJ   0.446    0.153
  Dallas            TX   1.989    0.762
</pre>
will be rendered as
  City           State    Mean     S.D.

  Salt Lake City    UT   1.263   -0.432
  Newark            NJ   0.446    0.153
  Dallas            TX   1.989    0.762
while without the <pre> tag it would look like this:

City State Mean S.D. Salt Lake City UT 1.263 -0.432 Newark NJ 0.446 0.153 Dallas TX 1.989 0.762

Actually, the <pre> tag can be used as a poor man's tool for creating simple, border-less tables.

Lists

There are two kinds of lists in HTML: unordered and ordered, starting with <ul> or <ol>, respectively, and ending with an appropriate slashed tag. Between those two there may be any number of list items, each tagged with <li>, with a closing </li> optional (just don't).

Here is an example of a simple, unordered list:

   <ul>
      <li> List item One
      <li> Number Two
      <li> Three
   </ul>
It will be formatted as List items may contain, in addition to plain text, other, nested lists, paragraphs, and similar constructs. Not all combinations will look right; use common sense. Here is an example of such nesting:
   <ul>
      <li> This is List item One<br>
      It includes a nested, unordered list:
         <ul>
            <li> Nested Item Unoz
            <li> Nested Item Dos
         </ul>
      <li> Number Two: another nested list, ordered:
         <ol>
            <li> Nested Item Uno
            <li> Nested Item Dos
         </ol>
      <li> Item Three includes two breaks:
         <p> A paragraph break
         <br> A line break
   </ul>
This unordered list will be rendered as

If paragraphs within list items are not spaced the way we'd like them to, well, tough luck. It is up to the browser and may depend on the font in use (or phase of the Moon) — take it or leave it.

Tweaking it on a certain combination of the browser and operating system may break it under another combinations, unless you assume full control over formatting; no longer a one-hour learning process.

HTML is not just about text formatting and page layout. Equally important is its capability of hypertext links: quick transitions to other HTML documents (or even specific places inside those documents). This is done with a hypertext link of the form

   <a href=file>this link</a>
where file should be substituted with the (generally speaking) address of the HTML file, and this link — with the text to be clicked on. The url may be the Web address (URL) of the file (HTML or not) or its name on the same local computer (using a relative or absolute path). Links to HTML files allow (optionally) to specify the anchor (exact destination):

Examples of what may be used as file

When used in links, anchor names have to be preceded with a pound (hash) character, '#', as shown above. These anchors are created in documents with an id attribute attached to any style tag for the targeted text (here: section header), like this;

   <h3 id=LIST>Lists</h3>

You can attach the id to almost any tag shown here (although I would suggest limiting that to headers, paragraph breaks, and larger text units when/if you start using them).

Linking to images

A link may be used for jumps to a 'naked' image, displayed without use of any HTML code. To do that, just use the <a> tag with an image file name as the href attribute:
   This is just an <a href=_i/sd-2.s.jpg>image</a>
Try it: This is just an image

Many servers reject external requests for local resources (leeching): one more reason to stick to local (relative) URLs in image jumps.

A small problem: the way in which 'naked' image files are displayed depends on the browser. Some of these ways look better than others (and no, this is not a matter of taste). Check the Web Browsers article in my photo/bytes section.

For how to use images as links, see Linking from Images further on.

Linking to other file types

A link may point to any type of file (denoted by its name extension). Depending on your browser and system settings, they may be handled in one of the three ways:

As this may sometimes lead to confusion, I would recommend using explicit links only to files always opened by browsers (or, at least, likely to be), and to compressed archives.

The former should include, in addition to any HTML or similar documents, the commonly used PDF format (although the court is still out on this one), and image files. converted to the JPEG, GIF, or PNG format. Anything else should go into a .zip file to be downloaded (yes, stick to the most common, plain ZIP format, even if the most recent IZIP produces smaller files!).

HTML characters

Characters with no keys assigned have to be inserted into the text using the & sequence like shown before. The most common ones are

Letters of the Greek alphabet are entered the same way, using their English names (capitalized for uppercase):

and so on.

Horizontal ruler

Just a matter of personal preferences. The <hr> tag (by itself, no closing) draws a full-width ruler line like this:

Browser compatibility

2019/08/12 — tested OK on latest Opera, Chrome, Firefox, Vivaldi, Edge and Internet Explorer. To the best of my understanding, all I'm describing here should also work fine on the Apple Safari browser, but this I haven't checked.

I'm suspecting some of smartphones or tablet browsers may show rendering problems; they often override the coded HTML formatting with one working (or assumed to work) better on small screens. Tough.

What next?

This article is done; I'm not really expecting it to change, except for small corrections or additions. Still, I'm considering a few sequels ??? dealing with easy use of Cascading Style Sheets. Just thinking.


Copyright © 2019-2020 by J. Andrzej Wrotniak
2019/07/04, last update 2020/01/31