HTML

HTML is a page description language invented by Tim Berners-Lee. Thanks to technologies such as DNS and HTTP, one could write a simple tagging language that could link a document with others. We call these links links. According to ABBREVIATIONFINDER, HTML itself reads Hyper Text Markup Language. Hyper Text in the name thus represents the ability to link, since hypertext is basically another word for linking. Markup means markup and it is done with special tags.

The basic parts of the document

An HTML document is basically a text document. What sets it apart from other text documents are the tags. Using these tags you can divide the document into three parts:

  1. First will be a so-called doctype. With it, both people and computer programs can see which version of HTML it is. In the strictly technical sense, the doctype is not part of the HTML code, but an instruction that precedes it. Keep the word in mind, but ignore its meaning for the time being. Just remember that you should have a doctype first.
  2. The first part of the HTML code is the document header. What is in the head will not appear directly in the browser window, but can affect how the view will go and look. A tag is mandatory, namely the tag <title>. What is listed there will appear on the browser’s title bar , usually the blue bar above the menus. The tag <head>starts with the head, but it is preceded by the tag <html>, which together with its final tag rings the entire document. The main language of the document should be specified as an attribute on the initial html tag.
  3. Most of an HTML document is normally its body. This is indicated by the tag <body>.

Now with our four tags ( <html>, <head>, <title> och <body>) we can make a minimal HTML document. Some things you will see in the code below are not yet explained. If you want to test yourself, just write them off for so long. They are explained later.

HTML

A Minimal Document (HTML5)

In the technically strict sense, this is the smallest possible document, but it is hardly useful.

<!DOCTYPE html><title>Ett tomt dokument</title>

In practice, a minimal document looks like this:

<!DOCTYPE html><html lang=”sv”> <head> <meta charset=”utf-8″ /> <title>Ett tomt dokument</title> </head> <body> </body></html>

If you look at this document in a web browser, it will only be white. The title bar will say An empty document , but otherwise we have not done anything that is visible yet. So let’s add some tags. Here are some tags we can get started with:

  • <h1>which is the main heading of the document.
  • <h2>which are then subheadings.
  • <h3>which consequently becomes another heading level down the hierarchy. So we are not talking about the third heading, but the third level of headings.
  • <p>indicating paragraphs.

Now let’s give our document some content!

<!DOCTYPE html><html lang=”sv”> <head> <meta charset=”utf-8″ /> <title>Ett ej längre tomt dokument</title> </head> <body> <h1>Hipp hurra, ett dokument ida!</h1> <p> Detta är måhända det första dokument jag testat att göra, men något skall det innehålla och här skriver jag det. Tjosan, hejsan, o.s.v. </p> <p> Här kommer det andra stycket. I denna HTML-kod finns det också indrag i koden som markerar vilka delar som hör ihop. De påverkar inte funktionen i sig, utan gör bara koden lättare att läsa. Tag för vana att alltid skapa indrag i din kod. Två blanksteg per nivå, som här, brukar vara lagom. </p> <h2>ännu mera info här</h2> <p> Där kom alltså underrubriken, och detta är ett nytt stycke. Så värst mycket mer orkar jag dock inte skriva nu. Här blev det därför slut. </p> </body></html>

If you want to try making a doument yourself then start taking notes and copy the code above. Save what you wrote with the html extension and make sure that Windows (if you use it) does not load.txt yourself. I have seen several beginners whose documents are called things in the style of foobar.html .txt. so far, our document also looks very boring. Beginners therefore quickly ask questions such as:

  • How can I change the font and / or change the color and size of the text?
  • How can I get a different background color or background image?
  • How to center the text?

Right here you should thank your lucky star for reading my web school and no one else 😉 Too many beginner courses are already starting to introduce tags and other parts of the HTML code that can be used to control these things. But these are not questions about content , but about appearance. The look should not be controlled with HTML at all. Once upon a time, they did, but it was stupid and we who work with web technology have all suffered for this stupidity. Appearance should be controlled with CSS and you will be able to tolerate.

Attributes and elements

What we have called hitherto, they call more versed in elements. A tag is what starts and ends an element. It is called the start tag and the end tag respectively. One element is: start tag + content (including any other elements) + end tag.

Some things we want to be able to tell about what is shown and for that we have attributes. Attributes are written in the start tag and in the text above we had the following example attributes:

  • lang=”sv”
  • charset=”utf-8″

Always write attributes like this: attribute name = “attribute value”. You should always have quotes around the attribute value. In some versions of HTML it is not a law, but it is still a good habit. As a teacher, I’ve seen time after time how problems arise when beginners do not use the quotation marks.

There may be an exception to this rule and that is when using attributes that can only have a single value. The following examples all work exactly the same:

  • <input required=”required” … />
  • <input required=”” … />
  • <input required … />

In this case, it is advantageous to use the lower way, ie. write the attribute without any value at all. It also explains why, in normal cases, you should always have the quotation marks around the attribute values. Imagine that someone wanted to write this:

<p class=”foo bar”>

If you forget the quotes, this will be interpreted by the browser as follows:

<p class=”foo” bar=”bar”>

What was intended as part of the attribute value has now become another attribute. There is a version of HTML, XHTML where you must have quotes around the attribute value. In XHTML , it is also not allowed to write the card form of an attribute, ie. totally without value.

The attribute is long

The lang attribute indicates which (human) language the page has. Here it is fun (or rather sad) to see beginners copying other people’s code from the web and stating lang=”en”even though their documents are in Swedish (or German or Danish or..)

Empty elements

In the code above, the element was meta. You may have noticed that it did not have a separate end tag. This element cannot have any kind of content other than attributes. The element is used here to specify character encoding and the information is used by the browsers as they load the pages locally from the computer file system. The meta-element can be used in several ways, but we leave it there until further notice.

Some other examples of empty elements are:

br

Means break and creates a new line. Can be used for poetry, for example, but should normally be avoided.

hr

Means horizontal rule and used in the childhood of the web to thematically separate parts of a longer page. The element is hardly needed anymore. The visual effect should be created with CSS.

input

Elements for creating input fields in forms.

Link

Specifies a related resource, such as a style template , a favicon, or an RSS feed.

One way of expressing this is that empty elements close themselves , at least in plain HTML. Later we will look at XHTML where the rules are different.

Implicit closure of elements

An end tag is an explicit (explicit) action. In HTML, there may also be elements that are closed by another element’s start tag. Here is an example, where a developer mistakenly believed that you could have lists inside paragraphs, but know that you can avoid ending the individual items (list items = <li>) inside the list.

<p> Hälsningsfraser: <ul> <li>Hej <li>God dag <li>Tjenare </ul></p>

What happens here is that the start tag of the list <ul>ends the paragraph. It is as if the browser interprets the code about this:

<p> Hälsningsfraser:</p><ul> <li>Hej <li>God dag</li> <li>Tjenare</li></ul>

Notice that the p-end tag has been moved up. Paragraphs cannot contain lists, so when the list is started, the paragraph ends. Each individual item in the list is terminated by the next item, or by the end of the list itself. This is called implicit (implicit) termination of elements.

The rules for what ends implicitly and are not quite easy to learn. There are some obvious cases, but also much more difficult. Add to this that the browsers do not always (so far) have identical rules for when implicit closure should occur, it will be even more difficult. For this reason , I always urge you to close each element explicitly! When you then validate your code, incorrectly placed termination tags will result in errors. The above example, according to the HTML5 validator , results in the error No p element in scope but ap end tag seen. In Swedish, this means that there is a termination p-tag, but it does not belong to a p-element. It ended the higher up!

Implicitly created elements

Try the following:

<!DOCTYPE html><title>Test av implicit skapade element</title><style>html { color: red;}body { font-size: xx-large;}</style><p>Hej</p>

Although we did not print html tags or body tags, the CSS code works. The text will be very large and red. The piece (p element) inherits the color and size from the html and body elements. Thus, these exist, although they are not printed in the code. They are implicit in HTML (but not in XHTML, as we will see later.) The code above is thus functionally equivalent to this:

<!DOCTYPE html><html> <head> <title>Test av implicit skapade element</title> <style> html { color: red; } body { font-size: xx-large; } </style> </head> <body> <p>Hej</p> </body></html>

Just as with implicitly ended elements, it is not entirely easy to know when elements have been created implicitly. Therefore, I do not recommend that you rely on this, especially as a beginner. There are a number of experts who fix and try with their HTML code so that it becomes extremely minimal, but the savings are rarely so great that this plays no practical role. Print everything and avoid errors. It’s a rule that minimizes your real worries!

Entities

Tags in HTML start with the character <and end with>, but how do you do if you want to use these characters for something else? If I want to say that 3 is larger than 4 or if I want to write HTML code so that it is visible? The answer is entities. <written as & lt; and> is written as & lt ;.

An entity thus begins with the sign & and ends with a semicolon. Since the character & has this special meaning, it must be written as an entity when you want to use it as part of the regular text. The & amp; which comes from the English word ampersand.

Unnecessary entities

In the absolute childhood of the web, you could not write on river, island and island, without using entities.

The entity to

&aring; = a with a ring.

The entity for ä

&auml;= a with umlaut , in Swedish called trema.

The entity of the island

&ouml;= o with umlaut.

Using them today is completely unnecessary. Since at least 1993, all browsers have supported character encodings containing our Swedish characters.

The only other entity that you normally need to use is &quot;that which corresponds to the quotation mark “. It can be used when writing data code, as in this HTML school. For real quotes, however, there are regular elements: <q>for short quotes and <blockquote>for long quotes.

For those who want to delve into the details of entities, there are plenty of resources online. However, what I have written here goes a long way. It is far more important that as a beginner you learn to handle character encoding. With the right character encoding, you can write whatever character you want, and in this way all other entities except the above mentioned become completely unnecessary.

What kind of HTML?

Just recently, the concept of XHTML was introduced. After all, the web is evolving at a furious pace and the HTML code that was good five years ago is outdated today. We can speak, in simplified terms, about four generations (X) HTML:

  1. HTML 3.2, which was established January 14, 1997.
  2. HTML 4.01, which was established January 24, 1999
  3. XHTML 1.0 and XHTML 1.1, which were adopted on January 26, 2000 and May 31, 2001, respectively.
  4. XHTML 2.0, supposed to replace all old code, but for various reasons this became a flop and the standard is basically dead.
  5. (X) HTML5 is being drafted and will greatly expand what you can do with HTML, with emphasis on applications and not just static documents.

Someone might wonder what existed before HTML 3.2 and the answer is that up until then no standards had had any real impact. Why 4.01 and not HTML 4.0? There was a 4.0, but it contained some bugs and can easily be put into oblivion’s chamber.

The type of HTML you use should be specified using the document doctype. If you use some kind of XHTML, you must also specify the document namespace. Namespaces are so far considered to be a premium and should you forget to enter it, the document will still work flawlessly. However, you can wear a bad habit and if you want to immerse yourself in web technology it will punish you later on. So you do not need to know what a namespace is yet, just remember that they should be specified in XHTML!

Summary

We have thus established that a website consists of three parts: doctype , head and body. The document is marked with what is commonly called tags. It can also be expressed so that the document is made up of elements. Elements can have attributes, which are written inside the start tag according to the formula attributnamn=”attributvärde”. Initially, HTML was not intended to contain information about how the content should be presented in a visual way. Gradually, there were both special elements and attributes whose purpose was to control the design. In modern coding, such tags and attributes should be avoided. That job is handed over to the CSS code instead.

Before we learn more (X) HTML, we must decide what kind of (X) HTML we should choose. The next-next chapter explains the differences between HTML 4.01 and XHTML, as well as their respective subgroups. The following chapter will tell you how to see the difference between well-written HTML code and badly written text.