Author: * Sankira Qin -
3 Posts
on this thread out of
1,337 Posts
sitewide.
Date: Jun 9, 2005 - 10:06
Let's see If I can explain this in plain English AND without boring you. Now, those who are technically adept at this will probably say, "well...that's sorta true, but..." So know up front, I'm not trying to make this technically precise. I don't think people are that interested in all the little nuances, and if they are they can go research it. I'm trying to explain the thinking behind it more than the actual spec.
SGML (Standard Generalized Markup Language) is the international standard for creating "interchangeable, structured documents." This markup language has been used in business for over twenty years and was used to create the DTDs for HTML. A DTD (Document Type Definition) is a set of rules with which your code must comply before your document (like your webpage) will display properly. So, as you can see, HTML is a tiny subset of SGML, or you could say, it's an SGML application.
Basically, XML is "SGML Lite". The intention behind it was to make the power of SGML more readily available. With it you can write your own DTDs ... in other words, create your own markup languages. In fact, the W3C actually describes HTML as an XML application and XHTML as a "reformulation" of HTML 4.
HTML's default DTD is very forgiving. XML is very unforgiving. While XML is very readily adaptable to the web, it's much more powerful for other uses. Still, the two should be able to work together, since it's expected that many XML programmers will want to import their results to the web...or at least use an XML browser to view the information. So, in order to "force" more compatibility between the two, HTML is being moved closer to the strictness of XML. The first step on this road is XHTML, which incorporates new rules into HTML that require you to pay a little more attention to your code.
Let's take a look at the requirement in XHTML that you use lowercase tags, for example. XML doesn't require all lowercase tags. XML requires the programmer to use the same case in opening and closing the tags. In other words, if your opening tag is
then your closing tag must be
and if your opening tag is
your closing tag must be
.
(And, by the way, XML does NOT consider and the same thing.) So, why does XHTML require you to use all lowercase tags? All I can figure is the W3C thought it would simply be better to pick a case and force the issue. Make people get used to matching their opening and closing tags by removing all but that one option. Comply or it won't work is certainly motivation to remember your syntax. But it's necessary, since XML is case sensitive and the aim is to make newly created web documents compatible with XML documents.
Let's say Joe has created an XML document that describes the characteristics of the company's Super Widget (some weird machine, don't ask me....). This document carries no information at all about how it should be displayed to a user. An application can use the information in Joe's document to generate calculations based on the data, rearrange the information so it can be inserted into a database...etc.
Joe's XML document is purely descriptive of the widget, but because of the XML tags he's included, the tagged information can be acted upon by applications, including an XML browser, if his document is rendered using stylesheets. You could associate it with CSS. But while CSS could probably render the document in a CSS-compliant browser (after a fashion), it isn't nearly powerful enough to handle processing the document's data. And not all browsers recognize the markup that associates a stylesheet to an XML document. Instead, XSLT (eXtensible stylesheet language transformation) is used on the server side to translate the XML file into something browser compatible.
At the same time, the company's webdesign programmers are creating pages for the company intranet and they suddenly realize that Joe's XML program for the widget specifications would be really useful on this page. Since their webpages are written in XHTML, they can have their pages and Joe's program interface fairly easily. That would not be possible if they were using the old HTML 4 specs in their webpages.
So that's really how XHTML is a "mix" of HTML and XML. There's no mix of the languages (since XML doesn't exist as a language until the programmer creates it)...it's a mix of the theories behind them. [All lowercase tags, all attributes in quotations marks, tags opened and closed and in the right order, etc.]
|