"Bear with me a moment..."

LogBlog

Pat Logan's Web Log

"But any reasonable concept of democratic citizenship requires an individual
who is able to discern knowledge from propaganda,
is competent to choose among conflicting claims and programs,
and is capable of actively participating in the affairs of the polity."—Aronowitz

(Return to blog summaries)

This personal Web page is not an official University of Rhode Island Web page. See URI disclaimer

Oct. 13, 2013

HTML and the Web

After my last post ("Is Web Development on the Right Track at URI?") I was asked whether the HTML that I teach is old-fashioned.

Good question with a simple answer: It is not. In fact, it remains at the heart of the modern web, which is why I teach it.

How do I know that? I know about the W3C.

W3C?

My delight in the web comes not from its technology. Coding web pages is fun and interesting, but hours spent on pages for courses and research grow tedious. I do not love the web as a coder. Rather, I love the web for its essence—the coolest experiment in global democracy ever.

The web is not run by a governing body: It never has been. It has, from its inception and to this day been run by a consortium of world experts, empowered by reason and a process of consensus building. Together, they have built a critical enabling technology that has changed human life forever, all within the lifetimes of my youngest students.

The home page of the World Wide Web Consortium (W3C) is www.w3.org. The W3C's About page describes its mission:

The W3C mission is to lead the World Wide Web to its full potential by developing protocols and guidelines that ensure the long-term growth of the Web.

The history of the W3C is available from its own Facts About page, which includes links to other histories and timelines. The ancient history of the web, for example, includes this:

In 1989, Tim Berners-Lee invented the World Wide Web (see the original proposal). He coined the term "World Wide Web," wrote the first World Wide Web server, "httpd," and the first client program (a browser and editor), "WorldWideWeb," in October 1990. He wrote the first version of the "HyperText Markup Language" (HTML), the document formatting language with the capability for hypertext links that became the primary publishing format for the Web. His initial specifications for URIs, HTTP, and HTML were refined and discussed in larger circles as Web technology spread.

In October 1994, Tim Berners-Lee founded the World Wide Web Consortium (W3C) at the Massachusetts Institute of Technology, Laboratory for Computer Science [MIT/LCS] in collaboration with CERN, where the Web originated [...] with support from DARPA and the European Commission.

I want to get to the question about HTML, but must share, without further comment, one more bit from the W3C's Mission page: this statement of principles (links within original removed):

Design Principles

The following design principles guide W3C's work.

Web for All

The social value of the Web is that it enables human communication, commerce, and opportunities to share knowledge. One of W3C's primary goals is to make these benefits available to all people, whatever their hardware, software, network infrastructure, native language, culture, geographical location, or physical or mental ability. [...]

Web on Everything

The number of different kinds of devices that can access the Web has grown immensely. Mobile phones, smart phones, personal digital assistants, interactive television systems, voice response systems, kiosks and even certain domestic appliances can all access the Web. Learn more about:

Vision

W3C's vision for the Web involves participation, sharing knowledge, and thereby building trust on a global scale.

Web for Rich Interaction

The Web was invented as a communications tool intended to allow anyone, anywhere to share information. For many years, the Web was a "read-only" tool for many. Blogs and wikis brought more authors to the Web, and social networking emerged from the flourishing market for content and personalized Web experiences. W3C standards have supported this evolution thanks to strong architecture and design principles. Learn more about:

Web of Data and Services

Some people view the Web as a giant repository of linked data while others as a giant set of services that exchange messages. The two views are complementary, and which to use often depends on the application. Learn more about:

Web of Trust

The Web has transformed the way we communicate with each other. In doing so, it has also modified the nature of our social relationships. People now "meet on the Web" and carry out commercial and personal relationships, in some cases without ever meeting in person. W3C recognizes that trust is a social phenomenon, but technology design can foster trust and confidence. As more activity moves on-line, it will become even more important to support complex interactions among parties around the globe. Learn more about:

The odd thing about the W3C is that it functions as the governing body for the entire world wide web, but it is not actually in charge of anything. It does not create or enforce anything. It has no authority like the ISO or ANSI. Instead, the W3C simply makes "Recommendations."

What does "Web standard" mean? What is a "Recommendation"?

W3C publishes documents that define Web technologies. These documents follow a process designed to promote consensus, fairness, public accountability, and quality. At the end of this process, W3C publishes Recommendations, which are considered Web standards.

W3C and HTML

If HTML was invented by Tim Berners-Lee in 1990 and the W3C was created in 1994, are either still relevant in 2013?

The W3C continues to be the source of recommendations that determine the nature of the modern web. Although it is not a rule making organization, in fact its recommendations carry the weight of standards. Two groups of people feel this weight: browser makers and code writers.

Because the W3C only makes recommendations, and in fact does not enforce standards of any kind, browser makers are free to take their time ensuring that their browser can do the things that the W3C recommends. Browser makers are also free to not follow the W3C at all. Fortunately, over the 18 years of its existence, the W3C has seen general compliance with its recommendations throughout the web browser industry, which gives it the basis for its claim, "which are considered Web standards."

HTML

One of those standard technologies is HTML, a technology that started before the W3C, but since 1994 has been shaped and changed by W3C recommendations.

The W3C hosts a number of detailed timelines and histories of the internet, the W3C, and HTML. One of my favorites is a history of html by Dave Raggett, published in 1998 in a now out-of-print book, Raggett on HTML4. Fortunately, chapter 2, "A History of HTML," is preserved on Dave's W3C web site. It is a great read from someone who was a part of the actual early days of the web and a contributor to its development and improvement. The chapter outlines the history from the 1980s through 1998.

Raggett describes how several things that had been developed by 1991 converged to make HTML on the internet possible. These include the internet itself (physical infrastructure and ability to transmit information via a standard transfer protocol, TCP-IP), the hypercard concept of the MacIntosh computer by Apple (on-screen buttons allowed users to navigate between files), and the Domain Name System.

A program called Distributed Name Service (DNS) maps domain names onto IP addresses, keeping the IP addresses `hidden'. DNS was an absolute breakthrough in making the Internet accessible to those who were not computer nerds.

The original HTML invented by Tim Berners-Lee in 1990 (and all variations since) was globally acceptable because it was based on an already globally established model.

The HTML that Tim invented was strongly based on SGML (Standard Generalized Mark-up Language), an internationally agreed upon method for marking up text into structural units such as paragraphs, headings, list items and so on. SGML could be implemented on any machine. The idea was that the language was independent of the formatter (the browser or other viewing software) which actually displayed the text on the screen. The use of pairs of tags such as and is taken directly from SGML, which does exactly the same. The SGML elements used in Tim's HTML included P (paragraph); H1 through H6 (heading level 1 through heading level 6); OL (ordered lists); UL (unordered lists); LI (list items) and various others. What SGML does not include, of course, are hypertext links: the idea of using the anchor element with the HREF attribute was purely Tim's invention, as was the now-famous `www.name.name' format for addressing machines on the Web.

Basing HTML on SGML was a brilliant idea: other people would have invented their own language from scratch but this might have been much less reliable, as well as less acceptable to the rest of the Internet community. Certainly the simplicity of HTML, and the use of the anchor element A for creating hypertext links, was what made Tim's invention so useful.

The original HTML of 1990 was intended for text documents only. It was not until April 1993 that the first browser, Mosaic, was released with the ability to use a new HTML image element (=tag) capable of allowing web pages to now include images as well as text.

In 1993 and 1994, groups all over the world were inventing new additions to HTML and writing code for browsers to support them. By July 1994, a specification including most of these new tags was developed as HTML 2. In the Fall of 1994, Netscape Communications Corp. was formed by a former leader of the Mosaic project, launching a browser (eventually, Netscape Navigator) whose progressive elements made it a market leader for a long while. These included, for example, a new tag (<center>) to center elements and <layer> to permit placement of content within individual, placeable layers on the page; the layer tag was unique to Navigator, quickly disappearing in the next iteration of the browser, replaced by a new <div> tag, suggested by the W3C to create a division of the page. <center> was eventually deprecated (no longer included in the HTML recommendations: see CSS, next). Browser makers competed with new and distinct HTML elements, and the Browser Wars were on. Writing code that would work on all browsers became a near impossible nightmare, just three years in to the graphic web (i.e., post Mosaic) as we know it.

The W3C Consortium was formed in late 1994. Development of new tags and new browsers preceded into 1995, the situation becoming increasingly chaotic. In March 1995, a draft of HTML 3 was released by the consortium, but disagreements over new elements and attributes led only to still further disagreements. Even something as basic as how to mark up tables caused controversy, which was not resolved until release of HTML 3.2 in May 1996. Browser makers elected to use some of the recommendations of HTML 3, claiming they were supporting the new standard, while others offered extensions, again claiming support to the same standard; but browsers remained distinct and incompatible. The nightmare continued for web developers.

CSS and HTML

It was not until December, 1996, that the first recommendation for CSS 1 emerged, making possible a change in W3C philosophy: From CSS 1 forward, HTML would be primarily to provide structure around content, but CSS would be given the responsibility for rendering (for screen media, this means the visual appearance of the page). For example, the display of fonts was moved out of HTML per se when the <font> tag (which set the size, color, and family of fonts) was deprecated (in effect, removed) from HTML 4.01 in December 1999, with font properties thereafter determined in the style sheets. CSS 1 also provided recommendations for spaces around block elements (padding, borders, and margins between blocks).

CSS 2 followed in November 1997, adding positioned elements, a specification for downloadable fonts (see Google Fonts), and new style rules for printers and speaking (aural) browsers (which could read text from an html page).

Never resting, the W3C issued still another recommendation, for XHTML 1, in January 2000. The "X" stood for eXtensible, and the recommendation was based on XML, the eXtensible Markup Language, a more restricted subset of the standard general markup language model (SGML) behind html. Simply put, it was the W3C's attempt to tighten up a specification, meant to apply to both browser makers and coders.

Work on CSS 3 began with releases in 1999, beginning a modular approach to future recommendations; no longer would a single overall recommendation be released, but targeted advances (changes in borders, for example, adding the "border-radius" property to permit nicely rounded corners (used on this page for quoted materials)) would be released piecemeal.

What's next? The W3C says it will release a stable version of HTML5 (note that the preferred usage includes no space between HTML and the 5) by the end of 2014. HTML5 has been in the W3C recommendation draft stage for a while, however, and many of the preliminary recommendations are already appearing in browsers. HTML5 will (hopefully) mark the end of an internal W3C battle to focus on recommendations for developers or to focus on browser makers. HTML5 reverses the trend toward the stricter, more formal rules defining both browser and coding practices under XHTML, and instead suggests that while coders should continue practices advocated under XHTML, browser makers should relax restrictions and cope with many of the HTML elements and practices left in the myriad older pages still populating the web.

Keeping Up

As a web developer, the challenge is keep up with an ever advancing array of HTML and CSS features, dropping outdated, deprecated elements and properties while experimenting with new ones. Developers must be mindful of the current state of dominant new browsers (as well as slightly older browsers that users have not yet updated). Can the majority of browsers handle the latest and greatest W3C recommendation? If not, what will happen (will the page render, or will it "degrade gracefully") if the feature is not available on a particular browser? Students (and their professors) must learn to check browser compatibility, available through tutorial pages from the W3C Schools, www.w3schools.com (or from sites such as caniuse.com).

Last year, for example, I suggested that students should begin using a partial set of HTML5 features, including the HTML5 document declaration (an HTML element that allows newer browsers to recognize HTML5), the new <nav> element as a container for navigation, and styles for opacity and corner rounding. We held back on use of the placeholder html attribute (places a greyed-out placeholder text inside form boxes to suggest what to enter) and background linear and radial gradients because not enough browsers would recognize them. This year, students are having fun incorporating these features into their pages, confident that the new stuff will now work in the latest major browsers (while coding with an understanding that many people still use slightly older versions).

And so, I continue to teach HTML (web developers use the generic "HTML" to refer to the entire spectrum of HTML technologies, all of which are still being used on servers and browsers somewhere on the world wide web today). Does that make me incompetent or hopelessly old-fashioned? Of course not. It is part of the work of being a web developer to keep up with a constant stream of changes in HTML, but so long as the backbone of the world wide web remains HTML and CSS, it's what students need to be taught.

My concerns with wordpress are not that it is an inappropriate competitor for HTML. Worldpress uses PHP server-side coding and a MySQL database to enter and store, and later retrieve and display, content entered by a user. PHP produces HTML and CSS for client-side browsers, pages that use exactly the same HTML and CSS as pages my students learn to create; otherwise, browser's could not render them.

Wordpress takes users back to a pre-CSS world, where "writers of Web pages were complaining that they didn't have enough influence over how their pages looked," as Lie and Bos put it in their 1999 history of CSS. By locking users into pre-designed and unalterable "themes" (preset style sheets provided by institutional designers) and making it impossible to alter style except through the confining alternatives built into a wordpress application, page builders give up creativity and originality to institutional control. Lots of people like to do paint-by-numbers, as I'd said previously. Lots of people hire a contractor and skilled craftspersons to build their houses, too, and then proclaim "I built my house." But they didn't. Wordpress is no different. HTML, CSS, and javascript remain the essential core technologies of web pages read by modern browsers. If you wish to become a web developer, they are the most important technologies to learn.

(Return to blog summaries)