Yeah, that about sums up the differences. I suppose there are a few more finer points, but those seem to be the key ones. Lower case HTML tags mostly, and being less forgiving about closing of tags. It's essntially a "cleaned up" form of HTML, and is supposed to have less incompatible browser quirk issues. It also compresses better, since most page content has more lower case characters than upper case characters, so there is more redundancy between the page content and the markup tags. (Unless it's an AOL page...?
). That and it's easier to type lower case tags.
There is an XHTML validator you can use to check your sites:
http://validator.w3.org/I generally prefer XHTML over HTML, mostly because it seems to be the strictest set of rules, which usually means what you write is pretty much guaranteed to work anywhere. Although, it's not technically a strict subset of HTML. For example, one difference I've come across is that "<BR>" is valid in HTML, whereas "
" is valid in XHTML, and each are invalid in the other format. Basically, XHTML forces you to close every tag, including ones that are always empty (usually with the shorter <tag /> syntax, instead of the longer <tag></tag> syntax). HTML tends to ommit certain end tags when the tag never has any content. Oddly enough though, script tags still need to be given as <script></script> pairs, even if they're just referencing an external JavaScript file. It won't work if you try to use <script /> syntax.
Btw, as for various web languages, here's how their purpose breaks down:
Client Side (parsed/used/executed by the browser):
1) Content and Structure: HTML, XHTML
2) Layout and Appearance: CSS
3) Behavior: JavaScript
Server Side (parsed/used/executed by the web server):
1) Flat files simply served as is: .html (HTML/XHTML), .css, .js
2) Executed, and standard output returned (usually the output is an HTML/XHTML page, but it could be an image, or any other file type):
PHP, Ruby, ASP, Perl, Python, ... pretty much anything you want to use really, including compiled binaries.
2a) Run as a CGI script (like starting a new program, and loading it from disk for each GET request, which can be a little slow). Any language can basically be run this way, usually scripting languages, sometimes compiled binaries. Since the process is isolated from the web server, it won't bring the web server down if it crashes.
2b) Run as an Apache (or other web server) module. These get loaded by the web server when it's started and stay in memory, so they should provide lower latency responses. If the module contains a bug which could cause it to crash, it can bring down the whole web server. PHP is typically run as a module. Note that the previous comment means a bug in the PHP interpreter, not in the PHP script. The PHP interpreter should handle problems with the PHP script by basically terminating the script, without crashing the interpreter, or Apache. Ruby also has a module, although, the use of mod_ruby is somewhat questionable, mainly due to a single shared interpreter instance, and caching of required files, so they won't pick up changes unless you restart the web server. There are other Ruby modules I've heard of but they seem to basically use...
2c) Run as a seperate private internal web server, to which Apache makes proxy requests. To the client, it appears these requests are served directly by the main web server (usually Apache), but are actually served by a more custom built framework, which is usually tied to a specific language. For example, Ruby on Rails seems to use this approach. This would appear to have many of the same performance benefits as an Apache module, but perhaps with a bit of extra latency due to the proxing. Also, flaws in the secondary web server are isolated from the main web server, so if it crashes, it won't bring down the whole web server. It may however bring down that one app. I've heard in the case of a mod_rails though, that it may actually restart any failed secondary servers upon the next request that Apache gets.
I doubt I'd get after anyone for not doing work. It's not like I have time to get a whole lot done anymore. Heck, I probably don't even have the time to chase after other people to get things done.
My intention is that it will be a purely voluntary basis. I would expect people to just work on what interests them, and if something gets done and is reasonably polished, they can post an announcement or send a PM, so it can be reviewed and put up on the site.