decoration decoration
Stories

GROKLAW
When you want to know more...
decoration
For layout only
Home
Archives
Site Map
Search
About Groklaw
Awards
Legal Research
Timelines
ApplevSamsung
ApplevSamsung p.2
ArchiveExplorer
Autozone
Bilski
Cases
Cast: Lawyers
Comes v. MS
Contracts/Documents
Courts
DRM
Gordon v MS
GPL
Grokdoc
HTML How To
IPI v RH
IV v. Google
Legal Docs
Lodsys
MS Litigations
MSvB&N
News Picks
Novell v. MS
Novell-MS Deal
ODF/OOXML
OOXML Appeals
OraclevGoogle
Patents
ProjectMonterey
Psystar
Quote Database
Red Hat v SCO
Salus Book
SCEA v Hotz
SCO Appeals
SCO Bankruptcy
SCO Financials
SCO Overview
SCO v IBM
SCO v Novell
SCO:Soup2Nuts
SCOsource
Sean Daly
Software Patents
Switch to Linux
Transcripts
Unix Books

Gear

Groklaw Gear

Click here to send an email to the editor of this weblog.


You won't find me on Facebook


Donate

Donate Paypal


No Legal Advice

The information on Groklaw is not intended to constitute legal advice. While Mark is a lawyer and he has asked other lawyers and law students to contribute articles, all of these articles are offered to help educate, not to provide specific legal advice. They are not your lawyers.

Here's Groklaw's comments policy.


What's New

STORIES
No new stories

COMMENTS last 48 hrs
No new comments


Sponsors

Hosting:
hosted by ibiblio

On servers donated to ibiblio by AMD.

Webmaster

A Basic HTML and OCR Tutorial

How to Comment in HTML ]  [ Simple things ]  [ Things to Remember ] 
Inline elements ]  [ Block-level elements ]  [ Character references ]  [ Mixing and nesting ]  [ Useful Tools ]  [ Tesseract OCR on Linux ] 

So many have asked how to leave comments as HTML that we decided to show you.

It's not hard, and you can reference this any time you forget. You don't have to use everything we show you on this page. If all you want to do is leave a link to a story you think is interesting, you can just do that. But some may want to do a bit more than that, so here's a quick crash course which Erwan and I -- mostly Erwan -- put together for you.

You don't have to use any of it. Plain text is fine with me. What matters is that you contribute what you know or find. This tutorial is only if you wish to know how. We also explain what works on Groklaw and what doesn't, for those who already know HTML but bump into our quirks.

You can use the above links to jump to the part that interests you, and leaving a link is in the first one, "How to Comment in HTML" in the section, "The Simple Things", and together with the second link, "Things to Remember", you have the basics and really all you need to know for a simple HTML comment.

Allowed HTML in Groklaw Comments

To begin, when you click on Reply so as to post a comment, you will see this notice on the page:
Allowed HTML Tags:<p>, <b>, <i>, <u>, <strike>, <a>, <em>, <strong>, <br>, <tt>, <hr>, <li>, <ol>, <ul>, <sup>, <sub>, <blockquote>, [code]
Clickable links: <a href="http://www.example.com/">Like this</a>

This shows you what tags are allowed, which is another way of saying it's telling you what you can do, and that implies there are some things you can't do on Groklaw, even though HTML lets you do much more because only a subset of HTML is allowed in Groklaw comments. I can use more tags in articles, but that's because I don't leave spam, if you know what I mean, and if I goof, I can fix it, which you can't. If any and every HTML tag were allowed, it would be easy to create comments that wreck havoc on Groklaw's screen layout.

Still, this list allows you to add quite a few effects to your Groklaw comment layout. But note that you don't have to go for effects. Let's start with the basics.

The Simple Things

  • How to do a link:

    If you wish to leave a clicky link, you can do so following this example (cut, paste & edit recommended):

    <a href="http://www.example.com">a clicky link</a>.

    If your url is quite long, please find a natural break point, any punctuation such as / or & or = and right there, hit the return key to move to the next line. Otherwise, if you don't break it up, Geeklog will do it for you, and sometimes not ideally. It happens most often if you use the url as the description also.

  • Emphasis:

    If you want to make something bold, you can use <b> before the word or phrase you want to stand out and </b> after it. Same with italics, only you'd use <i> </i> instead of <b> </b>. Same with underlining, only you use <u></u>.

  • Creating line breaks:

    You might want to do this to separate two paragraphs, for example, or to make a list look right. You can either insert line breaks using <br> or enclose your paragraphs between <p> and </p>.

  • More options: Tables, footnotes, indentation, lines, lists:

    For most of us, doing a link and maybe some bold and line breaks is really all you ever need to leave an HTML comment. Don't forget to look at the choice beneath the comment text area, where you must choose either plain text or HTML, and choose HTML and then either Preview, to check your work, or Submit if you like to live on the edge. But that's not all you can do with HTML. Here are a few more options: You can do <sup> </sup> if you want a number or something up in the air, like a footnote. You can use <hr> to make a line across the page, like this:


    You can have indented sections in a quotation from some article like this:

    Text you have that says something
    about the article:

    Text from the article.

    Your words continue.

    <p>Text you have that says something about the article:</p>
    <blockquote>
    Text from the article.</blockquote>
    <p>
    Your words continue.</p>

    This makes the quoted section indented and lightly shaded, so it makes it easy to see what is a quotation and what is your comment.

    If you want to list things so they have a dot in front of each line, it's easy. Just do this:

    Here's my list:
    • your first item
    • your second item
    Here's my list:
    <ul>
    <li>
    your first item</li>
    <li>
    your second item</li>
    </ul>

    You can put it in blockquotes if you want to shade and indent it:

    Here's my list:
    • your first item
    • your second item
    Here's my list:
    <blockquote><ul>
    <li>
    your first item</li>
    <li>
    your second item</li>
    </ul></blockquote>

Back to top ] 

Things to Remember

Things to remember about HTML comments:

  1. Post Mode must be set to HTML Formatted. You can check for errors by using the "Preview" button or you can just click "Submit" if you like to live dangerously. But if you check your link before using the "Submit Comment" button, you'll spare us a lot of wasted energy if you happened to make a mistake, which isn't hard to do in urls. The safest way to make sure you get it right is to copy and paste the url, but you can still make mistakes, like using a colon instead of a double quote. I do that a lot. I don't know why. If you preview, you will see your url has disappeared, if you make a mistake like that, so it's worth checking.
  2. Check opening & closing double quotes around your URL: "blah.com" not 'blah.com'
  3. It won't work on Groklaw unless you include the http:// part of the address. By that I mean, you can't just type www.groklaw.net or just groklaw.net. If you do, it gets stripped out and disappears.
  4. Remember the closer after your link description, </a>. Except <br> and <hr> all HTML tags must have a closer. However, </p> and </li> are optional.
  5. No spaces in the url string.
  6. Though case does not matter in the domain name, it does for any file location or parameter details that may follow the / after the domain name. So it would be http://www.groklaw.net/About if the page About was capitalized.
  7. In preview mode, it may look like your signature is flowing into the end of your post. Final post will be OK. Ignore.
  8. If you want line breaks, use <br>. Alternatively, you can use <p> to start a new paragraph. Paragraphs are separated by a wider line space. If you forget to insert <br> or <p> all text will run together.

Back to top ] 

Now let's dig into the details a bit. From here on, the HTML will be shown on the left, and how it looks on the right, and it's all Erwan, as you can tell by how he teases me about my HTML:

Inline elements

Inline elements are HTML tags that will affect a subset of the text within a block. For example a few words in a sentence or a sentence in a paragraph.

Here are Groklaw comments' allowed inline elements:

Rendering HTML code
Clicky <a href="http://www.example.com">Clicky</a>
bold <b>bold</b>
underlined <u>underlined</u>
italics <i>italics</i>
strike <strike>strike</strike>
teletype or monospaced text <tt>teletype or monospaced text</tt>
Line
break
Line<br>break
emphasized <em>emphasized</em> <!-- by default this is the same as <i> for italic but this could be changed in a CSS file -->
strong <strong>strong</strong> <!-- by default this is the same as <b> for bold but this could be changed in a CSS file -->
Textsuperscript Text<sup>superscript</sup>
Textsubscript Text<sub>subscript</sub>

To learn more see the block-level and inline elements documentation on www.w3.org.

Back to top ] 

Block-level elements

Block-level elements are HTML tags that delimit blocks. Roughly speaking, a block is an on-screen rectangle. The block elements allowed in Groklaw comments will always force a newline

Rendering HTML code

A paragraph.

And a second one.

<p>A paragraph.</p> <!-- Note that the closing </p> are optional -->
<p>And a second one.</p>
Here is a list:
  • One item.
  • Another item.
Here is a list:
<ul>
<li>One item.</li>
<li>Another item.</li>
</ul> <!-- Note that the closing </li> are optional -->
Here is an ordered list:
  1. Item one.
  2. Item two.
Here is an ordered list:
<ol>
<li>Item one.</li>
<li>Item two.</li>
</ol>

Text you have that says something
about the article:

Text quoted from the article.

Your words continue.

<p>Text you have that says something about the article:</p>
<blockquote>Text quoted from the article.</blockquote>
<p>Your words continue.</p>

A horizontal rule.


Separating two paragraphs.

<p>An horizontal rule.</p>
<hr>
<p>Separating two paragraphs.</p>

Back to top ] 

Character references

Character references are used to display special characters. The most obvious one is "<" that starts HTML <tags> and needs to be coded as "&lt;".

Character references start with "&". Therefore to display "&" you need to code it as "&amp;".

A little trick: If you want to turn a piece of plain text that contains some HTML examples into HTML code and have the HTML code show in the result as we have on this page, first replace all instances of "&" by &amp; then all instances of "<" by &lt;.

Rendering HTML code
& &amp;
< &lt;
> &gt;
[   ] [&nbsp; &nbsp;] <!-- non-breaking spaces -->
© &copy;
® &reg;

Thérê äre plentÿ õf øther esçåpe seqüences to give some exotic toûch to yoúr writings.

(Th&eacute;r&ecirc; &auml;re plent&yuml; &otilde;f &oslash;ther es&ccedil;&aring;pe seq&uuml;ences to give some exotic to&ucirc;ch to yo&uacute;r writings.)

To learn more see the character references documentation on www.w3.org. There are also lists of character entities such as this one and this other one.

Back to top ] 

Mixing and nesting

You can mix and nest all these <Tags> to create a Groklaw HTML message.

When nesting elements, always remember to close the innermost element first and outermost last. Therefore, closing </tags> should appear in reverse order from the respective opening <tags>.

Rendering HTML code
Here is an ordered list:
  1. Item one with a nested list.
    • A
    • B
  2. Item two.
Here is an ordered list:
<ol>
<li>Item one with a nested list.
   <ul>
   <li>A</li>
   <li>B</li>
   </ul>
</li>

<li>Item two.</li>
</ol>
When one wants to explain to a paralegal that
AT&T
should not be written
<u><i><b>AT&T</u></i></b> (1)(2)(3)
because the right HTML code is
<u><i><b>AT&amp;T</b></i></u>

one has to cope with recursive HTML tags escaping.


  1. & must be escaped as &amp;
  2. When nesting <Tags>, the closing </Tags> should appear in reverse order from the opening ones.
  3. However ugly this HTML is to a purist, it will work in most browsers that can cope with little human errors.

<p>When one wants to explain to a paralegal that</p>

<blockquote>
<u><i><b>AT&amp;T</b></i></u>
</blockquote>

<p>should not be written</p>

<blockquote>
&lt;u>&lt;i>&lt;b>AT&amp;T&lt;/u>&lt;/i>&lt;/b> <sup>(1)(2)(3)</sup>
</blockquote>

<p>because the right HTML code is</p>

<blockquote>
&lt;u>&lt;i>&lt;b>AT&amp;amp;T&lt;/b>&lt;/i>&lt;/u>
</blockquote>

<p>one has to cope with recursive HTML tags escaping.</p>

<hr>

<ol>
<li>&amp; must be escaped as &amp;amp;</li>
<li>When nesting &lt;Tags>, the closing &lt;/Tags> should appear in reverse order from the opening ones.</li>
<li>However ugly this HTML is to a purist, it will work in most browsers that can cope with little human errors.</li>
</ol>

Back to top ] 

Useful Tools

Most FLOSS text editors will offer HTML syntax coloring.

This static page has been created using the KDE Quanta plus web development environment which you should find in any Linux distibution.

If you want to go beyond a few clikies in your Groklaw comments, the Web Developer Firefox Add-ons is really great.

At first, you might feel lost in the HTML documentation at www.w3.org. It is however a really great source of information.

To check if your HTML code is right you can use w3.org's Markup Validation Service. The other option is to run the command line utility "tidy":

erwan@bg-corblin:~/Temp/Groklaw$ tidy -e Guidelines04.html
line 1 column 1 - Warning: missing <!DOCTYPE> declaration
line 1 column 1 - Warning: inserting missing 'title' element

Info: Document content looks like HTML 4.01 Transitional
2 warnings, 0 errors were found!


You are recommended to use CSS to specify the font and
properties such as its size and color. This will reduce
the size of HTML files and make them easier to maintain
compared with using <FONT> elements.

To learn more about HTML Tidy see http://tidy.sourceforge.net
Please send bug reports to html-tidy@w3.org
HTML and CSS specifications are available from http://www.w3.org/
Lobby your company to join W3C, see http://www.w3.org/Consortium
erwan@bg-corblin:~/Temp/Groklaw$

The two warnings highlighted always show up in HTML code that is meant to be inserted in a Groklaw page (either as a story, a static page or a comment). It means that tidy did not see the HTML document headers. These headers are taken care of by Geeklog.

Back to top ] 

Tesseract OCR on Linux

Google's Tesseract page, where you can download it and learn about it.

Groklaw's Instructions from 2006.

To use Tesseract OCR on Ubuntu Linux, you can follow these instructions, Tesseract-ocr: convert scanned images into editable documents on Linux.

Back to top ] 


This page was created by PJ and Erwan.


Last Updated Thursday, July 14 2011 @ 07:24 PM EDT


Groklaw © Copyright 2003-2013 Pamela Jones.
All trademarks and copyrights on this page are owned by their respective owners.
Comments are owned by the individual posters.

PJ's articles are licensed under a Creative Commons License. ( Details )