[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: OT Palm readers Re: [dvd-discuss] CTEA ProtectsWhatCopyrights?
"Arnold G. Reinhold" wrote:
> I'd argue for a simple subset of HTML. That would give you the best
> of both worlds. The source would still be in plain ASCII. Some
> simple "pretty printing" (lines wrapped to a standard length,
> paragraph tags on separate lines, headers on their own line, etc.)
> can make the HTML source <I>reasonably</I> readable as is. And, of
> course, it would be understood by existing Web browsers.
Twiki/email-isms could do the simple <I>, <em>, and <b> stuff. All I
really need though is two rules.
(1) What regexp constitutes a heading for the TOC (e.g. Chapter)
(2) What regexp constitutes a <p>
and define these in a "META" information section in the PG "the small
print".
META HEADING1="^\s+Chapter\s+\d+"
META PARAGRAPH="^$"
Actually, one could implement other Twiki/email-ism transformations as
s/pat/repl/. Here is the regexp
s/\b\*(\S+)\*\b/<I>$1</I>/
that would map *italic* to <I>italic</I> (if I haven't fumble fingered
it). This
uses a regexp subset to allow flexible plain text encoding of simple
tagging without imposing coding standard on the PG volunteers -- and
without embedded tags in the content. A standard encoding set COULD be
defined and used and declared
META TAG_ENCODING= "PG_meta_v1.0"
or
META TAG_ENCODING = "Twiki_v2.1"
allowing for any number of implementations for conversion tools to
non-plain text ebook, HTML, RTF, whatever, without encumbering the
"small print" section with inscrutable (and fragile) regexp lists.
Finally, one could do a CDDB-like metatag database for PG files instead
of embedded the encoding information (depending on PG's willingness to
support such a thing).
.002