From: vasan999 on 22 Oct 2007 20:05 The site says, that this will convert html to latex. Can anyone explain me this code? I am not familiar with such difficult commands especially there are no comments line by line explanation and overall operation. 1i\ \\documentstyle{article} 1i\ \\begin{document} $a\ \\end{document} # Too bad there's no way to make sed ignore case! /<[Xx][Mm][Pp]>/,/<.[Xx][Mm][Pp]>/b lit /<.[Xx][Mm][Pp]>/b lit /<[Ll][Ii][Ss][Tt][Ii][Nn][Gg]>/,/<.[Ll][Ii][Ss][Tt][Ii][Nn][Gg]>/b lit /<.[Ll][Ii][Ss][Tt][Ii][Nn][Gg]>/b lit /<[Pp][Rr][Ee]>/,/<.[Pp][Rr][Ee]>/b pre /<.[Pp][Rr][Ee]>/b pre # Stuff to ignore s?<[Ii][Ss][Ii][Nn][Dd][Ee][Xx]>?? s?</[Aa][Dd][Dd][Rr][Ee][Ss][Ss]>??g s?<[Nn][Ee][Xx][Tt][Ii][Dd][^>]*>??g # character set translations for LaTex special chars s?>.?>?g s?<.?<?g s?\\?\\backslash ?g s?{?\\{?g s?}?\\}?g s?%?\\%?g s?\$?\\$?g s?&?\\&?g s?#?\\#?g s?_?\\_?g s?~?\\~?g s?\^?\\^?g # Paragraph borders s?<[Pp]>?\\par ?g s?</[Pp]>??g # Headings s?<[Tt][Ii][Tt][Ll][Ee]>\([^<]*\)</[Tt][Ii][Tt][Ll][Ee]>?\ \section*{\1}?g s?<[Hh]n>?\\part{?g s?</[Hh]n>?}?g s?<[Hh]1>?\\section*{?g s?</[Hh][0-9]>?}?g s?<[Hh]2>?\\subsection*{?g s?<[Hh]3>?\\subsubsection*{?g s?<[Hh]4>?\\subsubsection*{?g s?<[Hh]5>?\\paragraph{?g s?<[Hh]6>?\\subparagraph{?g # UL is itemize s?<[Uu][Ll]>?\\begin{itemize}?g s?</[Uu][Ll]>?\\end{itemize}?g s?<[Ll][Ii]>?\\item ?g # DL is description s?<[Dd][Ll]>?\\begin{description}?g s?</[Dd][Ll]>?\\end{description}?g # closing delimiter for DT is first < or end of line which ever comes first NO #s?<[Dd][Tt]>\([^<]*\)<?\\item[\1]<?g #s?<[Dd][Tt]>\([^<]*\)$?\\item[\1]?g #s?<[Dd][Dd]>??g s?<[Dd][Tt]>?\\item[<?g s?<[Dd][Dd]>?]?g # Other common SGML markup. this is ad-hoc s?<sec[ab]>?? s?</sec[ab]>??g # Italics s?<it>\([^<]*\)</it>?{\\it \1 }?g # Get rid of Anchors :pre s?<[Aa][^>]*>??g s?</[Aa]>??g # This is a subroutine in sed, in case you are not a sed guru : lit s?<[Xx][Mm][Pp]>?\\begin{verbatim}?g s?</[Xx][Mm][Pp]>?\\end{verbatim}? s?<[Ll][Ii][Ss][Tt][Ii][Nn][Gg]>?\\begin{verbatim}?g s?</[Ll][Ii][Ss][Tt][Ii][Nn][Gg]>?\\end{verbatim}? On Oct 22, 2:57 pm, vasan...(a)hotmail.com wrote: > Basically, it should do all that any of the tools below and in > addition, > > 1/ > human readable output that maintains the text lines of the source, ie > does not scramble the text lines or insert newlines unnecessarily or > removes them. inserts minimal latex elements. > > 2/ > maintains cross-links, ie convert <href to \ref and <name= to \label > > but if the set of htmls is incomplete proceed with the assumption that > the reference is there, ie dont delete the links or try to modify them > or their addresses. One of the tool I tested is too smart in this > respect and actually ruins the result. > > 3/ > proper conversion of images, tables, etc. No math mode involved in > html. > > 4/ > Even an emacs lisp function could be written by a guru that can do the > job. > > 5/ > Is there any commercial wysiwig tool ? > > LaTeX etc > > * html2latex is a program based on the NCSA html parser. Contact: > Nathan.Torking...(a)vuw.ac.nz. > * Another html2latex can combine several HTML files into a single > LaTeX file, converting links between the files to references. External > URL's can be converted into footnotes or into a bibliography sorted on > URL. Contact: F.J.Fa...(a)cs.utwente.nl (Frans J. Faase) > * Another html2latex implemented on Linux by yacc+lex+C. Also > available from the TSX-11 Linux FTP site as nc-html2latex-0.97.tar.gz. > Contact: naoc...(a)naochan.com (Naoya Tozuka) > * htmlatex.pl is a perl script to do the conversion (may be moving > soon). Contact: n9146...(a)cc.wwu.edu (Jake Kesinger) > * There is also a sed script to convert HTML into LaTeX.
|
Pages: 1 Prev: HELP: rsync using destination enviorment data Next: suppress known warning message |