Tuesday, July 19, 2011

Creating syntax highlighted code using Pygments

Highlighted source code is much more readable. You can get it quickly and on practically any platform (MS Windows, Linux or very often even on Chrome OS).

Compare this (where I tried to improve readability at least by using mono spaced Courier font):
for $doc in //Document
let $datestr := $doc/@time
let $datearr := tokenize($datestr, '/')
let $year := $datearr[3]
let $month := format-number(xs:int($datearr[2]), '#00')
let $day := format-number(xs:int($datearr[1]), '#00')
let $dateisostr := concat($year, '-', $month, '-', $day)
where $dateisostr > '2010-06-01' and $dateisostr < '2011-01-01'
(:
let $date := xs:date($dateisostr)
where $date > xs:date('2010-06-01') and $date < xs:date('2011-01-01')
:)
return $doc

with highlighted version of the same code few lines lower:
How to achieve such a result?
    Quick solution using Demo Pygment web site highlighting XML document
  1. Go to Pygment Demo
  2. Go to the bottom of the page
  3. Enter some some description and in Language select XML.
  4. Paste into the window the XML to highlight (optionally upload the file)
  5. Click "Highlight"
  6. You will see highlighted result and have option to use another highlighting style
  7. Copy paste the result into your text. Result could look like this (using "default" style):
<Documents>
       <Document id="1" time="5/2/2010">
       </Document>
       <Document id="2" time="4/8/2011">
       </Document>
       <Document id="3" time="6/9/2010">
       </Document>
       <Document id="4" time="8/10/2010">
       </Document>
</Documents>
In most situations this will be the fastest solution. Check list of available languages and you will see, there is really wide choice, which is very likely to fulfil your needs.
However, in some cases, the language you need is not listed here. This is (at this moment) case for XQuery.
Checking list of Pygment supported languages shows, XQuery is supported, but for some reason not provided on the Demo web site. 

Quick solution using Pygment command line script
I will assume you have Python and pygment script installed (will come to this very soon), you can do this:
  1. Go to list of lexers and find what file extension is your language using. In case of XQuery it is *.xqy and *.xquery
  2. write the code to highlight into text file and save it using file extension for your language. I used file name date.xqy
  3. open console and change into the folder, where you have you file saved
  4. following command (which shall create resulting html file with default style embedded into html
    pygmentize -f html -O full ./date.xqy > date.xqy.html
  5. Open resulting date.xqy.html file in your web browser 
  6. Copy past the code into your text.
The result could look like this:
for $doc in //Document
 let $datestr := $doc/@time
 let $datearr := tokenize($datestr, '/')
 let $year := $datearr[3]
 let $month := format-number(xs:int($datearr[2]), '#00')
 let $day := format-number(xs:int($datearr[1]), '#00')
 let $dateisostr := concat($year, '-', $month, '-', $day)
 where $dateisostr > '2010-06-01' and $dateisostr < '2011-01-01'
(:
 let $date := xs:date($dateisostr)
 where $date > xs:date('2010-06-01') and $date < xs:date('2011-01-01')
:)
return $doc

Installing pygment into MS WindowsHere I assume you are not familiar with Python and just want to have it running. The steps are as follows:¨
  1. Install Python.
    1. Go to http://python.org/download/ and download Windows installer fro some latest stable version. 32-bit version recommended. I used version 2.6, version 2.7.* shall work
      If you have already some Python installed, skip this step (it shall work with any version since 2.4)
    2. Run the Installer for Python
    3. My installation ended with C:\Python2.6 folder with Python installation here.
    4. Add C:\Python2.6\Scripts into your PATH system variable (of course, use your path to the Scripts folder). Hera is instruction how to modify PATH.
  2. Install easy_install (this is utility for installing Python programs)
    1. Go to http://pypi.python.org/pypi/setuptools
    2. Go to Downloads section and pick the exe installer for Python you have in your system
    3. now you shall be able to call command easy_install.
      Open console and try calling the command easy_install
      $ easy_install
      install_dir C:/Python26/Lib/site-packages/
      error: No urls, filenames, or requirements specified (see --help)
    4. The error is ok, we shall be happy, the command told us something. If it complains about command not being available, make sure you have it really install and that your PATH is really pointing int Python Scripts folder.
  3. Install the pygments itself.
    1. Run easy_install and ask it to install pygments. Internet connectivity is required and it does not matter, which folder you are currently in.
      $ easy_install pygments
      install_dir C:/Python26/Lib/site-packages/
      Searching for pygments
      Best match: Pygments 1.4
      Adding Pygments 1.4 to easy-install.pth file
      Installing pygmentize-script.py script to C:/Python26/Scripts
      Installing pygmentize.exe script to C:/Python26/Scripts
      
      Using c:/python26/lib/site-packages
      Processing dependencies for pygments
      Finished processing dependencies for pygments
      
      $
  4. Now we shall be ready to call the command pygmentize. Test it without any parameter and you will see the help.
    $ pygmentize
    Usage: pygmentize [-l <lexer> | -g] [-F <filter>[:
    <options>]] [-f <formatter>]
              [-O <options>] [-P <option=value>] [-o <outfile>] [<infile>]
    
           pygmentize -S <style> -f <formatter> [-a <a
    rg>] [-O <options>] [-P <option=value>]
           pygmentize -L [<which> ...]
           pygmentize -N <filename>
           pygmentize -H <type> <name>
           pygmentize -h | -V
    
    Highlight the input file and write the result to <outfile>.
    
    If no input file is given, use stdin, if -o is not given, use stdout.
    
    <lexer> is a lexer name (query all lexer names with -L). If -l is not
    given, the lexer is guessed from the extension of the input file name
    (this obviously doesn't work if the input is stdin).  If -g is passed,
    attempt to guess the lexer from the file contents, or pass through as
    plain text if this fails (this can work for stdin).
    
    Likewise, <formatter> is a formatter name, and will be guessed from
    the extension of the output file name. If no output file is given,
    the terminal formatter will be used by default.
    
    With the -O option, you can give the lexer and formatter a comma-
    separated list of options, e.g. ``-O bg=light,python=cool``.
    
    The -P option adds lexer and formatter options like the -O option, but
    you can only give one option per -P. That way, the option value may
    contain commas and equals signs, which it can't with -O, e.g.
    ``-P "heading=Pygments, the Python highlighter".
    
    With the -F option, you can add filters to the token stream, you can
    give options in the same way as for -O after a colon (note: there must
    not be spaces around the colon).
    
    The -O, -P and -F options can be given multiple times.
    
    With the -S option, print out style definitions for style <style>
    for formatter <formatter>. The argument given by -a is formatter
    dependent.
    
    The -L option lists lexers, formatters, styles or filters -- set
    `which` to the thing you want to list (e.g. "styles"), or omit it to
    list everything.
    
    The -N option guesses and prints out a lexer name based solely on
    the given filename. It does not take input or highlight anything.
    If no specific lexer can be determined "text" is returned.
    
    The -H option prints detailed help for the object <name> of type <type>,
    where <type> is one of "lexer", "formatter" or "filter".
    
    The -h option prints this help.
    The -V option prints the package version.
    
    
    $
    You are not required to read it all, but if you like...
  5. By now you have pygment installed and can use it as described above in Quick...
Installing Pygment on Linux
I am not living all the time on Linux so I will give just brief instructions. I will guess, what can work, you will guess, what will resolve your real situation.
  1. No need to install Python, it is already part of Linux
  2. for the rest of Linux options, see Installation Instructions for Linux
Small problems to resolve
Unable to find DOS console highlighter
If you read text above a bit carefully, you should find, that what is written and what is really happening on DOS console differs a bit. The only reason is, that I did not find any directly working highlighter for DOS command line and had to use Bash shell one. For this reason I had to replace backslashes with forward slashes.
If you have some better working solution, I would be happy to hear about it.
Broken formatting after text published to this blog
This blog entry was originally written as an e-mail in GMail web client and then automatically published by sending to dedicated e-mail address.
As result, the entry was published, but a lot of formatting was broken (lost linefeeds etc.). Coopy paste into blog editor helped the situation, but this was not what I intended to do.
Possible solution, which I would like, is to write the text locally, convert it into html and then paste the html into blog. This is to be investigated and could be resolved by following next action pointing to Sphinx. I have seen many nice blogs, having really nice code colouring and very precise formatting. Let me know, if you have  simple and usable How to for this type of task.

Proposal for (my) next actions
My intention was to write simple instructions for those, who are not Pythonists and would appreciate some easy highlighting solution.
I am sure, that using Sphinx (which is using pygments for code colouring) would offer solution, where quite complex article can be written in plain text (likely to use reStructuredText syntax), have multiple kinds of source code shown either inline or imported from external text files. For larger work, this can be effective solution. I think, this is true and it is my intention to try it.
So far, have a good luck with existing "simple" solution.


No comments:

Post a Comment