Using Org-mode to Publish a Web Site

This website is published using Org-mode from Emacs, mostly by using Org’s publishing facilities. I also have a git hook that automatically builds my website. Here’s my setup.

Publishing Documents

Org’s publish subsystem is used to gather up Org files for publication. Org-publish needs a list of “projects” to publish; I break my site up into three, corresponding to the Org files and the static content.

(setq org-publish-project-alist
      `(("www"
         :components ("www-pages" "www-static"))
        ("www-pages" ...)
        ("www-static" ...)))

The ellipses are long lists of options for each component. Before talking about those, I should explain how publication works. The entire website lives in a git repository on my server, which I have checked out on my laptop. When I want to change anything, I make a commit, and push it to my server. This kicks off a post-receive hook, which builds the website and updates it.

To build the site, the hook clones the repository to /tmp/www-in, and updates it to the latest version. The hook also creates /tmp/www-out, where HTML files are placed by Org-Publish.

# Allow reading created files by other users
umask 022

# Don't screw up directories I care about
cd /tmp

# Copy repository and make directories
[ -d /tmp/www-in  ] || git clone /home/git/repositories/www.git /tmp/www-in 
[ -d /tmp/www-out ] || mkdir /tmp/www-out

# Get new version
cd /tmp/www-in
unset GIT_DIR
GIT_WORK_TREE=/tmp/www-in/ git fetch
GIT_WORK_TREE=/tmp/www-in/ git checkout -f origin/master

The reason for explicitly setting GIT_WORK_TREE is that that variable, which git reads, is explicitly set in hooks; that’s also why I unset GIT_DIR.

So, I will want Org-Publish to read files from /tmp/www-in.

("www-pages"
 :base-directory "/tmp/www-in"

 :base-extension "org"
 :recursive t

 :publishing-directory "/tmp/www-out"
 :publishing-function org-html-publish-to-html
 ...
 )

("www-static"
 :base-directory "/tmp/www-in"
 :base-extension "css\\|js\\|png\\|jpg\\|gif\\|pdf\\|mp3\\|ogg\\|swf\\|gz\\|tar\\|zip\\|bz2\\|xz\\|tex\\|txt\\|html\\|scm\\|key\\|svg"
 :publishing-directory "/tmp/www-out"
 :publishing-function org-publish-attachment
 :recursive t)))

The keys here should be pretty self-explanatory. The long list of extensions is really ugly, and I every once in a while have a problem wherein I need to add another extension to the list. I should fix that at some point. The ellipsis here is a set of options that tell Org-Publish how to format the output. I’ll get back to that when I talk about styling things.

Getting back to the hook, we now need to call out to Emacs to actually build the Org files to HTML:

emacs -Q --batch \
    -l /tmp/www-in/etc/publish.el \
    /tmp/www-in/index.org \
    --funcall org-publish-all \
    2>&1 | grep -vi "skipping\\|overview"

And finally, the files need to be copied to the server root:

cp -r /tmp/www-out/* ~www/www

There’s one final trick involved. Sometimes I need to update the git hook itself. To make this simple, I keep the hook in the same repository, and copy it every time I build the site:

cp /tmp/www-in/etc/post-receive.hook /home/git/repositories/www.git/hooks/post-receive

This means that when I change the hook, I have to separately commit the hook change, then commit everything else. But separating commits is sort-of a best practice anyway, so I’m not too concerned about that wasteful restriction.

Styling pages in Org-Publish

Org-Publish has a lot of option that govern how it generates HTML. Fundamentally, the HTML it generates is ugly, but it gets the job done, and it’s not that bad to style.

The main options are those set in the definition of the project. These are:

:with-author t
:with-creator nil

:headline-level 4
:section-numbers nil
:with-toc nil
:with-drawers t

:html-link-home "/"
:html-preamble nil
:html-postamble t
:html-head-extra ,my-head-extra
:html-head-include-default-style nil
:html-head-include-scripts nil

Mostly, I turn off a bunch of fluff that I don’t want: the table of contents, the section numbers, the preamble, all of styles Org throws in by default, and so on.

I also turn on drawers and give a custom <head> element.

As it turns out, Org's default for publishing a drawer is to treat it as a code example (<pre> tags). This block is also identical to real code examples, so it’s hard to fix the problem with CSS. I fix that by overriding org-export-format-drawer-function:

(defun my-org-export-format-drawer (name content)
  (concat "<div class=\"drawer " (downcase name) "\">\n"
          "<h6>" (capitalize name) "</h6>\n"
          content
          "\n</div>"))
(setq org-html-format-drawer-function 'my-org-export-format-drawer)

I then use CSS to make these blocks show up as large colored boxes, like this one:

Example

Hi! I’m a drawer!

I also set a custom header, which includes my own default style sheet and a <meta> tag to help out mobile browsers:

(setf my-head-extra
      (concat
       "<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">\n"
       "<link rel='stylesheet' href='/etc/main.css' />"))

I also clean up a bunch of headers to make them shorter—by default Org will dump dozens of lines of code into your page header.

; No author / date at the bottom
(setf org-html-home/up-format "")

; Export as UTF-8
(setf org-export-html-coding-system 'utf-8-unix)

; The defaults are just fine for mathjax and style
; However, they do not work over TLS due to mixed content errors
(setf org-html-mathjax-options
      '((path "/etc/MathJax/MathJax.js?config=TeX-AMS-MML_HTMLorMML")
        (scale "100") (align "center") (indent "2em")
        (mathml nil)))
(setf org-html-mathjax-template
      "<script type=\"text/javascript\" src=\"%PATH\"></script>")

(setf org-html-footnotes-section "<div id='footnotes'><!--%s-->%s</div>")
(setf org-html-link-up "")
(setf org-html-link-home "")
(setf org-html-preamble nil)
(setf org-html-postamble nil)
(setf org-html-scripts "")

(setf org-html-postamble-format
      (list
       (list
        "en"
        (concat
         "<p>By <a href='https://pavpanchekha.com' rel='author'>%a</a>.\n"
         "Share it—it's <a href='http://creativecommons.org/licenses/by-sa/4.0' rel='license'>CC-BY-SA licensed</a>.</p>"))))

A subtlety is the footnotes section, where I have a %s inside a comment, to prevent Org from printing the word “Footnotes” anywhere.

Finally, I make sure to set the author and email, in case Org outputs them anywhere:

(setf user-full-name "Pavel Panchekha")
(setf user-mail-address "me@pavpanchekha.com")

This is just about all of my blog-publishing pipeline. The last trick is the neat tag system I use on my home page. That’s implemented with a few Org hacks.

First, the main page is an Org file that uses an outline for each blog entry. Each post has a set of tags, which are actual Org tags. An example:

* Subpages

** [[file:projects.org][Projects]]
** [[file:esp/][Lectures]]
** [[file:curiae.org][Curiae]]

* Blog

** [[file:blog/major-key.org][Major Key (a puzzle)]]                                                  :misc:
** Age-aware Data Structures                                           :algs:
*** [[file:blog/age-aware/array.org][Age-aware Array Search]] (Part 1 of 4)
*** [[file:blog/age-aware/tree.org][Tree Lookup]] (Part 2 of 4)
** [[file:blog/stream-fusion.org][Stream Fuse Carefully]]                                                :plt:

I make sure that only the top level heading (“Subpages” and “Blog”) are treated as headings with

#+OPTIONS: H:1 toc:nil num:nil

Finally, I use some JavaScript, which you can find in /etc/blog.js, to make the tag-chooser, and some CSS (found in main.css:216–270) to style it.

By on . Share it—it's CC-BY-SA licensed.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.