In this post I’ll show you a simple method to collect and display backlinks from Obsidian markdown files using Hakyll. If you want to learn more about how to use Hakyll in general to publish a website from Obsidian, see my earlier post Moving from WordPress to Obsidian plus Hakyll.
In Obsidian when you open a Markdown file, you can see a list of other files linking to it. This is done automatically when you create a [[Wikilink]] from one file to another. I wanted to replicate this functionality for this site, so that I can show a list of links at the end of a post to pages that link to it.
I found one example of a Hakyll backlink implementation in the wild, Gwern’s solution, available on GitHub. However, Gwern’s implementation is more complex than I needed, and it requires a separate backlink processing step when building the site. So, I ended up writing my own significantly simpler solution. Note that Gwern’s solution has more features beyond simple backlinks, so it’s still worth checking out.
Collecting backlinks during Hakyll site compilation
The main issue with collecting backlinks during the regular Hakyll site compilation pass is that it requires access to the file contents. This means if we were to collect the backlinks as part of the regular compile, we would end up with cyclic dependencies between backlink data and the pages.
The solution I found to this is to separately load and parse the pages, outside of Hakyll’s own Pandoc compilation. It means we end up parsing the files twice, once for backlinks and once for building the pages, but that doesn’t seem to have a major impact on performance, at least with the number of pages I have currently. Even if it did, I’m not sure if there’s any way around it.
Backlink functions
Let’s first look at the functions we need to do the backlink collection itself. First, the main function to do this:
import qualified Data.Map as M
import qualified Data.Text as T
-- A mapping from URL to linking page
type LinkMap = M.Map T.Text [Item String]
instance Writable LinkMap where
write filePath (Item ident content) = write filePath (Item ident $ show content)
-- | Create backlink mapping data from files matching a pattern
makeBacklinkMap :: Pattern -> Rules ()
makeBacklinkMap pagesPattern = do
pages <- getMatches pagesPattern
create ["backlinkmap"] . compile $ buildMap pages >>= makeItem
where
buildMap :: [Identifier] -> Compiler LinkMap
buildMap idents = unsafeCompiler $ do
linkMaps <- forM idents $ \ident -> do
markdown <- parseMarkdown =<< loadFileContents ident
pure $ case markdown of
Nothing -> M.empty
Just p -> M.fromList $ zip (collectLinks p) (repeat [Item ident ""])
return $ foldl (M.unionWith (++)) M.empty linkMapsFirst, this defines a LinkMap type synonym which holds the backlink data. This stores a mapping of a URL to a list of pages linking to it. Because we store the link map data as an Item from a Hakyll Compiler, we need to implement Writable for the type.
The function makeBacklinkMap takes a Pattern as its parameter: This is to allow us to easily control which files we want to collect the backlinks from. The compilation result is stored into an item called backlinkmap. Since it has no route defined, it won’t get an actual file in the output, but we can load it from other compilers to access the backlink information.
The buildMap function orchestrates the building of the actual link map itself. We use unsafeCompiler in it to allow us to run an IO action to load the file contents, which we parse and process for links using parseMarkdown and collectLinks which are shown below, along with some other helpers.
import qualified Data.Text.IO as TIO
-- | Try to parse some text using Pandoc
parseMarkdown :: T.Text -> IO (Maybe Pandoc)
parseMarkdown text = either (const Nothing) Just <$> runIO (readMarkdown obsidianReaderOptions text)
-- | Load file referenced by an Identifier
loadFileContents :: Identifier -> IO T.Text
loadFileContents = TIO.readFile . toFilePath
-- | Collect all wikilinks from a Pandoc AST
collectLinks :: Pandoc -> [T.Text]
collectLinks = query getLinkUrl
where
getLinkUrl inline = case inline of
(Link attr _ (url, _))
| hasClass "wikilink" attr -> [url]
_ -> []
hasClass :: T.Text -> Attr -> Bool
hasClass className (_, classes, _) = className `elem` classes
{-
Below are Pandoc reader options to improve compatibility with Obsidian
flavored markdown. Ext_list_without_preceding_blankline,
Ext_blank_before_header and Ext_implicit_figures aren't necessary,
but I prefer the way they make the parsing work.
-}
obsidianReaderOptions :: ReaderOptions
obsidianReaderOptions =
let exts = disableExtensions (readerExtensions defaultHakyllReaderOptions) obsidianMarkdownDisabledExtensions
in defaultHakyllReaderOptions { readerExtensions = exts <> obsidianMarkdownExtensions }
obsidianMarkdownExtensions :: Extensions
obsidianMarkdownExtensions = extensionsFromList
[ Ext_wikilinks_title_after_pipe
, Ext_lists_without_preceding_blankline
]
obsidianMarkdownDisabledExtensions :: Extensions
obsidianMarkdownDisabledExtensions = extensionsFromList
[ Ext_blank_before_header
, Ext_implicit_figures
]parseMarkdown is used here to parse markdown using Pandoc, and collectLinks uses Pandoc’s query function to collect a list of all wikilink URLs.
Using it within Hakyll
Plugging this code into the site build logic is simple:
main :: IO ()
main = do
hakyll $ do
makeBacklinkMap "content/*.md"
match "content/*.md" $ do
route $ idRoute
compile $ do
-- Load backlinks and create a context
backlinkMap <- loadBody "backlinkmap"
let ctx = backlinksCtx backlinkMap defaultContext
<> defaultContext
pandocCompiler
>>= loadAndApplyTemplate "templates/post.html" ctx
>>= loadAndApplyTemplate "templates/default.html"
backlinksCtx :: LinkMap -> Context String -> Context String
backlinksCtx linkmap linkCtx =
listField "backlinks" linkCtx (getUnderlying >>= getLinks)
where
getLinks ident =
let filename = T.pack . takeBaseName . toFilePath $ ident
in maybe empty pure (M.lookup filename linkmap)Above is a basic Hakyll match/route/compile setup. The points of note here are we load the backlinkMap value, and use the backlinkCtx function to create a listField called backlinks. Note that we use empty as the field’s value if no backlinks are found - this allows us to use an $if()$ in a template to test whether backlinks exist or not.
$if(backlinks)$
<section class="backlinks">
<h2>Links to this page:</h2>
<ul class="post-list">
$for(backlinks)$
<li><a href="$url$">$title$</a></li>
$endfor$
</ul>
</section>
$endif$With the above markup, we get a nice list of backlinks, with titles and URLs.
In closing
One thing I’m not sure I’m entirely happy with this solution is that we need to store the backlink map as a value in the Hakyll store, and then loadBody it later. If we compare it to Hakyll’s own tag building functions, it feels it would be more idiomatic if the backlinks would instead be a value, like this:
backlinks <- makeBacklinkMap "some/pattern/*.md"However, the trouble with this is - at least as far as I could figure out - that it can mess with Hakyll’s dependency tracking. By using the compile + loadBody method, it seems to work better with that regard.
Improvement suggestions on this most welcome.
Comments or questions?
If you have any comments or questions about this post, feel free to email me to jani@codeutopia.net, or use any of the other methods on the contact page.