Tags: ghc

Better executable path queries in GHC 9.4

I previously wrote about System.Environment.getExecutablePath and how I fixed it on FreeBSD. Unfortunately, this function still has some problems. In this post I explain the problems and introduce executablePath, the solution arriving in base-4.17.0.0 (GHC 9.4.1).

Problems with getExecutablePath §

getExecutablePath :: IO FilePath is a way for a Haskell program to query the path to its own executable. It has several significant problems:

Type of the solution §

Types are an essential tool for modelling a problem and guiding the development of a solution. The problems with getExecutablePath reveal that:

The Maybe a type models the existence or absence of a value:

data Maybe a = Nothing | Just a

Accordingly, a suitable type to model this problem is:

executablePath :: Maybe (IO (Maybe FilePath))

The outer Maybe models the presence or absence of a query mechanism. The query itself has the type IO (Maybe FilePath). The inner Maybe models that the query might be unable to return a valid FilePath.

The type is also a kind of (machine-checked) documentation. It reveals things that the written documentation for getExecutablePath should have said, but didn’t.

FilePath is defined as a type synonym for String, which is itself a type synonym for [Char]:

type FilePath = String
type String   = [Char]

It can be argued on multiple grounds that this is not an appropriate type for representing file paths:

  • Performance: [Char] is a linked list of individual characters. Packed strings have better performance.

  • Correctness: FilePath admits any string value, not just valid paths. See above for a real world example: paths suffixed with "(deleted)" on Linux.

I did not go further down this rabbit hole for the change discussed in this post. FilePath pervades base and other “standard” libraries. Furthermore, GHC targets a variety of operating systems; accurately modeling valid file paths on diverse platforms drives up complexity. If you have specific needs not met by FilePath, check out the many path libraries which offer different approaches to representing and working with paths.

Implementation of executablePath §

In this section I’ll briefly review the implementation. GHC merge request !4779 has the gory details, for those interested.

I was able to implement executablePath without modifying any code that uses the foreign function interface (FFI). getExecutablePath was unchanged. executablePath implementations wrap the former. See my earlier post for an example of how getExecutablePath uses the FFI.

Mac OS X, FreeBSD and NetBSD §

The FreeBSD and NetBSD implementations of getExecutablePath are nearly identical, but the implementation for Mac OS X is very different. Nevertheless, the observable behaviour is identical: the system calls error with ENOENT when the executable has been deleted, and succeed otherwise. No other expected failure scenarios are known (yet).

Therefore, the executablePath implementation for these platforms boils down to catching the Haskell exception value corresponding to ENOENT and turning it into Nothing. Unexpected exceptions are re-thrown.

executablePath =
  Just (fmap Just getExecutablePath `catch` f)
    where
    f e | isDoesNotExistError e = pure Nothing
        | otherwise             = throw e

Linux §

The Linux implementation of getExecutablePath reads the value of /proc/self/exe (part of the procfs(5)). The man page states:

If the pathname has been unlinked, the symbolic link will contain the string ‘(deleted)’ appended to the original pathname.

executablePath checks for this condition and, if detected, returns Nothing. Note that we could have stripped the suffix and returned Just the “original” path. Returning Nothing makes it consistent with the other platforms.

executablePath = Just (fmap check getExecutablePath)
  where
  check s | "(deleted)" `isSuffixOf` s = Nothing
          | otherwise                  = Just s

What if the file is named foo (deleted)? The behaviour is ambiguous. Checking the existence of the file is not safe either. If the file was foo, a different file foo (deleted) could exist beside it. Better a false negative in an unlikely scenario, than an unsafe false positive.

Windows §

Windows prevents the deletion of an executable file during the lifetime of any process created from it. So executablePath simply wraps the result of getExecutablePath with a Just.

executablePath = Just (fmap Just getExecutablePath)

Fallback implementation §

The “fallback implementation” is for platforms that don’t have a reliable mechanism for querying the executable path (or no one implemented it in GHC yet). In this case, executablePath does not even supply the query IO action.

executablePath = Nothing

Programs that want to query the executable path have to deal with the Nothing case. That is: the possibility that there is no reliable way to get it. That’s a good thing.

Conclusion §

This article explained the problems of getExecutablePath and reviewed the solution coming in GHC 9.4, called executablePath. I encourage programs that use getExecutablePath to migrate when feasible, especially if multi-platform support is important.

One topic I did not discuss is how I implemented tests for this feature in the GHC test suite. I will cover this in an upcoming post.

Creative Commons License
Except where otherwise noted, this work is licensed under a Creative Commons Attribution 4.0 International License .