hackage-security
hackage-security copied to clipboard
RFC: Protocol for bootstrapping when primary Hackage instance is unreachable
I wasn't sure whether to file this here or in Cabal's issue tracker, but I think this method can be generalised, so I'm documenting it here for now:
cabal-install can't bootstrap automatically currently when hackage.haskell.org is unreachable (either because it's down or because of routing/firewalling issues), even though one if its mirrors may be reachable without issue.
To this end, I propose the following simple best-effort fallback scheme:
When bootstrapping hackage-security, and the configured repository url ${URL} (e.g. hackage.haskell.org) is not reachable, a DNS TXT lookup on _mirrors.${URL} shall be attempted looking for RFC1464-compliant entries of mirror urls with the keys ${IDX}.urlbase (where ${IDX} is a non-negative integer), and attempt to bootstrap from each of those mirrors urls (in the order of their ${IDX} value) until one succeeds (and giving up when all urls have been tried).
For implementing a prototype, I've created such a DNS RR:
$ dig _mirrors.hackage.haskell.org TXT
; <<>> DiG 9.10.3-P4-Ubuntu <<>> _mirrors.hackage.haskell.org TXT
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62373
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1500
;; QUESTION SECTION:
;_mirrors.hackage.haskell.org. IN TXT
;; ANSWER SECTION:
_mirrors.hackage.haskell.org. 300 IN TXT "0.urlbase=http://hackage.fpcomplete.com/" "1.urlbase=http://objects-us-west-1.dream.io/hackage-mirror/"
;; Query time: 2 msec
;; SERVER: 69.20.0.164#53(69.20.0.164)
;; WHEN: Thu Oct 06 16:32:18 UTC 2016
;; MSG SIZE rcvd: 170
Moreover, I've created a simple parser for parsing nslookup's output (which appears to be the common denominator tool which is provided by default on Windows, OSX, IBM AIX, and Linux systems) which I've tested on the platforms I had access to:
#! /usr/bin/env runghc
import Data.List
import Data.Char
import Control.Monad
import System.Environment
import System.Process (readProcess)
import Text.Read
-- | Parse output of @nslookup -query=TXT $HOSTNAME@ tolerantly
parseNsLookupTxt :: String -> Maybe [(String,[String])]
parseNsLookupTxt = go0 [] []
where
-- approximate grammar:
-- <entries> := { <entry> }
-- (<entry> starts at begin of line, but may span multiple lines)
-- <entry> := ^ <hostname> TAB "text =" { <qstring> }
-- <qstring> := string enclosed by '"'s ('\' and '"' are \-escaped)
-- scan for ^ <word> <TAB> "text ="
go0 [] _ [] = Nothing
go0 res _ [] = Just (reverse res)
go0 res _ ('\n':xs) = go0 res [] xs
go0 res lw ('\t':'t':'e':'x':'t':' ':'=':xs) = go1 res (reverse lw) [] (dropWhile isSpace xs)
go0 res lw (x:xs) = go0 res (x:lw) xs
-- collect at least one <qstring>
go1 res lw qs ('"':xs) = case qstr "" xs of
Just (s, xs') -> go1 res lw (s:qs) (dropWhile isSpace xs')
Nothing -> Nothing -- bad quoting
go1 res lw [] _ = Nothing -- missing qstring
go1 res lw qs xs = go0 ((lw,reverse qs):res) [] xs
qstr acc ('\n':_) = Nothing -- We don't support unquoted LFs
qstr acc ('\\':'\\':cs) = qstr ('\\':acc) cs
qstr acc ('\\':'"':cs) = qstr ('"':acc) cs
qstr acc ('"':cs) = Just (reverse acc, cs)
qstr acc (c:cs) = qstr (c:acc) cs
qstr _ [] = Nothing
mirrorsDnsName :: String
mirrorsDnsName = "_mirrors.hackage.haskell.org"
extractMirrors :: String -> [String]
extractMirrors s0 = map snd $ sort vals
where
vals = [ (kn,v) | (h,ents) <- maybe [] id $ parseNsLookupTxt s0
, h == mirrorsDnsName
, e <- ents
, Just (k,v) <- [splitRfc1464 e]
, Just kn <- [isUrlBase k]
]
isUrlBase :: String -> Maybe Int
isUrlBase s
| isSuffixOf ".urlbase" s, not (null ns), all isDigit ns = readMaybe ns
| otherwise = Nothing
where
ns = take (length s - 8) s
splitRfc1464 :: String -> Maybe (String,String)
splitRfc1464 = go ""
where
go _ [] = Nothing
go acc ('`':c:cs) = go (c:acc) cs
go acc ('=':cs) = go2 (reverse acc) "" cs
go acc (c:cs)
| isSpace c = go acc cs
| otherwise = go (c:acc) cs
go2 k acc [] = Just (k,reverse acc)
go2 k acc ['`'] = Nothing
go2 k acc ('`':c:cs) = go2 k (c:acc) cs
go2 k acc (c:cs) = go2 k (c:acc) cs
main :: IO ()
main = do
fns <- getArgs
if null fns
then do
output <- readProcess "nslookup" ["-query=TXT", mirrorsDnsName] ""
print (extractMirrors output)
else do
forM_ fns $ \fn -> do
output <- readFile fn
print (fn,extractMirrors output)
return ()
Its output is simply
["http://hackage.fpcomplete.com/","http://objects-us-west-1.dream.io/hackage-mirror/"]
after a short conversation with @dcoutts the conclusion is that I'm going to integrate this into cabal-install real-soon-now(tm), no changes in hackage-security needed for now