HAProxy is essentially the entry-point for most of the services I run. Its logs are sent to a central rsyslog container on a volume that's accessible by the GoAccess container for parsing. I found most of the configs here.
    log         /dev/log local2 # Log to local syslog, which sends a copy to a
                                # remote rsyslog container.

frontend main
    capture request header Referer len 128
    capture request header User-Agent len 128

    log-format %si:%sp\ %ci\ [%t]\ \"%r\"\ %ST\ %B\ "%hr"
    # %si - your server ip - very useful if you have multiple application
    # %sp - your server port
    # %ci - user ip
    # %t  - datetime in haproxy format
    # %r  - request
    # %ST - status code
    # %B  - data reponse length
    # %hr - captured headers separated by "|" (Referer|User-Agent)
# Send haproxy logs to its own file
local2.*    -/var/opt/log/haproxy/haproxy.log

# Receive messages from remote host via UDP
time-format %H:%M:%S

date-format %d/%b/%Y

log-format %^ %^ %^ %^ %h [%d:%t.%^] "%r" %s %b "{%R|%u}"

# %^ - skipped token
# %h - user ip
# %d - date-format
# %t - time-format
# %r - request e.g. GET /something
# %s - server status code
# %b - data response length
# %R - referer - very important if you want to know where your users come from
# %u - user agent

# There is so many skipped tokens because my haproxy put some extra information in every line or rsyslog(?)
# Sample line:
# Mar 22 09:09:06 server haproxy[PID]: [22/Mar/2016:09:08:56.989] "POST /UIDL/?v-uiId=0 HTTP/1.1" 200 334 "{https://www.referer.com/|Mozilla/5.0 (Linux; Android 4.4.4; GT-I9060I Build/KTU84P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.83 Mobile Saf}"

Now I can look at all crawler traffic from the terminal. Yay.

GoAccess Screenshot

I might enable the web version at a later date. Still undecided at the moment.