HAProxy is essentially the entry-point
for most of the services I run. Its logs are sent to a central rsyslog
container on a volume that's accessible by the
GoAccess container for parsing. I found most of
the
configs here .
Architecture
.dot
source
/etc/haproxy.conf:
global
log /dev/log local2 # Log to local syslog, which sends a copy to a
# remote rsyslog container.
[ ...]
frontend main
[ ...]
capture request header Referer len 128
capture request header User-Agent len 128
log-format %si:%sp\ %ci\ [ %t] \ \" %r\"\ %ST\ %B\ "%hr"
# %si - your server ip - very useful if you have multiple application
# %sp - your server port
# %ci - user ip
# %t - datetime in haproxy format
# %r - request
# %ST - status code
# %B - data reponse length
# %hr - captured headers separated by "|" (Referer|User-Agent)
[ ...]
rsyslog.conf
# Send haproxy logs to its own file
local2.* -/var/opt/log/haproxy/haproxy.log
# Receive messages from remote host via UDP
module( load = "imudp" )
input(
type = "imudp"
port = "514"
)
/etc/goaccess.conf:
time-format %H:%M:%S
date-format %d/%b/%Y
log-format %^ %^ %^ %^ %h [ %d:%t.%^] "%r" %s %b "{%R|%u}"
# %^ - skipped token
# %h - user ip
# %d - date-format
# %t - time-format
# %r - request e.g. GET /something
# %s - server status code
# %b - data response length
# %R - referer - very important if you want to know where your users come from
# %u - user agent
# There is so many skipped tokens because my haproxy put some extra information in every line or rsyslog(?)
# Sample line:
#
# Mar 22 09:09:06 server haproxy[PID]: 10.60.10.50:80 1.2.3.4 [22/Mar/2016:09:08:56.989] "POST /UIDL/?v-uiId=0 HTTP/1.1" 200 334 "{https://www.referer.com/|Mozilla/5.0 (Linux; Android 4.4.4; GT-I9060I Build/KTU84P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.83 Mobile Saf}"
Now I can look at all crawler traffic from the terminal. Yay.
GoAccess Screenshot
I might enable the web version at a later date. Still undecided at the moment.