On my server, I am watching some logfiles so that I can pull up the shields on some specific conditions. I am using syslog-ng and iptables.

Syslog-ng communicates with iptables through /proc/net/xt_recent/.... This used to work with the syslog-ng file(...) directive:

destination d_fw_auth_fail {
    file("/proc/net/xt_recent/authfail" template("+$ip\n") fsync(yes));
};

Apparently, this doesn't work for procfs any longer:

Mar 10 22:28:59 eucli syslog-ng[19266]: I/O error occurred while writing; fd='21', error='Invalid argument (22)'
Mar 10 22:28:59 eucli syslog-ng[19266]: Suspending write operation because of an I/O error; fd='21', time_reopen='60'
Mar 10 22:28:59 eucli kernel: xt_recent: illegal address written to procfs

As said here, the reason seems to be that file(...) uses lseek(...) to find the end of the file. This concept doesn't exist in special files in procfs.

So let's use a script instead:

#!/bin/bash
# /usr/local/bin/ban IP TARGET
while true ; do
  {
    read ip target
    [ -z "$ip" ] && continue
    echo +$ip > /proc/net/xt_recent/$target
    date +"%F %T $ip $target"
  } 2>&1 >> /var/log/ban.log
done

Try it out like this:

/usr/local/bin/ban 10.0.0.42 authfail

Use tail /var/log/ban.log /proc/net/xt_recent/authfail to see if they contain the expected data.

Now wire it up like this:

destination d_fw_auth_fail {
    program("/usr/local/bin/ban" template("$ip authfail\n"));
};

Notes:

Syslog-ng expects the program to keep running and read from stdin. This is so that it doesn't need to run a thousand separate processes if you get a thousand log messages per second. Thus we need to wrap everything in the while true loop, otherwise syslog-ng will restart the script many times per second even when there are no messages to deliver.

I was thinking about adding suppress(...) so that if many failures happen in a very short time, some of them will be dropped. This would be a general measure against denial of service attacks. However, in this case it doesn't make sense: After the limit of messages per time unit is reached, that IP will be blocked completely for a longer time - more effective. Adding suppress(...) would just delay the ip block.

jan 2018-03-11