Lucid Projects

Blog - Day to day mumblings...
UFW Analysis - Someone(s) trying to get in

UFW Analysis - Someone(s) trying to get in

15 Sep 2020 - Jake Sherwood

Someone(s) trying to get in UFW Analysis - Someone(s) trying to get in

UFW Analysis - Someone(s) trying to get in

Our first assignment for Understanding Networks was to get a server up and running and set up UFW (uncomplicated firewall.) We then needed to analyze the logs and search for patterns.

In class we went through the process of setting up a server on Digital Ocean. I already had a DO account set up and some other Droplets so I was able to go pretty quickly through the process.

Installing UFW and looking through the logs was something I hadn’t done so that was nice to work through. It was also somewhat scarily enlightening to see how much people are actually port sniffing attempting to gain access.

A few basic stats:

22279 hits from 9/8 - 9/14
All BLOCKS

TOP 10 IPS

IP Count of hits Geo SRC
218.92.0.211 757 CHINA
49.88.112.69 546 CHINA
87.251.74.73 349 RUSSIA
45.145.67.74 276 RUSSIA
193.27.229.86 275 RUSSIA
185.176.27.26 219 RUSSIA
141.98.80.242 216 PANAMA
185.176.27.14 204 RUSSIA
185.176.27.102 191 RUSSIA
185.176.27.30 186 RUSSIA

A couple of graphs of my traffic
bar graph showing hits by ip bar graph showing hits by ip

pie graph showing hits by ip pie graph showing hits by ip

Analysis Synopsis
ALOT of people are trying to gain access to servers / infrastructure constantly. Reddit confirmed this amount of traffic is not out of the norm.

The majority of attempts coming from China and Russia.

It was interesting to see the difference style of attempts. Some tried all different types of destination ports, source ports or a combination of both.

Some tried for a few hours, to days.

The two biggest hits by IP being China and both focused solely on port 22. They were really trying to get in.

Analysis Rabbit Hole
I tried a few things to parse the data.

Initially I found this old UFW visiualization repo. But it looks to be abandoned and an issue posted makes it sound like quite a bit of work to get it partially working.

I also had the idea to try to parse the data into a MySQL database and get it set up on Grafana to analyze /visualize. I found a mapping plugin for Grafana but it needed longitude and latitude data for plotting. So I found this free GeoLocation API - IPStack. I played around making a few api calls and set up a simple js script to pull the geo data based on IP.

I was just running it locally but this is the code slightly modified from one of their examples:

// set endpoint and your access key
var ip = '66.XXX.XXX.XX'
var access_key = '[ACCESS_KEY]';

// get the API result via jQuery.ajax
$.ajax({
    url: 'http://api.ipstack.com/' + ip + '?access_key=' + access_key,   
    dataType: 'jsonp',
    success: function(json) {

        console.log(json);
        console.log(json.city);
    }
});

That led me down a whole rabbit whole.

The MySQL installation kept failing and I was beginning to feel like I was spending way more time then I should on something that wasn’t really part of the assignment.

I kept getting errors like this
Error: Can’t connect to local MySQL server through socket

Which led me to reading about MySQL sockets and potential problems. Working through some DO troubleshooting documents unfortunately were no help and I decided to move on. Figured one last ditch effort I would open a community ticket and see if anyone could help me sort it.

Lo and behold the community came through. A user by the handle of bobbyiliev helped me look at some other logs and see that the MySQL installation was crashing and retrying over and over again.

After a bit more digging we decided it was a memory issue and upgrading the Droplet was what was needed.

I just bumped it up to the next tier up and things started working.

Sadly this all happened over a few days and I wasn’t able to get it all tied in to Grafana. I still might try to sort that out time allowing.

In the meantime I looked into a few other options.

I got Awstats install and configured but turns out it mostly only works for access logs as I couldn’t get my UFW log data to show. Granted this may have been due to a misconfiguration. There are a lot of bits to that setup.

I then found this handy UFW python script which is what I ended up using for most of my analysis.

This script lets you pass in a log file and set up various filters with optional arguments to display data from the logs that are a little easier to look at.

It is using as series of regular expressions to parse out the “HEADER_PATTERN” and then the “PARAM_PATTERN”

class BaseParser():
    # Aug  6 06:25:20 myhost kernel: [105600.181847] [UFW ALLOW] ...
    HEADER_PATTERN = r'([A-Za-z]{3}\s+\d{1,2} \d{2}:\d{2}:\d{2}) ([a-zA-Z0-9-]+) kernel: \[.*\] \[UFW ([A-Z]+)\]'  # nopep8
    # IN= OUT=eno1 SRC=123.45.67.89 DST=123.45.67.88 LEN=60 TOS=0x00
    # PREC=0x00 TTL=64 ID=24678 DF PROTO=TCP SPT=37314 DPT=11211
    # WINDOW=29200 RES=0x00 SYN URGP=0
    PARAM_PATTERN = r'(\w+)[=]?([\w.:]*)'

It allows you to run cmds like:

//show total entries in log file
$ufp/src/ufp.py --count  un_logs/ufw.log

//shows IP and all the various ports attempted 
$ufp/src/ufp.py -src2dpt -ct un_logs/ufw.log.1

//show specific times and ports for IP (src)
$ufp/src/ufp.py -src 185.153.199.146 -p -ct un_logs/ufw.log

I ended up piping some of the output to a txt doc with

ufp/src/ufp.py -p -ct un_logs/ufw.log > src_ip.txt

I then parsed out the IP data and loaded it into Excel and sorted out by count of SRC IP to generate the graphs show above.

To try to understand the regex used a bit more I used regex101.com to analyze the regex patterns. It was really useful to see a detailed explanation of what the patterns mean and what they matched. I still need to spend some time to understand what and how its matching, but it was good to see what was happening.

header pattern match header pattern match

param pattern match param pattern match

A few additional things I tried in the process. Tried to look into Pandas. Installed it but didn’t get it going but it did break my blog publishing pipeline. That was a fun gotcha this morning.

Also messed a bit with Google App Scripts and figured out how to tie in IPstack api to add the location to a google sheet based on IP value passed in.

All in all this was a fun rabbit hole to go down.

categories: understandingnetworks

join me on this crazy ride. enter your email:

contact [at] jakesherwood [dot] com

contact me if you'd like to work together