UFW Analysis - Someone(s) trying to get in
15 Sep 2020 - Jake Sherwood
UFW Analysis - Someone(s) trying to get in
UFW Analysis - Someone(s) trying to get in
Our first assignment for Understanding Networks was to get a server up and running and set up UFW (uncomplicated firewall.) We then needed to analyze the logs and search for patterns.
In class we went through the process of setting up a server on Digital Ocean. I already had a DO account set up and some other Droplets so I was able to go pretty quickly through the process.
Installing UFW and looking through the logs was something I hadn’t done so that was nice to work through. It was also somewhat scarily enlightening to see how much people are actually port sniffing attempting to gain access.
A few basic stats:
22279 hits from 9/8 - 9/14
All BLOCKS
TOP 10 IPS
IP | Count of hits | Geo SRC |
---|---|---|
218.92.0.211 | 757 | CHINA |
49.88.112.69 | 546 | CHINA |
87.251.74.73 | 349 | RUSSIA |
45.145.67.74 | 276 | RUSSIA |
193.27.229.86 | 275 | RUSSIA |
185.176.27.26 | 219 | RUSSIA |
141.98.80.242 | 216 | PANAMA |
185.176.27.14 | 204 | RUSSIA |
185.176.27.102 | 191 | RUSSIA |
185.176.27.30 | 186 | RUSSIA |
A couple of graphs of my traffic
bar graph showing hits by ip
pie graph showing hits by ip
Analysis Synopsis
ALOT of people are trying to gain access to servers / infrastructure constantly. Reddit confirmed this amount of traffic is not out of the norm.
The majority of attempts coming from China and Russia.
It was interesting to see the difference style of attempts. Some tried all different types of destination ports, source ports or a combination of both.
Some tried for a few hours, to days.
The two biggest hits by IP being China and both focused solely on port 22. They were really trying to get in.
Analysis Rabbit Hole
I tried a few things to parse the data.
Initially I found this old UFW visiualization repo. But it looks to be abandoned and an issue posted makes it sound like quite a bit of work to get it partially working.
I also had the idea to try to parse the data into a MySQL database and get it set up on Grafana to analyze /visualize. I found a mapping plugin for Grafana but it needed longitude and latitude data for plotting. So I found this free GeoLocation API - IPStack. I played around making a few api calls and set up a simple js script to pull the geo data based on IP.
I was just running it locally but this is the code slightly modified from one of their examples:
That led me down a whole rabbit whole.
The MySQL installation kept failing and I was beginning to feel like I was spending way more time then I should on something that wasn’t really part of the assignment.
I kept getting errors like this
Error: Can’t connect to local MySQL server through socket
Which led me to reading about MySQL sockets and potential problems. Working through some DO troubleshooting documents unfortunately were no help and I decided to move on. Figured one last ditch effort I would open a community ticket and see if anyone could help me sort it.
Lo and behold the community came through. A user by the handle of bobbyiliev helped me look at some other logs and see that the MySQL installation was crashing and retrying over and over again.
After a bit more digging we decided it was a memory issue and upgrading the Droplet was what was needed.
I just bumped it up to the next tier up and things started working.
Sadly this all happened over a few days and I wasn’t able to get it all tied in to Grafana. I still might try to sort that out time allowing.
In the meantime I looked into a few other options.
I got Awstats install and configured but turns out it mostly only works for access logs as I couldn’t get my UFW log data to show. Granted this may have been due to a misconfiguration. There are a lot of bits to that setup.
I then found this handy UFW python script which is what I ended up using for most of my analysis.
This script lets you pass in a log file and set up various filters with optional arguments to display data from the logs that are a little easier to look at.
It is using as series of regular expressions to parse out the “HEADER_PATTERN” and then the “PARAM_PATTERN”
It allows you to run cmds like:
I ended up piping some of the output to a txt doc with
I then parsed out the IP data and loaded it into Excel and sorted out by count of SRC IP to generate the graphs show above.
To try to understand the regex used a bit more I used regex101.com to analyze the regex patterns. It was really useful to see a detailed explanation of what the patterns mean and what they matched. I still need to spend some time to understand what and how its matching, but it was good to see what was happening.
header pattern match
param pattern match
A few additional things I tried in the process. Tried to look into Pandas. Installed it but didn’t get it going but it did break my blog publishing pipeline. That was a fun gotcha this morning.
Also messed a bit with Google App Scripts and figured out how to tie in IPstack api to add the location to a google sheet based on IP value passed in.
All in all this was a fun rabbit hole to go down.