Mauserrifle.nl Tech and Life

    Home     Archive     Projects     Contact

Search in All Access Log Files on Web Server

At times it is useful to search for a pattern in all domains their access log files on a web server. This can easily be done by combining the commands find and grep without complicated script loops.

Websites path

First it is important to find the folder where all websites and logs are located. On a multiple client server setup the common paths are:

  • /var/www/vhosts (Plesk)
  • /home (DirectAdmin)

On a custom Linux install you may find them in in any folder you configured yourself, such as:

  • /var/www
  • /home/domains

When there are no specific log folders configured in the vhosts, the main log location can be used:

  • /var/log/apache2 (Debian)
  • /var/log/httpd (CentOS)

Scenario

We want to search all logs from multiple websites for the IP address 172.217.17.78 on a Plesk server. We want to exclude all static files (.jpg, .png, .css, etc) so we are left with actual page visit behavior, such as visiting /home and /about.

The command below gets the job done:

find /var/www/vhosts \
 \( \
-name "*access*log*" -o \
-name "*combined*" -o \
-name "*request*"  \
\)  \
-exec zgrep -iH '172\.217\.17\.78' {} \; \
| grep -vE "(\.gif|\.jpg|\.png|\.swf|\.ico|\.txt|\.xml|\.css|\.js|\.rss|\.svg|admin\-ajax|download\.php|piwik|analytics)" \
| tee /tmp/output.txt

Explained in detail

  • find searches logs using three file names with OR statements using -o
  • zgrep is executed for each file found to filter all traffic from the specific IP. zgrep can read compressed log files (.gz).
    • -i ignores case sensitive
    • -H prints each file name
  • grep is used to filter out the static files
    • -v inverts the match
    • -E enables extended regular extension
  • tee outputs the result to the terminal and a temporary file

The result is an overview of actual page visits from the IP and we can analyze them by eye and even add more grep filters.

/var/www/vhosts/website.nl/logs/access_ssl_log.processed.3.gz:172.217.17.78 - - [19/Dec/2020:14:47:15 +0100] "GET / HTTP/2.0" 200 4095 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
/var/www/vhosts/website.nl/logs/access_ssl_log.processed.3.gz:172.217.17.78 - - [19/Dec/2020:14:47:17 +0100] "GET /results HTTP/2.0" 200 11251 "https://website.nl/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
/var/www/vhosts/website.nl/logs/access_ssl_log.processed.3.gz:172.217.17.78 - - [19/Dec/2020:14:47:18 +0100] "GET /about HTTP/2.0" 200 5386 "https://website.nl/results" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
/var/www/vhosts/website.nl/logs/access_ssl_log.processed.3.gz:172.217.17.78 - - [19/Dec/2020:14:47:19 +0100] "GET /api HTTP/2.0" 200 3513 "https://website.nl/about" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"


Hope this helps!

If you liked this post, you can share it with your followers!