Search in All Access Log Files on Web Server
At times it is useful to search for a pattern in all domains their access log files on a web server. This can easily be done by combining the commands find and grep without complicated script loops.
Websites path
First it is important to find the folder where all websites and logs are located. On a multiple client server setup the common paths are:
/var/www/vhosts
(Plesk)/home
(DirectAdmin)
On a custom Linux install you may find them in in any folder you configured yourself, such as:
/var/www
/home/domains
When there are no specific log folders configured in the vhosts, the main log location can be used:
/var/log/apache2
(Debian)/var/log/httpd
(CentOS)
Scenario
We want to search all logs from multiple websites for the IP address
172.217.17.78
on a Plesk server. We want to exclude all static files (.jpg, .png, .css, etc) so we are left with actual page visit behavior, such
as visiting /home and /about.
The command below gets the job done:
find /var/www/vhosts \
\( \
-name "*access*log*" -o \
-name "*combined*" -o \
-name "*request*" \
\) \
-exec zgrep -iH '172\.217\.17\.78' {} \; \
| grep -vE "(\.gif|\.jpg|\.png|\.swf|\.ico|\.txt|\.xml|\.css|\.js|\.rss|\.svg|admin\-ajax|download\.php|piwik|analytics)" \
| tee /tmp/output.txt
Explained in detail
find
searches logs using three file names with OR statements using-o
zgrep
is executed for each file found to filter all traffic from the specific IP. zgrep can read compressed log files (.gz
).-i
ignores case sensitive-H
prints each file name
grep
is used to filter out the static files-v
inverts the match-E
enables extended regular extension
tee
outputs the result to the terminal and a temporary file
The result is an overview of actual page visits from the IP and we can analyze them by eye and even add more grep filters.
/var/www/vhosts/website.nl/logs/access_ssl_log.processed.3.gz:172.217.17.78 - - [19/Dec/2020:14:47:15 +0100] "GET / HTTP/2.0" 200 4095 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
/var/www/vhosts/website.nl/logs/access_ssl_log.processed.3.gz:172.217.17.78 - - [19/Dec/2020:14:47:17 +0100] "GET /results HTTP/2.0" 200 11251 "https://website.nl/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
/var/www/vhosts/website.nl/logs/access_ssl_log.processed.3.gz:172.217.17.78 - - [19/Dec/2020:14:47:18 +0100] "GET /about HTTP/2.0" 200 5386 "https://website.nl/results" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
/var/www/vhosts/website.nl/logs/access_ssl_log.processed.3.gz:172.217.17.78 - - [19/Dec/2020:14:47:19 +0100] "GET /api HTTP/2.0" 200 3513 "https://website.nl/about" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
Hope this helps!