Capturing Baidu spider entries from Apache Web log file


How to capture the Baidu spider entries from Web log file?

Here are some Web log file entries:

✍: Guest


Baidu Spider is a Web crawler from that obtains content for the baidu Search engine. Baidu spider uses the following user agent string:

Mozilla/5.0 (compatible; Baiduspider/2.0; +

The regular expression to capture Baidu Spider entries from Apache Web log file can be written as: with the multiple lines modifier "m" specified:

Click the button to test this regular expression here online:

2013-02-04, 0👍, 0💬