Forum: Crawler, Spider, and User Agent ID
Displaying Topics 1 - 40 (71 total) Sorted by: Date-Last-Post, Direction: reverse
#Subjectsort orderDate
1: new post   Where is grok bot? What is Grok Using for Web Sources?
...asking here if anyone has seen a grok-bot or xai-bot hit, and from where. I've done a few web-searches on this, absolutely no examples of what a grok UA looks like.[br]
Aug 12, 2025
2: new post   Single Sign On and Identifying Search Bots
websites that are switching to a metered paywall system and are trying to identify searchbots
Oct 14, 2022
3: new post   Server Farms July 2021
WebmasterWorld's member reports of data center IP ranges as they are discovered.
July 6, 2021
4: new post   Google Now Sending Notifications It's Crawling Over HTTP/2
Google now sending notifications to site owners it's crawling their site over HTTP/2.
Jan 14, 2021
5: new post   Server Farms: Continuing discussion of hosting company IP ranges
Server Farms 2020: Continuing discussion of hosting company IP ranges.
Feb 28, 2020
6: new post   Evergreen Googlebot Now Runs Latest Chromium Rendering Engine
It'll now keep up-to-date with Chrome browser.
May 13, 2019
7: new post   Server Farms - July 2018
WebmasterWorld's latest Server Farms records.
July 20, 2018
8: new post   Server Farms - 2018
WebmasterWorld's server farm listings, data centers and IP ranges.
Feb 12, 2018
9: new post   Most of Your Website Traffic is Not Human
"It's a disappointing but eye-opening statistic that most of the traffic to our websites is not from actual people." WebmasterWorld Members compare statistics and reveal surprising traffic levels.
July 7, 2017
10: new post   At Home with the Robots: 2017 edition
Comprehensive assessment and record of search engine robots.
May 16, 2017
11: new post   Server Farms - January 2017
WebmasterWorld's collation of data center IP ranges as they are discovered or change in the rapidly evolving assigned IP landscape.
Jan 28, 2017
12: new post   Server Farms - Update
This is where we report data center IP ranges as they are discovered or change in the rapidly evolving assigned IP landscape.
Oct 13, 2016
13: new post   Server Farms - January 2016
WebmasterWorld Members record and discuss the latest server farms.
Feb 29, 2016
14: new post   Detecting and Managing Sophisticated IP Spoofing
WebmasterWorld Members discuss the topic of IP spoofing and detection techniques.
Jan 14, 2016
15: new post   Server Farms - July 2015
WebmasterWorld's in-depth discussion and resource on server farms.
July 28, 2015
16: new post   Bot ID and Blocking GET xmlrpc.php To Avoid Exploit
WebmasterWorld Members discuss bot blocking to avoid xmlrpc exploit.
July 10, 2015
17: new post   Server Farms IP Tracking Resource - February 2015
WebmasterWorld members provide an extensive range of IP addresses and user agent names that can be used to keep vast numbers of scrapers off websites.
Feb 23, 2015
18: new post   At Home with the Robots: 2015 Edition
An extensive review of robots, or web crawlers, and behaviour, good, friendly, or unhealthy.
Feb 10, 2015
19: new post   Blocking non-North American Traffic Made Simple
Webmasters discuss how to make an amazingly small optimized IP block list that allows only North American traffic to access a website. The technique can easily be applied to other geographical areas.
Apr 23, 2014
20: new post   Googlebot Fails to Pass DNS Verification
WebmasterWorld members have reported that an apparently valid Googlebot is failing DNS verification. Major impact for sites relying on Googlebot validation.
Apr 2, 2014
21: new post   The User Agent Whitelist
WebmasterWorld members discuss methods for whitelisting good requests vs blacklisting bad requests.
Feb 7, 2014
22: new post   Dealing With WordPress Comment Spam Escalation
"Just to see what would happen I enabled full comments on my WordPress blog and at first I just let the comments pile up in the WordPress moderation queue as I was curious how bad it would get since nothing ever got published.[br][br]It quickly ramped up from a few a day to 100s a day, peaking currently at over 500 spam posts a day."
Jan 22, 2014
23: new post   TECH UPDATE: Bots as Browsers Using JavaScript
"Tech briefing to bring those up to speed that aren't aware of the rapidly changing server side landscape including support for JavaScript thanks to Node.js."
Oct 31, 2013
24: new post   How to Identify and Block Fake BingBot Visits
How do you identify and block fake BingBot visits to your sites.
Apr 2, 2013
25: new post   Filtering Out Really Hard To Find Bad Bots
WebmasterWorld Members discuss how best to filter out unwanted, bad bots that are tough to find.
Jan 18, 2013
26: new post   Identifying Fake User Agent Strings
User agents come in all shapes and sizes. Some, like the fake Googlebots, are easy to recognize, but what about those really long ones. WebmasterWorld Members help clarify the identification process.
June 11, 2012
27: new post   How To Block Thousands of Spambot IPs Hitting a Site
WebmasterWorld members discuss the best methods of handling and blocking spambots with thousands of unique IP addresses hitting a site, causing bandwidth to rise from 1GB a month to 12GB a day.
Dec 12, 2011
28: new post   Microsoft Bot 157 Ranges Updated
Microsoft's 157. range bots list updated.
Nov 16, 2011
29: new post   The Best Way to Keep All Spiders/Bots Out Of A Site
WebmasterWorld Members discuss the issue of stopping bots from crawling a site, and keeping them out. It seems it's tougher than you might think.
Oct 3, 2011
30: new post   Yahoo! Slurp Ignoring robots.txt
WebmasterWorld Members report that Yahoo's Slurp is ignoring robots.txt
Sept 17, 2011
31: new post   Google Messing Up Javascript Stats
This is the first time in nearly 9 years I've seen G blatantly disregard robots.txt and they're doing it with a GoogleBot UA.
May 16, 2011
32: new post   Stopping Scrapers From The Start
"I'm putting a *huge* number of pages of content online. I'm looking to stop the scraping/copying/bots from the outset and I need bandwidth kept to a minimum."
Feb 25, 2011
33: new post   Google's Web Preview Spider
"WebmasterWorld Members discuss the Web Preview Spider, whether it obeys robots.txt, and how to block it."
Nov 19, 2010
34: new post   Now Seeing Bingbot
"Bingbot is now in the wild."
Sept 29, 2010
35: new post   Fresh IP's in MSN's Many Cloaked Bot Arsenal
"No UA, no robots.txt, no REF, no nothing. Not once. Not twice. Not even three times. Try eleven."
Sept 3, 2010
36: new post   Casper Bot Search Attempting To Infect Sites
"Seen quite a few of these over the past few days, generally in groups of half a dozen-ish."
July 7, 2010
37: new post   MSNbot Changing to Bingbot on Oct.1, 2010
"we will drop the beta designation from the Bing crawler and change the name of the crawler to reflect Microsoft's new brand for search."
June 29, 2010
38: new post   The Staggering Number of Tweet Chasing Bots
Up to 20 bots now following twitter fire hose feed.
May 8, 2010
39: new post   Facebook Sues Data Scraper
"Warden gathered that data from public profiles using "crawling" software similar to what's commonly available on the Web..."
Apr 4, 2010
40: new post   Updating The htaccess Bot Ban list
"I'm sure most of us are familiar with the classic .httacces bad bot ban list for .htaccess that gets copied and pasted wholesale from web developer forum to forum (e.g.: http://www.webmasterworld.com/forum13/687.htm )"
Dec 29, 2009
#Subjectsort orderDate
open in new window=open in new window

   
1 2 Next >>