Ajax and Javascript for SEO now that googlebot understands Javascript - JavaScript and AJAX forum at WebmasterWorld - WebmasterWorld

Forum Moderators: open

Message Too Old, No Replies

Ajax and Javascript for SEO now that googlebot understands Javascript

heringslake

2:37 pm on Apr 23, 2008 (gmt 0)

10+ Year Member

Hello,
This post refers to this older post [webmasterworld.com].
In that post people answered that google doesn't understand Js so it's fine to use Ajax and Javascript freely (or with little changes) for SEO.

Well this is a common misconception, many people who do SEO professionally actually still think that google doesn't understand javascript. If you are one of them please go to google and type: "How do search engines' bots handle javascript?" and read the first result thouroughly.

In that document it shows how this is not the case and how google understands most javascript apart the one connected with the DOM (at least 1 year ago). Please say if you find something wrong or not believe in the article.
So NOW, what are the reccomendations for using Ajax and/or Javascript in order to get some duplicate content out of a page?

I can think of 3 main approaches:
1. Including the duplicate content in an external Js, assign it to variables, and do innerHTML to some divs. (old style)
2. Using XmlHTTPRequest (GET) to retrieve the data in XML format and then put it into the page.
3. Do an Ajax POST and retrieve the XML content with this. (slower as the page will be processed twice by the server but this seems the safest as in my research Google cannot do post requests).

A 4th method would be using an encryption that google can't solve (I would know how to do that), but I am reluctant to cheat or to be appearing like cheating as I care about my site.... so I don't want to use it.

I think this is a very important and complex topic, what do you think is the best approach to get rid of dupe? Do you think google could get suspicious with any of these approaches? I don't mind that google reads the content, but I would like that google doesn't consider that text as much as the unique text on the page.
Please give me your opinion even if you are not sure or you didn't test it, I would like to hear all opinions and maybe we can find a good answer.

Thanks!

httpwebwitch

6:07 pm on Apr 25, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I'm inclined to favour method #2 above. Simply because it'd be easy to tuck your content into a CMS, or into plain *.html files, and loading the content wouldn't take much longer than loading other dependent files like images, *.js and *.css files.

However I suspect (without proof) that bots go trolling for any URLs found within scripts. For instance, if I happen to inject this script onto a page:

<script>
var dummy = "http://www.example.com/dummypage.html"
</script>

My theory is that Googlebot will visit that URL and try to index it. Even though it's just a string in a JS variable, never actually used for anything.

'Twould be a simple hypothesis to test...

httpwebwitch

6:10 pm on Apr 25, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

A fifth option is to use iframes, the contents of which theoretically wouldn't "count" as being on the same URL as the page which contains it. Again, another untested hypothesis. There are people who specialize in this kind of myth debunking, I'm disappointed that none of them are contributing here...

explorador

5:26 am on Apr 30, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

So NOW, what are the reccomendations for using Ajax and/or Javascript in order to get some duplicate content out of a page?

My natal language is spanish, not english... I don't understand if you are trying to put dup content but avoiding Google to see it?

You could make a Perl script or php script printing the javascript... The script should detect if it is a bot or a browser, and if it is a bot you could print spaces... or a custom message.

Alcoholico

10:24 am on Apr 30, 2008 (gmt 0)

10+ Year Member

So, you're suggesting good old cloacking... and possible penalties, number 2 has my vote.

Miamacs

11:11 am on Apr 30, 2008 (gmt 0)

10+ Year Member

use iframes, the contents of which theoretically wouldn't "count" as being on the same URL as the page which contains it

using this method on a few sites.

iframes are treated similar to links from the host page to the one displayed. Content is indexed only for the URL in the frame. You can use the iframe on as many pages as you want to. it'll promote the importance of its target. on my sites it even has PageRank with no actual links to it.

and once separated this way, you can decide whether to add or not add noindex, nocache if you want to... I haven't done this but it only makes sense that it'd work as it should on any other URL.

Edge

12:20 pm on Apr 30, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

I use method #5 - I don't have or use potential duplicate content on my site.

httpwebwitch

1:10 pm on Apr 30, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Introducing another awkward facet: Does Googlebot index little HTML fragments, those served up by REST services for AJAX consumption? For instance, I have some Javascript on a page waiting for user interaction, and all it does is load up a fragment of information (a paragraph or two) for insertion into the DOM. Being RESTful, it does live at a permanent indexable URL. Presumably, the code:content ratio of this fragment is 100%, and its keyword density is shazam! Will it be crawled, indexed and ranked?

I guess the fact that I've never seen (or more accurately, never noticed) these in SERPs might be evidence to the contrary, but just because something doesn't rank well doesn't mean it wouldn't throw a wrench into my SEO.

Indexing content only accessible via Javascript is a double-edged sword. On one hand, it makes all that AJAXian stuff findable. On the other, it could potentially fill the SERPs with kabillions of little scraps of noise.

madmatt69

2:30 pm on Apr 30, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

If you had content in external JS files, could you exclude your js directory in robots.txt?

supafresh

5:44 pm on Apr 30, 2008 (gmt 0)

10+ Year Member

Where is the danger for dupe content when the bot cant even render pages?

ponyboy96

12:43 pm on May 1, 2008 (gmt 0)

10+ Year Member

I asked a Google Engineer about why the spiders can't index Flash, JavaScript, AJax, etc... when I was at SES Chicago. What I was told kind of contradicts the statement that they are indexing it. He basically said that the spiders CAN index these elements and have been able to for some time; however, they are programmed to ignore it or not load the information for security reasons. Only on the most trusted sites do they allow these elements to be crawled.

experienced

1:32 pm on May 7, 2008 (gmt 0)

10+ Year Member

we mostly use JavaScript to avoid the direct link to affiliates. All the urls are there in .js file and has been called through a js function on the page. Do you think that google will take it negatively? I think promoting affiliate is not that bad but it depends how you handle the affiliate links on your site...

Would invite comments from the experts..