Forum Moderators: open
Well this is a common misconception, many people who do SEO professionally actually still think that google doesn't understand javascript. If you are one of them please go to google and type: "How do search engines' bots handle javascript?" and read the first result thouroughly.
In that document it shows how this is not the case and how google understands most javascript apart the one connected with the DOM (at least 1 year ago). Please say if you find something wrong or not believe in the article.
So NOW, what are the reccomendations for using Ajax and/or Javascript in order to get some duplicate content out of a page?
I can think of 3 main approaches:
1. Including the duplicate content in an external Js, assign it to variables, and do innerHTML to some divs. (old style)
2. Using XmlHTTPRequest (GET) to retrieve the data in XML format and then put it into the page.
3. Do an Ajax POST and retrieve the XML content with this. (slower as the page will be processed twice by the server but this seems the safest as in my research Google cannot do post requests).
A 4th method would be using an encryption that google can't solve (I would know how to do that), but I am reluctant to cheat or to be appearing like cheating as I care about my site.... so I don't want to use it.
I think this is a very important and complex topic, what do you think is the best approach to get rid of dupe? Do you think google could get suspicious with any of these approaches? I don't mind that google reads the content, but I would like that google doesn't consider that text as much as the unique text on the page.
Please give me your opinion even if you are not sure or you didn't test it, I would like to hear all opinions and maybe we can find a good answer.
Thanks!
However I suspect (without proof) that bots go trolling for any URLs found within scripts. For instance, if I happen to inject this script onto a page:
<script>
var dummy = "http://www.example.com/dummypage.html"
</script>
My theory is that Googlebot will visit that URL and try to index it. Even though it's just a string in a JS variable, never actually used for anything.
'Twould be a simple hypothesis to test...
So NOW, what are the reccomendations for using Ajax and/or Javascript in order to get some duplicate content out of a page?
You could make a Perl script or php script printing the javascript... The script should detect if it is a bot or a browser, and if it is a bot you could print spaces... or a custom message.
use iframes, the contents of which theoretically wouldn't "count" as being on the same URL as the page which contains it
using this method on a few sites.
iframes are treated similar to links from the host page to the one displayed. Content is indexed only for the URL in the frame. You can use the iframe on as many pages as you want to. it'll promote the importance of its target. on my sites it even has PageRank with no actual links to it.
and once separated this way, you can decide whether to add or not add noindex, nocache if you want to... I haven't done this but it only makes sense that it'd work as it should on any other URL.
I guess the fact that I've never seen (or more accurately, never noticed) these in SERPs might be evidence to the contrary, but just because something doesn't rank well doesn't mean it wouldn't throw a wrench into my SEO.
Indexing content only accessible via Javascript is a double-edged sword. On one hand, it makes all that AJAXian stuff findable. On the other, it could potentially fill the SERPs with kabillions of little scraps of noise.
Would invite comments from the experts..