Article DetailsGoogle Suggests: Avoid… |
| Date Added: November 11, 2007 12:34:55 PM |
Google suggests: avoid hidden text or hidden links, cloaking, sneaky redirects, doorway pages, automated queries to Google, loading pages with irrelevant keywords, creating multiple pages, subdomains, or domains with substantially duplicate content, and creating pages that install viruses, trojans, or other badware. Hidden text and links Hiding text or links in your content can cause your site to be perceived as untrustworthy since it presents information to search engines differently than to visitors. Text (such as excessive keywords) can be hidden in several ways, including: • Using white text on a white background Hidden links are links that are intended to be crawled by Googlebot, but are unreadable to humans because: • The link consists of hidden text (for example, the text color and background color are identical). If your site is perceived to contain hidden text and links that are deceptive in intent, your site may be removed from the Google index, and will not appear in search results pages. When evaluating your site to see if it includes hidden text or links, look for anything that’s not easily viewable by visitors of your site. Are any text or links there solely for search engines rather than visitors? If you’re using text to try to describe something search engines can’t access - for example, Javascript, images, or Flash files - remember that many human visitors using screen readers, mobile browsers, browsers without plug-ins, and slow connections will not be able to view that content either. Using descriptive text for these items will improve the accessibility of your site. You can test accessibility by turning off Javascript, Flash, and images in your browser, or by using a text-only browser such as Lynx. Some tips on making your site accessible include: • Images: Use the alt attribute to provide descriptive text. In addition, we recommend using a human-readable caption and descriptive text around the image. If you do find hidden text or links on your site, either remove them or, if they are relevant for your site’s visitors, make them easily viewable. Cloaking Cloaking refers to the practice of presenting different content or URLs to users and search engines. Serving up different results based on user agent may cause your site to be perceived as deceptive and removed from the Google Some examples of cloaking include: • Serving a page of HTML text to search engines, while showing a page of images or Flash to users. If your site contains elements that aren’t crawlable by search engines (such as Flash, Javascript, or images), you shouldn’t provide cloaked content to search engines. Rather, you should consider visitors to your site who are unable to view these elements as well. For instance: • Provide alt text that describes images for visitors with screen readers or images turned off in their browsers. Ensure that you provide the same content in both elements (for instance, provide the same text in the Javascript as in the noscript tag). Including substantially different content in the alternate element may cause Google to take action on the site. Sneaky Javascript Redirects When Googlebot indexes a page containing Javascript, it will index that page but it cannot follow or index any links hidden in the Javascript itself. Use of Javascript is an entirely legitimate web practice. However, use of Javascript with the intent to deceive search engines is not. For instance, placing different text in Javascript than in a noscript tag violates our webmaster guidelines because it displays different content for users (who see the Javascript-based text) than for search engines (which see the noscript-based text). Along those lines, it violates the webmaster guidelines to embed a link in Javascript that redirects the user to a different page with the intent to show the user a different page than the search engine sees. When a redirect link is embedded in Javascript, the search engine indexes the original page rather than following the link, whereas users are taken to the redirect target. Like cloaking, this practice is deceptive because it displays different content to users and to Googlebot, and can take a visitor somewhere other than where they intended to go. Note that placement of links within Javascript is alone not deceptive. When examining Javascript on your site to ensure your site adheres to our guidelines, consider the intent. Keep in mind that since search engines generally can’t access the contents of Javascript, legitimate links within Javascript will likely be inaccessible to them (as well as to visitors without Javascript-enabled browsers). You might instead keep links outside of Javascript or replicate them in a noscript tag. Doorway pages Doorway pages are pages specifically made for search engines. Doorway pages contain many links - often several hundred - that are of little to no use to the visitor, and do not contain valuable content. HTML sitemaps are a valuable resource for your visitors, but ensure that these pages of links are easy for your visitors to navigate. If you have a number of links to include, consider organizing them into categories or into multiple pages. But in doing so, ensure that they are intended for visitors to navigate the sections of your site, and not simply for search engines. Google’s aim is to give our users the most valuable and relevant search results. Therefore, we frown on practices that are designed to manipulate search engines and deceive users by directing them to sites other than the ones they selected and that provide content solely for the benefit of search engines. Sites making use of these practices may be removed from the Google index, and will not appear in Google search results. Automated queries Google’s Terms of Service do not allow the sending of automated queries of any sort to our system without express permission in advance from Google. Sending automated queries absorbs resources and includes using any software (such as WebPosition Gold™) to send automated queries to Google to determine how a website or webpage ranks in Google search results for various queries. Keyword stuffing “Keyword stuffing” refers to the practice of loading a webpage with keywords in an attempt to manipulate a site’s ranking in Google’s search results. Filling pages with keywords results in a negative user experience, and can harm your site’s ranking. Focus on creating useful, information-rich content that uses keywords appropriately and in context. To fix this problem, review your site for misused keywords. Typically, these will be lists or paragraphs of keywords, often randomly repeated. Check carefully, because keywords can often be in the form of hidden text, or they can be hidden in title tags or alt attributes. Duplicate content Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin. Examples of non-malicious duplicate content could include: • Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices However, in some cases, content is deliberately duplicated across domains in an attempt to manipulate search engine rankings or win more traffic. Deceptive practices like this can result in a poor user experience, when a visitor sees substantially the same content repeated within a set of search results. Google tries hard to index and show pages with distinct information. This filtering means, for instance, that if your site has a “regular” and “printer” version of each article, and neither of these is blocked in robots.txt or with a noindex meta tag, we’ll choose one of them to list. In the rare cases in which Google perceives that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we’ll also make appropriate adjustments in the indexing and ranking of the sites involved. As a result, the ranking of the site may suffer, or the site might be removed entirely from the Google index, in which case it will no longer appear in search results. There are some steps you can take to proactively address duplicate content issues, and ensure that visitors see the content you want them to. • Consider blocking pages from indexing: Rather than letting Google’s algorithms determine the “best” version of a document, you may wish to help guide us to your preferred version. For instance, if you don’t want us to index the printer versions of your site’s articles, disallow those directories or make use of regular expressions in your robots.txt file. • Minimize boilerplate repetition: For instance, instead of including lengthy copyright text on the bottom of every page, include a very brief summary and then link to a page with more details. Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don’t follow the advice listed above, we do a good job of choosing a version of the content to show in our search results. Installing viruses and other badware Sites that exploit browser security holes to install software (such as malware, spyware, viruses, adware, and trojan horses) are in violation of the Google quality guidelines, and may be removed from Google’s index. In many cases, websites that install badware have themselves been targeted by hackers, and you may not be aware that your site has been compromised until you receive a notification from Google. If your site has been flagged for badware, we’ll display a warning to visitors, and we’ll tell you about it in the Site summary page of Webmaster Tools. If your site has been flagged, review StopBadware.org’s guidelines for websites, and then request a review of your site. To request a review: 1. On the Dashboard in Webmaster Tools, click the site you want reviewed. Unfortunately, cleaning up a compromised site can be very difficult. Simply cleaning up the HTML files is seldom sufficient. If a rootkit has been installed, for instance, nothing short of wiping the machine and starting over may work. Even then, if the underlying security hole isn’t also fixed, your site could be compromised again within minutes. Source: Google Webmaster Help Center |