+ Post New Thread
Page 1 of 2 12 LastLast
Results 1 to 15 of 16
Web Development Thread, making a webpage invisible to Google in Coding and Web Development; I've got a webpage on a server that I'd rather that google didn't index! (Don't worry -its just my simple ...
  1. #1

    SimpleSi's Avatar
    Join Date
    Jun 2005
    Location
    Lancashire
    Posts
    4,965
    Thank Post
    1,188
    Thanked 445 Times in 348 Posts
    Rep Power
    136

    making a webpage invisible to Google

    I've got a webpage on a server that I'd rather that google didn't index!

    (Don't worry -its just my simple php helpdesk )

    If there a method of telling search engines to go away and not show it up?

    AFAIK its not linked from any other webpage.

    regards

    Simon

  2. IDG Tech News

  3. #2

    plexer's Avatar
    Join Date
    Dec 2005
    Location
    Norfolk
    Posts
    9,565
    Thank Post
    306
    Thanked 884 Times in 794 Posts
    Rep Power
    211

  4. #3
    Chunky's Avatar
    Join Date
    Nov 2007
    Location
    Newbridge, Wales, UK.
    Posts
    164
    Thank Post
    15
    Thanked 26 Times in 20 Posts
    Rep Power
    14
    solution found by sticking
    "code to avoid google indexer" into google...

    (first link)

    » 6 ways to stop Google and other search engines from indexing your site | Antezeta Web Marketing

    Some good info there.

    Chunks

  5. #4

    sonofsanta's Avatar
    Join Date
    Dec 2009
    Location
    Lincolnshire, UK
    Posts
    1,918
    Blog Entries
    4
    Thank Post
    271
    Thanked 380 Times in 282 Posts
    Rep Power
    132
    You'll need to use a file called robots.txt in the root of your webserver, you can use it to tell spiders explicitly what they can and can't look at. Dead easy to do, works for all search engines.

    Read up about it, and how to do it, @ http://www.robotstxt.org/robotstxt.html

  6. #5

    ZeroHour's Avatar
    Join Date
    Dec 2005
    Location
    Scotland
    Posts
    5,677
    Blog Entries
    1
    Thank Post
    682
    Thanked 1,017 Times in 591 Posts
    Rep Power
    264
    Do the robots.txt but I also add an extra layer myself with this piece of php code for things:
    Code:
    $badAgents = array('Mediapartners-Google','msnbot-NewsBlogs/1.1 (+http://search.msn.com/msnbot.htm)','Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; http://help.yahoo.com/help/us/ysearch/slurp)','Mozilla/5.0 (compatible; Ask Jeeves/Teoma; +http://about.ask.com/en/docs/about/webmasters.shtml)','msnbot/2.0b (+http://search.msn.com/msnbot.htm)','Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)','msnbot/1.1 (+http://search.msn.com/msnbot.htm)', 'Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)');
    if(in_array($_SERVER['HTTP_USER_AGENT'],$badAgents)) {    
    	Header("Location: http://BLAH");
    exit;
    }
    That can just redirect the bots away from your site anyway.

  7. #6
    p858snake's Avatar
    Join Date
    Dec 2008
    Location
    Queensland
    Posts
    1,392
    Blog Entries
    2
    Thank Post
    28
    Thanked 166 Times in 142 Posts
    Rep Power
    46
    Just remember the robots file is a defacto standard that most bots do follow although not all of them do it.

    The best way to stop something being spiderd that you don't to be is, is to make sure the bots can't access it, for example applying security to the folder(/s).

  8. #7

    SimpleSi's Avatar
    Join Date
    Jun 2005
    Location
    Lancashire
    Posts
    4,965
    Thank Post
    1,188
    Thanked 445 Times in 348 Posts
    Rep Power
    136
    Ta for info

    If there a way of finding out where google might have found the link in the first place

    (I probably put it on a pdf/word doc that is in one of my Docman folders but I don't want to have read all of them )

    regards

    Simon

  9. #8
    pwds's Avatar
    Join Date
    Dec 2008
    Location
    Derby
    Posts
    279
    Thank Post
    73
    Thanked 48 Times in 38 Posts
    Rep Power
    16
    Quote Originally Posted by SimpleSi View Post
    If there a way of finding out where google might have found the link in the first place
    IIRC Google Webmaster tools will show you the incoming links to your site. Have to sign and and verify site ownership (usually by placing a file on the site root or adding meta tags).

  10. #9

    SimpleSi's Avatar
    Join Date
    Jun 2005
    Location
    Lancashire
    Posts
    4,965
    Thank Post
    1,188
    Thanked 445 Times in 348 Posts
    Rep Power
    136
    Making a robots.txt file seems to have done the trick - my helpdesk has gone dark again

    regards

    Simon

  11. #10
    SteveBentley's Avatar
    Join Date
    Jun 2007
    Location
    Yorkshire
    Posts
    1,296
    Thank Post
    108
    Thanked 235 Times in 167 Posts
    Rep Power
    58
    To reitterate a point made earlier - "good" spiders obey robots.txt, but less reputable sites won't. If you don't want the outside world to see parts of the site, make sure they can't see the site. Taking the page off Google isn't going to cut it.

  12. #11

    Michael's Avatar
    Join Date
    Dec 2005
    Location
    Birmingham
    Posts
    6,762
    Thank Post
    171
    Thanked 1,055 Times in 828 Posts
    Rep Power
    217
    The only problem with using robots.txt is that it can be used maliciously. For example I'd never put the admin directory for my websites listed in robots.txt.

    Users could type:
    Code:
    www.domainexample.com/robots.txt

  13. #12
    p858snake's Avatar
    Join Date
    Dec 2008
    Location
    Queensland
    Posts
    1,392
    Blog Entries
    2
    Thank Post
    28
    Thanked 166 Times in 142 Posts
    Rep Power
    46
    Quote Originally Posted by SteveBentley View Post
    To reitterate a point made earlier - "good" spiders obey robots.txt, but less reputable sites won't. If you don't want the outside world to see parts of the *site, make sure they can't see the site. Taking the page off Google isn't going to cut it.
    *Cough* Microsoft's Bot, although you will know because of other things when it starts to index*cough*

    Just look it up and you will see what it means.

  14. #13

    SimpleSi's Avatar
    Join Date
    Jun 2005
    Location
    Lancashire
    Posts
    4,965
    Thank Post
    1,188
    Thanked 445 Times in 348 Posts
    Rep Power
    136
    If you don't want the outside world to see parts of the site, make sure they can't see the site.
    ..and your suggestion to do this is ?

    regards

    Simon

  15. #14
    SteveBentley's Avatar
    Join Date
    Jun 2007
    Location
    Yorkshire
    Posts
    1,296
    Thank Post
    108
    Thanked 235 Times in 167 Posts
    Rep Power
    58
    Quote Originally Posted by SimpleSi View Post
    ..and your suggestion to do this is ?
    Password protect it or just allow a specific IP range if it's purely for internal use.

  16. #15

    plexer's Avatar
    Join Date
    Dec 2005
    Location
    Norfolk
    Posts
    9,565
    Thank Post
    306
    Thanked 884 Times in 794 Posts
    Rep Power
    211

SHARE:
+ Post New Thread
Page 1 of 2 12 LastLast

Similar Threads

  1. Drive invisible
    By witch in forum Hardware
    Replies: 8
    Last Post: 23rd February 2009, 04:24 PM
  2. Invisible Hard drive
    By spider6986 in forum Windows
    Replies: 3
    Last Post: 24th April 2008, 03:16 PM
  3. ISA Server has become invisible...
    By Ravening_Wolf in forum Networks
    Replies: 12
    Last Post: 7th November 2006, 11:07 AM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •