Over indexation by search bots Help needed re robots txt file

Dear Spip specialists, this is a question from a non techie.

I have run into a major problem with web site which is now repeatedly reaching its bandwith limit, caused by search bots overindexing the site. For the past many years (2007 on) I have run a site using spip which operated on 1.9.2. This site had a plaintext robots.txt file in the root directory. and all worked out fine.

In May 2012 when it was announced that support for 1.9.2 was being abandonned i made the transition to version 2.15.

I updated the robots text file given the new spip architecture including new directories. But the search bots have played havoc.

It seems the version 2 of spip has provision for automatically generating a robots. txt file (included in the squelettes folder) Also the htacces file says something to the effect. I am unclear as to what to do.

Please urgently advice what i should do to get around this problem.

regards

Harsh Kapoor

Hi,

you can still create you custom robots.txt at the root directory. Then, the web server won’t use the one that is generated by SPIP. If you have a doubt, you can also remove the line in the .htaccess file that designs spip to answer to robots.txt requests.

.Gilles

On Wed, Aug 1, 2012 at 6:30 PM, Harsh Kapoor <aiindex@gmail.com> wrote:

Dear Spip specialists, this is a question from a non techie.

I have run into a major problem with web site which is now repeatedly reaching its bandwith limit, caused by search bots overindexing the site. For the past many years (2007 on) I have run a site using spip which operated on 1.9.2. This site had a plaintext robots.txt file in the root directory. and all worked out fine.

In May 2012 when it was announced that support for 1.9.2 was being abandonned i made the transition to version 2.15.

I updated the robots text file given the new spip architecture including new directories. But the search bots have played havoc.

It seems the version 2 of spip has provision for automatically generating a robots. txt file (included in the squelettes folder) Also the htacces file says something to the effect. I am unclear as to what to do.

Please urgently advice what i should do to get around this problem.

regards

Harsh Kapoor


spip-en@rezo.net - http://listes.rezo.net/mailman/listinfo/spip-en

Le 01/08/2012 18:30, Harsh Kapoor a écrit :

Dear Spip specialists, this is a question from a non techie.

Install the security screen, since it provides some common protection against robot abuses.

JLuc

I have run into a major problem with web site which is now repeatedly reaching its bandwith limit, caused by search bots
overindexing the site. For the past many years (2007 on) I have run a site using spip which operated on 1.9.2. This site
had a plaintext robots.txt file in the root directory. and all worked out fine.

In May 2012 when it was announced that support for 1.9.2 was being abandonned i made the transition to version 2.15.

I updated the robots text file given the new spip architecture including new directories. But the search bots have
played havoc.

  It seems the version 2 of spip has provision for automatically generating a robots. txt file (included in the
squelettes folder) Also the htacces file says something to the effect. I am unclear as to what to do.

Please urgently advice what i should do to get around this problem.

regards

Harsh Kapoor