Discussion:
backup software for indexing service
(too old to reply)
chris
2007-06-27 23:10:09 UTC
Permalink
Raw Message
Is there any backup software out there that does not trigger indexing
service to reindex all the files that have just been backup.

Indexing service uses the USN journal to notify it of any files it needs to
reindex. My backup software is set to do a full copy and not reset the
archive bit. I thought that this would take care of the problem. However, my
backup software also resets the last 'accessed date' on all the files so that
it appears that it was not the last thing to access the files. They are
trying to cover their tracks for other 3rd party programs that look at the
last 'accessed date' to determine if they should do something to the file or
not.

By doing this, (reseting the accessed date), they are triggering an entry
into the USN journal because they are making changes to the files 'accessed
date'.

Normally, if a program opens a file and does not make any changes, then
there is no new entry in the USN journal. But because my backup software
resets the last 'accessed date', an entry is sent to the USN Journal.
Indexing Service looks in that journal to determine if something should be
reindexed.

1) My question is, Is there any backup software out there that will backup
content that is scanned by indexing service and not change the files in
anyway. (for example, archive bit or accessed date). I hope someone can
answer this question. It really is terrible to have to watch your server
spend 5+ hours everynight, reindexing every single file every single night.
Please help.
.._..
2007-06-28 14:01:20 UTC
Permalink
Raw Message
Command line in Windows "xcopy" does not appear to change the file name.

Though indexing service may be storing the path and the file in it's own FAT
(I suspect this is the case), if that is true then you can't keep it from
detecting new files because the path to them will be new.

Why are you concerned about indexing server indexing these? If it consumes
too much CPU or disk access just turn down indexing service settings to be
less aggressive. As a "service" you should be able to set it up and forget
about it, that's what services are for.

Or, make custom catalogs for the areas you want to index and turn the main
disk-wide service off.

Or, use an archive file like .zip, or BacupExec or some other program that
stores the files in a database (that indexing service will either index
quickly as large files or not touch).

"chris" <***@discussions.microsoft.com> wrote in message
news:D04B8D81-3AC3-4936-AF1A-***@microsoft.com...
> Is there any backup software out there that does not trigger indexing
> service to reindex all the files that have just been backup.
>
> Indexing service uses the USN journal to notify it of any files it needs
> to
> reindex. My backup software is set to do a full copy and not reset the
> archive bit. I thought that this would take care of the problem. However,
> my
> backup software also resets the last 'accessed date' on all the files so
> that
> it appears that it was not the last thing to access the files. They are
> trying to cover their tracks for other 3rd party programs that look at the
> last 'accessed date' to determine if they should do something to the file
> or
> not.
>
> By doing this, (reseting the accessed date), they are triggering an entry
> into the USN journal because they are making changes to the files
> 'accessed
> date'.
>
> Normally, if a program opens a file and does not make any changes, then
> there is no new entry in the USN journal. But because my backup software
> resets the last 'accessed date', an entry is sent to the USN Journal.
> Indexing Service looks in that journal to determine if something should be
> reindexed.
>
> 1) My question is, Is there any backup software out there that will backup
> content that is scanned by indexing service and not change the files in
> anyway. (for example, archive bit or accessed date). I hope someone can
> answer this question. It really is terrible to have to watch your server
> spend 5+ hours everynight, reindexing every single file every single
> night.
> Please help.
chris
2007-06-29 19:08:06 UTC
Permalink
Raw Message
I want indexing service to index files that are new or 'modified within the
content of the file'. I do not want indexing service to index files that have
allready been indexed before and have had no changes whatsoever to the
content within the file or the filename.

The problem I am having is because indexing service uses the USN Journal to
determine if a file needs to be reindexed. If the archive bit or accessed
date is modified, indexing service will reindex the file. My backup software
does not touch the archive bit. However, it unfortunately replaces the last
access date with the previous accessed date so that it fools other 3rd party
applications into thinking that the backup software never backup the file. It
covers its tracks.

Unfortunately, by covering there tracks for 3rd party applications, My
backup software resets the 'accessed date' value which automatically adds an
entry into the USN Journal which triggers the file to be reindexed.

If my backup software would simply stop reseting the accessed date, then the
USN Journal would not have an entry and Indexing service would not reindex
the file.

So, Again, my question is, Is there any backup software out there that does
not cause indexing service to reindex every single file that it backs up
during the nightly backup?

I use Tape backup not disk. I have over 500,000 htm files that are indexed.
If you need to reindex them every night it takes 5+ hours. The problem I am
having is that nothing new will be indexed untill every single file gets
reindexed before hand, simply because any new file will be at the end of the
queue. So if someone adds something at 8am in the morning, it will not we
indexed for about 3 hours or so. I need new files indexed immediately, not 3
hours later. This constant RE-indexing of 500,000 htm files every single
night very disrupting to our company.

All I need is a tape backup solution that does not trigger the reindexing of
all the files. If it does not exist, then don't you think microsoft should
come up with a solution or work with at least one tape backup software
company. Is that too much to ask.

Maybe Microsoft could check the 'accessed date' of the file and compare it
to the date/time of the journal entry. If the accessed date is more than a
minute earlier than the USN Journal entry date, then indexing service should
know not to RE-index the file because it was most likely accessed by backup
software. Another thing they could do is base it on the 'modified date' only,
and not the USN Journal.

The bottom line is this. A file should only be reindexed if the filename, or
the content of the file change. Why? , because if it is RE-indexed, it will
be RE-indexed the exact same way as far as the search is concerned. It will
be a meaningless RE-Index and a complete waste of server resources. I like
microsoft products, but i feel that indexing service should change the way
they index files and base it only on filename changes and content changes,
Not the USN Journal.

However if anyone can point me in the right direction to a tape backup
software solution that is indexing service friendly then thats all I need and
you can disregard this rant.

anyone, . . . anyone, . . . anyone . . . . Microsoft, . . . anyone. . . ?

".._.." wrote:

> Command line in Windows "xcopy" does not appear to change the file name.
>
> Though indexing service may be storing the path and the file in it's own FAT
> (I suspect this is the case), if that is true then you can't keep it from
> detecting new files because the path to them will be new.
>
> Why are you concerned about indexing server indexing these? If it consumes
> too much CPU or disk access just turn down indexing service settings to be
> less aggressive. As a "service" you should be able to set it up and forget
> about it, that's what services are for.
>
> Or, make custom catalogs for the areas you want to index and turn the main
> disk-wide service off.
>
> Or, use an archive file like .zip, or BacupExec or some other program that
> stores the files in a database (that indexing service will either index
> quickly as large files or not touch).
>
> "chris" <***@discussions.microsoft.com> wrote in message
> news:D04B8D81-3AC3-4936-AF1A-***@microsoft.com...
> > Is there any backup software out there that does not trigger indexing
> > service to reindex all the files that have just been backup.
> >
> > Indexing service uses the USN journal to notify it of any files it needs
> > to
> > reindex. My backup software is set to do a full copy and not reset the
> > archive bit. I thought that this would take care of the problem. However,
> > my
> > backup software also resets the last 'accessed date' on all the files so
> > that
> > it appears that it was not the last thing to access the files. They are
> > trying to cover their tracks for other 3rd party programs that look at the
> > last 'accessed date' to determine if they should do something to the file
> > or
> > not.
> >
> > By doing this, (reseting the accessed date), they are triggering an entry
> > into the USN journal because they are making changes to the files
> > 'accessed
> > date'.
> >
> > Normally, if a program opens a file and does not make any changes, then
> > there is no new entry in the USN journal. But because my backup software
> > resets the last 'accessed date', an entry is sent to the USN Journal.
> > Indexing Service looks in that journal to determine if something should be
> > reindexed.
> >
> > 1) My question is, Is there any backup software out there that will backup
> > content that is scanned by indexing service and not change the files in
> > anyway. (for example, archive bit or accessed date). I hope someone can
> > answer this question. It really is terrible to have to watch your server
> > spend 5+ hours everynight, reindexing every single file every single
> > night.
> > Please help.
>
>
>
Hilary Cotter
2007-07-02 12:18:53 UTC
Permalink
Raw Message
You CAN'T backup an indexing services catalog and restore it somewhere else
without a complete rescan. What you might want to do is try to have your
backup software do incremental scans where they don't reset the archive bit.

--
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"chris" <***@discussions.microsoft.com> wrote in message
news:D04B8D81-3AC3-4936-AF1A-***@microsoft.com...
> Is there any backup software out there that does not trigger indexing
> service to reindex all the files that have just been backup.
>
> Indexing service uses the USN journal to notify it of any files it needs
> to
> reindex. My backup software is set to do a full copy and not reset the
> archive bit. I thought that this would take care of the problem. However,
> my
> backup software also resets the last 'accessed date' on all the files so
> that
> it appears that it was not the last thing to access the files. They are
> trying to cover their tracks for other 3rd party programs that look at the
> last 'accessed date' to determine if they should do something to the file
> or
> not.
>
> By doing this, (reseting the accessed date), they are triggering an entry
> into the USN journal because they are making changes to the files
> 'accessed
> date'.
>
> Normally, if a program opens a file and does not make any changes, then
> there is no new entry in the USN journal. But because my backup software
> resets the last 'accessed date', an entry is sent to the USN Journal.
> Indexing Service looks in that journal to determine if something should be
> reindexed.
>
> 1) My question is, Is there any backup software out there that will backup
> content that is scanned by indexing service and not change the files in
> anyway. (for example, archive bit or accessed date). I hope someone can
> answer this question. It really is terrible to have to watch your server
> spend 5+ hours everynight, reindexing every single file every single
> night.
> Please help.
chris
2007-07-09 16:08:05 UTC
Permalink
Raw Message
I'm still looking for an answer to the question, "Can you back up content on
a server, that is indexed by indexing service, and not cause indexing service
to rescan the content." I understand that indexing will rescan during a
restore, but I'm not concerned about the restore. I am only concerned about
backing up content files.

I am trying to find a tape backup software out there that will not trigger
files to be reindexed. Does it exist? Yes/No. If so, I would backup the
content files every day and skip the catalog.wci.

I am just looking for a tape backup product name. Does it exist?

"Hilary Cotter" wrote:

> You CAN'T backup an indexing services catalog and restore it somewhere else
> without a complete rescan. What you might want to do is try to have your
> backup software do incremental scans where they don't reset the archive bit.
>
> --
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
>
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
> "chris" <***@discussions.microsoft.com> wrote in message
> news:D04B8D81-3AC3-4936-AF1A-***@microsoft.com...
> > Is there any backup software out there that does not trigger indexing
> > service to reindex all the files that have just been backup.
> >
> > Indexing service uses the USN journal to notify it of any files it needs
> > to
> > reindex. My backup software is set to do a full copy and not reset the
> > archive bit. I thought that this would take care of the problem. However,
> > my
> > backup software also resets the last 'accessed date' on all the files so
> > that
> > it appears that it was not the last thing to access the files. They are
> > trying to cover their tracks for other 3rd party programs that look at the
> > last 'accessed date' to determine if they should do something to the file
> > or
> > not.
> >
> > By doing this, (reseting the accessed date), they are triggering an entry
> > into the USN journal because they are making changes to the files
> > 'accessed
> > date'.
> >
> > Normally, if a program opens a file and does not make any changes, then
> > there is no new entry in the USN journal. But because my backup software
> > resets the last 'accessed date', an entry is sent to the USN Journal.
> > Indexing Service looks in that journal to determine if something should be
> > reindexed.
> >
> > 1) My question is, Is there any backup software out there that will backup
> > content that is scanned by indexing service and not change the files in
> > anyway. (for example, archive bit or accessed date). I hope someone can
> > answer this question. It really is terrible to have to watch your server
> > spend 5+ hours everynight, reindexing every single file every single
> > night.
> > Please help.
>
>
>
Hilary Cotter
2007-07-09 16:16:20 UTC
Permalink
Raw Message
No, you can't do this.

--
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"chris" <***@discussions.microsoft.com> wrote in message
news:AE3CCBA1-8AC5-456B-9832-***@microsoft.com...
> I'm still looking for an answer to the question, "Can you back up content
> on
> a server, that is indexed by indexing service, and not cause indexing
> service
> to rescan the content." I understand that indexing will rescan during a
> restore, but I'm not concerned about the restore. I am only concerned
> about
> backing up content files.
>
> I am trying to find a tape backup software out there that will not trigger
> files to be reindexed. Does it exist? Yes/No. If so, I would backup the
> content files every day and skip the catalog.wci.
>
> I am just looking for a tape backup product name. Does it exist?
>
> "Hilary Cotter" wrote:
>
>> You CAN'T backup an indexing services catalog and restore it somewhere
>> else
>> without a complete rescan. What you might want to do is try to have your
>> backup software do incremental scans where they don't reset the archive
>> bit.
>>
>> --
>> Looking for a SQL Server replication book?
>> http://www.nwsu.com/0974973602.html
>>
>> Looking for a FAQ on Indexing Services/SQL FTS
>> http://www.indexserverfaq.com
>> "chris" <***@discussions.microsoft.com> wrote in message
>> news:D04B8D81-3AC3-4936-AF1A-***@microsoft.com...
>> > Is there any backup software out there that does not trigger indexing
>> > service to reindex all the files that have just been backup.
>> >
>> > Indexing service uses the USN journal to notify it of any files it
>> > needs
>> > to
>> > reindex. My backup software is set to do a full copy and not reset the
>> > archive bit. I thought that this would take care of the problem.
>> > However,
>> > my
>> > backup software also resets the last 'accessed date' on all the files
>> > so
>> > that
>> > it appears that it was not the last thing to access the files. They are
>> > trying to cover their tracks for other 3rd party programs that look at
>> > the
>> > last 'accessed date' to determine if they should do something to the
>> > file
>> > or
>> > not.
>> >
>> > By doing this, (reseting the accessed date), they are triggering an
>> > entry
>> > into the USN journal because they are making changes to the files
>> > 'accessed
>> > date'.
>> >
>> > Normally, if a program opens a file and does not make any changes, then
>> > there is no new entry in the USN journal. But because my backup
>> > software
>> > resets the last 'accessed date', an entry is sent to the USN Journal.
>> > Indexing Service looks in that journal to determine if something should
>> > be
>> > reindexed.
>> >
>> > 1) My question is, Is there any backup software out there that will
>> > backup
>> > content that is scanned by indexing service and not change the files in
>> > anyway. (for example, archive bit or accessed date). I hope someone can
>> > answer this question. It really is terrible to have to watch your
>> > server
>> > spend 5+ hours everynight, reindexing every single file every single
>> > night.
>> > Please help.
>>
>>
>>
Loading...