Discussion:
backup software for indexing service
(too old to reply)
chris
2007-06-27 23:10:09 UTC
Permalink
Is there any backup software out there that does not trigger indexing
service to reindex all the files that have just been backup.

Indexing service uses the USN journal to notify it of any files it needs to
reindex. My backup software is set to do a full copy and not reset the
archive bit. I thought that this would take care of the problem. However, my
backup software also resets the last 'accessed date' on all the files so that
it appears that it was not the last thing to access the files. They are
trying to cover their tracks for other 3rd party programs that look at the
last 'accessed date' to determine if they should do something to the file or
not.

By doing this, (reseting the accessed date), they are triggering an entry
into the USN journal because they are making changes to the files 'accessed
date'.

Normally, if a program opens a file and does not make any changes, then
there is no new entry in the USN journal. But because my backup software
resets the last 'accessed date', an entry is sent to the USN Journal.
Indexing Service looks in that journal to determine if something should be
reindexed.

1) My question is, Is there any backup software out there that will backup
content that is scanned by indexing service and not change the files in
anyway. (for example, archive bit or accessed date). I hope someone can
answer this question. It really is terrible to have to watch your server
spend 5+ hours everynight, reindexing every single file every single night.
Please help.
.._..
2007-06-28 14:01:20 UTC
Permalink
Command line in Windows "xcopy" does not appear to change the file name.

Though indexing service may be storing the path and the file in it's own FAT
(I suspect this is the case), if that is true then you can't keep it from
detecting new files because the path to them will be new.

Why are you concerned about indexing server indexing these? If it consumes
too much CPU or disk access just turn down indexing service settings to be
less aggressive. As a "service" you should be able to set it up and forget
about it, that's what services are for.

Or, make custom catalogs for the areas you want to index and turn the main
disk-wide service off.

Or, use an archive file like .zip, or BacupExec or some other program that
stores the files in a database (that indexing service will either index
quickly as large files or not touch).
Post by chris
Is there any backup software out there that does not trigger indexing
service to reindex all the files that have just been backup.
Indexing service uses the USN journal to notify it of any files it needs to
reindex. My backup software is set to do a full copy and not reset the
archive bit. I thought that this would take care of the problem. However, my
backup software also resets the last 'accessed date' on all the files so that
it appears that it was not the last thing to access the files. They are
trying to cover their tracks for other 3rd party programs that look at the
last 'accessed date' to determine if they should do something to the file or
not.
By doing this, (reseting the accessed date), they are triggering an entry
into the USN journal because they are making changes to the files 'accessed
date'.
Normally, if a program opens a file and does not make any changes, then
there is no new entry in the USN journal. But because my backup software
resets the last 'accessed date', an entry is sent to the USN Journal.
Indexing Service looks in that journal to determine if something should be
reindexed.
1) My question is, Is there any backup software out there that will backup
content that is scanned by indexing service and not change the files in
anyway. (for example, archive bit or accessed date). I hope someone can
answer this question. It really is terrible to have to watch your server
spend 5+ hours everynight, reindexing every single file every single night.
Please help.
chris
2007-06-29 19:08:06 UTC
Permalink
I want indexing service to index files that are new or 'modified within the
content of the file'. I do not want indexing service to index files that have
allready been indexed before and have had no changes whatsoever to the
content within the file or the filename.

The problem I am having is because indexing service uses the USN Journal to
determine if a file needs to be reindexed. If the archive bit or accessed
date is modified, indexing service will reindex the file. My backup software
does not touch the archive bit. However, it unfortunately replaces the last
access date with the previous accessed date so that it fools other 3rd party
applications into thinking that the backup software never backup the file. It
covers its tracks.

Unfortunately, by covering there tracks for 3rd party applications, My
backup software resets the 'accessed date' value which automatically adds an
entry into the USN Journal which triggers the file to be reindexed.

If my backup software would simply stop reseting the accessed date, then the
USN Journal would not have an entry and Indexing service would not reindex
the file.

So, Again, my question is, Is there any backup software out there that does
not cause indexing service to reindex every single file that it backs up
during the nightly backup?

I use Tape backup not disk. I have over 500,000 htm files that are indexed.
If you need to reindex them every night it takes 5+ hours. The problem I am
having is that nothing new will be indexed untill every single file gets
reindexed before hand, simply because any new file will be at the end of the
queue. So if someone adds something at 8am in the morning, it will not we
indexed for about 3 hours or so. I need new files indexed immediately, not 3
hours later. This constant RE-indexing of 500,000 htm files every single
night very disrupting to our company.

All I need is a tape backup solution that does not trigger the reindexing of
all the files. If it does not exist, then don't you think microsoft should
come up with a solution or work with at least one tape backup software
company. Is that too much to ask.

Maybe Microsoft could check the 'accessed date' of the file and compare it
to the date/time of the journal entry. If the accessed date is more than a
minute earlier than the USN Journal entry date, then indexing service should
know not to RE-index the file because it was most likely accessed by backup
software. Another thing they could do is base it on the 'modified date' only,
and not the USN Journal.

The bottom line is this. A file should only be reindexed if the filename, or
the content of the file change. Why? , because if it is RE-indexed, it will
be RE-indexed the exact same way as far as the search is concerned. It will
be a meaningless RE-Index and a complete waste of server resources. I like
microsoft products, but i feel that indexing service should change the way
they index files and base it only on filename changes and content changes,
Not the USN Journal.

However if anyone can point me in the right direction to a tape backup
software solution that is indexing service friendly then thats all I need and
you can disregard this rant.

anyone, . . . anyone, . . . anyone . . . . Microsoft, . . . anyone. . . ?
Post by .._..
Command line in Windows "xcopy" does not appear to change the file name.
Though indexing service may be storing the path and the file in it's own FAT
(I suspect this is the case), if that is true then you can't keep it from
detecting new files because the path to them will be new.
Why are you concerned about indexing server indexing these? If it consumes
too much CPU or disk access just turn down indexing service settings to be
less aggressive. As a "service" you should be able to set it up and forget
about it, that's what services are for.
Or, make custom catalogs for the areas you want to index and turn the main
disk-wide service off.
Or, use an archive file like .zip, or BacupExec or some other program that
stores the files in a database (that indexing service will either index
quickly as large files or not touch).
Post by chris
Is there any backup software out there that does not trigger indexing
service to reindex all the files that have just been backup.
Indexing service uses the USN journal to notify it of any files it needs to
reindex. My backup software is set to do a full copy and not reset the
archive bit. I thought that this would take care of the problem. However, my
backup software also resets the last 'accessed date' on all the files so that
it appears that it was not the last thing to access the files. They are
trying to cover their tracks for other 3rd party programs that look at the
last 'accessed date' to determine if they should do something to the file or
not.
By doing this, (reseting the accessed date), they are triggering an entry
into the USN journal because they are making changes to the files 'accessed
date'.
Normally, if a program opens a file and does not make any changes, then
there is no new entry in the USN journal. But because my backup software
resets the last 'accessed date', an entry is sent to the USN Journal.
Indexing Service looks in that journal to determine if something should be
reindexed.
1) My question is, Is there any backup software out there that will backup
content that is scanned by indexing service and not change the files in
anyway. (for example, archive bit or accessed date). I hope someone can
answer this question. It really is terrible to have to watch your server
spend 5+ hours everynight, reindexing every single file every single night.
Please help.
Hilary Cotter
2007-07-02 12:18:53 UTC
Permalink
You CAN'T backup an indexing services catalog and restore it somewhere else
without a complete rescan. What you might want to do is try to have your
backup software do incremental scans where they don't reset the archive bit.
--
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
Post by chris
Is there any backup software out there that does not trigger indexing
service to reindex all the files that have just been backup.
Indexing service uses the USN journal to notify it of any files it needs to
reindex. My backup software is set to do a full copy and not reset the
archive bit. I thought that this would take care of the problem. However, my
backup software also resets the last 'accessed date' on all the files so that
it appears that it was not the last thing to access the files. They are
trying to cover their tracks for other 3rd party programs that look at the
last 'accessed date' to determine if they should do something to the file or
not.
By doing this, (reseting the accessed date), they are triggering an entry
into the USN journal because they are making changes to the files 'accessed
date'.
Normally, if a program opens a file and does not make any changes, then
there is no new entry in the USN journal. But because my backup software
resets the last 'accessed date', an entry is sent to the USN Journal.
Indexing Service looks in that journal to determine if something should be
reindexed.
1) My question is, Is there any backup software out there that will backup
content that is scanned by indexing service and not change the files in
anyway. (for example, archive bit or accessed date). I hope someone can
answer this question. It really is terrible to have to watch your server
spend 5+ hours everynight, reindexing every single file every single night.
Please help.
chris
2007-07-09 16:08:05 UTC
Permalink
I'm still looking for an answer to the question, "Can you back up content on
a server, that is indexed by indexing service, and not cause indexing service
to rescan the content." I understand that indexing will rescan during a
restore, but I'm not concerned about the restore. I am only concerned about
backing up content files.

I am trying to find a tape backup software out there that will not trigger
files to be reindexed. Does it exist? Yes/No. If so, I would backup the
content files every day and skip the catalog.wci.

I am just looking for a tape backup product name. Does it exist?
Post by Hilary Cotter
You CAN'T backup an indexing services catalog and restore it somewhere else
without a complete rescan. What you might want to do is try to have your
backup software do incremental scans where they don't reset the archive bit.
--
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
Post by chris
Is there any backup software out there that does not trigger indexing
service to reindex all the files that have just been backup.
Indexing service uses the USN journal to notify it of any files it needs to
reindex. My backup software is set to do a full copy and not reset the
archive bit. I thought that this would take care of the problem. However, my
backup software also resets the last 'accessed date' on all the files so that
it appears that it was not the last thing to access the files. They are
trying to cover their tracks for other 3rd party programs that look at the
last 'accessed date' to determine if they should do something to the file or
not.
By doing this, (reseting the accessed date), they are triggering an entry
into the USN journal because they are making changes to the files 'accessed
date'.
Normally, if a program opens a file and does not make any changes, then
there is no new entry in the USN journal. But because my backup software
resets the last 'accessed date', an entry is sent to the USN Journal.
Indexing Service looks in that journal to determine if something should be
reindexed.
1) My question is, Is there any backup software out there that will backup
content that is scanned by indexing service and not change the files in
anyway. (for example, archive bit or accessed date). I hope someone can
answer this question. It really is terrible to have to watch your server
spend 5+ hours everynight, reindexing every single file every single night.
Please help.
Hilary Cotter
2007-07-09 16:16:20 UTC
Permalink
No, you can't do this.
--
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
Post by chris
I'm still looking for an answer to the question, "Can you back up content on
a server, that is indexed by indexing service, and not cause indexing service
to rescan the content." I understand that indexing will rescan during a
restore, but I'm not concerned about the restore. I am only concerned about
backing up content files.
I am trying to find a tape backup software out there that will not trigger
files to be reindexed. Does it exist? Yes/No. If so, I would backup the
content files every day and skip the catalog.wci.
I am just looking for a tape backup product name. Does it exist?
Post by Hilary Cotter
You CAN'T backup an indexing services catalog and restore it somewhere else
without a complete rescan. What you might want to do is try to have your
backup software do incremental scans where they don't reset the archive bit.
--
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
Post by chris
Is there any backup software out there that does not trigger indexing
service to reindex all the files that have just been backup.
Indexing service uses the USN journal to notify it of any files it
needs
to
reindex. My backup software is set to do a full copy and not reset the
archive bit. I thought that this would take care of the problem.
However,
my
backup software also resets the last 'accessed date' on all the files
so
that
it appears that it was not the last thing to access the files. They are
trying to cover their tracks for other 3rd party programs that look at the
last 'accessed date' to determine if they should do something to the
file
or
not.
By doing this, (reseting the accessed date), they are triggering an entry
into the USN journal because they are making changes to the files 'accessed
date'.
Normally, if a program opens a file and does not make any changes, then
there is no new entry in the USN journal. But because my backup software
resets the last 'accessed date', an entry is sent to the USN Journal.
Indexing Service looks in that journal to determine if something should be
reindexed.
1) My question is, Is there any backup software out there that will backup
content that is scanned by indexing service and not change the files in
anyway. (for example, archive bit or accessed date). I hope someone can
answer this question. It really is terrible to have to watch your server
spend 5+ hours everynight, reindexing every single file every single night.
Please help.
Loading...