Discussion:
accent-insensitive search with IS ?
(too old to reply)
Nuice
2006-09-07 02:53:47 UTC
Permalink
I have a web service interfacing with IS (Windows Server 2003) through
OLEDB. The index contains office documents and web pages in english and in
french. I want to know if there is a way to tell IS to do an
accent-insensitve search ("telephone" should match "téléphone" and vice
versa)?

Thank's
WenJun Zhang[msft]
2006-09-07 12:56:35 UTC
Permalink
Hi Nuice,

Unfortunately the current version Index Server (Windows 2003 Server SP1)
does not support accent insensitive searches. The behavior is by design.

The Word Breaker plays a critical role in whether accents are supported or
not as it can create multiple words in the index. The Word Breaker
supplied with Windows does not provide this functionality.

Diacritics
http://msdn.microsoft.com/library/en-us/indexsrv/html/wbrscenario_656b.asp?f
rame=true

If diacritics are used only minimally in a language, the word breaker for
that language should remove them during both index creation and querying.
When you create a word breaker, it is recommended that at query time, you
ensure that the word breaker generates the alternate spellings "ber" and
"ueber."

Currently an alternative solution should be using the search function of
SharePoint Server 2003. When you search your SharePoint Portal Server 2003
portal site for content(File or SQL), you cannot limit your search to
return only content that contains accented characters or diacritical marks.
When you search the portal site for a keyword that contains one or more
accented characters or diacritical marks, SharePoint Portal Server returns
the accented form of the word and the unaccented form of the word in the
search results.

Accent-insensitive Searching
http://msdn.microsoft.com/library/en-us/spptsdk/html/AccentSensitivitySearch
es_SV01150746.asp?frame=true

Not only are Microsoft SharePoint Portal Server Search queries not
sensitive to case; they are also not sensitive to accents when using either
the FREETEXT or CONTAINS predicates.

I'll help you forward the question to our product group to ask about if
there is any plan to involve the accent insensitive searche feature in
future version Indexing Service. If you have any concern or further
question, please don't hesitate to let me know.

Thanks & Have a nice day.

Sincerely,

WenJun Zhang

Microsoft Online Community Support

==================================================

Get notification to my posts through email? Please refer to:
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.

Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at:

http://msdn.microsoft.com/subscriptions/support/default.aspx.

==================================================

This posting is provided "AS IS" with no warranties, and confers no rights.
Nuice
2006-09-07 13:40:49 UTC
Permalink
Hi WenJun,

If it is possible I'd like to know if accent insensitivity will be supported
in future version of Index Server.

Thank's
Post by WenJun Zhang[msft]
Hi Nuice,
Unfortunately the current version Index Server (Windows 2003 Server SP1)
does not support accent insensitive searches. The behavior is by design.
The Word Breaker plays a critical role in whether accents are supported or
not as it can create multiple words in the index. The Word Breaker
supplied with Windows does not provide this functionality.
Diacritics
http://msdn.microsoft.com/library/en-us/indexsrv/html/wbrscenario_656b.asp?f
rame=true
If diacritics are used only minimally in a language, the word breaker for
that language should remove them during both index creation and querying.
When you create a word breaker, it is recommended that at query time, you
ensure that the word breaker generates the alternate spellings "ber" and
"ueber."
Currently an alternative solution should be using the search function of
SharePoint Server 2003. When you search your SharePoint Portal Server 2003
portal site for content(File or SQL), you cannot limit your search to
return only content that contains accented characters or diacritical marks.
When you search the portal site for a keyword that contains one or more
accented characters or diacritical marks, SharePoint Portal Server returns
the accented form of the word and the unaccented form of the word in the
search results.
Accent-insensitive Searching
http://msdn.microsoft.com/library/en-us/spptsdk/html/AccentSensitivitySearch
es_SV01150746.asp?frame=true
Not only are Microsoft SharePoint Portal Server Search queries not
sensitive to case; they are also not sensitive to accents when using either
the FREETEXT or CONTAINS predicates.
I'll help you forward the question to our product group to ask about if
there is any plan to involve the accent insensitive searche feature in
future version Indexing Service. If you have any concern or further
question, please don't hesitate to let me know.
Thanks & Have a nice day.
Sincerely,
WenJun Zhang
Microsoft Online Community Support
==================================================
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.
Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
http://msdn.microsoft.com/subscriptions/support/default.aspx.
==================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
Hilary Cotter
2006-09-07 13:51:34 UTC
Permalink
Currently Windows Desktop Search does not support it. SQL full-text search
2005 does.
--
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.

This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.

Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
Post by Nuice
Hi WenJun,
If it is possible I'd like to know if accent insensitivity will be
supported in future version of Index Server.
Thank's
Post by WenJun Zhang[msft]
Hi Nuice,
Unfortunately the current version Index Server (Windows 2003 Server SP1)
does not support accent insensitive searches. The behavior is by design.
The Word Breaker plays a critical role in whether accents are supported or
not as it can create multiple words in the index. The Word Breaker
supplied with Windows does not provide this functionality.
Diacritics
http://msdn.microsoft.com/library/en-us/indexsrv/html/wbrscenario_656b.asp?f
rame=true
If diacritics are used only minimally in a language, the word breaker for
that language should remove them during both index creation and querying.
When you create a word breaker, it is recommended that at query time, you
ensure that the word breaker generates the alternate spellings "ber" and
"ueber."
Currently an alternative solution should be using the search function of
SharePoint Server 2003. When you search your SharePoint Portal Server 2003
portal site for content(File or SQL), you cannot limit your search to
return only content that contains accented characters or diacritical marks.
When you search the portal site for a keyword that contains one or more
accented characters or diacritical marks, SharePoint Portal Server returns
the accented form of the word and the unaccented form of the word in the
search results.
Accent-insensitive Searching
http://msdn.microsoft.com/library/en-us/spptsdk/html/AccentSensitivitySearch
es_SV01150746.asp?frame=true
Not only are Microsoft SharePoint Portal Server Search queries not
sensitive to case; they are also not sensitive to accents when using either
the FREETEXT or CONTAINS predicates.
I'll help you forward the question to our product group to ask about if
there is any plan to involve the accent insensitive searche feature in
future version Indexing Service. If you have any concern or further
question, please don't hesitate to let me know.
Thanks & Have a nice day.
Sincerely,
WenJun Zhang
Microsoft Online Community Support
==================================================
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.
Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
http://msdn.microsoft.com/subscriptions/support/default.aspx.
==================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
WenJun Zhang[msft]
2006-09-11 13:58:54 UTC
Permalink
Hi Nuice,

I just get confirmation from our product group. Unfortunately Indexing
Service is going to be archived. It will ship with Longhorn Server only for
backwards compatibility. Therefore no new features or bugfixes will be
taken outside of security issues.

Windows Desktop Search will be the searching replacement on desktop. It's
intent is to index files on your local machine and local content that a
user would need to search for. It's not intended to be a server search
solution.

So you'd probably look to SharePoint or SQL Full Text Search for a
long-term solution.

Please don't hesitate to let me know any of your concerns or feedback and I
will help you deliver them to our product team if necessary. Thanks.

Sincerely,

WenJun Zhang

Microsoft Online Community Support

==================================================

Get notification to my posts through email? Please refer to:
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.

Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at:

http://msdn.microsoft.com/subscriptions/support/default.aspx.

==================================================

This posting is provided "AS IS" with no warranties, and confers no rights.
WenJun Zhang[msft]
2006-09-13 10:23:05 UTC
Permalink
Hi Nuice,

Just want to check if you have any further question of concerns on this
issue? If so, please don't hesitate to let me know.

Thanks.

Sincerely,

WenJun Zhang

Microsoft Online Community Support

==================================================

Get notification to my posts through email? Please refer to:
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.

Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at:

http://msdn.microsoft.com/subscriptions/support/default.aspx.

==================================================

This posting is provided "AS IS" with no warranties, and confers no rights.
WenJun Zhang[msft]
2006-09-08 08:53:24 UTC
Permalink
Hi Nuice,

I'm just contacting our dev group of indexing service to confirm if the
feature will be available in a future version. The turn around may take 1
working day or 2. I will give you an update at the early of next week.
Please wait for my follow up.

Thanks & Have a nice weekend.

Sincerely,

WenJun Zhang

Microsoft Online Community Support

==================================================

Get notification to my posts through email? Please refer to:
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.

Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at:

http://msdn.microsoft.com/subscriptions/support/default.aspx.

==================================================

This posting is provided "AS IS" with no warranties, and confers no rights.
Loading...