Cameron
12-11-2007, 9:51 AM
I am probably missing an easy way to do this, but can you do compound searches the include or exclude words such as "+cameron +sucks" or "cameron and sucks"? Even better if you could do this "+cameron +sucks -brandon"
If not, any way to make it more like google?
LorenK
12-11-2007, 10:51 AM
I did a bit of research on it. It's called Boolean Search. I'll need to read some more about what the effect is on the site then discuss with Chris.
In a nutshell, Boolean Search option is very server intensive and have a higher rate of 'no results found'.
Natural Language searches result in larger database sizes (due to more indexing) but are less server intensive. Another downside is no wildcard searches.
I'll need to read more though.
I am probably missing an easy way to do this, but can you do compound searches the include or exclude words such as "+cameron +sucks" or "cameron and sucks"? Even better if you could do this "+cameron +sucks -brandon"
If not, any way to make it more like google?
Cameron
12-11-2007, 11:24 AM
What database are you guys using? Depending on the database and how the search is implemented will likely depend on the impact. Typically speaking ands don't consume as many SQL resources as ORs which is the current default setup. This is mitigated to some extent by getting a broader search when using ORs that return more rows so less searching is needed. For most SQL systems, it really depends on how the query is built and the database structure. If the system is doing LIKE queries in a text field, that is just all kinds of ugly performance wise. I would be suprised if that were the case though.
LorenK
12-11-2007, 12:20 PM
Moving thread to new Site Suggestions/Issues area.
haninja
12-11-2007, 12:34 PM
Good suggestion. I often find myself searching multiple words and getting way to many responses.
LorenK
12-12-2007, 12:49 AM
Okay, boolean searches are enabled on ARC. I think they already were.
The available options are AND, OR, NOT and -. You can also use double quotes for fully qualified strings and * for wildcard searches.
I have found one issue though and that is that words that are right next to each other in a post do not filter out with the NOT or - operator. For example, if you put 'yellow - tang' posts with "yellow tang" will be returned. I haven't figured that one out yet.
Here is some primer info on using boolean searches that I found on another site and edited for our needs here.
The boolean full-text search capability supports the following operators:
+
A leading plus sign indicates that this word must be present in each row that is returned.
-
A leading minus sign indicates that this word must not be present in any of the rows that are returned.
(no operator)
By default (when neither + nor - is specified) the word is optional, but the rows that contain it are rated higher. This mimics the behavior of MATCH() ... AGAINST() without the IN BOOLEAN MODE modifier.
*
The asterisk serves as the truncation operator. Unlike the other operators, it should be appended to the word to be affected.
"
A phrase that is enclosed within double quote (‘"’) characters matches only rows that contain the phrase literally, as it was typed. The full-text engine splits the phrase into words, performs a search in the FULLTEXT index for the words. The engine then performs a substring search for the phrase in the records that are found, so the match must include non-word characters in the phrase. For example, "test phrase" does not match "test, phrase".
If the phrase contains no words that are in the index, the result is empty. For example, if all words are either stopwords or shorter than the minimum length of indexed words, the result is empty.
The following examples demonstrate some search strings that use boolean full-text operators:
'apple banana'
Find rows that contain at least one of the two words.
'+apple +juice'
Find rows that contain both words.
'+apple macintosh'
Find rows that contain the word “apple”, but rank rows higher if they also contain “macintosh”.
'+apple -macintosh'
Find rows that contain the word “apple” but not “macintosh”.
'apple*'
Find rows that contain words such as “apple”, “apples”, “applesauce”, or “applet”.
'"some words"'
Find rows that contain the exact phrase “some words” (for example, rows that contain “some words of wisdom” but not “some noise words”). Note that the ‘"’ characters that surround the phrase are operator characters that delimit the phrase. They are not the quotes that surround the search string itself.
Some words are ignored in full-text searches:
Any word that is too short is ignored. The default minimum length of words that are found by full-text searches is four characters.
Words in the stopword list are ignored. A stopword is a word such as “the” or “some” that is so common that it is considered to have zero semantic value. There is a built-in stopword list, but it can be overwritten by a user-defined list.
Cameron
12-12-2007, 12:57 AM
I tried the following:
+calcium +reactor
calcium + reactor
calcium and reactor
They all returned this thread: http://www.atlantareefclub.org/forums/showthread.php?t=10310&highlight=calcium+reactor
The word reactor is not found in this thread.
LorenK
12-12-2007, 1:05 AM
Hmmm...I'll need to research farther.
Thanks for breaking it though :-)
Edit...I think you did break it because those don't show for me. Which leads me to believe there is some role based control involved.
Edit2...Cameron try again. I enabled it for your user group.
Cameron
12-12-2007, 1:38 AM
Looking good. Many thanks.