Indexing and Search Service Query and Sort Criteria Summary

Skip to end of metadata
Go to start of metadata

Indexing and Search Service Query and Sort Criteria Summary

This information provides guidance for generating Indexing and Search Service (ISS) search queries. It specifies the field names that you can use in query terms, and assists in translating well-known IMAP SEARCH queries into ISS search queries. It also describes how search results can be sorted by certain field names.

Topics:

Search Query Field Names

You can use the following search query field names for the email data type:

"answered"
"attachgroup-author"
"attachgroup-contents"
"attachgroup-imagecompress"  (only defined for picture type like atjpeg)
"attachgroup-imageheight"    (only defined for picture type like atjpeg)
"attachgroup-imageprecision" (only defined for picture type like atjpeg)
"attachgroup-imagesource"    (only defined for picture type like atjpeg)
"attachgroup-imagewidth"     (only defined for picture type like atjpeg)
"attachgroup-name"
"attachgroup-pubdate"
"attachgroup-size"
"attachgroup-title"
"attachment-type"
        Contains an entry for every attachment detected in document.
        Valid values are lowercase and prefixed by "at", such as "atpdf"
        defined in the AttachmentType enumeration for attachments.
        Currently <type> in the at<type> values can be any of:
            applefile audio compress doc (MS Word) html image
            iwork (Apple iWork) jpeg odf (Any OpenOffice file)
            other (Uncategorized) pdf pgpsign plain (Plain text)
            ppt (MS Powerpoint), rtf ssign sencr vcf video
            vsd (MS Visio) xls (MS Excel) xml
"bcc"
"body"
        means contents OR attachgroup-contents
"cc"
"content-description"
"content-id"                (search not yet fully supported)
"content-md5"               (search not yet fully supported)
"content-transfer-encoding" (search not yet fully supported)
"content-type"
"contents"
"deleted"
"draft"
"email-headers"             (not currently implemented)
"flagged"
"folder"
"from"
"hostname"
"message-id"                (search not yet fully supported)
"messagekey"                (not currently implemented)
"received"
"recent"                    (search not yet fully supported)
"reply-to"
"seen"
"sent"
"size"
"subject"
"text"
         means email-headers OR contents OR attachgroup-contents
"to"
"uid"
"uidvalidity"
"userflags"                 (search not yet fully supported)
"username"

The iWork, audio, compress, image, and video attachment types are supported starting in Indexing and Search Service 1 Update 1.

The applefile and pgpsign attachment types are supported starting in Indexing and Search Service 1 Update 2.

Sort Criteria

You can use the following search field names to specify sort criteria:

cc
folder
from
received      IMAP SORT arrival (not yet completely implemented)
sent          IMAP SORT date (not yet completely implemented)
size
subject
to
uid           implicit in IMAP SORT

Specify reverse sort criteria by inserting a hyphen (-) as a prefix to the individual field name. Specify multiple sort criteria as a simple ordered list of field names, such as:

+size -subject +to -from

The default sort uses the uid field name. You can also explicitly specify this field name in a multiple sort criteria list.

Note
Because sort by folder is a post meta search process, folder sort is always the last sort performed. That is, sort = +subject +folder is the same as +folder +subject. The results are always ordered first by folder, then by other sorting criteria.

How IMAP SEARCH Uses Search Query Field Names

Most IMAP SEARCH criteria use the same (or similar) name as the field name in a search query to ISS. However, some criteria must be mapped in non-obvious ways. These features in IMAP SEARCH map into corresponding ISS search query terms:

IMAP flags map into a single field name, such as:
        ANSWERED   => +answered:true
        UNANSWERED => +answered:false or -answered:true
        Similarly for DELETED, DRAFT, FLAGGED, RECENT, SEEN
                and the corresponding UN*
OLD => -recent=true
NEW means (RECENT UNSEEN) => +recent:true +seen:false
LARGER n => -size:{0 TO n}
            -size:[0 TO n] including n, which means
                           LARGER than or equal to n
SMALLER n => +size:{0 TO n} technically excludes zero
UID <sequence set> - single or range of UIDs only (arbitrary list
           not currently supported):
        UID 44:444 => +uid:[44 TO 444] inclusive of 44 and 444
        UID 44     => +uid:44

ON <date>          => +received:YYYYMMDD
SENTON <date>      => +sent:YYYYMMDD
BEFORE <date>      => +received:{19700101 TO YYYYMMDD}
SENTBEFORE <date>  => +sent:{19700101 TO YYYYMMDD}
SENTSINCE <date>   => -sent:[19700101 TO YYYYMMDD]
SINCE <date>       => -received:[19700101 TO YYYYMMDD]

The ISS field names support for the following IMAP SEARCH features has not yet been implemented:

KEYWORD <flag> (also UNKEYWORD)
HEADER <field-name> <string>

Search Query Syntax

A search query consists of a list of terms, each separated by blanks. A term that you prefix with a plus sign (+) means "AND MATCHES", a term that you prefix with a minus sign (-) means "AND NOT MATCHES", and a term that you prefix with nothing (that is, a blank space) means "OR MATCHES".

Every query must contain one term that defines the host name and one defines the user name of the account that you want to search. Thus, the simplest valid search query looks like the following:

    +hostname:hhh +username:uuu

This search returns all emails in the account uuu@hhh. These two field names must appear as the first two terms in any search query. (Otherwise, terms may appear in any order in the query.) You can use additional terms in the query to select fewer results from this set. Only terms of the form -term or +term are allowed (OR cannot restrict the result).

A term may contain a keyword from the list of field names (listed above) separated by a colon (:) from the value to be searched. A term value without a field name uses the default field name, which is "contents". A term may also consist of a parenthesized list of terms (with an optional prefix character) to form more complex queries, including the OR MATCHES feature. Parenthesized terms may be nested.

The value of a term can be a single string, a phrase, or a range specifier. A single string term contains only the characters to be matched with no spaces, such as:

    +contents:java

A single string term will match the specific string as a word in the field.

A phrase is a quoted string containing a list of single string terms separated by spaces, such as:

    +contents:"java code"

A phrase term will match the sequence of string values as words in the field.

A range specifier is a pair of single string terms separated by the string TO and enclosed in either square or curly brackets, such as:

    +size:{0 TO 1000}
    +size:[100 TO 2000]
    +sent:{19700101 TO 20010101}

A range term matches all values from the first to the second in the field. The square brackets mean include the end points in the match; curly brackets mean exclude the end points. Notice that date matches like the last example above require the values to be fully specified in the form of YYYYMMDD for year, month, day.

Starting in Indexing and Search Service 1.0.15.19.0, range terms containing the uid field name have changed so that the semantics of the search more closely match those of the IMAP SEARCH UID term. You can use the bound of "*" in a uid range term to indicate the upper bound matches through the largest uid available in the folder. See IETF RFC 3501 INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4rev1 for details.

Within a parenthesized list term, you may use the reserved words AND, NOT, and OR in addition to the corresponding + or - prefix characters to combine terms. These reserved words must be uppercase; they are helpful for readability.

The hostname and username field names may only appear once at the beginning of the search query, and may not appear in a parenthesized list. Range term values (that is, using "{ }" and "[ ]" with TO to specify a range of values) are not permitted within parenthesized expressions.

Starting in Indexing and Search Service 1.0.15.19.0, range terms containing only uid or only received field names may appear in parenthesized expressions. No term with any other field name may appear in such a parenthesized expression.

Within any single parenthesized list term, only field names of the same "kind" may appear. These kinds are defined by specific function as follows:

    folder terms            - the "folder" field name only
    flag terms              - the "answered", "deleted", "draft", "flagged",
                              and "seen" field names
    meta terms              - the "received", "sent", "uid", "uidvalidity",
                              and "userflags" field names
    generic content terms   - "body", "text", and "attachgroup-contents"
    content terms           - all other field names

The value of a term with a field name may also be a parenthesized list of values, for example:

    +folder:(INBOX Trash Sent)

This is a shortened form of the following equivalent form:

    +(folder:INBOX folder:Trash folder:Sent)

This will match results in any one of the specified folders. In this form, the list of values must not have field names specified, because all values share the same field name.

Term Modifiers

The single string and phrase terms may contain modifiers which can be used for more complicated matching.

A single string term value may contain the special characters "?" and "*" to perform single character and multiple character wildcard matching respectively. For example, the following term

+contents:te?t

will match the words "text" and "test" and any other single character in the third position of a word of this form. The following term

+contents:test*

will match the words test, tests or tester, and so forth. Multiple wildcard characters can be use in combination to match more complicated strings. Note: wildcard characters as the first character of the string can be very expensive to match, so this feature can be enabled and disabled via the configuration parameter iss.searchsvc.leadingwildcard.enabled (default to enabled).

The single string term may also be appended with a trailing "fuzzy search" modifier. This allows for matching words that are similar in spelling. For example, the following term

+contents:test~

will also match the words rest, nest, and tests.

The phrase term can be modified with a trailing "proximity search" modifier, finding words that are a within a specific distance away. For example, the following term

+contents:"java code"~10

will match the words "java" and "code" within 10 words of each other in the field.

Other Search Features

Various email clients have search criteria for relative dates such as "yesterday", "past month", "age in days", and so on. Each of these must be mapped into a sent: or received: term based on the specific date (such as "yesterday") or on a range of dates (like "past month"). It is the client's responsibility to generate the correct yyyymmdd form for each term.

For example, to search for all emails received this month, compute the date and use the year and month with the day in a wildcard match:

    +received:201103??

which matches all emails received in March 2011.

To search for all emails received in the last 90 days, compute the date as of 90 days ago, and use a range term to select all emails since then:

    +received:[20101223 TO 20110322]

which matches all emails received from December 23, 2010 through March 22, 2011 inclusively.

Labels:
indexsearchservice indexsearchservice Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Sign up or Log in to add a comment or watch this page.


The individuals who post here are part of the extended Oracle community and they might not be employed or in any way formally affiliated with Oracle. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Oracle nor any other party necessarily agrees with them.