View Source

h1. Overview of Oracle Communications Indexing and Search Service

This information provides an overview of the Oracle Communications Indexing and Search Service (ISS), including a description of the product architecture and discussion of the major components.



h2. Indexing and Search Service Software Architecture

The following figure shows the high-level overview of the ISS software architecture.

h5. Indexing and Search Service High-Level Architecture

!Communications Suite Attachments^iss-high_level_architecture.gif|alt="This figure shows the high-level architecture of the ISS product."!

This figure shows that ISS is composed of an indexing service and a search service. The indexing service indexes data repositories in real time. Indexing is provided through web services, enabling you to index arbitrary data. Clients of ISS consume RESTful web services that provide the search capabilities.

{panel:|borderColor=#ccc|bgColor=#FFFFCE}Starting with *Indexing and Search Service 1 Update 1*, attachment files are no longer saved to the attachment store. Instead, search results include body part information for pointing back to the attachment file residing on the Messaging Server store. Additionally, the *Indexing and Search Service 1 Update 1* attachment store now only contains thumbnail images.{panel}

The following figure shows a more detailed look at components of the ISS software architecture.

h5. Indexing and Search Service Detailed Architecture

!iss-architecturev.gif|alt="This figure shows theIndexing and Search Service architecture."!

{info:title=Note}Two JMS servers are present, one for Messaging Server notifications and one for internal ISS communication. In a simple configuration, the JMQ broker for Messaging Server notifications is running on the Messaging Server system and the JMQ broker for ISS is running on the ISS system. Indexing services communicate only to the ISS JMQ server.{info}

{panel:|borderColor=#ccc|bgColor=#FFFFCE}Starting with *Indexing and Search Service 1 Update 2*, you can configure ISS search services for high availability. See [Configuring Indexing and Search Service for High Availability] for more information.{panel}

h2. How ISS Searches Messages

Oracle Communications Indexing and Search Service (ISS) is a general-purpose indexing and searching server for Oracle Communications Unified Communications Suite. Oracle Communications Messaging Server uses ISS to bring search services to any IMAP-based mail client. The Communications Suite Convergence web client and other IMAP clients can use the ISS engine to perform fast, comprehensive, cross-folder searches of message bodies and attachments.

From an architectural standpoint, IMAP clients perform searches on the ISS store by first connecting to the Messaging Server IMAP daemon. The IMAP SEARCH ISS gateway component of Messaging Server (available as of the Messaging Server 7 Update 2 release) determines if the IMAP search should be handled by ISS.

ISS, rather than Messaging Server, performs the IMAP search unless the IMAP SEARCH ISS gateway finds one of the following conditions:

* An ESEARCH extension feature is used in the search.
* None of the search criteria of SUBJECT, FROM, TO, CC, BCC, TEXT, or BODY is specified in the search.
* Any one of the search criteria KEYWORD, HEADER, OLDER, YOUNGER, MODSEQ, ANNOTATION, or RECENT is used in the search.

ISS handles the IMAP search request as long as the preceding conditions are false, regardless of how complex the search might be. Additionally, ISS can handle AND, OR, and NOT operators in the search request.

Additionally, ISS performs the search if the ESEARCH RETURN (ALL) result option is present in the search command. This capability is introduced in *Messaging Server 7.0.5 Patch 30*.

{info:title=Note}Certain conditions also cause Messaging Server to perform the search. For example, most IMAP SEARCH criteria use the same (or similar) name as the field name in a search query to ISS. However, some criteria must be mapped in non-obvious ways. If this mapping is incorrect, then the search is handled by Messaging Server. Also, if a problem occurs while obtaining a response from ISS, the search is handled by Messaging Server as a fallback. For more information on generating correctly formatted Indexing and Search Service (ISS) search queries, see [Indexing and Search Service Query and Sort Criteria Summary].{info}

ISS supports two kinds of searches: regular mail search and attachment search. For attachment search, the ISS {{storeui}} component (at the GlassFish Server level) returns thumbnail images plus links to the actual images in the ISS store.

The ISS architecture enables two ways of conducting a search. Either the mail client communicates directly to ISS by using the RESTful web service (deployed in the GlassFish Server's web container), or Messaging Server communicates to the ISS interface. In a large deployment, you can load-balance the ISS URL by using either a hardware load balancer or a DNS type of load balancing. The load balancer distributes requests to the GlassFish Server instances running ISS. Search queries are posted to a search service JMS topic. At the back end, the search is picked up by the search service consumer that is handling that user. Only the search store instance responsible for that particular user responds. The search service consumer performs the search request and returns the results to the client.

h2. Using Logs to Identify Indexing and Search Service Searches

Because ISS does not see the original IMAP commands, you must use Messaging Server logs to view this information. You can use the Messaging Server telemetry logs to check on some IMAP searches, for example ESEARCH searches. For more information on telemetry logs, see [Checking User IMAP/POP/Webmail Session by Using Telemetry|Monitoring the Message Store#Checking User IMAP/POP/Webmail Session by Using Telemetry]. You can use the host name, user name, and folder information in the ISS logs to match up with the corresponding IMAP command in the Messaging Server logs. For more information on ISS logs, see [Log Files|Indexing and Search Service Troubleshooting#Log Files].

h2. How ISS Indexes Messages

You must bootstrap user accounts to enable ISS to index the user's email.

{panel:|borderColor=#ccc|bgColor=#FFFFCE}Starting with *Indexing and Search Service 1 Update 2*, you run the {{ \--bootstrap}} command on the user accounts to be indexed.{panel}

Bootstrapping triggers the ISS Crawler to connect by using the IMAP protocol to the Messaging Server message store. The Crawler obtains the list of folders for that user, walks through each folder, downloads the email, and adds it to the ISS store.

After initial bootstrapping of accounts, indexing of new messages in the ISS store actually begins when an email message change occurs in the Messaging Server message store. Email events that are significant for ISS include:

* Arrival of a new email message
* Deleting an email message
* Viewing (reading) an email message
* Setting an email message flag
* Creating a new folder
* Moving an email message to a new folder

These events generate JMQ notifications containing the type of change. The JMS Producer (actually the {{jmqnotify}} plugin) posts the notification message to the Event Notification Queue (the {{imqbroker}} that you configure on Message Server). On the ISS side, the JMQ Consumers (MS Event Consumers) are listening to the Event Notification Queue. Events are tagged by the user, that is, the user who generated the event. Thus, the ISS store instance knows how to serve that particular user (knows which store instance that user is on), takes the message, and processes it.

When a user receives a new email message in the message store, an event notification is generated. This event notification attempts to fit the entire text of the email message into its payload (event message) so that ISS can then process all of the new message for indexing. Because the event message attempts to contain all the text of the email message, IMAP processing is conserved. Additionally, Messaging Server does not have to perform extra work on its end, because the email message is in memory when the event happens.

When you configure JMQ, you can set the size of the event message body. Currently, the ISS configuration instructions describe setting the message body at 256 Kbytes. When the message size is larger than the configured sized, the original message must be retrieved over IMAP.

If a user copies a message or sets a message flag, the event notification message contains all the information that ISS needs to update the ISS store. ISS does not need to download any more information to keep its store in sync with the Messaging Server message store.

If the event notification is for a new email message that has arrived in a user's mailbox, ISS passes it to the Parser/Converter for processing. The message is separated into the fields that ISS indexes. Attachments are separated from the body text for processing by the Converter. As long as ISS has a converter for the attachment type, it extracts the "meaningful" text. This is implemented using a plugin architecture, but all plugins are internal to the ISS product. The ISS HTML converter indexes only text outside of HTML tags. That is, the HTML converter ignores HTML markup, and indexes only the content.

In the case of text, PDF, or OpenOffice attachments, the ISS converters translate the format to text content. Additionally, ISS discards stop words such as "the." Only some of the attachments that ISS indexes are actually saved to the attachment store. The reason for this restriction is that some attachments do not have thumbnail images and so it does not make sense to store them. For example, ISS does not store thumbnail images for {{.txt}} and {{.xml}} attachments. ISS does support indexing Microsoft Office documents, including Word, Excel, PowerPoint, and Visio, in the attachment store.

h3. Attachments and Levels of Support

ISS supports many types of attachments, including documents, audio, and video. In addition, ISS provides three levels of attachment support, in which ISS:

* Categorizes the attachment by file type and makes it searchable by type or by its advertised file name
* Extracts the text from the file and indexes it
* Extracts the text from the file and extracts or generates a file-specific thumbnail image

For more information, see [Indexing and Search Service Supported Attachments].

h2. ISS Security and Authentication

The files in the ISS attachment store and index are owned by the ISS user whom you specified during configuration. This is analogous to the Messaging Server user owning the files in the message store. Regular users can read only their own files in the store. They cannot access other users' files.

To search mail in the ISS store, users need to authenticate to LDAP to be able to use the RESTful web service. A second means of authentication is for the Messaging Server host itself to authenticate to ISS through the {{mail.server.ip}} property that you specified during configuration and defined in the {{jiss.conf}} file. This verification grants access to the Messaging Server host or hosts, through the host IP address, to access the RESTful web service.

When securing your ISS deployment, be sure to change the passwords for the JMQ guest and admin users. For more information, see the steps for configuring the GlassFish Message Queue broker in [Preparing Messaging Server for ISS Integration|CommSuite7U1:Preparing Messaging Server for Indexing and Search Service Integration].

If necessary, you can also configure JMQ to use SSL, though this configuration has not officially been tested yet.

h2. Search Query and Sort Criteria

For guidance on generating search queries to ISS, see [Indexing and Search Service Query and Sort Criteria Summary].

h4. About Search Results and Pattern Matching

The search web service allows for pagination of results through the {{start}} and {{count}} parameters, but when the queries come in from Messaging Server, the count is always set to the max (count=2147483647).

You might not see all the search results that you expect because ISS is not doing the search the same way as IMAP does. A partial match does not result in a match unless you provide a wildcard character to the search web service. Currently, this capability is not available through the Messaging Server search integration.

That is, searching for "apple" will not match "apples," but searching for "apple*" does match both "apple" and "apples." Currently, you can use the wildcard if you use the RESTful web service directly and omit the double quotes, but not if you search by using IMAP. Messaging Server puts the terms in double quotes, so if you put "apple*" in your IMAP client, this string is interpreted by ISS as "apple*" and the asterisk \(*) is not interpreted as a wildcard.

To experiment with the richer search experience of "talking to" the web services directly, go to the following URLs:

* {nolink}{{{}http://}}{_}iss-host{_}{{:8080/rest{}}}{nolink}
* {nolink}{{{}http://}}{_}iss-host{_}{{:8080/searchui/}}{nolink}

ISS provides sample searches at these URLs. These searches talk directly to the web service and utilize the thumbnail images.