This information describes how to troubleshoot your Indexing and Search Service deployment.
Topics in this section:
- Oracle Communications Messaging Server IMAP Log
- Indexing and Search Service infraprov Log
- Indexing and Search Service Service Management Facility (SMF) Logs
- Messaging Server and Indexing and Search Service IMQ Broker Logs
- Indexing and Search Service GlassFish Server Log
- Indexing and Search Service Messaging Server JMQ Event Consumer Log
- Indexing and Search Service Messaging Server JMQ Consumer Statistics Log
- Indexing and Search Service Index Service Log
- Indexing and Search Service Index Service Statistics Log
- Indexing and Search Service Search Service Log
- Indexing and Search Service Search Service Statistics Log
- Indexing and Search Service Utility Service Log
When to use: If bootstrapping of accounts is failing, authentication might be a problem. Check the Messaging Server imap log. If your read-only message store administrative user (mail.imap.admin.username in the /opt/sun/comms/jiss/etc/jiss.conf file) is the problem, you might see errors similar to the following:
If the password of that user is incorrect (mail.imap.admin.password in the /opt/sun/comms/jiss/etc/jiss.conf file), you might see errors similar to the following:
Also, ensure that this user is specified in store.indexeradmins (to give read-only message store admin rights) and that you have restarted the imapd process, to pick up this change. If this is an issue, you might see errors similar to the following:
You need to add this read-only message store administrative user manually (that is, it is not automatically created) during the ISS installation process. See Installation Scenario - Indexing and Search Service for more information.
When to use: To check for problems during installation and initial configuration. infraprov commands log to this file.
When to use: If you receive the following error when trying to start the ISS service:
To determine the cause for indexSvc entering the maintenance state, check the /var/svc/log/application-jiss-indexSvc:default.log log file. There can be several possible causes, one of which can be that the GlassFish Message Queue broker may not be running.
After determining the cause, fix the underlying issue (for example, start the JMQ broker), and run the svc_control.sh stop command (to clear the maintenance state), followed by the svc_control.sh start command again.
When to use: To investigate problems, such as authentication failures, from the IMQ brokers on either the Messaging Server or the Indexing and Search Service system.
For more information, see Indexing and Search Service IMQ Log Messages.
When to use: Verify that GlassFish Server is receiving, authenticating, and servicing requests.
If you see a query from the Messaging Server that is refused, ensure that the originating IP address is correctly specified in the mail.server.ip parameter in the jiss.conf file. If you do not see a query as expected from Messaging Server in the ISS GlassFish Server log, check the settings of the service.imap.indexer.* configutil parameters. If those settings are correct, try issuing a query directly from the Messaging Server and check the ISS GlassFish Server log again, for example:
# telnet isshost.example.com 8080 Trying 10.10.10.10... Connected to isshost.example.com. Escape character is '^]'. GET /rest/search?q=%20%2busername:user1%20%2Bhostname:mailhost.example.com&contentformat=simpleuid&format=atom HTTP/1.0 HTTP/1.1 200 OK X-Powered-By: Servlet/2.5 Server: Sun GlassFish Enterprise Server v2.1.1 Patch16 Content-Language: * Accept-Ranges: bytes Content-Type: text/xml;charset=UTF-8 Content-Length: 1781 Date: Thu, 14 Nov 2013 22:24:27 GMT Connection: close <?xml version="1.0" encoding="UTF-8"?> <feed xmlns="http://www.w3.org/2005/Atom" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/"> <title>isshost.example.com search: +username:user1 +hostname:mailhost.example.com</title> <link>http://isshost.example.com/rest/search?q=+username:user1%20+hostname:isshost.example.com&format=atom</link> <updated>Thu Nov 14 22:24:27 UTC 2013</updated> <author><name>Oracle, Inc.</name></author><id>urn:uuid:9999999999999</id> <opensearch:totalResults>39</opensearch:totalResults> <opensearch:startIndex>0</opensearch:startIndex> <opensearch:itemsPerPage>10</opensearch:itemsPerPage> <opensearch:Query role="request" searchTerms="+username:user1 +hostname:isshost.example.com" searchPage="-1" /> <link rel="search" type="application/opensearchdescription+xml" href="http://isshost.example.com" /> <entry> <folder>Trash</folder> <uidvalidity>1195182036</uidvalidity> <id>1</id> </entry> <entry> <folder>Trash</folder> <uidvalidity>1195182036</uidvalidity> <id>2</id> </entry> <entry> <folder>Trash</folder> <uidvalidity>1195182036</uidvalidity> <id>3</id> </entry> <entry> <folder>Trash</folder> <uidvalidity>1195182036</uidvalidity> <id>4</id> </entry> <entry> <folder>Trash</folder> <uidvalidity>1195182036</uidvalidity> <id>5</id> </entry> <entry> <folder>Trash</folder> <uidvalidity>1195182036</uidvalidity> <id>6</id> </entry> <entry> <folder>Trash</folder> <uidvalidity>1195182036</uidvalidity> <id>7</id> </entry> <entry> <folder>Trash</folder> <uidvalidity>1195182036</uidvalidity> <id>8</id> </entry> <entry> <folder>Trash</folder> <uidvalidity>1195182036</uidvalidity> <id>9</id> </entry> <entry> <folder>Trash</folder> <uidvalidity>1195182036</uidvalidity> <id>10</id> </entry> </feed> Connection to isshost.example.com closed by foreign host.
For more information, see Indexing and Search Service Application Server Log Messages.
When to use: Verify the status of the Messaging Server event consumer (jmqconsumer) to ensure that real-time updates are working (that is, arrival of new mail, deletion of mail, new folder creation, and so on).
The mail store JMQ broker is only accessed by jmqconsumer. JMQ broker connection problems with the Messaging Server message store appear in this log file. The following messages show that the JMQ broker is functioning:
For more information, see Indexing and Search Service JMQ Consumer Log Messages.
When to use: To check JMQ event statistics related to performance and patterns of service over long periods of time.
For more information, see Output of listbacklog Option.
When to use: To check issues with indexing during bootstrapping or during the indexing of new real-time events from Messaging Server.
For more information, see Indexing and Search Service Index Service Log Messages.
When to use: To check indexing statistics related to performance and patterns of service over long periods of time.
For more information, see Output of liststats Option.
When to use: To check on the arrival and servicing of search requests.
For more information, see Indexing and Search Service Search Service Log Messages.
When to use: To check search statistics related to performance and patterns of search requests over long periods of time.
For more information, see Output of liststats Option.
When to use: To check on the arrival and servicing of utility requests.
For more information, see Indexing and Search Service Utility Service Log Messages.
ISS provides various command-line tools to manage the ISS store. The primary tool is the issadmin.sh script. Other tools listed in this section are intended for specific problems. See Indexing and Search Service Command-Line Utilities for more details.
When to use: If an individual index directory is corrupted. Checks for consistency and corrects some types of errors at the Lucene data record level.
When to use: To inspect internal database structure at the Lucene data record level. This script runs from the command line and supports commands to search and display records in an individual index directory. Its features are similar to those of the luke.sh command.
When to use: To inspect internal database structure at the Lucene data record level. This script runs a graphical user interface (GUI) interface that corresponds to lucli.sh. To enable this script, you must manually download the jar files.
When to use: To manipulate index directories directly at the Lucene file level. This script is rarely used, because it requires detailed knowledge of internal store structure.
When to use: To search accounts by using the command-line interface. Can be used to verify if an account contains the expected data from search queries that have failed.
When to use: To diagnose general problems, to recreate lost data from single or multiple accounts, and to list, create, delete, check, and sync accounts and folders.
When you suspect a problem with an account, for example, after finding an error message in the log files, you can print a summary of the account structure by using the --accountinfo command option.
Output from this option includes the name of account, the group to which it is assigned, the disk space used by its group, the state of the account, the time that the account was created in the ISS store, the time that the state of the account was last changed, counts of the number of emails in each folder and total for the account, and specific information about each folder by name (including the number of emails found containing attachments as indicated by the at: field, the number of emails that produced attachment store files, typically thumbnails used for client display or content files under some configurations, and other helpful information).
The names of the folders are case sensitive and should display as seen in the email client. If any name appears as a string of question marks, this usually indicates a problem with displaying other character sets (such as Korean or Chinese) on your terminal. Make sure that your terminal is configured to display UTF-8, for example:
In addition to the information for the specific account, the header of the output displays some global characteristics of the store. The total size, total search, and total index values are always zero at this time. Most of the other values are easy to understand. If any appear inconsistent (such as the number of accounts do not match the number you created less those you deleted), then you should investigate using the --listbrief and/or --listaccounts options. (These commands are more expensive and can take several minutes to complete when the store contains many accounts.) Special note about the dIndex memory locked field: this should usually be false, unless at some point you used the --lockmemoryindex option. Take special care using this option, because it can cause unexpected behavior in the store and search.
The email counts can have two forms: a single integer, or a pair of integers separated by "/". The single integer form is the number of emails currently in the folder. The first integer of a pair also indicates the number of emails currently in the folder. The second integer might be smaller or larger than the first. Its presence indicates some emails were copied to or from the folder, and it is the number of emails whose copied contents is still associated with the folder. Immediately after an account is bootstrapped, --accountinfo should not show any folder with a count using the pair form. This would indicate some kind of problem with the bootstrap procedure. (After a user has been manipulating the contents of folder of an account, the pair form could be due to normal copying or moving message between folders.) If you suspect a problem, use the --checkfolder or --checkaccount option to verify there is no problem with the content of the account.
Output from this option shows whether the account information in the index matches (or is "in sync") with the information in the Messaging Server message store that the index depends upon. (The --checkfolder option performs the same kind of checking on a single folder.) If you suspect that an account has a problem, the --checkaccount option output indicates which folders are not in sync. If the number of emails in any folder does not match the Messaging Server message store count, use the --sync option with the --checkaccount option to perform on-demand account update. This usually syncs the account within minutes.
The --checkaccount option updates the number of emails in the index to match the Messaging Server message store, but does not perform a detailed verification of all information in the index against the Messaging Server message store, as this is more time and resource intensive. Add the --detail modifier to the --checkaccount option to perform a detailed check. The "FLAGS" level also checks for any email flag information that is out of sync, and when used with the --sync option, it corrects such problems. When you specify a level of "STATUS" or "FULL", the --detail modifier likely produces much more output. For example, you would see any indexing problems found with individual emails in the account or folder. Many of these "STATUS" messages result from attachments containing data that cannot be interpreted properly. Sometimes the data format is inconsistent or other limitations on the indexing conversion process occur. These messages cannot be avoided when using the --sync option. These messages represent limitations on the kind of data that can be indexed and searched and might explain why some search queries are not able to return results as expected.
However, the output from the --detail STATUS modifier can indicate other problems with the indexed data that can be corrected. If problems persist with the account after using --sync, examine them for clues about what might be wrong.
Many problems with accounts can be corrected by using the --sync option, but not all. If --sync fails to correct the account, there might be an internal consistency problem which requires other approaches. If the problem appears to be local to a specific folder, then using the --deletefolder option on the problem folder, followed by a bootstrap (using the --bootstrap --folder option to issadmin.sh), might correct it. Each folder can be corrected in turn, or, in the worse case, you would need to use the --deleteaccount option and rebootstrap the entire account. If you need to perform such repairs, it is best to start by using the --setstate X option to ensure the account index is offline from the real-time event updates while you delete and rebootstrap folders or the account.
Output from this option describes the overall state of the store, in contrast to the --checkaccount and --checkfolder commands, which describe information about the structure of individual accounts. There are two parts to this information: one part is generated by the --detail DINDEX modifier and one by the --detail STORE modifier. Both parts appear when the --detail FULL modifier is used.
The output from using the --detail DINDEX modifier contains information about the master dIndex directory alone. The command checks for the records describing accounts and groups, looking for duplicate or missing records, and comparing their information for internal consistency between records.
The output from using the --detail STORE modifier contains information about the index store that holds the individual account group index directories. The command checks the information in the dIndex against what it finds in the index store directories for extra or missing group index directories, comparing the dIndex and store directories for consistency.
If you expect the output from the --checkstore command to be quite verbose, sometimes the --altoutput FILE option is required to avoid problems when using --checkstore with a store containing a very large number of accounts.
You can use the --checkstore command while the services are running or not. Because the --checkstore command might conflict with indexing writes, it is not recommended using it while services are running: results could be misleading. If the output shows a problem, you likely will need to stop services before it can be corrected. Local account problems can usually be corrected with the --checkaccount and --sync options without disrupting services for the rest of the system. However, if serious consistency problems appear in many accounts requiring extensive rebuilding of the store, schedule such work when the services can be shut down.
The --sync option can be used with the --checkstore --detail DINDEX options to correct inconsistencies found in the dIndex. The --sync option might update the dIndex, and so is not permitted while the services are running. After running with --sync, check the output of the --checkstore command again. Some problems might not have been corrected by --sync, and require further intervention.
Commands such as issadmin.sh invoke services in the IndexService that is a separate running process. (The issadmin.sh command can also be run when the IndexServices are not running.) When these commands are executed, they submit tasks to the IndexService, which performs the work, then they wait for the response. If you interrupt one of the commands (by using either Control-C at the command line or the kill command), the submitting process is stopped, but not the services that are active in the IndexService. These services continue to run to completion even though you have killed the submitting process. Some commands, like bootstrapping a large account or various uses of the --accountlist option, also can take a long time to complete, and you might need to interrupt these requests.
To stop a service in the IndexService after you interrupt a command, use the --listactiveservices option to determine which services are still active in the IndexService. Typical output for a host running Indexing and Search Service 1 Update 4 after interrupting a --checkaccount --sync command might look like this:
# ./issadmin.sh --listactiveservices Fri Aug 24 00:29:16 GMT 2012 active index services: a:28323:checkaccount:sync:mailhost.example.com:jennifer:javaone2009:edoc:0:1:Fri_Aug_24_00_28_45_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:2000:2001:Fri_Aug_24_00_29_13_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:3000:3001:Fri_Aug_24_00_29_13_GMT_2012:w a:Autosync:checkaccount:sync:mailhost.example.com:amam:INBOX:Fri_Aug_24_00_00_12_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:5000:5001:Fri_Aug_24_00_29_13_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:1000:1001:Fri_Aug_24_00_29_13_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:Fri_Aug_24_00_28_45_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:0:1:Fri_Aug_24_00_29_12_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:javaone2009:Fri_Aug_24_00_28_45_GMT_2012:w a:28323:checkaccount:1:jennifer:mailhost.example.com:Fri_Aug_24_00_28_44_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:4000:4001:Fri_Aug_24_00_29_13_GMT_2012:w a:Autosync:checkaccount:sync:mailhost.example.com:amam:INBOX:edoc:0:189816754:Fri_Aug_24_00_11_44_GMT_2012:w Fri Aug 24 00:29:23 GMT 2012
This example shows that many threads are still active: Each line of output is a separate thread. Names that end with ":w" indicate that these threads are potentially writing to the index. (The a: indicates an issadmin.sh command. The number is a process ID used to identify all threads created under the same issadmin.sh command. Other fields are command specific. Threads created from the autosync and other features are also listed.)
By using the --stopservices command, you can stop any one of the named threads. The following command stops a single thread:
However, usually you would want to stop all threads in a group. The following example stops all threads whose names start with a:28323:
Any string that ends with a colon (:) can be used as a prefix/wildcard.
The --stopservice command generates the following output:
# ./issadmin.sh --stopservice a:28323: Fri Aug 24 00:29:43 GMT 2012 stop of a:28323: : a:28323:checkaccount:sync:mailhost.example.com:jennifer:javaone2009:edoc:0:1:Fri_Aug_24_00_28_45_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:2000:2001:Fri_Aug_24_00_29_13_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:3000:3001:Fri_Aug_24_00_29_13_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:5000:5001:Fri_Aug_24_00_29_13_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:1000:1001:Fri_Aug_24_00_29_13_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:Fri_Aug_24_00_28_45_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:0:1:Fri_Aug_24_00_29_12_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:javaone2009:Fri_Aug_24_00_28_45_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:1:jennifer:mailhost.example.com:Fri_Aug_24_00_28_44_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:4000:4001:Fri_Aug_24_00_29_13_GMT_2012:w writing service not cancelled; marked to stop Fri Aug 24 00:29:53 GMT 2012
Note that threads that are "done" stop immediately. To avoid corrupting the index data, threads marked with ":w" must stop when they next reach a point where the index is no longer being written. This can take a bit longer, but the threads usually stop fairly quickly.
However, because the --sync command continues to run while you are using these commands, more threads might be created by threads that have not stopped yet. Therefore, you need to repeat the --listactiveservices and --stopservices commands, perhaps several times:
# ./issadmin.sh --listactiveservices Fri Aug 24 00:29:58 GMT 2012 active index services: a:Autosync:checkaccount:sync:mailhost.example.com:amam:INBOX:Fri_Aug_24_00_00_12_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:3000:3001:Fri_Aug_24_00_30_07_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:Fri_Aug_24_00_28_45_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:javaone2009:Fri_Aug_24_00_28_45_GMT_2012:w a:28323:checkaccount:1:jennifer:mailhost.example.com:Fri_Aug_24_00_28_44_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:2000:2001:Fri_Aug_24_00_30_07_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:0:1:Fri_Aug_24_00_30_07_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:4000:4001:Fri_Aug_24_00_30_07_GMT_2012:w a:Autosync:checkaccount:sync:mailhost.example.com:amam:INBOX:edoc:0:189816754:Fri_Aug_24_00_11_44_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:5000:5001:Fri_Aug_24_00_30_07_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:javaone2009:edoc:0:1:Fri_Aug_24_00_30_00_GMT_2012:w a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:1000:1001:Fri_Aug_24_00_30_07_GMT_2012:w Fri Aug 24 00:30:08 GMT 2012 ./issadmin.sh --stopservice a:28323: Fri Aug 24 00:30:31 GMT 2012 stop of a:28323: : a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:3000:3001:Fri_Aug_24_00_30_07_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:Fri_Aug_24_00_28_45_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:javaone2009:Fri_Aug_24_00_28_45_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:1:jennifer:mailhost.example.com:Fri_Aug_24_00_28_44_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:2000:2001:Fri_Aug_24_00_30_07_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:0:1:Fri_Aug_24_00_30_07_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:4000:4001:Fri_Aug_24_00_30_07_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:5000:5001:Fri_Aug_24_00_30_07_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:javaone2009:edoc:0:1:Fri_Aug_24_00_30_00_GMT_2012:w writing service not cancelled; marked to stop a:28323:checkaccount:sync:mailhost.example.com:jennifer:ubuntu-art:edoc:1000:1001:Fri_Aug_24_00_30_07_GMT_2012:w writing service not cancelled; marked to stop Fri Aug 24 00:30:45 GMT 2012 ./issadmin.sh --listactiveservices Fri Aug 24 00:30:48 GMT 2012 active index services: a:Autosync:checkaccount:sync:mailhost.example.com:amam:INBOX:Fri_Aug_24_00_00_12_GMT_2012:w a:Autosync:checkaccount:sync:mailhost.example.com:amam:INBOX:edoc:0:189816754:Fri_Aug_24_00_11_44_GMT_2012:w Fri Aug 24 00:30:54 GMT 2012 ./issadmin.sh --listactiveservices Fri Aug 24 01:11:38 GMT 2012 active index services: none Fri Aug 24 01:11:40 GMT 2012
Eventually all threads stop, as seen when --listactiveservices shows no more threads running. In this case, the --sync command did not complete, so the state of the account and its contents are not as expected. You can proceed by using commands like --accountinfo to see what the account looks like and clean it up. Console output from the original --sync command is not produced, because the output is lost when you interrupt the command.
You might also check the index directories of the account for write.lock files. Depending on when the interrupt occurred, these files might not be cleaned up properly.
If you find that simple issadmin.sh commands (such as --liststats) hang and do not return in a few minutes, the IndexService might have failed in a way which requires you to shut down the services and restart them by using the svc_control.sh command. Both indexing and search service will temporarily be suspended.
If you detect a hanging condition, you can use the following commands to help diagnose the problem before shutting down services:
- To show the process ID (pid) of the IndexService component, run the jps command.
- Save all output to a file for later examination:
- To identify any index group directories that were open at the time of failure, run a command like the following:
- Stop the services by using the svc_control.sh stop command.
The output from these commands will identify groups of accounts which might be causing problems. After shutting down the services, remove the write.lock files for each group and find accounts that might be affected by using issadmin.sh commands.
- If possible, leave the services offline while you make repairs, then bring up the servers after repairs are complete.
- If you need to restore services quickly, use the --setstate I option to disable accounts that might be causing problems, then check these accounts after you restart the services to see if they need repair. Use the --accountinfo option on each account in such groups and look for any reason for the hanging.
After you restart services, if the --accountinfo header information does not look reasonable, or the hanging condition or other problems recur, the critical internal data structures might be damaged. For more information, refer to Disaster Recovery.
The information in an ISS Store instance is organized into accounts. In the email environment, each account in the ISS store uses the same user name as the corresponding account in the Messaging Server message store from which its content is derived.
The following characters are invalid in ISS account names:
If any of these characters appear in an account name, ISS does not create or bootstrap the index for that account, and search queries for such an account fail. For more information on account name restrictions in Messaging Server, see Message Store Valid UIDs and Folder Names.
This section describes how to migrate your ISS accounts from Java 6 to Java 7.
Topics in this section:
- About Migrating from Java 6 to Java 7
- Tools for Java 7 Migration
- Java 7 Migration Example
- Configuration Considerations for Java 7 Migration
- Selecting Appropriate Autosync Configuration Values
- Selecting Autobootstrap Configuration Values
If you used Java 7 to index the Indexing and Search Service store, then you do not need to to migrate your ISS data. You can continue to update to future versions of Java 7, but you should not use any version of Java 6 with ISS.
The Java 7 release changed the Unicode representation of character strings compared to Java 6. If you created your Indexing and Search Services store using any Java 6 version, then updating to Java 7 may potentially cause some search queries to produce inaccurate results. For example, the search queries might return too few or too many matches, because of the changed data representation. To avoid this situation, you must reindex all or part of the ISS store data using Java 7. If you do not know what parts of the data are impacted, then you must reindex all accounts in the store.
Reindexing the ISS store can be very time consuming and disruptive to a production system. Beginning with version 188.8.131.52.0, ISS provides tools to identify which data you must reindex. Depending on your data, the number of accounts impacted might be only a fraction of the total. These tools also help you to manage indexing these accounts to minimize the Java 7 changes.
These ISS tools enable you to migrate to Java 7 by:
- Using the autosync process to identify which accounts must be reindexed under Java 7
- Halting all services, reconfiguring to use Java 7, and restarting services
- Submitting the accounts to be indexed using Java 7 to the autobootstrap process
The first and last steps can take a long time depending on how many accounts and how much data must be analyzed or indexed. The problem of inaccurate search results is limited to the time between restarting the service under Java 7 and the completion of the autobootstrap for each individual account. In this way all accounts are available for searching while ISS performs the corrections for Java 7.
ISS 184.108.40.206.0 includes the following new configuration parameter to identify which accounts to reindex for Java 7:
When you set the value of this parameter to true, the autosync processing checks for Java 7 character set problems in each folder in each account, and the folder state to be recorded in the index. When this parameter is set to false (the default), no checking for Java 7 problems occurs, and no folder state updates occur. The folder state indicates whether any email in the folder must be reindexed under Java 7.
You must enable ISS autosync for the iss.indexsvc.periodic.autosync.deepcheck.enable parameter to have effect. When iss.indexsvc.periodic.autosync.deepcheck.enable is set to true, autosync runs normally but performs the extra checking for Java 7 character set problems. This checking causes autosync to take at least twice as long as usual, and possibly much longer, to complete.
ISS 220.127.116.11.0 includes the following new option for the --checkstore command to monitor the progress of the deep checking:
This command collects and summarizes the information recorded in the folder state fields across all accounts. The output includes lines that you can extract to create a file suitable for use with the --setautobootlist FILE command. You can run this command whether deepcheck is enabled or not. Its output shows how much the autosync has completed and how much more indexing is needed before the Java 7 migration is complete. See Java 7 Migration Example for more information.
After you create the list of accounts that needs to be indexed under Java 7 (by using the --checkstore --detail DEEPCHECK command), you submit the accounts to the autobootstrap queue for reindexing using the following additional option:
FILE is the file of accounts to be reindexed in the usual --accountlist format as extracted from the --checkstore --detail DEEPCHECK output. The --reboot option informs the autobootstrap processing to delete the account if it already exists before reindexing it. In this manner ISS can still search the prior version of the account while it is waiting to be reindexed.
The following example shows how to use these ISS 18.104.22.168.0 tools to migrate an ISS store from Java 6 to Java 7. This example assumes that you have created the ISS store and it is still running under Java 6.
- Run the following command to determine how much data needs to be analyzed:
The output of this command resembles the following:
- Look at the section titled "Checking folder status" for the total folders in all accounts in the store, and those with status values marked.
Because the deep checking has not yet begun, no folders have status yet. As the autosync checks each folder, these counts change over time.
- Look at the last section titled "deepcheck folder analysis" for the status by accounts, which also shows that no status has been recorded.
The information in this section grows as accounts are found that require reindexing.
- Start the deep checking by setting the following configuration parameter:
The autosync configuration parameters must also be enabled when you refresh the configuration. See Selecting Appropriate Autosync Configuration Values for information on sizing the autosync configuration to reflect the increased overhead that the deep checking incurs.
- As the autosync checks the accounts, run the --checkstore command again to watch the progress.
Expect at least twice the usual time for autosync to process all the accounts, and perhaps much longer if the accounts contain a great deal of data. For an ISS Store containing 100,000 accounts, it could take several days to deep check every account. After the autosync has checked every account, the process continues as new email and other changes to the accounts occur. Running the --checkstore command at this point produces output similar to the following:
- Note how the "folder status" has changed.
All folders have a status now, and some "have Java 7 update issues." At this point there are no folders marked as "Java 7 safe" or "Java 7 indexed" because the reindexing has not yet begun.
- Note how the "deepcheck folder analysis" section has changed.
The accounts "clear of problems" do not need to be reindexed. They do not contain any data that requires them to be rebootstrapped. The accounts "with Java 7 update issues" each need to be reindexed under Java 7 to avoid search problems. The other totals should be zero after all accounts have been checked.
However, you might see a small number of accounts or folders that still show "no status" after the autosync has completed processing all accounts. This could be due to accounts being added since the autosync cycle began, or accounts that are not in the Active state. (Autosync only processes accounts in the Active state, so if any accounts are in the Inactive, Bootstrap, or Unknown states, you should use the --listbrief command to identify them. You can then either change them to Active for autosync to find later, or delete them if unneeded, or keep them for manual processing after the Java 7 update.)
The remaining part of the output shows the individual accounts that have Java 7 issues requiring reindexing. For each account, the number and names of the folders in which problems are detected are listed. This can give you a feel for how frequently problems were detected.
- To generate the file of accounts to be submitted to autobootstrap, find the string ";;;" by using the grep command. Extract only the account records needed for the --setautobootlist FILE command into a file.
- Once you have the file of accounts to reindex, you are ready to update to Java 7. Install Java 7 in addition to Java 6 on your host so that it is ready when you want to switch, and not delay the restart of the services.
- Run the following command to set up the accounts to autobootstrap before you shut down ISS services:
Make sure that the following parameter is also set:
Setting this parameter causes the list to be saved across the server restart. Then, after your restart ISS services, you only need to set the autobootstrap.enabled parameter to true and the bootstrapping begins.
- After you shut down ISS services, you only need to change the following jiss.conf parameter to indicate the change to the Java 7 install:
- After making this change, restart ISS services.
If you preset the accounts in the autobootstrap queue, you should also enable the autobootstrap before restart.
At this point all the accounts are using Java 7, but only the accounts that need reindexing should show any difference, and only if a search query happens to occur that triggers a character representational problem. From this point on, do not configure the store to use Java 6.
- The autosync and deepcheck are still enabled after the update to Java 7. The folder status continues to be updated. Use the --checkstore command to monitor the progress of the reindexing.
The following output shows Java 7 reindexing:
- Note the totals in the "folder status" are shifted to "Java 7 safe" and "Java 7 indexed" indicating folders had no previous problems or have been reindexed using Java 7 respectively.
- Note how the "deepcheck folder analysis" counts have changed, and the list of accounts needing reindexing has shrunk as the bootstrapping proceeds. Eventually all the accounts submitted for autobootstrap are indexed, and the --checkstore output resembles the following:
All folders and accounts show either "safe" or "indexed", indicating the update to Java 7 is complete.
- Disable the deepcheck by setting the following parameter with the --refresh command to reduce the overhead of the autosync:
If the counts for other than "safe" and "indexed" are not zero, then there might be accounts that have not been corrected because they were not Active (as noted previously). Use the --listbrief command to identify any accounts that you might need to delete and reindex manually.
The previous example involved a small number of accounts (55). A production ISS store likely contains several thousand times more accounts and data, and so the --checkstore command might take a few minutes to complete instead of seconds as in this example. The time needed to deep check all accounts, and to reindex all accounts in which a Java 7 problem is detected, will also be long depending on the size of your data and how many problems are detected. Therefore, take care to configure the autosync and autobootstrap parameters to keep the overhead down and enable normal operation of ISS services. The following section provides guidelines for deciding how to balance the costs of the Java 7 upgrade with normal server operation.
Use the following guidelines to determine appropriate autosync configuration values for your deployment.
- Check log files. Before enabling the deep check feature, examine the IndexSvc log files for an indication of the current autosync overhead. If autosync is already enabled, and the log level allows for INFO messages, look for messages containing the phrase "findAllCandidates: time to create list." These messages occur when autosync has completed a cycle of all accounts and generated a new list to start the process again. The time stamp of such messages can help you determine approximately how long it takes the server to cycle through all the accounts for autosync using current configuration parameters. This time duration is less than the lower bound for deep check processing, because deep checking is much more resource intensive that the typical autosync processing. This gives you an approximation for how long the deep check might take to complete.
- Process accounts in parallel. The deep checking overhead is significant on both ISS and Messaging Server. Each email in each folder of each account must be scanned for Java 7 migration issues. Thus the load on the Messaging Server is comparable to a full bootstrap of the account, and somewhat less on the ISS server. Spread this load out over time by adjusting the configuration parameters to process only a small number of accounts in parallel at one time, to avoid incurring delays in the normal ISS search services. As shown in the previous example, use the --checkstore --detail DEEPCHECK command to monitor progress to achieve the right balance between normal ISS processing and the deep check in autosync.
- Set parameters for the first time running autosync. If you have not been running autosync, and do not have a way to estimate the full cycle duration, then set the autosync iss.indexsvc.periodic.autosync.count and iss.indexsvc.periodic.autosync.thread.count parameters to one third the default parameter values in the jiss.comf.template file to begin deep check processing. Observe the overhead and adjust these parameter values as appropriate to the load on the servers.
- Start with small count and interval values. You can modify the autosync parameters in the jiss.conf file by running the issadmin.sh --refresh command without having to restart ISS services. However, the effects of these parameters do not take effect immediately. The current work period must finish before the new values are applied. Thus the "count" and "interval" values should be kept relatively small until you have determined the load on the system so they can be refreshed quickly without incurring a long wait for the current autosync work or interval to complete.
- Run the deep check off hours. Because the deep check processing might take a long time, run it during known times of low system load. The iss.indexsvc.periodic.autosync.deepcheck.enable parameter is refreshable, so you can turn it off or on as the load on the system varies. The work so far completed is recorded in the index, so no information is lost. However, completing the deep check takes much longer if not run all the time.
- When to stop deep check. If, after running the deep check for a while, the --checkstore --detail DEEPCHECK command output indicates a very high rate of accounts that require reindexing (say 80 to 90 percent of all accounts checked), you might consider stopping the deep check. Then you can simply shut down the server, update to Java 7, and rebootstrap all accounts in the index. This avoids the extra overhead of the deep check, and completes the Java 7 update in less time, but at the cost of not knowing which accounts may return inaccurate search results during the period before being indexed using Java 7.
- Create your own bootstrap commands. The --checkstore --detail DEEPCHECK command output includes comments showing those folders in an account which contain data that must be reindexed. Only the folders listed must actually be corrected. To reduce the amount of data to be reindexed, you can create your own list of --bootstrap commands which use --folder NAME option on just the folders indicated. This also reduces the time needed to complete the bootstrap. However, you must generate such a command list manually, which is harder to manage, and perhaps more error prone. This is only recommended in special cases, such as if the --checkstore --detail DEEPCHECK command output shows almost all the accounts have the same single folder (such as INBOX) to be reindexed. You could use a sequence of commands such as the following to reindex each account XXXX, and complete the Java 7 update more quickly than the general autobootstrap procedure outlined previously.
Any accounts which do not follow this pattern would have to be indexed individually based on which folders are needed to complete the update.
Once the deep check processing has produced a list of accounts to be indexed under Java 7, estimate the reindexing time for those accounts to determine what configuration parameters to use for the autobootstrap processing. The account list that you use for the --setautobootlist FILE command can also be used in the --accountinfo command to find how many emails each account contains. However, if the number of accounts is large, this might not be a practical approach.
As with the autosync parameters, start by using relatively small values for the autobootstrap "count," "interval," and "thread.count" parameters, so that you can quickly --refresh them as you monitor the autobootstrap process. The order of the accounts being bootstrapped is roughly the order of the names in your --setautobootlist FILE. Accounts later in the list have a larger likelihood of being searched before being rebootstrapped under Java 7. You can reorder the lines in the --setautobootlist FILE if you have any preference of which accounts you would rather have reindexed first. The order generated by the deep check is random.
After you have submitted the accounts for autobootstrap with the --setautobootlist FILE --reboot command, you can adjust the autobootstrap parameters based on how quickly the various accounts finish and the load on the system. If the autobootstrap load causes service to degrade too much, you can reduce the autobootstrap "count," "interval," and "thread.count" parameter values by using the --refresh command. You can even disable autobootstrapping as you think best. The list of accounts is retained unless you use the --unsetautobootlist FILE command to remove accounts not yet finished bootstrapping. If you "unset" and then later "set" any accounts, remember to use the --reboot option with the --setautobootlist FILE command.
Hardware failures, power loss, or software bugs can corrupt of index store data. As a result, Indexing and Search Services might fail or stop responding.
When problems like these occur, normal operations might resume automatically, or you might need to perform significant intervention. The severity of and recovery from such failures depend on where the data corruption occurs.
The index store consists of two major parts:
- The master directory (dIndex) – Contains information about the organization of accounts
- The account group index directories – Contain data about individual emails and folders in each account
Recovery from failures in each of these parts requires different approaches. The following sections contain general information about disaster recovery approaches, as well as specific scenarios.
The first high-level step for disaster recovery is to ensure that the dIndex is usable.
- Before attempting recovery operations on the dIndex data, make sure that the servers are stopped.
- Look for any write.lock file in the dIndex directory and remove it. The presence of a write.lock file means that dIndex was likely being written when the failure occurred, so it may not be reliable.
- Before starting the services or any issadmin.sh commands, run the following command to see if its structure is consistent:
If no problems are reported, then dIndex might not be seriously effected. If you see any warnings of the form:
then the dIndex needs to be fixed or replaced by a backup copy to restore server functions.
This message is similar to the result of running the checkIndex.sh script on the dIndex.
- If problems are detected, make a backup copy of the entire dIndex directory to preserve the original data. If there are backup files in the store (named dIndex.backupA, dIndex.backupB, and dIndex.backupC), determine which is the most recently changed, and run checkIndex.sh on each to find if any are valid. If the checkIndex.sh output for one or more shows no sign of corruption, then decide whether the most recent backup is sufficiently current that you can use it to replace the corrupted dIndex. Your alternative is to run the checkIndex.sh command with the -fix option again on the dIndex. This action likely causes information loss in dIndex. In either case, using one of the backupA/B/C copies or using -fix probably means some information has been lost. Depending on the nature of the corruption, using a recent backup might be a better approach, since the dIndex was in a known consistent state at the time the backup was created. However, there is no way to tell what data would be missing since the backup was created. (Any number of accounts might need to be made in sync again.)
After these corrections to dIndex are complete, run the --checkstore --detail DINDEX command again. It should show no corruption, but may show other problems if any older backup copies have been incorporated.
- (Optional) If the --checkstore --detail DINDEX command still shows problems (other than corruption warnings), run the following command on the fixed dIndex.
Decide at the prompt whether you want the command to correct the problems it shows. Currently, you can correct only duplicate records for accounts by using the --sync command.
- (Optional) Run the following command on the fixed dIndex to detect whether data has been lost.
If your store contains more that 100000 accounts, the --detail FULL command might take a very long time (hours) to complete, because each individual account index must be checked.
Correct any problems shown in the output before attempting to start services. You might also be able to use the backup copy of dIndex to diagnose what was lost by using the lucli.sh tool, but this process is complicated.
When the data in dIndex is corrupted, the effect might be global or local. In the worst situations, services cannot be started, or basic issadmin.sh commands like --accountinfo, --listbrief, --listaccounts, or --liststats fail to produce expected output or hang.
Corrupted global data can be detected in the header information of the output from many common issadmin.sh commands such as --listbrief. For example, if the number of accounts or groups is reported incorrectly, some critical global data might be corrupted. In this case, the main action you can take for recovery is to recreate (bootstrap) the entire index from Messaging Server store data. This can be a lengthy process, so you might want to take the time to determine if parts of the store can be salvaged.
You can run the issadmin.sh commands whether the Index and Search Service servers are running or stopped. Failures in issadmin.sh commands when the services are not running might indicate if the problem is global or local to only some accounts or groups.
If only some accounts or groups appear to be failing, use the following:
- Disable those accounts using the --setstate I command. If the services start without failure, then the problem might be local and these accounts can be corrected while the rest of the services are active.
- Try using the --deleteaccount or --deletefolder commands on suspicious accounts to see if the services can continue to operate. Sometimes services can be restored after you have removed a few accounts. You can then bootstrap these accounts again.
- Ensure that the rest of the data is intact by using the --checkaccount or --checkfolder commands on every individual account (refer to the --accountlist option).
- If the problems with dIndex appear to be corrected, you can use the --sync command to resolve any other problems in individual accounts.
If some issadmin.sh commands appear to work but dIndex is still corrupted, you can try the following approach to reduce the time that is required to recreate the store:
- Use the --export command to export any accounts that you can.
- Remove the corrupted store and start rebootstrapping.
- Import into the new store any accounts that you can successfully export by using the --import command.
- Use --checkaccount and --sync commands on every account to ensure that the data is current.
If the dIndex data is not corrupted but some accounts appear to be failing, then the data corruption might be localized in a set of account groups. Since each account group contains independent "meta" and "content" index directories, the damage might be limited to specific groups of accounts. You might be able to correct the damage without impacting the rest of the store, as described in the following steps:
- With services stopped, start by searching for write.lock files in the rest of the store with a command like:
- Remove all the write.lock files that you find. Any group directories identified by such files are candidates for further investigation, because they are likely to be corrupted.
- Use the checkIndex.sh command on each of the pair of index directories in each suspect group, using the same procedure as described in the preceding dIndex corruption section. (The last four digits of the group number are used to find the directories in the store; for example, for group 12345, look for directories with the path 23/45/index12345_meta and 23/45/index12345_content.) The individual group directories do not have backup files like the dIndex, so if checkIndex.sh indicates corruption, that index directory must be fixed using the -fix option, which will likely cause data loss. (Sometimes, after running checkIndex.sh -fix, the group index might still not be clean; if this happens, try running the -fix again. More data will be lost with each such command, but eventually the index should be cleaned. If an index cannot be cleaned by repeated --fix commands, then you must delete all files in that index directory and recreate the group. Refer to the section below about creating a new group for details.) If any of the checkIndex.sh commands indicate that data was lost, accounts in those groups are likely corrupted and need attention.
- Run the issadmin.sh --checkstore --detail STORE command, and check its output for other accounts or groups which may show problems. (If your store contains more that 100000 accounts, the --detail STORE command might take a very long time (hours) to complete.)
- Disable services to each suspect account by using the issadmin.sh --setstate I option. Then use the --accountinfo option, followed by --checkaccount or --checkfolder to diagnose account specific problems.
- If necessary, try to delete any data that appears corrupted by using the --deleteaccount or --deletefolder options, and bootstrap or use the --sync option to try to correct the accounts.
If these attempts fail to correct the accounts, then this might indicate a more global problem with dIndex. The services might be able to run with these specific accounts disabled, but to restore full service to all accounts, you must correct the larger global problem. Running the issadmin.sh --checkstore --detail DINDEX command might help at this point. Some other approaches to consider include the following:
- If you suspect that the problem might be related to a specific account group directory, you might be able to move the accounts in that group into another group by using either the --moveaccount or --export and --import command options. If either the group directories or dIndex is very badly damaged, these commands might also fail, or only transfer part of the account. Always use --checkaccount after moving or importing an account to see if it succeeded. Afterward, the --sync option might be able to correct the account.
- If you are unable to create a new group when trying to move an account, then dIndex is likely corrupted. Depending on how many accounts are configured as allowed per group, you might be able to move the accounts into existing groups, but not into a new group. This can allow the services to be resumed temporarily, but this failure of dIndex can only be corrected by recreating the store again from scratch.
- As a last resort, bootstrapping a folder or account into a different group should restore the data for specific corrupted accounts. Always remember to delete the folder or account from the damaged group directory before attempting to bootstrap again.
Finally, all data corruption might not be evident using these techniques, because the structure of the data is correct but the data values have been modified. Search results might prove inconsistent or wrong. This kind of corruption might be less serious but harder to detect because it will only effect the correctness of individual searches, not the operation of the entire store. You can restore services to the other accounts while fixing the effected folder or account. In this case, you can delete the suspect folders or accounts and bootstrap them while other services are running.
If one or both of the index directories of an account group (the "meta" and "content" directories) cannot be corrected via the checkIndex.sh --fix command, then the only way to correct this group is to replace the contents of the corrupted account group index completely, and all data for all accounts in that group will be lost. To do this, create an empty account group, delete all files in the corrupted index directory, and copy all files from the empty account group into the corrupted directory. The accounts that were in that group can then be managed using the normal issadmin.sh commands, since the knowledge about those accounts still exists in the dIndex, although the account group index will be empty.
A simple way to create an empty account group is to create a nonexistent account in a specific group, and then delete that account. This will create the structure of an empty index directory which can then be copied to correct any number of group index directories that had to be deleted. When copying the empty account group directory, be sure to preserve the owner, group, and access rights of the empty account group directory (as with the "cp -p" command). In order to ensure that the normal default allocation of accounts to groups will not effect the empty group directories, use the --group option with the --createaccount command with a small group number, less than 100. (Group numbers less than 100 are not normally allocated by default, so that they can be used for administration purposes like this without interference from the rest of the services.)
Because recovery from a disaster can be so costly, you might want to perform some routine maintenance functions regularly to make recovery faster. For dIndex, backup copies of the entire dIndex are automatically made; these are named dIndex.backupA, dIndex.backupB, and dIndex.backupC in the same directory as dIndex. These backup directories are rewritten periodically based on the value of the iss.store.account.optimizeinterval configuration parameter. (The path to the most recent backup appears in the header output for --accountinfo and other options of the issadmin.sh command.) These backups might allow you to recover some functionality if critical data is corrupted.
After the original bootstrap of accounts is complete, much of the global critical information in dIndex stays stable. Therefore, you might be able to temporarily replace the corrupted dIndex by a backup version while you check the rest of the system for failures. Depending how long ago the backup was made, you might be able to restore services in this manner for many users temporarily until you can complete a full rebuild of the store. However, this is not a fully reliable solution for the problem. You might need to perform a full rebootstrap of all accounts, although the system might be able to provide services to some accounts while the rest are being repaired.
You can back up individual accounts by using the --export option, and you can use the snapshot of the account to recover lost data by using the --import command. Then, after you have fixed any corruption, using the --checkaccount and --sync commands to correct the data might be quicker than a full rebootstrap of the account. However, the cost of exporting and importing each account in both time and disk space might be greater than is reasonable for large numbers of accounts.
The following sections contain guidelines for how to respond to specific severe problems that you might encounter. The techniques described here might be helpful in other situations where you need to recover from similar problems.
Running out of disk space can create serious problems:
- If the disk storage system containing the ISS store becomes filled, indexing services will start to fail.
- If the disk containing the log files becomes filled, then further WARNING and SEVERE messages will also be lost, so indications of what is wrong might also be missing. Some index data might be lost, and the index directories might be corrupted.
If you discover the disk has become full or is extremely close to full (98% capacity or more), immediately stop the services to prevent failing service requests from corrupting index data. Then recover space on the full disk until it is at least below 95% capacity. Follow the steps outlined in the Disaster Recovery section to determine if data corruption needs to be corrected before restarting the services.
If you detected the disk full problem and shut down services before any damage to the index occurred, you can restore services while you take steps to reduce the capacity of the disk further:
Some actions you can take to reduce the disk space for the store include:
- Placing the log directory on a separate disk
- Locating the export/import snapshot directory on a separate disk
- Removing any defunct or inactive accounts from the ISS store
Using the disk space sizes from the output of the --accountinfo and --listaccounts commands, you can determine whether you should --export any accounts to other ISS store instances based on how much space would thus be recovered.
You might consider investing time to prevent the disk full condition, rather than spending more time and effort in recovering from a badly corrupted store. Monitoring the disk space left, for example with a cron job, would enable you to detect when a problem is imminent and to stop the services before serious damage can occur. If you allow enough margin, such as detecting when the disk is 85% to 90% full, then you can sometimes increase disk space as described previously without shutting down the services. How actively you check for problems like this will depend on how much unused disk space your store usually contains, and how expensive a recovery from the disk full condition would be.
A warning message indicating "OutOfMemoryException" (OOME) might appear infrequently in the index log. This message is usually caused by an email that is unusually large or has a very complicated body structure, which causes an indexing service request to fail (typically the bootstrap or --sync commands). When this failure occurs, it might cause the index service process to hang, requiring you to stop and restart the service.
Whether or not the index service process hangs, the OOME message means that the account indicated in the message has suffered a failure, and the corresponding index data might be corrupted. The general steps to begin recovery are as follows:
- Stop any running commands that affect the account. (Refer to the previous section on Interrupting Commands for details.)
- Check the account group meta and content index directories of the damaged account for write.lock files.
The presence of any write.lock files means that an IndexWriter was opened to the account, and data might have been written into the file when the OOME occurred. This IndexWriter must be cleared and any corresponding write.lock files must be deleted before the account can be corrected.
Shutting down the index service will clear the IndexWriter, but services to all accounts will be interrupted for some length of time. To avoid this, you can try the following general approach:
- To clear the specific IndexWriter that was used, delete any write.lock files from the index meta and content directories.
- To avoid losing all the data in the account index, run the --export command.
- Delete the account.
- Recreate the account and use the --import command to restore the account data.
The account will likely be out of sync with the Messaging Store. It might be wise to create the account in a separate new group to avoid future problems.
- To avoid repeating the failure, you must prevent the email which caused the original OOME problem from being processed. Use the --ignorefolder command and specify the folder containing the offending email. This prevents all emails in that folder from being indexed again.
- Use the --sync or bootstrap commands to recover the remainder of the account data.
Unfortunately, this approach causes the loss of all data in the folder being ignored, including data unrelated to the failure. If you can move any emails which cause OOME to another empty folder on the Messaging Store, then you can avoid the loss of indexing of other emails in the folder. However, this might not be acceptable to your users.
Note that OOME problems might not be repeatable, since memory use varies significantly depending on what else is happening on the system at the time. After the account is indexed with a folder marked ignored, you might want to attempt to --unignorefolder the single folder and bootstrap it again when the system is less loaded, in the hope that the OOME does not recur. This approach might not succeed, and if another OOME occurs, then you must repeat the same recovery process. This might be worth attempting, depending upon what and how much data in the ignored folder will not be indexed.
Also, note that any write.lock found in an account index group directory indicates that there is an IndexWriter actively writing to an index, but only a single IndexWriter is used by all accounts in any group index. If there is only one account assigned to the group, then you know the account causing the OOME is the one effected.
However, if you have assigned multiple accounts to a group, this procedure of recovering from OOME failure might cause interaction with those accounts as well. To ensure the integrity of those accounts, use one of the following approaches:
- Disable all accounts in the group (using --setstate I) and back them up with --export before starting to recover. After recovery is complete, you can restore these accounts and use the --sync command to bring them up to date.
- Alternatively, you can use the --moveaccount command to move these accounts into other groups before attempting the recovery.
Always do --checkaccount after moving or restoring accounts to ensure the integrity of the resulting accounts.
Topics in this section:
- Why is my installation not picking up changes that I've made to the jiss.conf file?
- How can I tell if a mailbox and index are in sync?
- Why am I seeing large time differences or negative time for "time between generate and submit to index svc" in the jmqconsumer log?
- If I run reconstruct on the mail store, will event notifications be generated so that ISS remains in sync?
- How can I check the mail store IMQ broker?
- How can I check for problems with indexing emails or attachments, or generating thumbnail images?
- Is there a way to rotate the ISS logs?
- What does a 403 error mean in the Application Server server.log file?
Run the command:
to see if there are changes not incorporated. Then try the command:
and the --checkconfig command again to see if this corrects the problem. This should update most but not all configuration changes. If it does not, then make sure that you stop and restart ISS after you make changes to the jiss.conf file, for example:
Use the issadmin.sh command to check for synchronization, for example:
If the mailbox and index are not in sync, run the issadmin.sh command with the --sync parameter to fix:
If the Messaging Server system's time and the ISS system's time are not in sync, the jmqconsumer log will show large timing discrepancies. For example, the time when the jmqconsumer logged a message and the time when the message was actually sent could be significantly off or even negative, for example:
To fix this discrepancy, make sure both the Messaging Server system and the ISS system are running an NTP daemon so that their times are in sync.
Event notifications are not generated for actions generated from the Messaging Server reconstruct command. You must manually bring the accounts in sync by using one of the following commands:
Run the imqcmd command against the mail store IMQ broker, for example:
Verify that INDEXMS exists and that producers (from Messaging Server) and one consumer (from ISS) are present.
Use the issadmin.sh command, for example:
Problems might occur due to unsupported attachment types, errors in the attachment file (such as unreadable HTML or a faulty JPEG file that cannot be opened), or errors in the structure of the email. Look at the original email message to determine the specific problem.
Some errors could be due to text or plain attachments that are too large. You can increase the default setting of 2 MBytes by changing iss.indexsvc.attachment.sizelimit in the jiss.conf file. For example, the following entry increases the size limit to 25 MBytes: iss.indexsvc.attachment.sizelimit=25000000
If you increase the value of iss.indexsvc.attachment.sizelimit, then also increase the max heap size for indexsvc. You do this with the java.args setting in the jiss.conf, for example:
At present, you cannot rotate logs on demand. Instead, use the following procedure:
Run the tail command to ensure that ISS is running and writing to the new log, for example:
The following is an example excerpt from the Application Server server.log file.
The last line in this example output shows that a 403 HTTP error (forbidden) was returned. This error occurs if the originating IP address (in this example, 192.0.2.0) is not stored in the mail.server.ip setting in the /opt/sun/comms/jiss/etc/jiss.conf file. ISS currently grants access to the RESTful web service by IP address as a substitute for root user access.
If this error is occurring, make sure that you have correctly entered the IP address for the mail.server.ip setting. You can also enter a comma-delimited list of IP addresses when using multiple servers. If you change one or more IP addresses, make sure to stop and start the ISS services, for example: