Killing Defunct/Zombie process in Solaris

When a program is ran within a shell in Solaris (Unix), the shell starts a new process to carry out the work. And, some defunct/zombie processes might be showing as below on server.

$ ps -ef UID PID PPID C STIME TTY TIME CMD root 0 0 0 14:44:49 ? 0:00 sched root 1 0 0 14:44:49 ? 0:00 /etc/init – root 2 0 0 14:44:49 ? 0:00 pageout root 3 0 0 14:44:49 ? 0:19 fsflush root 222 1 0 14:45:33 ? 0:00 /usr/dt/bin/dtlogin root 260 1 0 14:45:37 ? 0:00 /usr/lib/saf/sac -t 300 root 135 1 0 14:45:21 ? 0:00 /usr/sbin/inetd -s root 264 260 0 14:45:37 ? 0:00 /usr/lib/saf/ttymon root 263 260 0 14:45:37 ? 0:00 /usr/lib/saf/listen tcp marco 390 388 0 14:47:43 pts/3 0:00 ksh root 388 135 0 14:47:43 ? 0:00 in.telnetd root 374 222 0 14:47:25 ? 0:00 /usr/dt/bin/dtlogin marco 456 390 0 15:11:37 pts/3 0:00 ./defunct marco 457 508 0 0:00 <defunct> root 472 390 0 14:59:34 pts/3 0:00 ps –ef

For example, a C program creates a child process with the fork() function.  When fork() is called, the   operating system makes an almost identical copy of the original process.  Fork then typically returns the original process and a new child process.

A defunct, or Zombie process, is just that, a process information block that is waiting for the parent process to clean it up.  While defunct processes do not take up any CPU time, RAM or IO, it does consume a process information block, a limited resource.  Ideally a program will clean up after any child processes.

The clean up a zombie process, a parent simply enquires about the state of the process.  Once the operating system sees that parent do so, it removes the zombie process from the process list.  Defunct processes are caused by the parent process not reaping its children.  Find out which process is the parentprocess of all those zombies (ps -ef).  It’s that process in which the problem usually lies.

When a child process terminates, it sends the signal SIGCHILD to the parent process.  So you can clean up after a child process by having a signal handler for the SIGCHILD event.

If there are large number of defunct/zombie processes, system performance is degraded and it may be virtually impossible to start a new login session (including via telnet or SSH) and eventually the system is hung.  To kill these defunct/zombie processes, you can simply issue command below as root.

preap `ps -ef | grep defunct | grep -v grep | awk ‘{print $2}’`



Ellucian Identify Service (Part 3) – Troubleshooting

During the starting process of Ellucian Server, you might experience different errors from $EIS_HOME/repository/logs/wso2carbon.log.

Error 1: “Caused by: java.sql.SQLException: ORA-00942: table or view does not exist”

image

Workaround is to create the missing tables manually.

This can be done by logging into the database as eis_admin and running the three oracle.sql scripts located in:

$EIS_HOME/dbscripts/oracle.sql
$EIS_HOME/dbscripts/identity/oracle.sql
$EIS_HOME/dbscripts/identity/application-mgt/oracle.sql

Error 2: javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)

image

Workaround:

At time of writing, Ellucian Identity Service 1.1 does not support TLS yet. So, SSL need to be enabled.

1. Shutdown Ellucian Identity Service, if you have already started it.
2. From the $EIS_HOME/config directory, run the appropriate command(s) for your operating system:

   Disable TLS, Enable SSLv3
   * Linux/Unix :  ANT_HOME=../apache-ant
                   export ANT_HOME
                   ../apache-ant/bin/ant -f tls_config.xml disable-tls
   * Windows    :  ..\apache-ant\bin\ant -f tls_config.xml disable-tls

-bash-4.2$ ANT_HOME=../apache-ant; export ANT_HOME
-bash-4.2$ echo $ANT_HOME
../apache-ant
-bash-4.2$ . $ANT_HOME/bin/ant -f tls_config.xml disable-tls
   Buildfile: /ellucian-EIS/EllucianIdentityService/config/tls_config.xml
     [echo]
     [echo] Configuring EIS at /ellucian-EIS/EllucianIdentityService/config/../

   disable-tls:
     [echo] Added TLS settings in repository/conf/tomcat/catalina-server.xml

   BUILD SUCCESSFUL
   Total time: 0 seconds

Error 3: LDAP related

  • ERROR {org.apache.directory.server.ldap.LdapServer} – ERR_171 Failed to bind an LDAP service (10,389) to the service registry
  • ERROR {org.wso2.carbon.apacheds.impl.ApacheLDAPServer} – Error starting LDAP server. {org.wso2.carbon.apacheds.impl.ApacheLDAPServer}
  • ERROR {org.wso2.carbon.ldap.server.DirectoryActivator} – Could not start the embedded-ldap.
  • ERROR {org.wso2.carbon.user.core.internal.Activator} – Cannot start User Manager Core bundle
  • ERROR {org.wso2.carbon.core.services.authentication.AuthenticationAdmin} – System error while Authenticating/Authorizing User : org.wso2.carbon.user.core.UserStoreException: User does not exist.
  • ERROR {org.wso2.carbon.user.core.ldap.LDAPConnectionContext} – Error obtaining connection.Trying again to get connection… {org.wso2.carbon.user.core.ldap.LDAPConnectionContext}

Workaround is to assure LDAP setting in eis_config.properties is correct to connect to LDAP server.

Error 4: ORA-28040: No matching authentication protocol

image

Workaround:

Add following lines to database file sqlnet.ora.

SQLNET.ALLOWED_LOGON_VERSION_CLIENT=8
SQLNET.ALLOWED_LOGON_VERSION_SERVER=8

For more details, refer to “12c: ORA-28040 After Upgrade: No Matching Authentication Protocol (Doc ID 1957995.1)”



Ellucian Identity Service (Part 2) – Configuration

Once Ellucian Identity Service is installed, the subsequent steps need to be completed: database setup, LDAP configuration and keystore, which are listed in Ellucian official guide below and all setting is defined in file $EIS_HOME/config/eis_condig.properties.

       

Database Setup (Oracle)

1. Create a user schema, for example eis_admin, with the following privileges:

create user eis_admin identified by PASSWORD
account unlock default tablespace USERS quota 500M on USERS;

grant create session, create sequence,
create table, create trigger to eis_admin;

2. Copy one of the Oracle JDBC libraries (for example, ojdbc5.jar) to the $EIS_HOME/repository/components/lib directory.

3. Remove any JDBC jar files from the following directory:

EIS_HOME/repository/components/dropins

4. Edit the EIS_HOME/config/eis_config.properties file.

5. Set the following properties to the corresponding values for your Oracle database
server:

#
# Data Source database types: oracle, sqlserver, postgresql, mysql, h2
#
eis.database.type=oracle
eis.database.host=oracle database server hostname or IP address
eis.database.port=oracle database server port number
eis.database.name=EIS database name
#
# Data Source user/schema
#
eis.database.username=eis_admin
eis.database.password=password
#
# Data source DB connection tuning
#
eis.database.timeout=60000
eis.database.test.connection=true
eis.database.test.interval=30000


6. Save the changes to the eis_config.properties file.

Note: The Ellucian Identity Service database is recommended to be standalone one.

LDAP Configuration

# User Store type
# Select a type from one of the following values:
#   ActiveDirectory               : Active Directory
#   ExternalReadOnlyLDAP          : External Read-Only LDAP
#   ExternalReadWriteLDAP         : External Read-Write LDAP
#   InternalReadWriteLDAP         : Internal Read-Write LDAP
#   EmbeddedReadWriteApacheDsLDAP : Embedded ReadWrite ApacheDS LDAP
#   InternalJDBC                  : Internal JDBC
#   Cassandra                     : Use Cassandra database
#
eis.userstore.type=ExternalReadOnlyLDAP
# User Store settings for Active Directory and LDAP
#
eis.userstore.ConnectionURL=user store connection URL
eis.userstore.ConnectionName=user store connection access
eis.userstore.ConnectionPassword=user store connection password
eis.userstore.UserSearchBase=user store user search pattern
eis.userstore.UserNameAttribute=user store username attribute
eis.userstore.GroupSearchBase=user store group search pattern
eis.userstore.SharedGroupSearchBase=user store group search pattern
#
# Active Directory specific settings
#
eis.userstore.defaultRealmName=active directory realm name
eis.userstore.userAccountControl=active directory account control
eis.userstore.JavaNamingLdapAttributesBinary=name of user store attribute that may contain binary characters

Keystore

  • Create the Server Keystore

After creating the new keystore, the EIS configuration must point to the new file.

1. Edit the EIS_HOME/config/eis_config.properties file.

2. Set the following properties to the corresponding values for the EIS keystore:

# Keystore settings
#
eis.keystore.location=${carbon.home}/repository/resources/security/wso2carbon.jks
eis.keystore.type=JKS
eis.keystore.password=keystore password
eis.keystore.private.key.alias=alias of the EIS private key
eis.keystore.private.key.password=password of the EIS private key

3. Save the changes to the EIS_HOME/config/eis_config.properties file.

  • Export the Public Server Certificate

The public certificate for the EIS server must be imported to the client keystore using the
same alias from the server keystore. The following command exports the public certificate
from the server keystore:

keytool -export -alias <use the same certificate alias> –keystore <server keystore name>.jks -file <public certificate name>.pem

  • Import a Public Certificate into the Client Keystore

EIS uses a separate client keystore for backend and inter-system communication. A public
certificate can be imported into EIS_HOME/repository/resources/security/clienttruststore.
jks using the following command:

keytool -import -alias <use the same certificate alias> -file <public certificate name>.pem -keystore clienttruststore.jks -storepass wso2carbon

  • Import a Public Certificate into the Server Keystore

The server keystore must contain public certificates for service providers to decrypt SAML and validate digital signatures. A public certificate can be imported into the server keystore using the following command:

keytool -import -alias <certificate alias> -file <public certificate name>.pem -keystore <server keystore>.jks

Above are general setting for EIS setting with database, LDAP, and keystore.

Starting Ellucian Identify Service for the first time

To apply the bootstrap configuration and start Ellucian Identity Service in the foreground with initial database population, you must follow these steps:

1. Save changes to EIS_HOME/config/eis_config.properties file and close the file if it is still open from configuration.

2. From the EllucianIdentityService/config directory, run the appropriate command(s) for your operating system:

Linux/Unix: ANT_HOME=../apache-ant/
export ANT_HOME
../apache-ant/bin/ant config-all-xml
Windows: ..\apache-ant\bin\ant config-all-xml

The config-all-xml command takes the property settings in the eis_config.properties file and applies them to numerous XML configuration files in the EIS_HOME/repository/conf directory and its sub-directories. You can make changes to the property settings and run the config-all-xml command as many times as needed.

Note: If any manual edits are made to XML files within the conf directory hierarchy, running the ant configuration command risks losing the manual edits. Either check all manual edits after using the ant configuration command, or refrain from using the property settings and ant configuration command after making any manual edits.

3. In a command prompt, change to the EIS_HOME/bin directory:

4. Start Ellucian Identity Service for the first time using the appropriate command for your operating system:

Linux/Unix: sh wso2server.sh -Dsetup
Windows: wso2server.bat –Dsetup

Note: the –Dsetup parameter is only used when you start EIS server for the first time, which will create about 80 objects under eis_admin schema in the EIS database.

Instead, you can try to run “sh wso2server.sh  start” and “sh wso2server.sh stop” in the background mode of linux. That required the database objects created manually beforehand.

For further instructions on running EIS as a Windows service or a background Unix/Linux process, refer to the following link:
https://docs.wso2.com/display/IS500/Running+the+Product



Ellucian Identity Service (Part 1) – Installation

To implement Single Sign On (SSO) on new Ellucian XE environment, Ellucian Identify Service (EIS) is required to be installed along with proper configuration with LDAP.

Before having Ellucian Identity Service installed, Java JDK 7 update 79 is commended to be installed. Higher version Java 7 update 80 and Java 8 do not work with Ellucian Identify Service at time of writing.

My prior post “Installing Java JDK and Tomcat server on Linux” is good reference for java installation.

1: Installing Ellucian Identity Service (EIS)

The base version of Ellucian Identity Service (EIS) is 1.0.0. After getting file “EllucianIdentityService_100.zip” downloaded, simply copy it to installation folder and then extract it like below.

-bash-4.2$ unzip EllucianIdentityService_100.zip

Once it’s done, a new folder “EllucianIdentityService” is created.

That’s it. Ellucian Identity Service (EIS) 1.0.0 is installed. Before running it, it’s recommended to upgrade or patch it to higher versions.

At time of writing, the highest version of EIS is 1.1.4. The path to get is

  • upgrade it to 1.1.0 (EllucianIdentityService_upgrade-1.1.0.zip)
  • patch it to 1.1.1 (EllucianIdentityService_patch-1.1.1.zip)
  • patch it to 1.1.2 (EllucianIdentityService_patch-1.1.2.zip)
  • patch it to 1.1.3 (EllucianIdentityService_patch-1.1.3.zip)
  • patch it to 1.1.4 (EllucianIdentityService_patch-1.1.4.zip)

It’s also not difficult to get above jobs done. But, it’s a little bit tricky here. let’s take patch 1.1.2 as example.

The readme.txt disclose the following steps to apply the patch 1.1.2.

1. Extract the update ZIP file to the Ellucian Identity Service server and copy the
EllucianIdentityService_patch-1.1.2 directory to the same location where the
EllucianIdentityService directory is installed.

The directory structure, from the parent directory will look like the following:
/----
|--EllucianIdentityService
|--EllucianIdentityService_patch-1.1.2

2. In a command prompt, change directory to the 'EllucianIdentityService_patch-1.1.2' directory and run the appropriate command for your operating system:

* Linux/Unix :  ANT_HOME=../EllucianIdentityService/apache-ant
export ANT_HOME
../EllucianIdentityService/apache-ant/bin/ant 

By following this instruction, I encountered failure while running last command. It returns

-bash-4.2$ ANT_HOME=../EllucianIdentityService/apache-ant/; export ANT_HOME
-bash-4.2$ ../EllucianIdentityService/apache-ant/bin/ant
-bash: ../EllucianIdentityService/apache-ant/bin/ant: Permission denied

My workaround to solve it is to give whole directory name of ant.

-bash-4.2$ . $ANT_HOME/bin/ant
Buildfile: /ellucian-EIS/EllucianIdentityService_patch-1.1.2/build.xml
[echo]
[echo] Configuring EIS at /ellucian-EIS/EllucianIdentityService_patch-1.1.2/../EllucianIdentityService

check-prereqs:
[echo] Checking Prerequisites

copy-artifacts:
[echo] Copying artifacts
[copy] Copying 57 files to /ellucian-EIS/EllucianIdentityService

update-webxml:
[echo] Updated: authenticationendpoint/WEB-INF/web.xml

apply-patches:
[echo] Completed.

BUILD SUCCESSFUL
Total time: 0 seconds

2. EIS 1.1.x and TLS Compatibility

“Ellucian Identity Service 1.1.0 included TLS enhancements that were recommended by WSO2 and Ellucian’s own security testing tools. These enhancements disable certain SSL-related vulnerabilities, such as a Poodle Attack, and focus on TLS using Java 7.

Due to these enhancements, applications that require SSLv3 or older cipher suites may not be able to open secure communication channels directly with EIS.

For Ellucian applications running on WebLogic or Tomcat, the application server must be configured to use TLS and Java 7. For more information about TLS configuration used by Ellucian Identity Service, see the section “Transport Layer Security Enhancements for Tomcat” in the Setting Up Ellucian Identity Service 1.1 guide.

As a temporary workaround, EIS 1.1.1 provides a new configuration script that lets you switch between the SSLv3 settings that were included prior to EIS 1.1.0 and the TLS settings delivered with EIS 1.1.0. This workaround will allow you to temporarily disable the TLS enhancements and test protocols, such as CAS and SAML, until your integrating applications can communicate using TLS.” —- EIS Patch 1.1.1 readme.txt

My experience about this is not to enable TLS until you are comfortable with all integration using TLS. Simply enabling TLS would cause communication issue, such as management console of EIS can’t be loaded properly.

3. Running EIS in background process

Running EIS in background process can be done as blow.

  • Start Server:   $EIS_HOME/bin/wso2server.sh start
  • Stop Server:    $EIS_HOME/bin/wso2server.sh stop

If you experience the following error, you might need to apply workaround below.

Error:    -bash-4.2$ $EIS_HOME/bin/wso2server.sh start

                 -bash: ps: write error: Bad file descriptor

Workaround: 

This happens due to redirection error in wso2server.sh. In wso2server.sh line no 177 there is a line as below

“if ps p $PID >& ; then”

there is a redirection to “&-“. but this is kind of obsolete new operating systems as mentioned at [1].

So I have changed that line to the line below.

“if ps -p $PID > /dev/null ; then”

Once done, the environment parameters as below need to be added to /etc/profile.

export JAVA_HOME=/usr/java/jdk1.7.0_79
export CLASSPATH=/usr/java/jdk1.7.0_79/lib
export PATH=$JAVA_HOME/bin:$PATH

export EIS_HOME=/ellucian-eis/EllucianIdentityService

 



SQL Repair Advisor: Further Step to Optimize SQL Performance

Days ago, user reported an issue with a regularly running report. It didn’t generate output but raised error like below.

image

Upon receiving the error ORA-04031, my first thinking was the insufficient size of large pool size in SGA. For specific reasons, we use Automatic Shared Memory Management (ASMM) instead of Automatic Memory Management (AMM). After reviewing the size of init parameter LARGE_POOL_SIZE, I just simply increased to the size of it Memory Advisor suggests. The new size of it would be the minimum size of Large Pool.  However, user kept getting this error.

Further investigation through Oracle Enterprise Manager (OEM), it looked like the running process was holding top 10 sessions. The color pink shows it’s not regular activity like others.

imageimage

By analyzing the session above, it’s obvious that “TABLE ACCESS FULL” on this specific table, which has over 7 millions records, is the root cause of performance degradation.

image

Till now, it’s straightforward to generate optimal execution plan. However, OEM just simply tell the existing one is the optimal one based on current statistics. Re-collecting statistics on this specific table won’t help as well.

I’ll then go deeper workaround by using SQL Repair Advisor. Once the SQL Profile is generated, the user can run this report with expecting result.

SQL Profile is quite mysterious other than SQL Plan Baselines as few of details are disclosed.  To have better understanding with SQL Profile, the following twos are the good resources.

1. What is the difference between SQL Profiles and SQL Plan Baselines? pdf

2. What is the difference between SQL Profile and SPM baseline? pdf