Post Login Script in DB2

One of less known but very useful feature of DB2 is the post login script after an application makes a connection.

DB2 database has a configuration parameter known as CONNECT_PROC which is name of the database procedure which will be executed everytime someone connects to the database. This is equivalent to a login script but generalized at the database level. Purist may doubt this but this is a very useful feature.

For example:

Automatic Backup when USB disk is plugged in

Management of my shoe-closet computer center requires that I should take some steps to protect the data by taking a backup. In last 10 years, I never took any backup and luckily I was never in the situation to have lost the data. The servers kept on running and runinng without any issue.

The only protection I had was RAID-1 using Linux mdadm commands but even  though disks never failed albeit I changed the disks every 4 years in anticipation that they might fail one day.

cups-1.6.3 RPMs for SLES 11 SP2

Usually enterprise distributions like SLES or Redhat are slow to adopt latest in consumer products such as CUPS used for printing. The version of cups on SLES 11 SP2 is 1.3 when the latest version at the time of writing this is 1.6.2. The older version 1.2 uses older version of IPP protocol and newer client machines such as Mac or Windows use newer IPP V2.0 with a result that cups does not work well when using newer mac etc.

Security for the Paranoid

In today's world, due to some dishonest populace who are like parasites living on somebody's else blood, implemenation of security comes at a heavy price and it has more problems than solutions. It is also a great business model for the companies to exploit the fear mongering mentality created by hackers and dishonest people who like to steal things or create havoc in your computer.

So, I tend to build security like a paranoid does but lately I have come to a realization that these all security implementations are nuisance rather than solutions.

Few examples:

DB2 Backup - DB2_BCKP_PAGE_VERIFICATION

When taking a DB2 backup, if you get the following error:

$ db2 backup database <dbname> to /backup compress
SQL1224N  The database manager is not able to accept new requests,
has terminated all requests in progress, or has terminated the specified request
because of an error or a forced interrupt.  SQLSTATE=00000

Try skipping the check on pages by using the following registry variable.

db2set DB2_BCKP_PAGE_VERIFICATION=OFF
db2stop
db2start
<retry the backup>

 

DB2 Database Backup Pending State

When database is converted from circular log to archive log, it will force the backup of the database which is a good thing.

But, sometime, we do not want to do it. Say, when we are in test environment like testing upgrade from DB2 ESE to DB2 pureScale.

If database is small, backup database <db> to /dev/null works but when the database is large, it still takes a lot of time to complete the backup even on null device.

DB2 pureScale - Role of netmon.cf

Best practice or things that appear to work well in a two nodes cluser or a single node (just with one DB2 and a CF).

The cthats subsystem handles heartbeating between nodes. When in steady state, there is a ring formed where each member is sending heartbeats to their downstream neighbor, and monitoring for heart beats arriving from their upstream neighbor (a higher IP address is your upstream neighbor). But, in the case with only two nodes, both members of the heartbeat ring are sending to and receiving from the same neighbor and RSCT is not really aware of that fact.

Commands to use:

# lsrsrc -Ab IBM.NetworkInterface Name OpState --> OpState 2 is a problem
# lssrc -ls cthats

If you have only 2 node pureScale cluster and there are not that many network adapters (My case of VMware pureScale where I only have one adapter), it is important that we have an IP address defined in /var/ct/cfg/netmon.cf so that RSCT can use that for the heart beat purpose. For a two node cluster, when only one node is alive and thus has nobody else to heartbeat with and in that case the IP address defined in /var/ct/cfg/netmon.cf  comes handy. Generally, we use the switch IP address. Please make sure that you update this file to reflect the new IP address should the switch IP address change in future. No need to restart the cluster as this file is picked up automatically when you modify it.  

Another example of the importance of netmon.cf is only one node DB2 pureScale cluster in which one host has both CF and a DB2 member.

After install of DB2 pureScale on one node (CF and a Member colocated), you may notice that the CF was not able to start. The DB2 installer shows the following DB2 SQL error.

ERROR: 04/22/2013 14:51:52   128   0   SQL1677N  DB2START or DB2STOP processing failed due to a DB2 cluster services error.
04/22/2013 14:51:53     0   0   SQL1685N  An error was encountered during DB2START processing of DB2 member with identifier "0" because the database manager failed to start one or more CFs.
SQL1677N  DB2START or DB2STOP processing failed due to a DB2 cluster services error.
ERROR: An error occurred while trying to start the "db2inst1" instance. The
return code is "55" and the SQL Message is:   "PROCESS_ERROR"

The db2diag.log for the CF (128) may show the following error.

2013-04-22-14.51.51.275440-420 I3963E448             LEVEL: Severe
PID     : 52505                TID : 47845314865664  PROC : ca-wdog 128 [db2inst1]
INSTANCE: db2inst1             NODE : 128
HOSTNAME: purescale1.chq.ei
FUNCTION: DB2 UDB, high avail services, rocmCAWatchDog, probe:2331
MESSAGE : ZRC=0x80050801=-2147153919=SQLE_RC_ADAPTER_NOT_FOUND
          "Adapter not found"
DATA #1 : String, 40 bytes
netmon.cf validation fails on this host.

2013-04-22-14.51.52.087100-420 I4412E454             LEVEL: Severe
PID     : 52416                TID : 47918334802432  PROC : db2rocme 128 [db2inst1]
INSTANCE: db2inst1             NODE : 128
HOSTNAME: purescale1.chq.ei
FUNCTION: DB2 UDB, high avail services, rocmCAStart, probe:3087
MESSAGE : ZRC=0x80050801=-2147153919=SQLE_RC_ADAPTER_NOT_FOUND
          "Adapter not found"
DATA #1 : String, 48 bytes
Error starting up ca-server. Attempting cleanup.

The above is all due to the heart-beating failure since this is a single node and there is no ring formed for the RSCT to do the heart-beating for the cluster with other members. The IP addresses defined in the /var/ct/cfg/netmon.cf are used for the heart-beating purposes when there are not enough hosts or members to do the heartbeat.

Follow these simple to use rules to add entries in your netmon.cf file.

  1. Find out the IP address of the 10GbE or Infiniband Switch. If two switches are used, find out both IP addresses. Generally command show interface ip may work. It all depends upon the switch manufacturer.
  2. Use /sbin/route command to find the subnet used by the 10GbE or InfiniBand adpaters (like eth8 or eth9) and for each interface, add the IP address of the switch to the netmon.cf file. 

The format of the netmon.cf file is:

!REQD <adapter> <ip>

For example: If the IP address of the switch is 192.168.100.1 and the IP subnet for 10GbE adpater (eth8) is 192.168.100.0. So, you should add entry

!REQD eth8 192.168.100.1

OPM for DB2 pureScale and HADR if primary is down

For example, we have a two (or more) nodes DB2 cluster having a shared database and we want to configure Optim Perfromance Manager (OPM) for the database monitoring. The questions comes - which node should OPM connect to and what if that node is down for any reason. Is there a way that OPM could connect to another node automatically?

The answer is YES. This is how you do it.

Step-1: Register your DB2 pureScale database in OPM and use the IP address or host name of the first DB2 member.

Step-2: In same registration database form, specify second database member information in additional JDBC properties as shown in the picture below.

The additional JDBC properties defined are :

  1. clientRerouteAlternateServerName
  2. clientRerouteAlternatePortNumber
  3. maxRetriesForClientReroute
  4. retryIntervalForClientReroute

But, we have a catch here: The DB2 table which holds this additional JDBC property is only 255 bytes long and if we specify all above properties, it may not fit into the field so we have to increase the size of the field. To do this:

Go to your PERFDB database and change the length of the column to 1024.

db2 connect to perfdb
db2 "ALTER TABLE IBMPDQ.PROFILE_PROPS ALTER COLUMN PROPERTY_VALUE SET DATA TYPE VARCHAR(1024)"
db2 connect reset

After above is done, the additional JDBC properties can be defined as:

clientRerouteAlternateServerName=10.77.19.13,10.77.19.12;
clientRerouteAlternatePortNumber=50001,50001;
maxRetriesForClientReroute=1;
retryIntervalForClientReroute=5

Please make sure that all 4 field definitions are in the same line and substitute IP addresses and port number accordingly. You can use host names also if DNS is set properly. And, you can add as many hosts as there are DB2 members.

Once above is done, you will have the following behavior.

If connection failure occurs for the first DB2 member, JCC driver will wait for 5 seconds and attempt to connect to the first IP address given in the list and it will try only for the 1st attempt since we have defined maxRetriesForClientReroute=1 and retryIntervalForClientReroute=5. You can tweak these as per your requirements.

By using above, OPM can connect to the second or other members if the connection to the member it connected to fails. This way, you can have OPM connected to the DB2 pureScale regardless the availability of members.

DB2 pureScale - Switch Role of CF

In DB2 pureScale, when you run the command db2instance -list, it shows the DB2 pureScale topology and which member is up and running and which CF is in PRIMARY mode and which one is in PEER mode.

If you want to change the role of CF from PRIMARY to PEER or from PEER to PRIMARY, you have two ways to do this.

DB2 way

Run db2instance -list command and find out the CF host which is in PRIMARY mode. You need to find out the node number which will be either 128 or 129.

JBoss and DB2 pureScale

Work Load Balancing

In order to utilize DB2 automatic Work Load Balancing for JBoss application, some changes need to be made in the JBoss JDBC driver manager configuration.

Myth - Most people think that work load balancing is done by the DB2 pureScale and it is not true. The work load balancing is done by the DB2 drivers. DB2 pureScale does only one thing - It returns a list of DB2 members available and how busy they are and then DB2 JCC driver does the workload balancing based upon this information. DB2 pureScale member once it receives a work (S/I/U/D), it has no way of rerouting that work to another DB2 member so the real work load balancing is done by the drivers and not by DB2 pureScale or even DB2.

Locate the XML file that describes the driver properties and add connection level properties so that JDBC driver does the Work Load Balancing.

Here is an example:

<?xml version="1.0" encoding="UTF-8"?>
<datasources>
    <local-tx-datasource>
      <jndi-name>DB2Prod</jndi-name>
        <driver-class>com.ibm.db2.jcc.DB2Driver</driver-class>
        <connection-url>jdbc:db2://10.10.10.10:53000/PSDB</connection-url>
        <!-- Set Current Schema -->
        <connection-property name="currentSchema">PS</connection-property>
        <!-- JDBC Tracing - Comment when not needed. Can generate huge files -->
        <connection-property name="traceDirectory">/tmp</connection-property>
        <connection-property name="traceFile">jcc.trace</connection-property>
        <connection-property name="traceLevel">-1</connection-property>
        <!-- Workload Balancing Parameters -->
        <connection-property name="enableSysplexWLB">true</connection-property>
        <!-- Client Identification Parameters -->
        <connection-property name="clientProgramName">MyPureScaleApp</connection-property>
        <connection-property name="clientWorkstation">Serv1</connection-property>
        <!-- Automatic Client Reroute Parameters -->
        <connection-property name="clientRerouteAlternateServerName">
              10.10.10.11,10.10.10.12,10.10.10.13
        </connection-property>
        <connection-property name="clientRerouteAlternatePortNumber">
              53000,53000,53000
        </connection-property>
        <connection-property name="maxRetriesForClientReroute">1</connection-property>
        <connection-property name="retryIntervalForClientReroute">1</connection-property>
        <user-name>PS</user-name>
        <password>password</password>
        <idle-timeout-minutes>15</idle-timeout-minutes>
        <min-pool-size>10</min-pool-size>
        <max-pool-size>200</max-pool-size>
        <!-- sql to call on an existing pooled connection when it is obtained from pool -->
        <new-connection-sql>select 1 from sysibm.sysdummy1</new-connection-sql>
        <!-- <security-domain>EncryptedSuperPnrDbRealm</security-domain> -->
     <metadata>
        <type-mapping>DB2</type-mapping>
     </metadata>
  </local-tx-datasource>
</datasources>

Excel Microsecond Timestamp Difference

DB2 timestamp can show micro seconds and I was trying to generate an excel report showing elapsed time between 2 DB2 timestamps and had to work a little bit to write this excel formulae.

Timestamp (Cell B3) = 2013-03-04-13.38.26.677483

Timestamp (Cell C3) = 2013-03-04-13.40.49.790675

Elapsed Time (Cell D3) =((LEFT(C3,10)-LEFT(B3,10) + TIME(MID(C3,12,2),MID(C3,15,2),MID(C3,18,2)) - TIME(MID(B3,12,2),MID(B3,15,2),MID(B3,18,2))) *24*60*60 + (RIGHT(C3,6)-RIGHT(B3,6)) / 1000000)

uDAPL Version

On AIX, how to find out the uDAPL version?

# lslpp -la "*udapl*"
  Fileset                      Level  State      Description
  ----------------------------------------------------------------------------
Path: /usr/lib/objrepos
  udapl.rte                 7.1.1.15  COMMITTED  uDAPL

Path: /etc/objrepos
  udapl.rte                 7.1.1.15  COMMITTED  uDAPL

But the above only tells the IBM AIX packager version and not the version of the uDAPL that is being used.

To find out the uDAPL version being used, it is easier to just look at the header files.

# grep Version /usr/include/dat/dat.h
 * 3) under the terms of the "GNU General Public License (GPL) Version 2" a
 *              Library, Version: 1.2"

The above tells that we have uDAPL  version 1.2 support.

But let us check if we have uDAPL version 2.0 support also

# grep Version /usr/include/dat2/dat.h
 * 3) under the terms of the "GNU General Public License (GPL) Version 2" a
 *              Library, Version: 2.0"

The above confirms that we have uDAPL  version 2.0 support also.

WAITING_FOR_FAILBACK and Failed Offline

In DB2 pureScale if a node goes down, it is started on another host in a light mode just to do the crash recovery. When the home host becomes available, the host should fail back to its home host seamlessly.

For example:

This is a sample output from the db2instance -list command which shows that things are working pretty normal.

DB2 pureScale Tie Breaker for GPFS

After you install DB2 pureScale, it is a good idea to check the GPFS tie breaker and to see what DB2 install did to set the GPFS tie breaker. Run the following command.

# db2cluster -cfs -list -tiebreaker
The current quorum device is of type Majority Node Set.

The above means that the surviving group of hosts with the majority will have an operational quorum and that is OK. Normally, you would see the above if you did the pureScale install using the command line and not by using db2setup command.

A GPFS may use several disks so how do you know which disk is used as a tie-breaker disk. You can run db2cluster -cfs -list -filesystem command to find out all file systems that are in use.

DB2 pureScale Tie Breaker for Cluster Manager

During DB2 pureScale install, you must have given the tie-breaker disk device name. It is a good practice to check if that disk was in fact used as a tie-breaker disk or not.

# db2cluster -cm -list -tiebreaker

If the output from the above command is "Operator", the disk was not used for the tie-breaker. You will not see the problem until some hosts fail and the proper failover would not happen as an "Operator" is supposed to interfere.

Run the command to set the tie breaker again.

# db2cluster -cm -set -tiebreaker -disk /dev/dm-0

Where /dev/dm-0 is the tie-breaker disk. In my case, I used a 100MB LUN to be used as a tie-breaker disk. The error probably would show as why the disk was not used as a tie-breaker disk. Please refer to my earlier article for how to check if your storage supports SCSI-3 PR or not.

db2prereqcheck utility

While installing DB2 pureScale, if you receive the error: DBT3542E  An instance management task failed because the db2prereqcheck utility encountered an internal error. Reason code: "2", the fix is simple.

Edit your /etc/hosts file and check if you have tabs separating the entries in the file. Replace those tabs with spaces and you should be good to go. It looks that the db2prereqcheck did not do a good job of parsing the /etc/hosts properly.

Fun Things with DB2 10.1

As I am exploring new features of DB2 10.1, there are some fun things that I can do.

Example-1:

A table has rows in it. You want to make those rows invisible to all including you. Do this.

$ db2 "select count(*) from ps_table" --> You see rows in this table.

$ db2 "alter table ps_table activate row access control"

$ db2 "select count(*) from ps_table" --> Rows have become invisible now as you get 0 count and this is for all.

Your co-workers are doing some work in development and you just want to give them some fun time. Do this and laugh at the water cooler when you meet them. Do not do this in your production otherwise chance of a pink slip is great. A smart DBA or a cop can find who did this ALTER TABLE with activation and deactivation.

Example - 2:

Quiesce at Instance and Database Level

There are two types of DBAs

Ones who are control freak - This article is not meant for you as you have already taken pains to grant privilges to an application user without granting DBADM. You have a very good mechanism to control how applications connect and what they can access. You are following good practices so you can stop reading further

Lazy DBAs like me who are not control freaks but at the same time do things in an order that meet the requirements of the business by using least path of resistance and doing the work.

So, I like to give DBADM to the application user so that I am not bothered to slice and dice things as to what object an application can connect or do something. Basically, I do not want to take the burden of granting individual privilges to the objects. I trust the application folks and have means to determine what they did if they did something wrong so that I can nail them down. Like they dropped an object and no one knows who did this. I can find who did. So, with power responsibility comes and application folks are responsible people who are the bedrock of running an actual business whereas DBAs are like commodities.

DB2 pureScale, Fast I/O Fencing and GPFS

This must be read in conjunction with this post.

The fast I/O fencing needs to be enabled in DB2 pureScale. This is a feature of GPFS in conjunction with your storage firmware or its driver.

In a nutshell, the following procedure must be followed:

The file prcapdevices should be generated by the tsprinquiry command.

prcapdevices file should be copied on all nodes in /var/mmfs/etc directory.

mmchconfig usePersistentReserve=yes should be executed and this requires DB2 pureScale cluster (RSCT and GPFS) to be stopped.

The following script is an attempt to automate the process of enabling usePersistentReserve feature of GPFS, which of course depends upon the underlying storage.

Connect Routers with a Wire and use same SSID

This is how I did a setup of two wireless routers with same SSID.

Host Router: Linksys WRT54GS (I am replacing that with a Linksys E4200) connect WAN port to the modem.

Client Router: Linksys E1000 connected through an ethernet cable with the main router. Do not connect to the WAN port on the second router but connect LAN to LAN port.

Both routers are flashed with DD-WRT firmware and please consult http://dd-wrt.com and use their step-by-step process for how to flash your router with their firmware.

Purpose: Set up both routers so that they emit same SSID so that we have devices connected on the same subnet for easy sharing.

Host Router Set up

http://192.168.100.101

Setup > Basic Setup

Setup > Advanced Routing

Wireless -> Basic Settings

Wireless > Wireless Security

Security > Firewall

Connect Routers Wirelessly and use same SSID

These instructions worked for me to setup 2 wireless routers having same SSID with a single Internet connection.

Note: Both routers connect to each other wirelessly so this setup is useful when it is not possible to connect both routers through a wire.

My host wireless router is Cisco Linksys WRT54GS on which I installed DD-WRT firmware. This router is connected to the cable modem for the WAN connectivity.

Note: You should consult http://dd-wrt.com for how to flash your router with DD-WRT and pay attention to step-by-step process.

I have another Cisco Linksys E1000 router which is not connected to the cable modem and we want to connect it wirelessly to the host router. I installed DD-WRT on this router also.

This setup also works for 2 homes who want to share the cost of the same internet connection. This may be useful for countries where internet cost is high.

Run Slim and Effective

On my new Thinkpad W530 with 32 GB RAM and dual 500GB SSD, I am running multiple virtual machines using VMware 9 workstation. So far, I have tested 6 VMs running with a resonable performance. My aim is to run DB2 pureScale using minimum 4 VMs and 2 VMs for the applications.

Few things that I am doing differently on this machine.

I have abandoned the use of Norton Antivirus and Windows defender completely from my host and other services so that I have bare minimum things running on my laptop. Without an anti-virus software, how do I make sure that I protect my computer when I go to the internet.

I set a rules for myself that I will never surf internet from my host Windows 8 machine. I simply have a self-imposed rule to not to do it.

I created a 1GB RAM Virtual Machine running Ubuntu 12.04 and installed flash plugin in Firefox. I installed VMware tools so that I can easily copy files to and fro between host and VM. I took a VM snapshot of the machine.

Windows 8 Manage Wireless Connections - Cisco LEAP Wireless

The first thing I noticed in Windows 8 is the missing "Manage Wireless Network". I tried taking help from google but I found same issues raised by other people.

I could not connect to my work wireless that uses Cisco LEAP and here are the steps that I followed.

Press Windows Key - On my Thinkpad W530 this key is between CTRL and ALT keys on the left hand side.

Start typing network and this will show up the search window.

Click Settings. This is in between Apps and Files

Click Setup a Connection or Network

Click Manually connect to a wireless network

Type your network name, Security Type, Encryption Type. Click Next.

Click Change Connection Settings

Click Security Tab

Choose Network Authentication Method. My example is Cisco:LEAP

Click Settings

Priority Aging in DB2

The Work Load Management (WLM) feature of DB2 is mostly viewed as one of the best but least understood feature. The WLM has two main themes.

1. Use system defined resources such as CPU or I/O usage to do a task faster or slower

2. Use DB2 defined control such as thresholds or concurrency controls to build a most effective WLM.

The first approach is the one used by almost all vendors in some capacity or another. The second approach is unique to DB2 and at present no database vendors beat this functionality. This WLM feature has its root in mainframe DB2 and if you do this right, no workload can bring down your DB2 database.

However, I am going to discuss only the first approach of WLM. Sometime in a system, there is a query running and takling lots of resources and having an impact on the system. A DBA wants to demote this query so that the system resources are freed up.

In other databases, a DBA has to do the baby-sitting to identify and demote such queries but in DB2, the concept is "Set it and forget it."

Optim Performance Manager - checkLicense Fix

After you install Optim Performance Manager 5.1.1.1, start the web console and try to add a database for monitoring.

After establishing the connection, you may notice an opmConfigurationAdapter.checkLicense error and you may wonder that you specified the license during install and why this error is being thrown.

I do not know the internals but after making some changes in the startup scripts, I did not see this error come again and it appears that it has to do something the way memory is assigned to the Web Client and the OPM Server.

For example:

Your server has plenty of RAM and the server is not running anything at the moment. The maximum memory logic calculation in pestart file allows a large chunk of memory to the OPM server and that seems to be the problem.

DB2 pureScale - Pending Online

RSCT is pretty smart to take care of automation but when unsual things happen and may be that logic is not yet there, you may have to intervene as a DBA and that is why you have a niche for yourself.

Take this case: The output from lssam shows that two members of the DB2 pureScale are in pending offline and they stay that way for a long time. It means that there is something wrong.

DB2 pureScale - Applications WLB, ACR and CA

DB2 pureScale has one of the best features known as WLB, ACR and CA and a brief introduction is here.

  • WLB (Work Load Balancing) - at the unit of work level is the capability unmatched in the industry and built at the driver level where it can distribute the work to different DB2 members without any application code change.
  • ACR (Automatic Client Reroute) - It is not only for DB2 pureScale but can be used with Standalone DB2, HADR, Q-Replication etc. This is one of the best piece of technology built at the driver level. DB2 also provides server level ACR by using UPDATE ALTERNATE SERVER but I am not that much fan of the server level ACR since the client has to connect at least once to cache that information and that is why my personal preference is to use it through driver. Once it is at the clinet, it becomes part of the release process of the software and guaranteed to work.
  • CA (Client Affinity) - Let a client have affinity towards a particulatr member with ACR capability.

Know that WLB and CA are mutually exclusive but WLB and ACR can coexist and same way ACR and CA can coexists but not WLB and CA.

Pages