Because I always forget how.

In any enterprise level application environment, you’ll find that your tiers are segregated by a firewall.

In some cases, you may see this type of architecture

FIREWALL -> WEB -> FIREWALL -> APP -> FIREWALL -> DB

or even

FIREWALL -> WEB -> FIREWALL -> APP/DB

In both designs, which are somewhat similar, you may potentially run into keepalive issues.

Keepalives are essentially messages sent between two devices on a specified interval to verify the state of the connection between them. If a message is not acknowledged by the receiving device, then the transmitting device assumes the connection is down and then will find another way to route data until that connection is re-established (if it does which usually, it doesn’t)

Keepalives are essential in environments where you’re using connection pools. Web servers may sometimes use a connection pool to talk to an application server like tomcat or weblogic. Application servers frequently use database connection pools to ensure that the performance is optimal.

Most connection pools will have a keep alive setting so you should leverage that when you can. Some connection pools do not. Mod_weblogic for example doesn’t have it’s own keep alive value. It can be enabled or disabled but by default, it will use the system keepalive interval which on RHEL/CentOS systems is set to 7200 seconds (two hours).

To check your current system keepalive settings

# sysctl -a | grep net.ipv4.tcp_keepalive
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200

net.ipv4.tcp_keepalive_intvl is the frequency by which keepalive messages are sent.
net.ipv4.tcp_keepalive_probes tells your system how many unacknowledged keepalive messages should be ignored before considering the connection to be dead.
net.ipv4.tcp_keepalive_time tells your system how long to wait before sending the first keepalive message after the last packet. This is the biggie!

I don’t understand why 7200 seconds was chosen as a number. In my environment here, the firewall can drop idle connections after one hour and sometimes even less depending on how big the connection table can get (I’m looking at you checkpoint).

So I normally trim these down so that the keepalive time is less and the number of probes is more. The interval is also reduced by a bit but that’s not really important. You would normally make these changes on the server that is initiating the connection. A webserver, or an application server. Sometimes a DB server but not always.

in /etc/sysctl.conf, add these lines (or modify them if they’re already there)


net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 20
net.ipv4.tcp_keepalive_time = 300

To put these settings into effect, run


sysctl -p /etc/sysctl.conf

and now retest with sysctl -a

Once set, you will need to restart your webserver or app server so it sees the new settings. This allows you to start with a fresh set of connections that you can actually monitor using netstat.

You should be able to corroborate on both ends of the connection, the ports, state and number of connections which tells you that things are A-OK!

Hope this helps.

November 16th, 2010My vim settings

From time to time, I find some settings for vi on remote systems that really kind of freak me out. The one I found recently was ‘incsearch’ so I decided to use this opportunity to note down the settings I use on a daily basis. Hope you find some of these useful.


syntax on  
set hlsearch
set incsearch
set ruler
set showmatch

syntax on is pretty obvious. If you’re writing code, it’s pretty smart about highlighting the code so it’s easier to read. It can be odd at first but I find it really useful and after a while, it becomes second nature.

set hlsearch highlights your search terms so they’re easy to see. I like this option a lot. not everyone does.

set incsearch searches as you type. It’s new to me so I’m still getting used to it but I think I can already see some uses for it.

set ruler shows you where your cursor is at all times. I like this option a lot if only to tell me what line number I’m on. set number will also do this but I also find it irritating because it also interferes with my copy/paste habits.

set showmatch is really useful if you’re a coder. If you’ve got somewhat complicated conditional statements or loops, this feature will show you where brackets match so you can find missing brackets and close the proper blocks.

Hope these help. I’ll update these as I find more.

Spacewalk, it’s pretty damn awesome. Or at least, I think it is. It’s an open source linux systems management solution from Redhat (GPLv2).

Once you get spacewalk up and running, you’ll be amazed by some of the things it does. It can push config files, packages, inventory systems, group them and allow you to work exclusively with those groups in a very easy way. That’s only scratching the surface of what spacewalk is capable of.

I like it because I can setup custom channels where I can push custom software to each of my servers. From time to time though, I notice that the repos don’t really rebuild automatically. If you look at the “details” section of your channel, you’ll notice something like this

The times don’t match. It probably means that the taskomatic daemon is not running or is running but isn’t really pulling tasks from the database.

To verify, login to sqlplus and run this query

sqlplus spacewalk/spacewalk@xe

SQL*Plus: Release 10.2.0.4.0 - Production on Sat Nov 13 14:14:00 2010

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.

Connected to:
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production

SQL> select * from rhnTaskQueue;

ORG_ID TASK_NAME
---------- ----------------------------------------------------------------
TASK_DATA PRIORITY EARLIEST
---------- ---------- ---------
1 update_errata_cache_by_channel
143 0 13-NOV-10

1 update_errata_cache_by_channel
122 0 12-NOV-10

1 update_errata_cache_by_channel
208 0 13-NOV-10

ORG_ID TASK_NAME
---------- ----------------------------------------------------------------
TASK_DATA PRIORITY EARLIEST
---------- ---------- ---------
1 update_errata_cache_by_channel
122 0 13-NOV-10

Notice how some tasks are older? This table should almost always be empty or only have data for a small period of time as the name suggests.

Restarting taskomatic is as simple as


[root@spacewalk init.d]# ./taskomatic stop
Stopping RHN Taskomatic...
Stopped RHN Taskomatic.
[root@spacewalk init.d]# ./taskomatic start
Starting RHN Taskomatic...

Wait about 10 minutes, because that’s the polling time for taskomatic, and then check the database again. There should be no rows


SQL> select * from rhnTaskQueue;

no rows selected
SQL>

Also check the spacewal UI and look for something like this

or

Spacewalk is still very much in its infancy but it shows great promise and there is a great community of people who are willing to help and are dedicated to making it a rich and feature full product. Spacewalk 1.1 was released recently and we haven’t had a chance to upgrade yet but I continue to see great things coming from spacewalk and that makes me happy.


© 2007 wp | anoopdotnet | iKon Wordpress Theme by Windows Vista Administration | Powered by Wordpress