sinfo - a monitoring tool for networked computers
by
Jürgen Rinas email homepage
 
Contents of this page
The Problem down
The Solution - How it works down
Some Screenshots down
Documentation down
Preconditions down
ChangeLog down
Download down
Published on down
 
The Problem homepage top of page

Don't you know the problem?

1.) There are many computers connected to your local network and you have got the (scientific) programs to use their computational power.

But the computers are too different to use some special clustering software. So you are on your own!

The first problem you've got to solve is an overview of the available computers an their current load. sinfo is an approach to solve this problem.

2.) You are responsible for an laboratory of public accessible computers. And you need a tool that provides a quick overview of all nodes and the (insane) processes running on it. sinfo can be your choice.

sinfo is running for about two years on our local network and has become a tool for daily use.

I'm running sinfo on the following systems:

  • Linux 2.6.32
  • FreeBSD 8.2 (it compiles, after a long time again)
  • Solaris 2.8, 2.9
    From version 0.0.15 on I can't test sinfo on solaris, since we scraped our last ultra sparcs lately. I would be grateful for your information if sinfo still works on solaris, or if you can provide me with an account on one of your solaris machines.
 
The Solution - How it works homepage top of page
idea.jpg:shematic information flow of sinfo
shematic information flow of sinfo

The sinfo-system is split into two parts. A demon and a user program.

1.) The demon (sinfod) distributes system information using UDP broadcasts on the local network. Each demon will also receive UDP broadcasts of all other demons and manage a list of the most recent informations.

2.) The user program (sinfo) connects to the demon via the local loop-back interface and displays the up to date informations using the ncurses library.

This scheme has the advantage that it produces minimal network load. If each node broadcasts it's information in a cooperative manner, the network load is O(N), where N is the number of nodes in your network.

Other systems to monitor your cluster load (e.g. rup(1) ) are using a polling scheme where every node has to ask every other node for the system information: In that case the network load is O(N**2).

The Informations broadcasted include:

  • The number of CPUs and their speed.
  • The network node hostname, the hardware type, the host processor type, the operating system name, the operating system release, the operating system version. - Everything uname provides.
  • The uptime of the system.
  • The load average.
  • The current load - split by user, nice, system and idle times.
  • The memory usage of the RAM and the swap space.
  • The network traffic send and received by the network card.
  • Informations of the TOP-5 processes.
 
Some Screenshots homepage top of page
 
Documentation homepage top of page

For documentation look at the man pages of sinfo and sinfod.

 
Preconditions homepage top of page

The following tools and libraries are necessary to compile and use sinfo. Most of them should be already available on any modern Linux system.

necessary libraries:

  • ncurses - libraries for terminal handling (tested with version 5.7)
  • boost - peer-reviewed portable C++ source libraries using Boost.Asio, Boost.Bind and Boost.Signals (tested with version 1.42)
 
ChangeLog homepage top of page

sinfo 0.0.47 - Sun, 08 Jul 2012 08:42:55 +0200

  • switched from stand alone asio.hpp to boost/asio.hpp bundled with boost in order to simplify library requirements

sinfo 0.0.46 - Mon, 14 May 2012 19:41:06 +0200

  • some compile fixes for gcc 4.7 http://gcc.gnu.org/gcc-4.7/porting_to.html : Name lookup changes
  • fixed cursor handling, by using the asio::null_buffers() concept to get a trigger, if input is available according to the example http://www.boost.org/doc/libs/1_45_0/doc/html/boost_asio/example/non blocking/third_party_lib.cpp

sinfo 0.0.45 - Tue, 13 Mar 2012 07:07:27 +0100

  • corrected README compile hint for FreeBSD
  • added configure flag --disable-IPv6 to disable IPv6 support

sinfo 0.0.44 - Tue, 13 Dec 2011 18:32:33 +0100

  • added reconnect for TCP connections
  • added LIBADD to make it --as-needed linkable tnx to T.Harder

sinfo 0.0.43 - Thu, 01 Sep 2011 09:00:13 +0200

  • fixed printing bug (integer underflow) when using sinfo -L or -W tnx to J.Erkkilae
 
Download homepage top of page

I've prepared a debian package for this program. You can download this via
apt-get update; apt-get install sinfo
if you configure your system to include my debian repository.

 
Published on homepage top of page

Lately I've seen some sites out there linking sinfo's homepage. If you want your site to be listed here or know a site not listed here, please drop me a line.

Valid XHTML 1.1!

Comments on this page ]