#
# Copyright (c) 2000, 2001, 2002 Jarkko Turkulainen. All rights reserved.
#
# SMA is copyrighted software. See the file COPYRIGHT which can be found
# at the top level of the sma distribution.
#
# $Id: README,v 1.7 2002/11/14 19:21:33 jt Exp $
#

		SMA - SENDMAIL LOG ANALYSER

PLEASE READ THIS CAREFULLY BEFORE YOU START USING THIS SOFTWARE

In many countries, including Finland and many other European countries,
it might be illegal to read mail logs and produce mail log summaries.
Or, at least, it is illegal to use mail logs for anything else but problem 
solving. Of course, if you run a mail service of your own, nobody's
interested in what you do with your logs. But on the other hand, if you
run a mail service that is used by other persons (no matter how many)
you simply cannot publish the names and top lists. You cannot even show
the report to your clients!!

If you are not sure about the law in your country, make sure that you run
SMA always with command line option (-d) or configuration file keyword
"ShowUsers no" (default in 1.3.2 or later). Or even better, disable all the
envelope analyses. 


1. OVERVIEW

SMA is a program than analyses mail log files and produces a nice summary 
of mail activity. It works by taking its input from files or standard input 
and outputting the results to standard output or file. All error messages and 
debugging information are printed to standard error.

Starting as of version 0.12.0, SMA prints the results as a nifty formatted 
ASCII report. The HTML report can still be produced with the command line 
option (-w). The HTML report uses heavily tables, so lynx is not the best 
browser for the job. If you MUST use a text browser, try using links 
instead. 


2. SOURCE INSTALLATION

Unpack the distribution. Copy one of the Makefiles (Makefile, Makefile.w32
or Makefile.mscp) as Makefile. Modify Makefile and conf.h as needed. Type 

$ make
(or nmake or gmake or whatever your system requires)..

at the top level of the distribution. After successful compilation, just
copy the binary file sma to your favourite directory or type

# make install

as root. This also installs the manpage as MANPATH/sma.8. You can also copy
the default configuration file "sma.conf" as defined in conf.h.

The program is small and simple, and it should compile on most UNIX-style 
environments. At least following systems are known to be working:

	AIX 4.1.3, 4.3 (gcc)
	Digital Unix 4.0 (gcc, DECcc)
	FreeBSD/i386 3.4, 3.5.1, 4.x (gcc)
	HP-UX 10.20 (gcc, HPcc)
	IRIX 6.5 (gcc)
	NetBSD/sparc 1.4.2 (gcc)
	OpenBSD/i386 2.7 - 3.2 (gcc)
	Red Hat Linux 5.2 - 7.2 (gcc)
	Solaris 2.5.1, 2.6, 7 and 8 (gcc)
	Solaris/i386 8 (gcc)
	Mac OS X (Darwin 6.2)
	Win32 (cygwin-1.3.3)
	Win32 (mingw-1.1) (*)
	Win32 (MS Visual C++) (**)

(*) Compile with Makefile.w32
(**) Compile with Makefile.msvcp

Systems with known problems:

	o on Tru64 UNIX cluster (DECcc) malloc() returns errno 22 (EINVAL)
	  which indicates that the requested space is out of range.

The flag -DUSE_REGEXP in CFLAGS may introduce some problems with systems
not conforming to XPG3 definition of regular expressions. In that case,
modify Makefile and try to recompile. If USE_REGEXP is not defined, the
standard strstr() function is used with filtering routines.

RPM BUILDING

The RPM spec-file (contrib/sma.spec) can be used to compile RPM binary
package. Here are the instructions:

- Copy sma-x.tar.gz to /usr/src/RPM/SOURCES/ (x refers to sma version,
  for example 1.3)
- Unpack sma-x.tar.gz 
  $ tar zxvf sma-x.tar.gz
- Build RPM 
  # rpm -bb sma-x/contrib/sma.spec
- Install RPM from /usr/src/RPM/RPMS/

Note that you may need to replace /usr/src/RPM with /usr/src/redhat on
some Red Hat -based systems.


3. WIN32 BINARY INSTALLATION

Win32 binary distribution is compiled with MinGW, see
http://www.mingw.org for more information. SMA binary should run
on all 32-bit Windows versions (95, 98, NT, 2000, XP, ...) that
use Microsoft's standard C runtime library (MSVCRT.DLL).

Here are the general instuctions on how to get SMA binary working.

- Unpack the distribution sma-x-win32.zip where 'x'
  refers to version number of the sma distribution.

- Make a directory (C:\sma etc.) 
  If you run Sendmail for NT, it might be easiest to install
  the binary in that directory (C:\Program Files\Sendmail etc.)

- Copy sma.exe and sma.conf to your directory. Edit sma.conf
  with your favourite text editor. Make sure your editor can
  handle ASCII files correctly. MS Word doesn't.

If you run Sendmail for NT, you must enable logging to files. SMA
cannot read NT event log. At Sendmail Control Panel, set sendmail
logging level as '9' (or higher) and define the log file (propably
C:\Program Files\Sendmail\smlog.txt or something).


4. HOW TO USE

sma [-Fcdhinpsqvw] [-D date1,date2] [-b color] [-f file] 
    [-o file] [-l num] [-r num] [-t value] [files ...]

Generally, SMA reads one or more log files from /var/log (or wherever
they are) and redirect the output to file. The program tries to open
configuration file named "sma.conf" from the current working directory.
This can be overriden with the flags (-f and -F) or with a compile
time option DEFAULT_CONF, defined in file conf.h.

Note that the behavior of sendmail logging has changed with the 8.12 so
that the MSP and MTA deliveries are logged as a separate entries. The
syslog tag should be set with the option (-L) in both sendmail and SMA.


Command line options:

-b color	Set the background color of the HTML report as "color".
		This is a six-digit RGB value.

-C string	Set report header as "string".

-c		Print the copyright notice and exit.

-D date1,date2  Process log entry only if the date is between "date1"
		and "date2". The format of the date is as follows:

		[[[[[cc]yy]mm]dd]HH]MM[.SS] where

		   yy      Year in abbreviated form (for years 1969-2068).
                           The format ccyymmddHHMM is also permitted, for
			   non-ambiguous years.
                   mm      Numeric month, a number from 1 to 12.
                   dd      Day, a number from 1 to 31.
                   HH      Hour, a number from 0 to 23.
                   MM      Minute, a number from 0 to 59.
                   SS      Second, a number from 0 to 61 (59 plus a maximum of
                           two leap seconds).

		Everything but the minute is optional. The dates must be
		separated using a colon, without any whitespace characters.
		If either of the dates is missing, current date is used.

-d		Analyse sender/receiver domains instead of full
		e-mail addresses; eg. domain.com instead of
		joe@domain.com.

-f file		Read the configuration from "file" instead of the default
		configuration (./sma.conf). Some of the configuration
		options are only available from the configuration file.
		You should read the file "sma.conf" for more information.

-F		Do not use default configuration file even if it exists.

-h 		Print help message and exit.

-i 		Include the ASCII report in HTML comment field. This option
		requires HTML reporting (-w, -O html or "Format html").

-L string	Process only lines with syslog tag "string".

-n 		Do not report the time distribution.

-o file		Print the report as file. If not given, print to stdout.

-O format	Output format. ascii, html or clog. See CUSTOM LOGGING for
		more information on clog format.

-p 		Print current configuration to stdout.

-s 		Sort by transfers. Default is by number of messages.

-q		Do not print any warning messages. Sometimes SMA may
		be noisy. Use this switch if you see too many 
		"skipping useless line.." messages.

-l num		Number of the senders/recipients that are printed
		in the summary. Default is 10.

-r num		Number of the input/output relay domains that are
		printed in the summary. Defaults to 5.

-t va1ue        Adjust the internal hash table size. Possible values are:
		"normal", "big", "huge" and custom, comma separated values.

-v		Print some debugging information for each parsed line.
		Be careful with big files and slow terminals..

-w		Print the report in HTML.

Examples:

- Print the results in txtdocs/report.txt:
  $ sma maillog > txtdocs/report.txt

- Print only relay domains and sort them by transfers. 
  Output format is HTML:
  $ sma -nsw -l 0 maillog > wwwdocs/report.html

- Print the ASCII report to file report.txt and errors/debugging to
  file debug.txt:
  $ sma -v maillog > report.txt 2> debug.txt

- Read configuration from file /usr/local/etc/sma.conf and read the
  output file name from command line:
  $ sma -f /usr/local/etc/sma.conf -o report.html maillog

- Read log file smlog.txt and write output to file 
  WebUI\reports\index.html:
  C:\Program Files\Sendmail> sma -o WebUI\reports\index.html smlog.txt

- Read from stdin and write to stdout :-)
  $ sma

- Read only logs between minutes 15 and 45 this hour
  $ sma -D 15,45 maillog

- Read only logs between 16:30 and 16:50 today
  $ sma -D 1630,1650 maillog

- Read only logs before 25th day this month
  $ sma -D ,250000 maillog

- Read only logs after 15:25:10 (hour is 15, minute 25 and second 10)
  $ sma -D 1525.10 maillog

- Read logs between 2001 and 2002, with full dates
  $ sma -D 200101010000.00,200201010000.00

- Print configuration to file sma.conf
  $ sma -p [your favourite command line options] > sma.conf

- Use big hash tables
  $ sma -t big

- Set address hash table size as 10000 and relay table size as 3000:
  $ sma -t 10000,3000


5. FILTERS

SMA filters help you to get an answer to questions like "how many
messages are passed through a specific relay host?" or "how many
messages were sent to @some.domain.com at certain time interval"? 

Filters are invoked from the configuration file with the following keys:

Key				Value
---------------------------------------------------
EnvelopeSenderFilter		*
EnvelopeRecipientFilter		*
RelaySenderFilter		*
RelayRecipientFilter		*
StartTime			YYYY/MM/DD-MM:HH:SS
EndTime				YYYY/MM/DD-MM:HH:SS

The meaning of the keys should be clear - four keys are for envelopes
and relays (input and ouput) and the rest of the six are for start -and
end times. The values are tested as a simple substring match. Only
regexp is (*) which means "any". If compiled with USE_REGEXP, all the
standard, egrep-style extended regular expressions may be used.

All filters are ANDed together - you cannot generate a report with filter
"all mails sent to some.domain OR all mails sent from some.domain".
But you can always run the same file several times with different set
of filters. If complied with USE_REGEXP, filters may contain also
conditionals (|).

The meaning of filter can be reversed by placing '!' as a first character.
All other '!'-characters are taken literally (or part of the regexp).



6. CUSTOM LOGGING

Custom log format (clog) is one of the output formatting options. Unlike
ASCII and HTML, which are reporting formats, clog is a sort of log file
filter. It's main function is to convert the multi-line sendmail log file
to a simple, one-line-per-delivery format. This simple log file may then
be further analysed with another log analyzer, for example the excellent
analog (http://www.statslab.cam.ac.uk/~sret1/analog).

Custom logging is invoked with command line option (-O clog) and/or
configured using the following configuration file keywords:

Format		clog
ClogFormat	FORMATSTRING

The value FORMATSTRING controls the information and how it is formatted.
It consists of ordinary characters and various two-character sequencies which
are replaced with built-in variables as follows:

	%U time in UNIX time format
	%D time in form "Wed Jun 30 21:49:08 1993"
	%y year, four digits
	%m month, in digits
	%M month, three letter English
	%n minute
	%s second
	%d day
	%h hour
	%H hostname
	%z size in bytes
	%f envelope sender
	%t envelope recipient
	%F relay sender
	%T relay recipient
	%S status (1 = sent, 0 = error)
	%% %-character
	\n newline
	\t tab stop
	\\ single backslash

For example, the following format string

  ClogFormat	"%D: from=%f, to=%t, size=%z"

looks at the output side like

  Thu Oct 25 04:24:56 2001: from=sender1, to=recipient1, size=10
  Thu Oct 25 04:24:57 2001: from=sender2, to=recipient2, size=20
  Thu Oct 25 04:24:58 2001: from=sender3, to=recipient3 size=30

Unlike ASCII and HTML, Custom logging is done in real-time and it runs with
a very small memory footprint. Piping the output of running sendmail daemon
might be a very interesting application:

$ tail -f /var/log/maillog | sma -O clog



7. ACKNOWLEDGEMENTS

Adam Beaumont <admin at a-q dot co dot uk> - thanks for constructive ideas
Dirk Meyer <dirk dot meyer at dinoex dot sub dot org> - code cleanup patch
Nicos Nicolaou <nicosn at logosnet dot cy dot net> - ideas and feedback
Mario Pino Uceda <mpino at cica dot es> - support for big log files
Pekka Honkanen <phonkane at cc dot hut dot fi> - testing and feedback
Stephane Lentz <Stephane dot Lentz at ansf dot alcatel dot fr> - RPM spec file

And many others not mentioned here (see the file HISTORY) for reporting bugs,
giving feedback, etc.


8. CONTACT

All comments/suggestions/diffs via email to sma@www.klake.org

Home