This document contains notes on new features in Amanda 2.3 that may not yet be fully documented elsewhere. * INDEXING BACKUPS FOR EASIER RESTORE Read more about this in the file named INDEXING. * SAMBA SUPPORT Read more about this in the file named SAMBA. * GNUTAR SUPPORT Amanda now supports dumps made via Gnu TAR. To use this, set your dumptypes set the program name to "GNUTAR": dumptype tar-client { .... program "GNUTAR" } Since Gnu TAR does not maintain a dumpdates file itself, nor give an estimate of backup size, those need to be done within Amanda. Amanda maintains an /etc/amandates file to track the backup dates analogously to how dump does it. NOTE: if your /etc directory is not writable by your dumpuser, you'll have to create the empty file initially by hand, and make it writable by your dumpuser ala /etc/dumpdates. NOTE: Since tar traverses the directory heirarchy and reads files as a regular user would, it must run as root. The two new amanda programs "calcsize" and "runtar" therefore must be installed setuid root. I've made them as simple as possible to to avoid potential security holes. * MULTIPLE BACKUPS IN PARALLEL FROM ONE CLIENT HOST A new "maxdumps" parameter for the conf file gives the default value for the amount of parallelism per client: maxdumps 2 # default max num. dumps to do in parallel per client If this default parameter is not specified, the default for the default :-0 is 1. Then, you can override the parameter per client through the dumptype, eg: dumptype fast-client { .... maxdumps 4 } If the "maxdumps" parameter isn't given in the dumptypes, the default is used. The idea is that maxdumps is set roughly proportional to the speed of the client host. You probably wont get much benefit from setting it very high, but all but the slowest hosts should be able to handle a maxdumps of at least 2. Amanda doesn't really have any per-host parameters, just per-disk, so the per-client-host maxdumps is taken from the last disk listed for that host. Just to make things more complicated, I've added the ability to specify a "spindle number" for each filesystem in the disklist file. For example: wiggum / fast-comp-user 0 wiggum /usr fast-comp-user 0 wiggum /larry fast-comp-user 1 wiggum /curly fast-comp-user 1 wiggum /moe fast-comp-user 1 wiggum /itchy fast-comp-user 2 wiggum /scratchy fast-comp-user 3 The spindle number represents the disk number, eg every filesystem on sd0 can get a spindle number of 0, everything on sd1 gets spindle 1, etc (but there's no enforced requirement that there be a match with the underlying hardware situation). Now, even with a high maxdumps, amanda will refrain from scheduling two disks on the same spindle at the same time, which would just slow them both down by adding a lot of seeks. The default spindle if you don't specify one is -1, which is defined to be a spindle that doesn't interfere with itself. That is if you don't specify any spindle numbers, any and all filesystems on the host can be scheduled concurrently up to the maxdumps. Just to be clear, there's no relation between spindle numbers and maxdumps: number the spindles by the disks that you have, even if that's more than maxdumps. Also, I'm not sure that putting spindle numbers everywhere is of much value: their purpose is to prevent multiple big dumps from being run at the same time on two partitions on the same disk, on the theory that the extra seeking between the partitions would cause the dumps to run slower than they would if they ran sequentially. But, given the client-side compression and network output that must occur between blocks read from the disk, there may be enough slack time at the disk to support the seeks and have a little parallelism left over to do some good. * MULTIPLE TAPES IN ONE RUN I've rewritten the taper - it now supports one run spanning multiple tapes if you have a tape-changer. The necessary changes in support of this have also been made to driver and reporter - planner already had support. There are a couple other places that should probably be updated, like amcheck. Dumps are not split across tapes - when taper runs into the end of a tape, it loads the next tape and tells driver to try sending the dump again. If you are feeling brave, set "runtapes" to something other than 1. The new taper also keeps the tape open the entire time it is writing the files out - no more having amchecks or other accesses/rewinds in the middle of the run screw you royally if they hit when the tape is closed for writing a filemark. * BOTTLENECK DETERMINATION I've made some experimental changes to Driver to determine what the bottleneck is at any time. Since Amanda tries to do many things at once, it's hard to pinpoint a single bottleneck, but I _think_ I've got it down well enough to say something useful. For now it just outputs the current bottleneck as part of its "driver: state" line in the debug output, but once I'm comfortable with its conclusions, I'll output it to the log file and have the reporter generate a nice table. The current choices are: not-idle - if there were dumps to do, they got done no-dumpers - there were dumps to do but no dumpers free no-hold - there were dumps to do and dumpers free but the dumps couldn't go to the holding disks (no-hold conf flag) no-diskspace - no diskspace on holding disks no-bandwidth - ran out of bandwidth client-constrained - couldn't start any dumps because the clients were busy * 2 GB LIMIT REMOVED I've fixed the 2-gig limits by representing sizes in Kbytes instead of bytes everywhere. This gives us a new 2 TB dump-file size limit (on 32bit machines), which should last us a couple more years. This seemed preferable to me than going to long-long or some other non-portable type for the size. * AMADMIN IMPORT/EXPORT Amadmin now has "import" and "export" commands, to convert the curinfo database to/from text format, for: moving an amanda server to a different arch, compressing the database after deleting lots of hosts, or editing one or all entries in batch form or via a script. --------------------------------------------------------------------------- Here's the old 2.2.x stuff from this file. I'm pretty sure most of this is in the mainline documentation already. --------------------------------------------------------------------------- This document contains notes on new features in Amanda 2.2 that may not yet be fully documented elsewhere. * CLIENT SIDE SETUP HAS CHANGED The new /etc/services lines are: amanda 10080/udp # bsd security amanda daemon kamanda 10081/udp # krb4 security amanda daemon The new /etc/inetd.conf lines are: amanda dgram udp wait /usr/local/libexec/amanda/amandad amandad kamanda dgram udp wait /usr/local/libexec/amanda/amandad amandad -krb4 (you don't need the vanilla amanda lines if you are using kerberos for everything, and vice-versa) * VERSION SUFFIXES ON EXECUTABLES The new USE_VERSION_SUFFIXES define in options.h controls whether to install the Amanda executables with the version number attached to the name, eg "amdump-2.2.1". I recommend that you leave this defined, since the this allows multiple versions to co-exist - particularly important while amanda 2.2 is under development. You can always symlink the names without the version suffix to the version you want to be your "production" version. * KERBEROS Read the comments in file docs/KERBEROS for how to configure the kerberos version. With KRB4_SECURITY defined, there are two new dumptype options: krb4-auth use krb4 auth for this host (you can mingle krb hosts & bsd .rhosts in one conf) kencrypt encrypt this filesystem over the net using the krb4 session key. About 2x slower. Good for those root partitions containing your keyfiles. Don't want to give away the keys to an ethernet sniffer! * MULTIPLE HOLDING DISKS You can have more than one holding disk for those really big installations. Just add extra diskdir and disksize lines to your amanda.conf: diskdir "/amanda2/amanda/work" # where the holding disk is disksize 880 MB # how much space can we use on it diskdir "/dumps/amanda/work" # a second holding disk! disksize 1500 MB Amanda will load-balance between the two disks as long as there is space. Amanda now also actually stats files to get a more accurate view of available and used disk space while running. * REMOTE SELF-CHECKS Amcheck will now cause self-checks to run on the client hosts, quickly detecting which hosts are up and communicating, which have permissions problems, etc. This is amazingly fast for what it does: here it checks more than 130 hosts in less than a minute. My favorite gee-whiz new feature! The new -s and -c options control whether server-only or client-only checks are done. * MMAP SUPPORT System V shared memory primitives are no longer required on the server side, if your system has a version of mmap() that will allocate anonymous memory. BSD 4.4 systems (and OSF/1) have an explicitly anonymous mmap() type, but others (like SunOS) support the trick of mmap'ing /dev/zero for the same effect. Amanda should work with both varieties. Defined HAVE_SYSVSHM or HAVE_MMAP (or both) in config.h. If you have both, SYSVSHM is selected (simply because this code in Amanda is more mature, not because the sysv stuff is better). * GZIP SUPPORT This was most requested feature #1; I've finally slipped it in. Define HAVE_GZIP in options.h. See options.h-vanilla for details. There are two new amanda.conf dumptype options "compress-fast" and "compress-best". The default is "compress-fast". With gzip, compress-fast seems to always do better than the old lzw compress (in particular it will never expand the file), and runs faster too. Gzip's compress-best does very good compression, but is about twice as slow as the old lzw compress, so you don't want to use it for filesystems that take a long time to full-dump anyway. * MOUNT POINT NAMES IN DISKLIST Most requested feature #2: You can specify mount names in the disklist instead of dev names. The rule is, if the filesystem name starts with a slash, it is a mount point name, if it doesn't, it is a dev name, and has DEVDIR prepended. For example: obelix sd0a # dev-name: /dev/sd0a obelix /obelix # mount name: /obelix, aka /dev/sd0g * INITIAL TAPE CHANGER SUPPORT INCLUDED A new amanda.conf parameter, tpchanger, controls whether Amanda communicates with a tape changer program to load tapes rather than just opening the tapedev itself. The tpchanger parameter is a string which specifies the name of a program that follows the API specified in docs/TAPE.CHANGERS. Read that doc for more information. * GENERIC TAPE CHANGER WRAPPER SCRIPT An initial tape-changer glue script, chg-generic.sh, implements the Amanda changer API using an array of tape devices to simulate a tape changer, with the device names specified via a conf file. This script can be quickly customized by inserting calls tape-changer-specific programs at appropriate places, making support for new changers painless. If you know what command to execute to get your changer to put a particular tape in the drive, you can get Amanda to support your changer. The generic script works as-is for sites that want to cascade between two or more tape drives hooked directly up to the tape server host. It also should work as-is with tape-changer drivers that use separate device names to specify the slot to be loaded, wheres simply opening the slot device causes the tape from that slot to be loaded. chg-generic has its own small conf file. See example/chg-generic.conf for a documented sample. * NEW "amtape" COMMAND The amtape is the user front-end to the amanda tape changer support facilities. The operators can use amtape to load tapes for restores, position the changer, see what amanda tapes are loaded in the tape rack, and see which tape would be picked by taper for the next amdump run. No man page yet, but running amtape with no arguments gives a detailed usage statement. See docs/TAPE.CHANGERS for more info. * CHANGER SUPPORT ADDED TO "amlabel" COMMAND The amlabel command now takes an optional slot argument for labeling particular tapes in the tape rack. See docs/TAPE.CHANGERS for more info. * TAPE CHANGER SUPPORT IMPROVED The specs in docs/TAPE.CHANGERS has been updated, and the code changed to match. The major difference is that Amanda no longer assumes slots in the tape rack are numbered from 0 to N-1. They can be numbered or labeled in any manner that suits your tape-changer, amanda doesn't care what the actual slot names are. Also added "first" and "last" slot specifiers, and an -eject command. The chg-generic.sh tape changer script now has new "firstslot", "lastslot", and "needeject" parameters for the chg-generic.conf file. It now keeps track of whether the current slot is loaded into the drive, so that it can issue an explicit eject command for those tape changers that need one. See example/chg-generic.conf for more info. * A FEW WORDS ABOUT MULTI-TAPE RUNS I'm still holding back on support for multiple tapes in one run. I'm not yet completely happy with how Amanda should handle splitting dumps across tapes (eg when end-of-tape is encountered in the middle of a long dump). For example, this creates issues for amrestore, which currently doesn't know about configurations or tape changers --- on purpose, so that you can do restores on any machine with a tape drive, not just the server, and so that you can recover with no online databases present. However, because the current snapshot DOES support tape changers, and multiple runs in one day, some of the benefit of multi-tape runs can be had by simply running amanda several times in a row. Eg, to fill three tapes per night, you can put amdump ; amdump ; amdump in you crontab. On the down side, this will generate three reports instead of one, will do more incremental dumps than necessary, and will run slower. Not very satisfying, but if you *need* to fill more than one tape per day NOW, it should work. * BIG PLANNER CHANGES The support for writing to multiple tapes in one run is almost finished now. See docs/MULTITAPE for an outline of the design. The planner support for this is included in this snapshot, but the taper part is not. There is a new amanda.conf variable "runtapes" which specifies the number of tapes to use on each amdump run. For now this should stay at 1, the default. Also, the old "mincycle" and "maxcycle" amanda.conf variables are deprecated, but still work for now. "maxcycle" was never used, and "mincycle" is now called "dumpcycle". There are two visible differences in the new planner: First, Planner now thinks in real-time, rather than by the number of tapes as before. That is, a filesystem is due for a full backup once every days, regardless of how many times amanda is run in that interval. As a consequence, you need to make sure the dumpcycle variable marks real time instead of the number of days. For example, previously "mincycle 10" worked for a two week cycle if you ran amdump only on weekdays (for 10 runs in a cycle). Now a two week cycle must be specified as "dumpcycle 14" or "dumpcycle 2 weeks". The "2 weeks" specifier works with both the old and new versions of planner, because previously "weeks" multiplied by 5, and now it multiplies by 7. Second, planner now warns about impending overwrites of full backups. If a filesystem's last full backup is on a tape that is due to be overwritten in the next 5 runs, planner will give you a heads-up about it, so that you can restore the filesystem somewhere, or switch that tape out of rotation (substitute a new tape with the same label). This situation often occurs after a hardware failure brings a machine or disk down for some days. * LEVEL-0 DUMPS ALLOWED WITH NO TAPE If there is no tape present (or the tape drive fails during dumping), Amanda switches to degraded mode. In degraded mode, level-0 dumps are not allowed. This can be a pain for unattended sites over the weekend (especially when there is a large holding disk that can hold any necessary dumps). Amanda now supports a new configuration file directive, "reserve". This tells Amanda to reserve that percentage of total holding disk space for degraded mode dumps. Example: your total holding disk space adds up to 8.4GB. If you specify a reserve of 50, 4.2GB (50%) of the holding disk space will be allowed to be used for regular dumps, but if that limit is hit, Amanda will switch to degraded dumps. For backward compatibility, if no 'reserve' keyword is present, 100 will be assumed (e.g. never do full dumps if degraded mode is in effect). NOTE: this percentage applies from run to run, so, as in the previous example, when Amanda runs the next day, if there is 3.8GB left on the holding disk, 1.9GB will be reserved for degraded mode dumps (e.g. the percentage keeps sliding).