Building better scripts Part I

Over the years, as a sysadmin, I’ve had the pleasure of building many scripts in a variety of different administrative languages. I’ve seen even more, some of which exceeded 10,000 lines in length and really pushed the boundary of what should be done inside of a script and what really needs to be done with a bit more of a software engineering approach. I’ve seen a lot more badly written scripts than good ones, so I’d like to take some time to discuss what makes superior automation and what does not. Consider the following output:

C:\>ping blarg
Ping request could not find host blarg. Please check the name and try again.

C:\>echo %errorlevel%
1

C:\>ping 8.8.8.8

Pinging 8.8.8.8 with 32 bytes of data:
Reply from 8.8.8.8: bytes=32 time=48ms TTL=118
Reply from 8.8.8.8: bytes=32 time=36ms TTL=118
Reply from 8.8.8.8: bytes=32 time=39ms TTL=118
Reply from 8.8.8.8: bytes=32 time=34ms TTL=118

Ping statistics for 8.8.8.8:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 34ms, Maximum = 48ms, Average = 39ms

C:\>echo %errorlevel%
0

C:\>

What I have just demonstrated is how the OS initiates a process (ping.exe with some arguments in this example) and once that process has ended, a return code is handed back to the caller (in this case, our command prompt). If everything ran as expected, the return code should be zero. If something didn’t execute correctly, it’s up to the author of the executable to have planned for this and to have built other non-zero return codes (hopefully in a documented or predictable way) into their program. This is true on Linux as well or any other operating system.

Sometimes we care about what the exit code of a process (the command) we just executed is and sometimes we don’t but my observation has been that the majority of scripts out there blindly execute some commands and happily exit leaving the user with no log of what happened or any indication that the commands that executed were even successful. I understand that for a new sysadmin, it may be a difficult process just to pull together a .bat file that appears to have the desired results. However, the next step in developing those automation skills is develop a script that adds in logging and captures the results of the (important) commands that get executed.

Consider the following .bat file:

set result=3

%~dp0blarg.exe -blarg "blarg" -moreBlarg "blargBlarg" >>%~dp0log.txt 2>&1
set blargResult=%errorlevel%

%~dp0blarg2.exe >>%~dp0log.txt 2>&1
set blarg2Result = %errorlevel%

if "%blargResult%"=="0" set /a result=%result%-1
if "%blarg2Result%"=="0" set /a result=%result%-2

exit %result%

In this above example, we’ve aggregated the results of the two executables we are calling (blarg.exe and blarg2.exe presumably do something important) and we’re returning a non-zero return code if one of them fails. In fact, if blarg.exe exits with a non-zero return code, we have a return code of 1, if blarg2.exe exits with a non-zero return code, we have a return code of 2, and if both processes exit with a non-zero return code, we have a return code of 3 being passed from the .bat to whatever called it. The other thing we’re doing in the example is redirecting STDOUT and STDERR to a log file, so the sysadmin running the .bat file down the road can go investigate why things have failed.

In other words, even with a simple .bat file, there is no excuse not to have a simple verification and logging mechanism to record the output. This is super-basic but if an entry-level sysadmin is seeking to expand his or her scripting and automation capabilities, this is most certainly where to start if they are not already doing this.

Comments !