Python for Shell Scripting

Posted on December 4, 2012

1


CC-by-SA

Recently I had to come up with a development environment for unit and regression testing of legacy embedded C code, and this required the development of good old integration/glue sysadmin-like scripts.

One thing I’ve always believed is that if my shell scripting code begin to accumulate hundred of SLOCs across shell script, then I’ve gone over the modular capacity of shell scripts.

“Who cares?” some of you’ll say, for it’s just glue code, throw-away code. Unfortunately, it is rarely the case that code is truly throw-away. Stuff tends to stick and promote itself into production/support code.  Such is the nature of software. And even if it is, it is never written in one single shot. Bugs (and the consequences of them when executed) have to be fixed.

python-logo

All hail teh Python!

So I decided once and for all to give Python a try for writing sysadmin/infrastructure support code.  I chose Python over Ruby or Lua since (as far as I can tell) Python has better capabilities to integrate and interact with the underlying OS. I’m talking superficialities here, btw.

Similarly, the argument applies for my choosing of Python over Groovy or Scala – they are nice for scripting, but interacting with the OS (and in particular, their ability to fork processes) is a little bit too clunky for my taste.

Caveat emptor obviously, but I like the results of using Python for these type of tasks.

Granted, it will always be easier to write small algorithms in Bash with fewer lines of code. Furthermore, Bash shell scripts have the simplest and cleanest notations for I/O redirection and pipelining (something that can still be remedy by exploiting Python’s modular capabilities.)

Having said that, since I’m looking for readability and maintainability for the life scope of the thing in question, a language like Python truly delivers.

Some of the stuff I had to developed, as I mentioned before, involved setting up a development environment with specific requirements. This involves installing certain Ubuntu packages, followed by cleanup afterwards and a modicum of security (thou shall exec as root.)

Secondly, even if it is throw-away code, it is never written bug-free the first time. Meaning “throw-away” code might just be executed once or a few times to complete a job. It does not mean, however, that it is written once and then delivered.

For starters, the gory details of it would go into a module, which, for illustration purposes, I shall call  “teh_project_devsetup” … and nope, ‘teh’ is not a typo 😉

Then the “main” logic would look as follows:

#!/usr/bin/python

import teh_project_devsetup as setup; # the python module with all the stuff in it

def main():
	setup.enforce_run_as_root()
	setup.install_dependencies()
	setup.remove_unused_packages()
	setup.auto_clean_packages_archive()

	print "--- done ---"

if __name__=="__main__":
	main()

 

In this strategy, the first thing we do is to ensure this whole enchilada run as root, via su or sudo perhaps (the typical way to run apt-get). Failure to meet that precondition causes the process to terminate before making any changes to the underlying system.

def enforce_run_as_root() :
	"""
	since we use apt-get and other root-level stuff, we use this to enforce a
	EUID of root (or bomb out the process if not)

	"""

	if os.geteuid() != 0:
			print 'You need to sudo/run as root. Cowardly exiting...'
			sys.exit(1)

Following that, we would like to execute apt-get for installing new packages (the setup.install_dependencies() function), as well as for auto-removing (setup.removed_unused_packages()) and auto-cleaning (setup.auto_clean_packages_archive()).

The specifics of these apt-get actions are beyond the scope of this blog post, and can easily be found in this age of google 😉

But how we do this? If we pay close attention, and if we approach this just as you would when doing software development of non-trivial systems, we realize that we can generalize this to the problem of launching and controlling processes, any processes (not just apt-get).

So we need a way to 1) execute a process, and 2) interact with it. For that, we could use a function like the one below. Nothing fancy. We use the subprocess.Popen object to launch a process; pass arguments and an environment to it; bind its I/O to ours; and return the exit status back to the caller:

#!/usr/bin/python
# teh_project_devsetup.py

import subprocess
import sys
import os

def run_process_and_wait(args_string) :
	"""
	creates a process (passing zero or more arguments),
	mapping its i/o streams to
	the current process i/o streams, and waits for its completion

	"""

	## handle/shortcut to wrap the current process' streams.
	f = {
			'in' : sys.stdin,
			'out' : sys.stdout,
			'err' : sys.stderr
		}

	## defines the process environment
	e = os.environ

	## add a proxy to the environment as needed by apt-get,
	## if you are behind a firewall of course.

	e.update( # obviously, use your actual proxy ip:port
			{'http_proxy' : "http://192.168.253.16:8080"}
			)
	exit_code = subprocess.Popen(
	  args_string,
	  shell=True,
	  stdin=f['in'], stdout=f['out'], stderr=f['err']
	).wait() ## wait for it, mucho importante

	return exit_code

Then, we can reuse this nugget for defining functions that execute apt-get for installing, auto-removing and auto-cleaning as follows (in this example, also within the same module).

We start with a base function that calls apt-get.  For this exercise, and for my actual case, I needed to run it silently without prompting the user for options. I also wanted it to fix broken dependencies:

def apt_get(package_list) :
	"""
	shell for invoking the apt-get process without prompting
	the user (attempt to fix broken dependencies autonomously).

	"""
	run_process_and_wait(
		' '.join(
			['apt-get', '-y -f '] + package_list))

Then we can create auto-removing and auto-cleaning functions based on the apt-get function above. For my specific needs (YMMV) they were the simpler cases:

def remove_unused_packages() :
	""" used for auto-cleaning of unused packages """

	print '--- remove unused packages after install...'
	apt_get(['autoremove'])

def auto_clean_packages_archive() :
	""" used for auto-cleaning downloaded archives """

	print '--- remove old downloaded archives...'
	apt_get(['autoclean'])

Now the fun part is having the function that does the actual installation:

def apt_get_install(package_list) :
	""" shell for installing a list of debian packages """
	print '--- installing\n\t' + ',\n\t'.join(package_list)
	apt_get(['--force-yes', 'install'] + package_list)

def install_dependencies() :
	"""
	creates and waits for an apt-get install process
	which install all the stuff needed by the harness

	"""
	## packages required for my development environment
	packages=[
		'ddd', # gdb debugger front-end
		'xxgdb',  # older gdb debugger front-end
		'lcov',  # command line gcov visualizer
		'ggcov', # UI gcov vizualizer
		'graphviz', # graphic displayer/renderer
		'swig', # language-binding generater
		'pip',  # python package manager
		'valgrind', # memory leak profiler
		'unifdef', # partial/selective C preprocessor
		'libcunit1', # CUnit libraries
		'libcunit1-doc', # CUnit man pages
		'libcunit1-dev', # CUnit headers
		'cdecl', # command-line utility for decoding C/C++ type declarations
		'cproto', # C prototype utility
		'indent', # general indenter
		'cppcheck', # tool for static C/C++ code analysis
		'splint', # secure C static doce analyzer,
		'doxygen', # source code documenter
		'gperf'    # hash generator
		]
	apt_get_install(packages)

The entire module file is as follows:

#!/usr/bin/python

""" Module: teh_project_devsetup.py """

import subprocess
import sys
import os

def run_process_and_wait(args_string) :
	"""
	creates a process (passing zero or more arguments),
	mapping its i/o streams to
	the current process i/o streams, and waits for its completion

	"""

	## handle/shortcut to wrap the current process' streams.
	f = {
			'in' : sys.stdin,
			'out' : sys.stdout,
			'err' : sys.stderr
		}

	## defines the process environment
	e = os.environ

	## add a proxy to the environment as needed by apt-get,
	## if you are behind a firewall of course.

	e.update( # obviously, use your actual proxy ip:port
			{'http_proxy' : "http://192.168.253.16:8080"}
			)
	exit_code = subprocess.Popen(
	  args_string,
	  shell=True,
	  stdin=f['in'], stdout=f['out'], stderr=f['err']
	).wait() ## wait for it, mucho importante

	return exit_code

def apt_get(package_list) :
	"""
	shell for invoking the apt-get process without prompting
	the user (attempt to fix broken dependencies autonomously).

	"""

	run_process_and_wait(
		' '.join(
			['apt-get', '-y -f '] + package_list))

def apt_get_install(package_list) :
	""" shell for installing a list of debian packages """
	print '--- installing\n\t' + ',\n\t'.join(package_list)
	apt_get(['--force-yes', 'install'] + package_list)

def install_dependencies() :
	"""
	creates and waits for an apt-get install process
	which install all the stuff needed by the harness

	"""
	## packages required for the SWiSS test harness
	packages=[
		'ddd', # gdb debugger front-end
		'xxgdb',  # older gdb debugger front-end
		'lcov',  # command line gcov visualizer
		'ggcov', # UI gcov vizualizer
		'graphviz', # graphic displayer/renderer
		'swig', # language-binding generater
		'pip',  # python package manager
		'valgrind', # memory leak profiler
		'unifdef', # partial/selective C preprocessor
		'libcunit1', # CUnit libraries
		'libcunit1-doc', # CUnit man pages
		'libcunit1-dev', # CUnit headers
		'cdecl', # command-line utility for decoding C/C++ type declarations
		'cproto', # C prototype utility
		'indent', # general indenter
		'cppcheck', # tool for static C/C++ code analysis
		'splint', # secure C static doce analyzer,
		'doxygen', # source code documenter
		'gperf'    # hash generator
		]
	apt_get_install(packages)

def remove_unused_packages() :
	""" used for auto-cleaning of unused packages """

	print '--- remove unused packages after install...'
	apt_get(['autoremove'])

def auto_clean_packages_archive() :
	""" used for auto-cleaning downloaded archives """

	print '--- remove old downloaded archives...'
	apt_get(['autoclean'])

def enforce_run_as_root() :
	"""
	since we use apt-get and other root-level stuff, we use this to enforce a
	EUID of root (or bomb out the process if not)

	"""

	if os.geteuid() != 0:
			print 'You need to sudo/run as root. Cowardly exiting...'
			sys.exit(1)

It is true that for such a contrived, but no-so-contrived example, I could knock it off faster using Bash shell scripting. However, development is not just about getting things to run (unless you absolutely have to in emergency situations.) Development is about creating software, however short its lifespan might be, that is understandable and modifiable and reusable.

If you do production/development support scripting on a regular basis, chances are you run into similar problem/solution combinations on a regular basis as well. Such situations typically scream for systematic reuse beyond mere copy/pasting.

For those cases, give Python (or a higher-level scripting language) a try. You might end up being both pleasantly surprised and more productive.

CC-by-SA

Advertisements