Automating Application Installation on Linux with Python

Perhaps you have a shiny new web application. It responds to HTTPS requests, delivering pure awesome in response. You would like to install the application on a Linux server, perhaps in Amazon EC2.

Performing the installation by hand is a Bad Idea™. Manual installation means you cannot easily scale across multiple machines, you cannot recover from failure, and you cannot iterate on the machine configuration.

You can automate the installation process with Python. The following are examples of procedures that introduce principles for automation. This is not a step-by-step guide for an entire deployment, but it will give you the tools you need to build your own.

Connecting via SSH with Paramiko

A manual installation process might involve executing lots of commands inside an SSH session. For example:

$ sudo apt update
$ sudo apt install nginx

All of your hard-won SSH skills can be transferred to a Python automation. The Paramiko library offers SSH interaction inside Python programs. I like to shorthand my use of Paramiko by wrapping it in a little container:

"""ssh_session.py"""

from paramiko import SSHClient
from paramiko import SFTPClient
from paramiko import AutoAddPolicy

class SSHSession:
    """Abstraction of a Paramiko SSH session"""
    def __init__(
        self,
        hostname: str,
        keyfile: str,
        username: str
    ):

        self._ssh = SSHClient()
        self._ssh.set_missing_key_policy(
            AutoAddPolicy()
        )
        self._ssh.connect(
            hostname,
            key_filename=keyfile,
            username=username
        )

        return

    def execute(self, command: str) -> str:
        """Return the stdout of an SSH command"""
        _, stdout, _ = self._ssh.exec_command(
            command
        )
        return stdout.read().decode('utf-8')

    def open_sftp(self) -> SFTPClient:
        """Return an SFTP client"""
        return self.ssh.open_sftp()

We use paramiko.AutoAddPolicy to automatically add the server to our known_hosts file. This effectively answers ‘yes’ to the prompt you would see if initiating a first time connection in an interactive terminal:

The authenticity of host 'some.fqdn.com
(172.16.101.244)' can't be established. ECDSA key
fingerprint is 
SHA256:PsldJAyjANGYeLiHNPknfI95CNxvaCmeC4HWSEe6+Y.
Are you sure you want to continue connecting
(yes/no)?

You should only do this if you have otherwise secured the network path to your server. If you have not, connect manually first via a terminal and check the key fingerprint.

We initialise an SSHSession instance with a set of parameters that conveniently match what you might already have in your SSH config file. For example:

$ cat ~/.ssh/config
Host some_friendly_server_name
    HostName some.fqdn.com
    User hugh
    IdentityFile ~/.ssh/some_private_key

The matching Paramiko session would be:

from ssh_session import SSHSession

SSH = SSHSession(
    'some.fqdn.com',
    'hugh',
    '~/.ssh/some_private_key'
)

We now have a convenient little object that can run SSH commands for us.  Note that the object ignores errors in stderror returned by paramiko.SSHClient.exec_command(). While this is convenient when we are confident of our commands, it makes debugging difficult. I recommend debugging in an interactive SSH session rather than in Python.

Installing Dependencies

Let’s start by installing Nginx and Git. You could substitute these with any dependency of your application.

_ = SSH.execute('sudo apt update')
_ = SSH.execute('sudo apt install nginx -y')
_ = SSH.execute('sudo apt install git -y')

Note the ‘-y’ at the end of the apt install command. Without it, the session will hang at the apt continuance prompt:

After this operation, 4,816 kB of additional disk
space will be used.

Do you want to continue? [Y/n]

The requirement to bypass interactive prompts will be a common thread throughout this article. When automating your process, step through it manually and take careful note of where interactive prompts are required.

Creating a Linux User

Our application should, of course, run under its own user. Let’s automate that process:

APP_USER = 'farquad'
command = 'sudo adduser --system --group '
command += APP_USER
_ = SSH.execute(command)

Note that we establish the username as a constant, and don’t hardcode it into our command. This external definition, whether through a constant or a function parameter or however else it is done, is important for several reasons.

  1. It allows you to re-use the command with multiple parameters. For example, perhaps your application requires multiple users.
  2. It implements Don’t-Repeat-Yourself ‘DRY’ principle. We will likely need the username elsewhere, and by externally defining it we have created a single source of authority.

Automating File Transfer using Paramiko SFTP

Suppose your application is stored in a Git repository, like Bitbucket or Github, and that the repository is private. It is no use having an automated installation process if you need to answer an HTTPS password prompt when pulling a repository.

Instead, let’s automate the process by using SSH and installing a repository key on the machine. First, the transfer process:

KEY_FILEPATH = '~/some/key'
with open(KEY_FILEPATH, 'r') as keyfile:
    key = keyfile.read()

sftp = SSH.open_sftp()
remote_file = sftp.file('~/repository_key', 'w')
remote_file.write(key)
remote_file.flush()
remote_file.close()
sftp.close()

Note that we first SFTP’d the key into our privileged user’s home directory, rather than directly into the application user’s directory. This is because our privileged user does not have permission to write into the application users’ home directory without sudo elevation, which we can’t do in the SFTP session.

Let’s move it into the appropriate place, and modify permissions appropriately:

command = 'sudo mkdir /home/' + APP_USER + '/.ssh'
_ = SSH.execute(command)

command = 'sudo mv ~/repository_key'
command += ' /home/' + APP_USER + '/.ssh/
_ = SSH.execute(command)

command = 'sudo chmod 600 /home/' + APP_USER
command += '/.ssh/repository_key'
_ = SSH.execute(command)

The file is now in the appropriate location, with the appropriate permissions. We repeat the process to install an ssh configuration file. I won’t lay out the entire process, but the principle is the same: Open an SFTP session, plop the file on the server, and move it  and re-permission around as necessary.

There is one important consideration. Because we have been creating directories as our privileged user, we need to turn over those directories to the application user:

command = 'sudo chown -R '
command += APP_USER + ':' + APP_USER
command += ' /home/' + APP_USER + '/.ssh'
_ = SSH.execute(command)

In the end, there should be an SSH configuration file on the server owned by the application user. Here is an example, using all the same names we have been using so far:

$ cat /home/farquad/.ssh/config
Host bitbucket.org
    HostName bitbucket.org
    User git
    IdentityFile ~/.ssh/repository_key
$ ls -la /home/farquad
//
drwxrwxr-x 2 farquad farquad 4096 Mar 23 18:47 .ssh
//

Pulling the Repository

The next step is easy mode. You’ve set things up such that your wonderful application can be pulled down in a single command:

REPOSITORY = 'git@bitbucket.org:super/app.git'
command = 'cd /home/' + APP_USER
command += '; sudo -u ' + APP_USER
command += ' git clone ' + REPOSITORY
_ = SSH.execute(command)

Well, maybe almost easy mode. There’s a bit going on here. Note the separation of commands via a semicolon. Consider your Python SSH connection to be a very ‘loose’ one. It won’t retain environment information, including current directory, between executions. Therefore, to use conveniences like cd, we chain commands with semicolons.

Also note the sudo -u farquad. We do this so that the git repository is pulled down as the property of our application user,  not our privileged user. This saves us all the dicking about with permissions that plagued the SFTP steps above.

Paramiko and Virtual Environments, like Virtualenv

The ‘loose’ nature of the Paramiko session referenced above becomes particularly important when working with virtual environments. Consider the following:

$ virtualenv -p python3 /home/farquad/app
$ cd /home/farquad/app
$ source bin/activate
(app) $ pip install gunicorn

If executed as distinct commands via a Paramiko SSH session, the gunicorn library will end up installed via systemwide pip. If you then attempt to run the application inside the virtual environment, say inside an Systemd configuration file…

//
[Service]
User=farquad
Group=farquad
WorkingDirectory=/home/farquad/app
Environment="PATH=/home/farquad/app/bin"
//

… Then your application will fail because gunicorn was missing from the virtual environment. Instead, be sure to execute commands that require a particular environment in an atomic manner.

Your Move!

Once your application deployment is automated,  you have freed yourself from having to trudge through SSH command sequences every time you want to adjust your deployment. The fear of breaking a server disappears, because you can fire up a replacement at will. Enjoy the freedom!