May 22, 2016
Developer Skills You Must Know (But Didn't Learn in College) | Part 4: Libraries and Package Managers

On March 22nd I gave a presentation to ASU's Software Developers Association which I called "What My Professors Didn't Teach Me - Developer Skills: What They Are, Why They're Important and What You Should Know About Them". The presentation provided an overview on seven different "developer skills" that I believe are necessary to succeed as a professional software developer. I define "developer skills" as those aptitudes between the "hard" technical skills of programming / hacking and the "soft" interpersonal skills of teamwork and communication. They are the things that a developer needs to know to succeed at his job outside of actually building software.

Over the next several weeks I will be writing a post on each of these proficiencies; each post will describe what the developer skill is, why I consider it to be necessary knowledge for the workplace, and what specific things you should learn. I plan to have links to tutorials or other websites that you can use to follow up.

The posts will be as follows:

Part 0: Introduction
Part 1: Version Control
Part 2: Ticketing Software
Part 3: Multi-Branch Development Workflow
Part 4: Libraries and Package Managers
Part 5: Working with Remote Computers
Part 6: Communication Between Software
Part 7: Securing Your Communication

The first three skills we've discussed so far (Version Control, Ticketing Software and Multi-Branch Development Workflow) all fall under the same category of "Project Management" skills. The remaining four, including today's topic, might be called "General Tech Skills". These are mostly items that a developer should have a general awareness and understanding of. Often you'll need to utilize one or more of these items as part of normal development, so you will need to know what they are and how they can be used; but until you actually have a use case I don't think it's necessary to dig too deep into them.

As usual, I don't find that these items are really taught in a normal CS degree. These topics are all about utilizing code or functionality beyond what you yourself write or have access to, but this is anathema to a college assignment, for obvious reasons.

The subject of this post is libraries and package managers - how we use "packages" of common functionality written by other people, and how we actually get them.

What Is This?

A good developer should always make a practice of not reinventing the wheel. If a piece of functionality has already been written, and you can reuse it without legal or ethical consequences, then reuse it. This saves you time, and therefore saves your employer money. The importance and benefits of code reuse was recognized long ago, and the programming field has long made functionality available in the form of "packages" or "libraries" that you can download and use in your own projects.

Image Credit:

So what kind of libraries are there? To my mind there are two (not mutually exclusive) types:

  1. System Interfaces: these are libraries that make it easy for you to work with a certain system. For example, most major languages have packages to make it very easy to communicate with different databases - in fact, often these packages are maintained by the same folks who wrote the database itself. Another example might be a package that makes it easier to send or receive HTTP messages. The goal here is to simplify and abstract complicated communication procedures.
  2. General Functionality: in contrast to the system interfaces packages, general functionality modules are more about providing common functionality so you don't have to write it yourself. An example might be a library that performs a sort algorithm or encrypts / decrypts certain data. 

Note that there is a distinction between "libraries" and functionality built into a system. Libraries usually refer to things that you download and apply to an existing system; if it comes with the system, it's not a library. However, this distinction can get quite blurry, as many systems provide a way to automatically download and add packages - more on this later.

In addition to alleviating the need to reinvent the wheel, major, well-written libraries also tend to be far more secure and stable than anything you yourself could realistically write. These libraries are often community curated and open sourced, meaning that you're getting the expertise of the entire development community with a package. Active use of these libraries by many different people mean that bugs tend to be found - and fixed - quickly.

So how do we install libraries? One way is to simply download the code it and integrate it into your project - but many languages and systems contain a package manager to assist with this. The package manager automatically downloads a package, integrates it into your project, and keeps track of the package, usually all with a single command.

Many package managers also allow you to download certain utilities related to a language or system; these utilities are usually not integrated into a particular project, but instead are used to help you with building that project in various ways.

Most modern languages and systems have package managers. Here are a few examples:

  • npm for Node.js
  • rubygems for Ruby
  • pear for PHP
  • homebrew for OSX
  • apt-get for Ubuntu-based Linux
  • yum for Debian-based Linux

Just a few of the many package managers out there

As you can see, some of these are linked to a specific language / runtime. Others are linked to an operating system / environment.

Why Is This Important?

We discussed why libraries and packages are important above - they vastly increase what you as a developer are able to accomplish. You no longer need to be an expert in every subject - you can instead use the work of experts in your own projects. Packages alleviate a significant amount of concern and stress about functionality and security, when used properly. In fact, many packages provide are the only way to realistically perform a certain task - for example, some services may only allow you to connect to them through their provided packages.

Even if a package is not the only way to perform something, it is often the easiest way. From a business perspective, using packages is a great idea - and you will be expected as a developer to use them. The first thing you'll often be asked when starting a project is, does anything out there already exist that can perform what we need?

Package managers are important to use primarily to, once again, alleviate stress and concern. Packages and libraries usually come in frequently-released versions. Many of these version changes provide security fixes or other things that you really want to have. This said, you don't want to spend all day at work checking whether a new package is out. Package managers perform this for you - they'll let you know when an upgrade is available, and make it easy to apply this upgrade.

In general, you'll be expected to work with libraries and package managers at any serious job. In fact, attempting to develop without using these things will often reflect negatively on your competency. So, use them!

Things You Should Know

This one's simple. If you are working with a system, know how to use that system's package manager! For example, if you're working with Node.js, learn how to use npm.

Beyond this, I would recommend that you know how to reliably determine whether a package is safe to use or not. We haven't discussed yet the potential downside of packages - you didn't write them, so you don't always know what's in them. A package could very well hide malware. How do you protect yourself against this? Simply stick to well-used, community-reviewed packages. Every system I know of has a community that reviews and rates libraries - if a library does something bad, it will have a very low review. Don't rely on reviews alone; keep an eye on how many people are actually using the package as well. If it's high, you have good confidence that the package is safe.

This concludes packages and package managers. The final three posts in this series will all have a unifying theme - dealing with two systems (whether software, or computers themselves) communicating with one another. We live in an interconnected world; since this is the case, you'll absolutely need to know how to (safely) use this communication. Look forward to the next entry sometime soon!