Categories
Software engineering

A Python toolbox for the modern software engineer

In the recent years, Python has become one of the most popular programming languages on this planet. This trend was partially driven by the rise of machine learning and data science – both areas, in which Python excels. (The beginner-friendly syntax didn’t hurt either.)

This focus on Python as data science tool or as scripting language sometimes overshadows its potential for software engineering. When I started out with Python I had a hard time finding info material on how to structure a larger project and which packages to use. So here is my complete list of “must haves” for each new project.

IDE

There are many Python IDEs out there and it is important to realize, that not all of them are equally good for each use case. While Jupyter is great for rapid prototyping or visualization and Anacondas Spyder has powerful features when it comes to data analysis, none of these IDEs are particularly good for software engineering. Here are my favorites:

  • Visual Studio Code: Free, cross-platform, cross-language, many Extensions, great Python support and very lightweight. The best choice if you have projects in multiple languages and don’t want to switch your IDE all the time.
  • Pycharm: A pure Python IDE, which is available as free Community edition or as commercial Professional edition. The professional edition comes with nice additional features like built-in performance profiling.

Personally, I prefer Visual Studio Code.

Environment and dependency management

Package management is one of the first things new Pythonistas have to learn before they can create reasonably sized projects or contribute to them.

Basics
  • Venv: Each new project should start with a new virtual environment. This allows you to isolate the dependencies of the package you’re creating.
  • Virtualenvwrapper: A very handy little package, that makes it easier to create, activate and switch between multiple virtual environments.
Extensions
  • Poetry is a swiss army knife when it comes to organizing your project. It helps you to create a basic directory structure, to organize the package dependencies and to publish your package to PyPI. It comes with a bit of a learning curve though.
  • Pipenv is a commonly used alternative to Poetry and simplifies the dependency management. It automatically tracks the dependencies in your virtual environment and helps you to lock them down for deterministic builds (e.g. to make sure you’re deploying the same setup that you tested before).

I personally like to have a lot of control over the structure of my projects and usually stick to the use of basic pip + venv without any of the extensions. But they are definitely interesting projects and worth checking out.

Code quality & static code checking

One thing that makes Python different from many other programming languages is, that there is a clear definition on what good code looks like: PEP8 – the official style guide for Python. This fact is used by the many code style linters out there and I highly recommend to use at least one in each of your projects. Besides the style, make sure you have a security linter enabled.

Basics
  • Pylint is the default linter I have enabled for any code I write. It is highly configurable and nicely integrated in many IDEs, including my preferred Visual Studio Code.
  • Bandit is a security linter that checks your code for common pitfalls like SQL injection, weak encryption methods, etc..
  • If you like beautiful code, but are too lazy to write it yourself, you’ll love black. The uncompromising code formatter takes even the ugliest code and auto-formats it to be compliant with PEP 8.
Extensions
  • Flake8 is a common alternative to Pylint. I’ve even seen people use both style linters in parallel to make sure their code really shines.
  • Sonarlint is a powerful IDE extension for VS Code and other IDEs, that takes care of style and security linting at the same time. It is a bit too resource intensive for my taste, but does a very good job at checking your code.

If you want to go all the way when it comes to ensuring the quality of your product, you might also have a look at products like Sonarqube or SonarCloud. These server-side solutions scan your entire code and give you an overall status report including found vulnerabilities.

Unit testing

Python comes with the built-in unittest framework. However, there is a very popular and more powerful third-party alternative called pytest.

There are a few important differences between pytest and unittest. Most importantly, pytest can run test cases written in the unittest framework, but not the other way around. So you don’t have to migrate your entire test suite if you decide to go with pytest.

I recommend to give pytest a shot, especially due to the large number of available extensions for every scenario imaginable:

Both test frameworks integrate nicely with all commonly used IDEs, so you don’t have to worry about compatibility.

Setup, build and continuous integration

This is a very large topic and I’m only going to scratch the surface here. Make sure to read up anything that doesn’t sound familiar.

  • Check out the tox automation project. This essentially gives you an isolated CI pipeline on your local machine and allows you to test and build your python package in multiple environments simultaneously. It takes some time to configure, but once you have it set up you’ll never want to work without it again.
  • When you’re planning on developing your package continuously, make sure to adhere to the official semantic versioning conventions for Python as written in PEP440. This will make your live easier when you’ll start pushing alpha and beta versions to PyPI.
  • Versioneer: Like a rocketeer, but for versions! – You just have to love their slogan. This nice little helper takes care of the semantic versioning for you by scanning your git repository for appropriate tags.
  • Wheels are the “new” (2012) recommended distribution format for Python packages. They are more lightweight and speed up the installation process. An excellent read about this topic is “What are Python wheels and why should you care.

Web frameworks

When it comes to web frameworks you’ll have to choose between the categories micro framework and full-stack web frameworks. I’ll provide you with the most common examples for both:

  • Flask is a very lightweight framework which is pretty easy to learn. The hello world example has only 5 lines of code.
    This minimalistic approach shouldn’t be confused with limited functionality. Flask relies heavily on a variety of extensions, that you can install on demand. This is nice in many ways: It makes the deployed web-service more lightweight, it simplifies your code base and it allows you to make use of powerful third-party extensions written by fans.
    If you’re looking into developing a RESTful API, I recommend to have a look at Flask-RESTx, which essentially adds a Swagger UI to your service for free.
  • Django is a very powerful, but also quite heavy full-stack web framework. It comes with a lot of configuration options and relies heavily on conventions. Learning it takes significantly longer compared to Flask, but it will give you a more complete feature set out of the box.

If you’re just starting out with web development, my recommendation is to go with Flask.

Databases

To set up a connection to a relational database you essentially have two options:

  • Pyodbc is the default package to establish a database connection and is compatible with all common database providers. It includes all the basic features like transaction management (as long as the odbc driver supports it), execution of multiple queries and many more.
  • SQLAlchemy is your go-to solution if you’re looking for a pythonic ORM mapper. It uses pyodbc under the hood and exposes the core functionality to you as user, so you’re not loosing any flexibility, which is really nice.

If you’re looking to established a connection to a NoSQL database or a HDFS system you’ll find the proper tool set easily with the search engine of your choice. There usually is a dominant package for each of them and the decision won’t be difficult.


This has been a lot and if you made it all the way here I admire your perseverance. You’ve earned yourself a good cup of tea or coffee.

While you’re enjoying your hot beverage: What are your favorite tools for software engineering in Python? Are you missing anything in the list or do you disagree with my selection? Please let me know in the comments! I’m looking forward to the discussion.

* Credits for the image go to Pexels on Pixabay

Categories
Software engineering

Connecting to SharePoint using Python

The API

We will use the SharePoint REST service. If you’re not familiar with the concept of REST services and how to construct REST requests you can check out this nicely written and hands-on tutorial.

Also, we’ll only be looking at classic on-premises installations of SharePoint. If you want to connect to SharePoint online you can give the Office 365 REST Python Client a try. There are plenty of code samples on the intro page.

Your options
  • Connect using BASIC Authentication
  • Connect using NTLM Authentication
  • Connect using Kerberos Authentication (Negotiate)

Kerberos is generally considered the safest option. So if you’re working in a corporate environment it is likely, that your administrators have locked down the other authentication types.

SSL certificate

For any of these options, you need a valid SSL certificate to connect to the SharePoint. Please don’t do insecure HTTP requests! No, not even for the early phase of your project. Just don’t. 🙂

To get the SSL certificate, open the browser of your choice and navigate to the SharePoint site. Then open the certificate.

How to display the certificate in Chrome
Open the certificate in Chrome

What we want to do, is to download the CA certificate and use it in our Python program to validate a secured connection to the server. So select the root certificate (at the top of the chain) and download it as file. When you’re asked for the file type, select Base-64 encoded X.509.

Depending on the server configuration you might only need the root (top-level) certificate. But there are some cases where you have to validate the entire chain. So if you get SSL errors, try to download each certificate in the chain individually and simply concatenate them into one .cer file using a text editor of your choice as shown below:

-----BEGIN CERTIFICATE-----
[...]
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
[...]
-----END CERTIFICATE-----
Basic Authentication

If you’re coding against a SharePoint opened for visitors outside of your Active Directory or the security requirements of your company are not too strict, you can try to connect with Basic Authentication:

import requests
from requests.auth import HTTPBasicAuth

cert = 'path\to\certificate.cer'
user = 'DOMAIN\\User'
password = 'MyPassword'

response = requests.get(
url=r'http://mysharepoint.com/_api',
auth=HTTPBasicAuth(user, password),
verify=cert)

print(response.status_code)
NTLM Authentication

For the NTLM authentication, you need to install the requests_ntlm package first. The adjusted code sample looks like this:

import requests
from requests_ntlm import HttpNtlmAuth

cert = 'path\to\certificate.cer'
user = 'DOMAIN\\User'
password = 'MyPassword'
 
response = requests.get(
    r'http://mysharepoint.com/_api',
    auth=HttpNtlmAuth(user, password),
    verify=cert)
 
print(response.status_code)
Kerberos Authentication (Negotiate)

For the Kerberos authentication, the setup depends on your operation system.

Windows users:

Install the requests_negotiate_sspi package and you’re ready to go. The package makes use of the Windows SSPI interface to get the credentials of the currently logged in users (a.k.a. integrated authentication).

import requests
from requests_negotiate_sspi import HttpNegotiateAuth

cert = 'path\to\certificate.cer'
 
response = requests.get(
    r'http://mysharepoint.com/_api',
    auth=HttpNegotiateAuth(),
    verify=cert)
 
print(response.status_code)
Linux users:

You first have to add the correct authentication package requests_negotiate to your environment. Additionally, you need to install the appropriate Kerberos distribution on your system from the MIT Kerberos Distribution page. After the installation, make sure, to add the folder containing the binary to your PATH.

Now, you can use the kinit command to retrieve a valid Kerberos ticket. Once this works, you’re properly set up and ready to connect:

import requests
from requests_negotiate import HTTPNegotiateAuth

cert = 'path\to\certificate.cer'
 
response = requests.get(
    r'http://mysharepoint.com/_api',
    auth=HTTPNegotiateAuth(),
    verify=cert)
 
print(response.status_code)

Categories
Software engineering

The Zen of Python

Beautiful is better than ugly.

Explicit is better than implicit.

Simple is better than complex.

Complex is better than complicated.

Flat is better than nested.

Sparse is better than dense.

Readability counts.

Special cases aren’t special enough to break the rules.

Although practicality beats purity.

Errors should never pass silently.

Unless explicitly silenced.

In the face of ambiguity, refuse the temptation to guess.

There should be one– and preferably only one –obvious way to do it.

Although that way may not be obvious at first unless you’re Dutch.

Now is better than never.

Although never is often better than *right* now.

If the implementation is hard to explain, it’s a bad idea.

If the implementation is easy to explain, it may be a good idea.

Namespaces are one honking great idea — let’s do more of those


And should you ever need a reminder – try typing the following in your next Python session:

>>> import this

Credit for the featured image goes to Devanath on Pixabay.