Speed Up your Python Skills

Seven tips to take you to the next level


10 min read

Python is the most widely used programming language in the data science domain, and its popularity continues to grow. The entire data science field has grown enormously in recent years.

In this article, we will show you seven tips on how to improve your Python skills. It’s often the little things that make a big difference. The tips will enrich your life as a Data Scientist. That’s why we give you seven tips that you can put into practice right now. Be curious!

As a Data Scientist, you often have to deal with large amounts of data. For this reason, you must code efficiently in terms of run time and memory. Your Python code should also be well-structured and easy to read. The tips will help you to write efficient and readable Python code.

🎓 Our Online Courses and recommendations

Our Online Courses and recommendations

Tip 1: Speed up NumPy

NumPy is a Python library to work efficiently with arrays. It also offers fast and optimized vectorized operations. But! It does not support parallel processing. As an alternative to NumPy, you can use NumExpr.

NumExpr achieves significantly better performance than NumPy because it supports multi-threading. Furthermore, it avoids allocating memory for intermediate results.

First, you have to install the packages NumPy and NumExpr. For example:

$ pip install numpy numexpr

Look at the example and try it out.

import numpy as np
import numexpr as ne
import timeit

var1 = np.random.random(2**27)
var2 = np.random.random(2**27)

%timeit np.sin(var1) / np.cos(var2)
# 2.73 s

%timeit ne.evaluate("sin(var1) / cos(var2)")
# 566 ms

Wow! The statement is performed approximately 5x faster with NumExpr. So if you want to speed up your NumPy statements, this gives you a way to do it.

NumExpr works best when you have large arrays. It also develops its maximum performance if you have a powerful computer with many cores. For this reason, we recommend NumExpr when these two conditions are present. For small array operations, you can also use NumPy, as the performance differences are very minimal. The reason is that NumExpr splits the array operands into small chunks. These chunks easily fit into the CPU cache. The chunks are distributed among the available cores of the CPU, allowing parallel execution.

If you want to learn more about NumExpr, check out NumExpr’s GitHub repository.

Tip 2: Fast alternative to pandas apply()

The pandas apply() function can execute functions along an axis of a data frame. Many programmers use the apply() function in combination with lambda functions. But how can you increase the performance of an apply() function?

You can use the package swifter. This package applies functions very quickly to data frames or series. The pandas apply() function runs on one core, and the swifter provides multiple core support.

First, you need to install the swifter package.

$ pip install swifter

After the installation, you can try it out directly.

import pandas as pd
import numpy as np
import swifter
import timeit

df = pd.DataFrame({'a': np.random.randint(7, 10, 10**7)})

# pandas apply()
%timeit  df.apply(lambda x: x**7)
# 54 ms

# swifter.apply()
%timeit  df.swifter.apply(lambda x: x**7)
# 37.5 ms

This simple example shows that the swifter.apply() function has a faster run time. The difference is particularly noticeable on powerful computers with multiple cores. If you need a performance boost in your next project, consider the swifter package.

Tip 3: Using Built-in Python Functions

Often you implement a function and don’t know that it already exists in Python. Especially if you come from other programming languages such as C or C++. First, you should always check if a Python built-in function already exists. Python built-in functions are much faster than custom implementations, so you should always use them. The following example demonstrates this.

import numpy as np
from time import perf_counter

result_list = []
company_list = ["Tesla", "Block", "Palantir", "Apple"]
company_list_sample = np.repeat(company_list, 10**7)

start = perf_counter()
for company in company_list_sample:
# 17.13 s

start = perf_counter()
result_list = map(str.lower, company_list_sample)
# 0.97 s

In the code above, we replicate a list of four entries 10 million times, so we get a list of 40 million entries. Then we convert the strings in the list to lower case. You can see that the built-in function is about 17 times faster. Especially with large amounts of data, this tip brings an enormous increase in performance. So use built-in functions!

There are many more built-in functions, such as min(), max(), all(), etc. Do your own research if you need a specific Python function. It’s worth it!

🎓 Learn Python and Machine Learning

Do you want to take your Python and ML skills to the next level? Discover The Complete Python, Machine Learning, AI Mega Bundle*.

Learn Python and Machine Learning*

It is a completely self-paced online learning course.

Here you'll discover:

  • Everything from the ABC of Python syntax to using your own web applications

  • Machine Learning and AI: Providing hands-on experience with real-world projects

  • Ready-to-use Project Templates and Source Code

👉🏽 Enroll today and take the next step in mastering Python and Data Science.*

Tip 4: Use list comprehension instead of loops

Programmers often use lists in combination with loops to store calculated results. However, this approach is not efficient in terms of run time. For this reason, it is better to use list comprehension, which has better performance. The following example shows the difference in performance.

import numpy as np
from time import perf_counter

result_list_loop = []
result_list_com = []

number_round = 10000000

start = perf_counter()
for i in range(number_round):
# 1.47 s

start = perf_counter()
result_list_com = [i*i for i in range(number_round)]
# 0.69 s

# 100

What do we learn from this example? Use list comprehension when possible. List comprehension is somewhat controversial in programming. Some programmers find the syntax hard to read, as one line of code expresses all statements. In our opinion, the syntax is clear and concise. It is a matter of taste, but the performance is better with a list comprehension.

A list comprehension begins with an opening bracket [. Then, there is the calculation from the for-loop. Then comes the loop header with three elements (keyword for, run variable, length of the loop). The list comprehension is closed with a closing bracket ]. Once you understand the syntax, you can write for-loops much more compactly.

But what about in terms of memory usage? How can we reduce the memory space? It is especially advisable with large lists if we want to perform further operations on them. In our example, we store 10000000 values in the list. But do we have to save all entries directly, or do we only need them when required?

In these cases, we can use generators. A generator creates a list item when needed. As a result, a generator requires less memory and has a better run time. Take a look at the following example.

import sys 
from time import perf_counter

print(sys.getsizeof(result_list_com), 'bytes')
# 89095160 bytes

start = perf_counter()
result_gen = (i*i for i in range(number_round))
# 0.22 ms

print(sys.getsizeof(result_gen), 'bytes')
# 112 bytes

# 100

We can do all the operations as in the previous example. The only difference is that we now use () instead of []. Instead of a list, we store a generator. This approach is more memory efficient. Check if you can use list comprehension or generators in your projects. They can improve performance and reduce memory.

Tip 5: Merge dicts with double asterisk syntax **

How do you merge dictionaries? You can do that with a one-liner. We use the asterisk syntax **. In the following example, you can see how it works.

dict_1 = {'company': 'Tesla', 'founding': 2002}
dict_2 = {'company': 'Tesla', 'founding': 2003, 'CEO': 'Elon Musk'}

dict_merged = {**dict_1, **dict_2}
# {'company': 'Tesla', 'Founding': 2003, 'CEO': 'Elon Musk'}

First, we define two dictionaries with identical and different key-value pairs. The foundation of Tesla was in 2003, so dict_2 is more up-to-date. If both dictionaries contain the same key and different values, then the value of the last dictionary is used. After merging, the new dictionary contains all three key-value pairs. The syntax is concise and compact, so merging is very easy. And the best thing is that you can merge three or more dictionaries. This trick can save a lot of time.

Another method is the update method. This method updates the first dictionary and does not create a copy. Take a look at the following example.

dict_1 = {'company': 'Tesla', 'founding': 2002}
dict_2 = {'company': 'Tesla', 'founding': 2003, 'CEO': 'Elon Musk'}

# {'company': 'Tesla', 'Founding': 2003, 'CEO': 'Elon Musk'}

The disadvantage of the update method is that you can only use one dictionary for updating. If you want to merge dictionaries in the future, remember this tip.

Tip 6: Do not import unnecessary modules

You may have heard this tip many times, but it can significantly improve the performance of your code. It is not necessary to import entire libraries. You usually only need certain functions of it. In addition, your code takes a long time to start because the entire library has to import first. That should not be the case. In addition, you then have to access individual functions via the dot notation. That is very inefficient, and you should avoid dot notation. The following examples demonstrate this.

import math
from time import perf_counter

start = perf_counter()
variable = math.exp(7)
# 8.47-05 s

In this example, we use math.exp() function with the dot notation. That leads to poor performance of your code. Also, we have imported the entire math library, although we only need the exp() function.

from math import exp
from time import perf_counter

start = perf_counter()
variable = exp(7)
# 4.51-05 s

In this example, we import the exp() function without the dot notation. By using this trick, we can halve the run time of our code. Wow. That’s great!

Tip 7: Use just-in-time compiler

Numba is a just-in-time (jit) compiler that works well with NumPy loops, arrays, and functions. Decorators are used to instruct Numba to compile certain functions with Numba. Numba compiles decorated functions just-in-time into machine code so that all or part of the code runs at the speed of native machine code.

First, we have to install Numba via pip.

pip install numba

After successful installation, you can use Numba. Take a look at the following example:

import numpy as np
from numba import jit
import timeit

var = np.random.random(10**7)
num_loop = 10000

def foo(var):
    result = 0
    for i in range(num_loop):
        result += 1
    result = np.sin(var) + np.cos(var)
    return result               
%timeit foo(var)
# 154 ms

def foo_numba(var):
    result = 0
    for i in range(num_loop):
        result += 1
    result = np.sin(var) + np.cos(var)
    return result    

%timeit foo_numba(var)
# 76.3 ms

You can see that the decorator above the foo function speeds up the code. The decorator nopython=True indicates that the compilation will run without the involvement of the Python interpreter. Numba speeds up the execution of the loop and the NumPy trigonometric functions. However, it can not be used with all Python functions. The following are the advantages and disadvantages of Numba:


  • Numba does not support pandas.

  • Unsupported code is executed via the interpreter and has the additional Numba overhead.


  • Very good support for NumPy arrays and functions, and loops.

  • Support for Nvidia CUDA. It can be used well for the development of neural networks based on NumPy.

  • Official support on M1/Arm64.

The cons and pros show that Numba should be used primarily for NumPy operations. In addition, you should always check at the beginning whether Numba is suitable for the respective implementation.


In this article, we have learned how to increase the efficiency of your code in terms of run time and memory.

Lessons learned:

  • NumPy does not support parallel processing. You can use NumExpr for that.

  • Pandas apply() function can be accelerated by swifter.

  • Check if there are built-in functions.

  • Use list comprehension instead of loops. Check if generators are suitable for your project.

  • Merge dicts with double asterisk syntax **.

  • Do not import unnecessary modules.

  • If you have run time problems, you can use just-in-time compilers. Just-in-time compilers speed up your code.

👉🏽 Join our free weekly Magic AI newsletter for the latest AI updates!

👉🏽 Elevate your Python and ML skills with our top course recommendations!

Did you enjoy our content and find it helpful? If so, be sure to check out our premium offer! Don't forget to follow us on X. 🙏🏽🙏🏽

Thanks so much for reading. Have a great day!

* Disclosure: The links are affiliate links, which means we will receive a commission if you purchase through these links. There are no additional costs for you.