Universal Functions In Pandas

1. UNIVERSAL FUNCTIONS: INDEX PRESERVATION

All NumPy Ufunc will work on Pandas Series and DataFrame

First, let’s create Pandas Series of random integers

import numpy as np
import pandas as pd 

# creating random state
rand = np.random.RandomState(42)

# Creating Pandas Series of random integers
ser1 = pd.Series(rand.randint(10, size=4))
print(ser1)

0    6
1    3
2    7
3    4
dtype: int64

Second, create a Pandas DataFrame of random integers

# Creating Pandas DataFrame
df1 = pd.DataFrame(rand.randint(10,size=(3,4)),
                   columns=['a','b','c','d'])
                   
print(df1)

   a  b  c  d
0  6  9  2  6
1  7  4  3  7
2  7  2  5  4

Now, if we apply any Numpy Ufunc on these objects (Series or DataFrame) the result will be another Panda object with indices preserved

# Taking exponent of all element in the Series, sr1
np.exp(ser1)

0     403.428793
1      20.085537
2    1096.633158
3      54.598150
dtype: float64

# Doing arithmatic on each element of dataframe, df1
print(np.multiply(df1,10))

    a   b   c   d
0  60  90  20  60
1  70  40  30  70
2  70  20  50  40

2. UNIVERSAL FUNCTIONS: INDEX ALIGNMENT

2.1. Index Alignment in Series

When we try to add two Series with non-identical index, the resulting sum will keep the index alignment

# First, define two series whose index are not identical
A = pd.Series([1,2,3], index=[0,1,2]) #index[0,1,2]
B = pd.Series([10,20,30], index=[1,2,3]) #index[1,2,3]

# Second, perform addition of these two series
print(A); print(B)
print(A.add(B))

0    1
1    2
2    3
dtype: int64
1    10
2    20
3    30
dtype: int64
0     NaN
1    12.0
2    23.0
3     NaN
dtype: float64

As we can tell from above example, when we perform the sum, the indices of both series are preserved.

add() method with fill_value

When Python doesn’t find any corresponding value on same index, it returns NaN
For example, in Series A there is index 0 but no corresponding value for Series B, index 0
To handle this NaN, we can use kwarg fill_value with Pandas .add() method

A.add(B, fill_value=0)

0     1.0
1    12.0
2    23.0
3    30.0
dtype: float64

2.2. Index Alignment in DataFrame

When we try to add two DataFrame with non-identical index, the resulting sum will keep the index alignment

# First, defining two dataframes with not identical indices or columns
C = pd.DataFrame(rand.randint(10, size=(2,2)),
                columns=['a','b'])

D = pd.DataFrame(rand.randint(10, size=(3,3)),
                columns=['a','b','c'])

print(C); print(D)

# Secondly, we add these two dataframes and see how results are handled
print(C.add(D))

      a    b   c
0   5.0  7.0 NaN
1  10.0  9.0 NaN
2   NaN  NaN NaN

add() method with fill_value

When Python doesn’t find any corresponding value on same index and column, it returns NaN
For example, in DataFrame D there is index 0, column ‘c’ but no corresponding value for Series C under index 0, column ‘c’
We can use keyword argument, fill_value with Pandas .add() method to handle the NaN

print(C.add(D, fill_value=0))

      a    b    c
0   5.0  7.0  9.0
1  10.0  9.0  0.0
2   9.0  2.0  6.0

2.3. Python Operators and their equivalent Pandas Methods

Python operator

Parameter method(s)

add()

sub(),subtract()

mul(),multiply()

div(),divide(),truediv()

floordiv()

mod()

pow()

3. UNIVERSAL FUNCTIONS: OTHER OPERATIONS

3.1. Understanding ‘axis’ keyword argument

One way to look at `axis` kwarg:

Remember that we mention, axis=0 or axis=index the operation will be performed column wise and when we mention axis=1 or axis=column, the operation will be performed row wise.

Another way to look at `axis` kwarg:

axis=0 or axis=index means to perform operation on all the rows in each column
axis=1 or axis=column means to perform operation on all the columns in each row

3.2. Operations on Self

Let’s subtract values of first row of the df1 from all rows in df1. In this case, the default value of kwarg, axis is 1 or columns

print(df1)
print(df1.subtract(df1.iloc[0]))

   a  b  c  d
0  6  9  2  6
1  7  4  3  7
2  7  2  5  4
   a  b  c  d
0  0  0  0  0
1  1 -5  1  1
2  1 -7  3 -2

However, If we would like to apply this arithmetic operation index-wise, we can use, axis=0 or axis=index

print(df1.subtract(df1['a'], axis=0))

   a  b  c  d
0  0  3 -4  0
1  0 -3 -4  0
2  0 -5 -2 -3

3.3. Operation between Series and DataFrame

Operations between a DataFrame and Series object are similar to operations between a two-dimensional and one-dimensional NumPy array

# Series
ser11 = pd.Series(rand.randint(12, size=3))
ser11

0     2
1     9
2    11
dtype: int64

# DataFrame
df11 = pd.DataFrame(rand.randint(10,size=(3,4)),
                  columns=['a','b','c','d'] )
print(df11)

   a  b  c  d
0  7  5  7  8
1  3  0  0  9
2  3  6  1  2

Let add Series to DataFrame with kwarg, axis=0 or axis=index, which matches the index . Both ser1 and df1 have identical index

print(df1.add(ser1, axis=0))

    a   b   c   d
0   9   7   9  10
1  12   9   9  18
2  14  17  12  13

PreviousIo File Read And Write NextHandling Missing Data In Pandas

Last updated 2 years ago

Was this helpful?

1. UNIVERSAL FUNCTIONS: INDEX PRESERVATION

2. UNIVERSAL FUNCTIONS: INDEX ALIGNMENT

2.1. Index Alignment in Series

add() method with fill_value

2.2. Index Alignment in DataFrame

add() method with fill_value

2.3. Python Operators and their equivalent Pandas Methods

3. UNIVERSAL FUNCTIONS: OTHER OPERATIONS

3.1. Understanding ‘axis’ keyword argument

One way to look at axis kwarg:

Another way to look at axis kwarg:

3.2. Operations on Self

3.3. Operation between Series and DataFrame

One way to look at `axis` kwarg:

Another way to look at `axis` kwarg: