# python – How to check whether a pandas DataFrame is empty?

## The Question :

- len() doesn’t work? It should return 0 for empty dataframe.

## The Answer 1

You can use the attribute `df.empty`

to check whether it’s empty or not:

if df.empty: print('DataFrame is empty!')

Source: Pandas Documentation

## The Answer 2

I use the `len`

function. It’s much faster than `empty`

. `len(df.index)`

is even faster.

import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(10000, 4), columns=list('ABCD')) def empty(df): return df.empty def lenz(df): return len(df) == 0 def lenzi(df): return len(df.index) == 0 ''' %timeit empty(df) %timeit lenz(df) %timeit lenzi(df) 10000 loops, best of 3: 13.9 µs per loop 100000 loops, best of 3: 2.34 µs per loop 1000000 loops, best of 3: 695 ns per loop len on index seems to be faster '''

## The Answer 3

I prefer going the long route. These are the checks I follow to avoid using a try-except clause –

- check if variable is not None
- then check if its a dataframe and
- make sure its not empty

Here, `DATA`

is the suspect variable –

DATA is not None and isinstance(DATA, pd.DataFrame) and not DATA.empty

## The Answer 4

To see if a dataframe is empty, I argue that one should test for the **length of a dataframe’s columns index**:

if len(df.columns) == 0: 1

## Reason:

According to the Pandas Reference API, there is a distinction between:

- an empty dataframe with 0 rows and
*0 columns* - an empty dataframe with rows containing
`NaN`

hence*at least 1 column*

Arguably, they are not the same. The other answers are imprecise in that `df.empty`

, `len(df)`

, or `len(df.index)`

make no distinction and return **index is 0** and **empty is True** in both cases.

## Examples

Example 1: An empty dataframe with 0 rows and 0 columns

In [1]: import pandas as pd df1 = pd.DataFrame() df1 Out[1]: Empty DataFrame Columns: [] Index: [] In [2]: len(df1.index) # or len(df1) Out[2]: 0 In [3]: df1.empty Out[3]: True

Example 2: A dataframe which is emptied to 0 rows but still retains `n`

columns

In [4]: df2 = pd.DataFrame({'AA' : [1, 2, 3], 'BB' : [11, 22, 33]}) df2 Out[4]: AA BB 0 1 11 1 2 22 2 3 33 In [5]: df2 = df2[df2['AA'] == 5] df2 Out[5]: Empty DataFrame Columns: [AA, BB] Index: [] In [6]: len(df2.index) # or len(df2) Out[6]: 0 In [7]: df2.empty Out[7]: True

Now, building on the previous examples, in which the *index is 0* and *empty is True*. When reading the **length of the columns index** for the first loaded dataframe df1, it returns 0 columns to prove that it is indeed empty.

In [8]: len(df1.columns) Out[8]: 0 In [9]: len(df2.columns) Out[9]: 2

**Critically**, while the second dataframe df2 contains no data, it is **not completely empty** because it returns the amount of empty columns that persist.

## Why it matters

Let’s add a new column to these dataframes to understand the implications:

# As expected, the empty column displays 1 series In [10]: df1['CC'] = [111, 222, 333] df1 Out[10]: CC 0 111 1 222 2 333 In [11]: len(df1.columns) Out[11]: 1 # Note the persisting series with rows containing `NaN` values in df2 In [12]: df2['CC'] = [111, 222, 333] df2 Out[12]: AA BB CC 0 NaN NaN 111 1 NaN NaN 222 2 NaN NaN 333 In [13]: len(df2.columns) Out[13]: 3

It is evident that the original columns in df2 have re-surfaced. Therefore, it is prudent to instead read the **length of the columns index** with `len(pandas.core.frame.DataFrame.columns)`

to see if a dataframe is empty.

## Practical solution

# New dataframe df In [1]: df = pd.DataFrame({'AA' : [1, 2, 3], 'BB' : [11, 22, 33]}) df Out[1]: AA BB 0 1 11 1 2 22 2 3 33 # This data manipulation approach results in an empty df # because of a subset of values that are not available (`NaN`) In [2]: df = df[df['AA'] == 5] df Out[2]: Empty DataFrame Columns: [AA, BB] Index: [] # NOTE: the df is empty, BUT the columns are persistent In [3]: len(df.columns) Out[3]: 2 # And accordingly, the other answers on this page In [4]: len(df.index) # or len(df) Out[4]: 0 In [5]: df.empty Out[5]: True

# SOLUTION: conditionally check for empty columns In [6]: if len(df.columns) != 0: # <--- here # Do something, e.g. # drop any columns containing rows with `NaN` # to make the df really empty df = df.dropna(how='all', axis=1) df Out[6]: Empty DataFrame Columns: [] Index: [] # Testing shows it is indeed empty now In [7]: len(df.columns) Out[7]: 0

Adding a new data series works as expected without the re-surfacing of empty columns (factually, without any series that were containing rows with only `NaN`

):

In [8]: df['CC'] = [111, 222, 333] df Out[8]: CC 0 111 1 222 2 333 In [9]: len(df.columns) Out[9]: 1

## The Answer 5

1) If a DataFrame has got Nan and Non Null values and you want to find whether the DataFrame is empty or not then try this code. 2) when this situation can happen? This situation happens when a single function is used to plot more than one DataFrame which are passed as parameter.In such a situation the function try to plot the data even when a DataFrame is empty and thus plot an empty figure!. It will make sense if simply display 'DataFrame has no data' message. 3) why? if a DataFrame is empty(i.e. contain no data at all.Mind you DataFrame with Nan values is considered non empty) then it is desirable not to plot but put out a message : Suppose we have two DataFrames df1 and df2. The function myfunc takes any DataFrame(df1 and df2 in this case) and print a message if a DataFrame is empty(instead of plotting):

df1 df2 col1 col2 col1 col2 Nan 2 Nan Nan 2 Nan Nan Nan

and the function:

def myfunc(df): if (df.count().sum())>0: ##count the total number of non Nan values.Equal to 0 if DataFrame is empty print('not empty') df.plot(kind='barh') else: display a message instead of plotting if it is empty print('empty')