Reindexing only valid with uniquely valued Index objects – How To Fix It?

Reindexing only valid with uniquely valued Index objects

If you are encountering the error InvalidIndexError: Reindexing only valid with uniquely valued Index objects in Python and have difficulties fixing it. Let’s follow this guide with the explanation and examples below to solve it.

How does the error “InvalidIndexerror: reindexing only valid with uniquely valued Index objects” occur?

Generally, this error occurs when you have duplicated column names.

Look at the example below to know more about this error.

import pandas as pd
from pandas import Timestamp

df1 = pd.DataFrame(
  	{
		'Title': ['Tutorial', 'Learn To Share IT', 'Fix Error', 'Guide'],
        'Author': ['crvt4722', 'noname', 'hq09', 'LSI'],
        'Comments': [9999, 6789, 5678, 4722]
    },
    index = pd.DatetimeIndex(['2022-10-1', '2022-6-1', '2022-10-1', '2022-5-6'])
)

df2 = pd.DataFrame(
    {
      	'New follow': [99, 88, 77, 66, 55],
     	'Interaction': ['Good', 'Bad', 'Excellent', 'Not Good', 'Not Bad'],
     	'Reader': [
        	'Cristiano Ronaldo', 
          	'Lionel Messi', 
          	'Neymar', 
          	'Marcus Rashford', 
          	'Harry Kane'
        ],
     	'Donation': [100000, 400000, 400000, 400000, 400000]
    },
    index = pd.DatetimeIndex(
    	['2022-10-1', '2022-6-1', '2022-1-2', '2022-5-6', '2022-5-1']
    )
)

result = pd.concat([df1, df2], axis = 1)
print(result)

Output

pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

How to solve this error?

Solution 1: Use reset_index() function in pandas DataFrame

This function will remove the data that is duplicated.

Syntax:

Parameters:

  • df: The data frame.

Return value: DataFrame or None.

Look at the example below.

import pandas as pd
from pandas import Timestamp

df1 = pd.DataFrame(
	{
    	'Title': ['Tutorial', 'Learn To Share IT', 'Fix Error', 'Guide'],
    	'Author': ['crvt4722', 'noname', 'hq09', 'LSI'],
    	'Comments': [9999, 6789, 5678, 4722]
    },
    index = pd.DatetimeIndex(['2022-10-1', '2022-6-1', '2022-10-1', '2022-5-6'])
)

df1 = df1.reset_index()

df2 = pd.DataFrame(
    {
    	'New follow': [99, 88, 77, 66, 55],
    	'Interaction': ['Good', 'Bad', 'Excellent', 'Not Good', 'Not Bad'],
    	'Reader': [
        	'Cristiano Ronaldo', 
          	'Lionel Messi', 
          	'Neymar', 
          	'Marcus Rashford', 
          	'Harry Kane'
        ],
    	'Donation': [100000, 400000, 400000, 400000, 400000]
    },
    index = pd.DatetimeIndex(
    	['2022-10-1', '2022-6-1', '2022-1-2', '2022-5-6', '2022-5-1']
    )
)

df2 = df2.reset_index()

result = pd.concat([df1, df2], axis = 1)
print(result)

Output

       index              Title  ...             Reader  Donation
0 2022-10-01           Tutorial  ...  Cristiano Ronaldo    100000
1 2022-06-01  Learn To Share IT  ...       Lionel Messi    400000
2 2022-10-01          Fix Error  ...             Neymar    400000
3 2022-05-06              Guide  ...    Marcus Rashford    400000
4        NaT                NaN  ...         Harry Kane    400000

[5 rows x 9 columns]

Solution 2: Use DataFrame.loc method

This method accesses a group of rows and columns by label(s) or a boolean array.

Syntax:

Return Value: Scalar, Series, DataFrame.

Look at the example below to learn more about this method.

import pandas as pd
from pandas import Timestamp

df1 = pd.DataFrame(
	{
    	'Title': ['Tutorial', 'Learn To Share IT', 'Fix Error', 'Guide'],
     	'Author': ['crvt4722', 'noname', 'hq09', 'LSI'],
     	'Comments': [9999, 6789, 5678, 4722]
    },
    index = pd.DatetimeIndex(['2022-10-1', '2022-6-1', '2022-10-1', '2022-5-6'])
)

df1 = df1.loc[~df1.index.duplicated(keep = 'first')]
              
df2 = pd.DataFrame(
    {
      	'New follow': [99, 88, 77, 66, 55],
     	'Interaction': ['Good', 'Bad', 'Excellent', 'Not Good', 'Not Bad'],
     	'Reader': [
        	'Cristiano Ronaldo', 
          	'Lionel Messi', 
          	'Neymar', 
          	'Marcus Rashford', 
          	'Harry Kane'
        ],
     	'Donation': [100000, 400000, 400000, 400000, 400000]
    },
    index = pd.DatetimeIndex(
    	['2022-10-1', '2022-6-1', '2022-1-2', '2022-5-6', '2022-5-1']
    )
)
              
df2 = df2.loc[~df2.index.duplicated(keep = 'first')]
              
result = pd.concat([df1, df2], axis = 1)
print(result)

Output:

                        Title    Author  ...             Reader  Donation
2022-01-02                NaN       NaN  ...             Neymar    400000
2022-05-01                NaN       NaN  ...         Harry Kane    400000
2022-05-06              Guide       LSI  ...    Marcus Rashford    400000
2022-06-01  Learn To Share IT    noname  ...       Lionel Messi    400000
2022-10-01           Tutorial  crvt4722  ...  Cristiano Ronaldo    100000

[5 rows x 7 columns]

Summary

You can use the reset_index() method or the DataFrame.loc[] method in pandas to solve the error reindexing only valid with uniquely valued index objects. Choose the solution that is the most suitable for you. We hope this tutorial is helpful to you. Have an exciting learning experience. Thanks!

Maybe you are interested:

Leave a Reply

Your email address will not be published. Required fields are marked *