## The Question :

*550 people think this question is useful*

I’ve created a Pandas DataFrame

df = DataFrame(index=['A','B','C'], columns=['x','y'])

and got this

x y
A NaN NaN
B NaN NaN
C NaN NaN

Then I want to assign value to particular cell, for example for row ‘C’ and column ‘x’.
I’ve expected to get such result:

x y
A NaN NaN
B NaN NaN
C 10 NaN

with this code:

df.xs('C')['x'] = 10

but contents of `df`

haven’t changed. It’s again only `NaN`

s in DataFrame.

Any suggestions?

*The Question Comments :*

## The Answer 1

*688 people think this answer is useful*

RukTech’s answer, `df.set_value('C', 'x', 10)`

, is far and away faster than the options I’ve suggested below. However, it has been **slated for deprecation**.

Going forward, the recommended method is `.iat/.at`

.

**Why **`df.xs('C')['x']=10`

does not work:

`df.xs('C')`

by default, returns a new dataframe with a copy of the data, so

df.xs('C')['x']=10

modifies this new dataframe only.

`df['x']`

returns a view of the `df`

dataframe, so

df['x']['C'] = 10

modifies `df`

itself.

**Warning**: It is sometimes difficult to predict if an operation returns a copy or a view. For this reason the docs recommend avoiding assignments with “chained indexing”.

So the recommended alternative is

df.at['C', 'x'] = 10

which *does* modify `df`

.

In [18]: %timeit df.set_value('C', 'x', 10)
100000 loops, best of 3: 2.9 µs per loop
In [20]: %timeit df['x']['C'] = 10
100000 loops, best of 3: 6.31 µs per loop
In [81]: %timeit df.at['C', 'x'] = 10
100000 loops, best of 3: 9.2 µs per loop

## The Answer 2

*235 people think this answer is useful*

Update: The `.set_value`

method is going to be deprecated. `.iat/.at`

are good replacements, unfortunately pandas provides little documentation

The fastest way to do this is using set_value. This method is ~100 times faster than `.ix`

method. For example:

`df.set_value('C', 'x', 10)`

## The Answer 3

*120 people think this answer is useful*

You can also use a conditional lookup using `.loc`

as seen here:

df.loc[df[<some_column_name>] == <condition>, [<another_column_name>]] = <value_to_add>

where `<some_column_name`

is the column you want to check the `<condition>`

variable against and `<another_column_name>`

is the column you want to add to (can be a new column or one that already exists). `<value_to_add>`

is the value you want to add to that column/row.

This example doesn’t work precisely with the question at hand, but it might be useful for someone wants to add a specific value based on a condition.

## The Answer 4

*42 people think this answer is useful*

The recommended way (according to the maintainers) to set a value is:

df.ix['x','C']=10

Using ‘chained indexing’ (`df['x']['C']`

) may lead to problems.

See:

## The Answer 5

*40 people think this answer is useful*

Try using `df.loc[row_index,col_indexer] = value`

## The Answer 6

*28 people think this answer is useful*

This is the only thing that worked for me!

df.loc['C', 'x'] = 10

Learn more about `.loc`

here.

## The Answer 7

*16 people think this answer is useful*

`.iat/.at`

is the good solution.
Supposing you have this simple data_frame:

A B C
0 1 8 4
1 3 9 6
2 22 33 52

if we want to modify the value of the cell `[0,"A"]`

u can use one of those solution :

`df.iat[0,0] = 2`

`df.at[0,'A'] = 2`

And here is a complete example how to use `iat`

to get and set a value of cell :

def prepossessing(df):
for index in range(0,len(df)):
df.iat[index,0] = df.iat[index,0] * 2
return df

y_train before :

0
0 54
1 15
2 15
3 8
4 31
5 63
6 11

y_train after calling prepossessing function that `iat`

to change to multiply the value of each cell by 2:

0
0 108
1 30
2 30
3 16
4 62
5 126
6 22

## The Answer 8

*10 people think this answer is useful*

To set values, use:

df.at[0, 'clm1'] = 0

- The fastest recommended method for setting variables.
`set_value`

, `ix`

have been deprecated.
- No warning, unlike
`iloc`

and `loc`

## The Answer 9

*6 people think this answer is useful*

you can use `.iloc`

.

df.iloc[[2], [0]] = 10

## The Answer 10

*6 people think this answer is useful*

In my example i just change it in selected cell

for index, row in result.iterrows():
if np.isnan(row['weight']):
result.at[index, 'weight'] = 0.0

‘result’ is a dataField with column ‘weight’

## The Answer 11

*4 people think this answer is useful*

`set_value()`

is deprecated.

Starting from the release 0.23.4, Pandas “*announces the future*“…

>>> df
Cars Prices (U$)
0 Audi TT 120.0
1 Lamborghini Aventador 245.0
2 Chevrolet Malibu 190.0
>>> df.set_value(2, 'Prices (U$)', 240.0)
__main__:1: FutureWarning: set_value is deprecated and will be removed in a future release.
Please use .at[] or .iat[] accessors instead
Cars Prices (U$)
0 Audi TT 120.0
1 Lamborghini Aventador 245.0
2 Chevrolet Malibu 240.0

Considering this advice, here’s a demonstration of how to use them:

**by row/column integer positions**

>>> df.iat[1, 1] = 260.0
>>> df
Cars Prices (U$)
0 Audi TT 120.0
1 Lamborghini Aventador 260.0
2 Chevrolet Malibu 240.0

>>> df.at[2, "Cars"] = "Chevrolet Corvette"
>>> df
Cars Prices (U$)
0 Audi TT 120.0
1 Lamborghini Aventador 260.0
2 Chevrolet Corvette 240.0

References:

## The Answer 12

*4 people think this answer is useful*

I tested and the output is `df.set_value`

is little faster, but the official method `df.at`

looks like the fastest non deprecated way to do it.

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.rand(100, 100))
%timeit df.iat[50,50]=50 # ✓
%timeit df.at[50,50]=50 # ✔
%timeit df.set_value(50,50,50) # will deprecate
%timeit df.iloc[50,50]=50
%timeit df.loc[50,50]=50
7.06 µs ± 118 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
5.52 µs ± 64.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
3.68 µs ± 80.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
98.7 µs ± 1.07 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
109 µs ± 1.42 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Note this is setting the value for a single cell. For the vectors `loc`

and `iloc`

should be better options since they are vectorized.

## The Answer 13

*3 people think this answer is useful*

Here is a summary of the valid solutions provided by all users, for data frames indexed by integer and string.

df.iloc, df.loc and df.at work for both type of data frames, df.iloc only works with row/column integer indices, df.loc and df.at supports for setting values using column names and / or integer indices.

When the specified index does not exist, both df.loc and df.at would append the newly inserted rows/columns to the existing data frame, but df.iloc would raise “IndexError: positional indexers are out-of-bounds”. A working example tested in Python 2.7 and 3.7 is as follows:

import numpy as np, pandas as pd
df1 = pd.DataFrame(index=np.arange(3), columns=['x','y','z'])
df1['x'] = ['A','B','C']
df1.at[2,'y'] = 400
# rows/columns specified does not exist, appends new rows/columns to existing data frame
df1.at['D','w'] = 9000
df1.loc['E','q'] = 499
# using df[<some_column_name>] == <condition> to retrieve target rows
df1.at[df1['x']=='B', 'y'] = 10000
df1.loc[df1['x']=='B', ['z','w']] = 10000
# using a list of index to setup values
df1.iloc[[1,2,4], 2] = 9999
df1.loc[[0,'D','E'],'w'] = 7500
df1.at[[0,2,"D"],'x'] = 10
df1.at[:, ['y', 'w']] = 8000
df1
>>> df1
x y z w q
0 10 8000 NaN 8000 NaN
1 B 8000 9999 8000 NaN
2 10 8000 9999 8000 NaN
D 10 8000 NaN 8000 NaN
E NaN 8000 9999 8000 499.0

## The Answer 14

*3 people think this answer is useful*

One way to use index with condition is first get the index of all the rows that satisfy your condition and then simply use those row indexes in a multiple of ways

conditional_index = df.loc[ df['col name'] <condition> ].index

Example condition is like

==5, >10 , =="Any string", >= DateTime

Then you can use these row indexes in variety of ways like

- Replace value of one column for conditional_index

df.loc[conditional_index , [col name]]= <new value>

- Replace value of multiple column for conditional_index

df.loc[conditional_index, [col1,col2]]= <new value>

- One benefit with saving the conditional_index is that you can assign value of one column to another column with same row index

df.loc[conditional_index, [col1,col2]]= df.loc[conditional_index,'col name']

This is all possible because .index returns a array of index which .loc can use with direct addressing so it avoids traversals again and again.

## The Answer 15

*2 people think this answer is useful*

`df.loc['c','x']=10`

This will change the value of *c*th row and
*x*th column.

## The Answer 16

*2 people think this answer is useful*

I would suggest:

df.loc[index_position, "column_name"] = some_value

## The Answer 17

*1 people think this answer is useful*

In addition to the answers above, here is a benchmark comparing different ways to add rows of data to an already existing dataframe. It shows that using at or set-value is the most efficient way for large dataframes (at least for these test conditions).

- Create new dataframe for each row and…
- … append it (13.0 s)
- … concatenate it (13.1 s)

- Store all new rows in another container first, convert to new dataframe once and append…
- container = lists of lists (2.0 s)
- container = dictionary of lists (1.9 s)

- Preallocate whole dataframe, iterate over new rows and all columns and fill using
- … at (0.6 s)
- … set_value (0.4 s)

For the test, an existing dataframe comprising 100,000 rows and 1,000 columns and random numpy values was used. To this dataframe, 100 new rows were added.

Code see below:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Wed Nov 21 16:38:46 2018
@author: gebbissimo
"""
import pandas as pd
import numpy as np
import time
NUM_ROWS = 100000
NUM_COLS = 1000
data = np.random.rand(NUM_ROWS,NUM_COLS)
df = pd.DataFrame(data)
NUM_ROWS_NEW = 100
data_tot = np.random.rand(NUM_ROWS + NUM_ROWS_NEW,NUM_COLS)
df_tot = pd.DataFrame(data_tot)
DATA_NEW = np.random.rand(1,NUM_COLS)
#%% FUNCTIONS
# create and append
def create_and_append(df):
for i in range(NUM_ROWS_NEW):
df_new = pd.DataFrame(DATA_NEW)
df = df.append(df_new)
return df
# create and concatenate
def create_and_concat(df):
for i in range(NUM_ROWS_NEW):
df_new = pd.DataFrame(DATA_NEW)
df = pd.concat((df, df_new))
return df
# store as dict and
def store_as_list(df):
lst = [[] for i in range(NUM_ROWS_NEW)]
for i in range(NUM_ROWS_NEW):
for j in range(NUM_COLS):
lst[i].append(DATA_NEW[0,j])
df_new = pd.DataFrame(lst)
df_tot = df.append(df_new)
return df_tot
# store as dict and
def store_as_dict(df):
dct = {}
for j in range(NUM_COLS):
dct[j] = []
for i in range(NUM_ROWS_NEW):
dct[j].append(DATA_NEW[0,j])
df_new = pd.DataFrame(dct)
df_tot = df.append(df_new)
return df_tot
# preallocate and fill using .at
def fill_using_at(df):
for i in range(NUM_ROWS_NEW):
for j in range(NUM_COLS):
#print("i,j={},{}".format(i,j))
df.at[NUM_ROWS+i,j] = DATA_NEW[0,j]
return df
# preallocate and fill using .at
def fill_using_set(df):
for i in range(NUM_ROWS_NEW):
for j in range(NUM_COLS):
#print("i,j={},{}".format(i,j))
df.set_value(NUM_ROWS+i,j,DATA_NEW[0,j])
return df
#%% TESTS
t0 = time.time()
create_and_append(df)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))
t0 = time.time()
create_and_concat(df)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))
t0 = time.time()
store_as_list(df)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))
t0 = time.time()
store_as_dict(df)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))
t0 = time.time()
fill_using_at(df_tot)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))
t0 = time.time()
fill_using_set(df_tot)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))

## The Answer 18

*1 people think this answer is useful*

Soo, your question to convert NaN at [‘x’,C] to value 10

the answer is..

df['x'].loc['C':]=10
df

alternative code is

df.loc['C', 'x']=10
df

## The Answer 19

*0 people think this answer is useful*

If you want to change values not for whole row, but only for some columns:

x = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
x.iloc[1] = dict(A=10, B=-10)

## The Answer 20

*0 people think this answer is useful*

From version 0.21.1 you can also use `.at`

method. There are some differences compared to `.loc`

as mentioned here – pandas .at versus .loc, but it’s faster on single value replacement

## The Answer 21

*-4 people think this answer is useful*

I too was searching for this topic and I put together a way to iterate through a DataFrame and update it with lookup values from a second DataFrame. Here is my code.

src_df = pd.read_sql_query(src_sql,src_connection)
for index1, row1 in src_df.iterrows():
for index, row in vertical_df.iterrows():
src_df.set_value(index=index1,col=u'etl_load_key',value=etl_load_key)
if (row1[u'src_id'] == row['SRC_ID']) is True:
src_df.set_value(index=index1,col=u'vertical',value=row['VERTICAL'])