python – How to add pandas data to an existing csv file?

The Question :

308 people think this question is useful

I want to know if it is possible to use the pandas to_csv() function to add a dataframe to an existing csv file. The csv file has the same structure as the loaded data.

The Question Comments :
  • I think method suggested by @tlingf is better only because he is using build-in functionality of pandas library. He suggests define mode as “a” . “A” stands for APPEND ‘df.to_csv(‘my_csv.csv’, mode=’a’, header=False)’
  • The answer from @KCzar considers both the cases when the CSV file is not there (i.e. add the column header) and when the CSV is already there (so add just the data rows without headers). In any case it uses the “append” mode and a custom separator, along with checks on the number of columns.

The Answer 1

635 people think this answer is useful

You can specify a python write mode in the pandas to_csv function. For append it is ‘a’.

In your case:

df.to_csv('my_csv.csv', mode='a', header=False)

The default mode is ‘w’.

The Answer 2

261 people think this answer is useful

You can append to a csv by opening the file in append mode:

with open('my_csv.csv', 'a') as f:
    df.to_csv(f, header=False)

If this was your csv, foo.csv:

,A,B,C
0,1,2,3
1,4,5,6

If you read that and then append, for example, df + 6:

In [1]: df = pd.read_csv('foo.csv', index_col=0)

In [2]: df
Out[2]:
   A  B  C
0  1  2  3
1  4  5  6

In [3]: df + 6
Out[3]:
    A   B   C
0   7   8   9
1  10  11  12

In [4]: with open('foo.csv', 'a') as f:
             (df + 6).to_csv(f, header=False)

foo.csv becomes:

,A,B,C
0,1,2,3
1,4,5,6
0,7,8,9
1,10,11,12

The Answer 3

73 people think this answer is useful
with open(filename, 'a') as f:
    df.to_csv(f, header=f.tell()==0)

  • Create file unless exists, otherwise append
  • Add header if file is being created, otherwise skip it

The Answer 4

21 people think this answer is useful

A little helper function I use with some header checking safeguards to handle it all:

def appendDFToCSV_void(df, csvFilePath, sep=","):
    import os
    if not os.path.isfile(csvFilePath):
        df.to_csv(csvFilePath, mode='a', index=False, sep=sep)
    elif len(df.columns) != len(pd.read_csv(csvFilePath, nrows=1, sep=sep).columns):
        raise Exception("Columns do not match!! Dataframe has " + str(len(df.columns)) + " columns. CSV file has " + str(len(pd.read_csv(csvFilePath, nrows=1, sep=sep).columns)) + " columns.")
    elif not (df.columns == pd.read_csv(csvFilePath, nrows=1, sep=sep).columns).all():
        raise Exception("Columns and column order of dataframe and csv file do not match!!")
    else:
        df.to_csv(csvFilePath, mode='a', index=False, sep=sep, header=False)

The Answer 5

4 people think this answer is useful

Initially starting with a pyspark dataframes – I got type conversion errors (when converting to pandas df’s and then appending to csv) given the schema/column types in my pyspark dataframes

Solved the problem by forcing all columns in each df to be of type string and then appending this to csv as follows:

with open('testAppend.csv', 'a') as f:
    df2.toPandas().astype(str).to_csv(f, header=False)

The Answer 6

3 people think this answer is useful

A bit late to the party but you can also use a context manager, if you’re opening and closing your file multiple times, or logging data, statistics, etc.

from contextlib import contextmanager
import pandas as pd
@contextmanager
def open_file(path, mode):
     file_to=open(path,mode)
     yield file_to
     file_to.close()


##later
saved_df=pd.DataFrame(data)
with open_file('yourcsv.csv','r') as infile:
      saved_df.to_csv('yourcsv.csv',mode='a',header=False)`

Add a Comment