Unable to assign values to a pandas data frame column from a list suing iteration
Unable to assign values to a pandas data frame column from a list suing iteration
I am changing my original code, to present a much simplified version of it. But, this is where the main problem is occurring.
Using the following code:
Sp=pd.DataFrame()
l1=['a', 'b', 'c']
for i in l1:
Sp['col1'] = i
Gives me the result Sp as:
col1
I would want my col1 to have values a, b and c. Could anyone please suggest why this is happening, and how to rectify it.
EDIT:
For every value in my list, I use it to connect to a different file using os, (file names are made up of list values). After picking up the csv file from there I take values such as mean, devisation etc. of the data from the file and assign those values to sp in another column. My final sp should look something as follows:
col1 Mean Median Deviation
a 1 1.1 0.5
b 2 2.1 0.5
c 3 3.1 0.5
1 Answer
1
EDIT: If need for each loop create DataFrame
and processes it, iterate and final DataFrame
append to list of DataFrames. Last concat
all aggregated DataFrames together:
DataFrame
DataFrame
concat
dfs =
l1 = ['a', 'b', 'c']
for i in l1:
df = pd.read_csv(file)
df = df.groupby('col').agg({'col1':'mean', 'col2':'sum'})
#another code
dfs.append(df)
Sp = pd.concat(dfs, ignore_index=True)
Old answer:
I think need call DataFrame
constructor with list
:
DataFrame
list
Sp = pd.DataFrame({'col1':l1})
If really need it, but it is the slowiest possible solution:
6) updating an empty frame a-single-row-at-a-time. I have seen this method used WAY too much. It is by far the slowest. It is probably common place (and reasonably fast for some python structures), but a DataFrame does a fair number of checks on indexing, so this will always be very slow to update a row at a time. Much better to create new structures and concat.
Sp=pd.DataFrame()
l1=['a', 'b', 'c']
for j, i in enumerate(l1):
Sp.loc[j, 'col1'] = i
print (Sp)
col1
0 a
1 b
2 c
Thanks. Large data set . Wouldn't want it to slow down. Any other way to still use the loop?
– A.DS
Jun 29 at 10:40
@A.DS - Loops is necessary? Can you explain more?
– jezrael
Jun 29 at 10:41
Use [assign][1] which assigns new columns to a DataFrame, returning a new object (a copy) with the new columns added to the original ones as shown below - Sp = Sp.assign( col1 = l1 ) [1]: pandas.pydata.org/pandas-docs/stable/generated/…
– Vikash Kumar
Jun 29 at 10:42
For every value in my list, I use it to connect to a different file using os. After picking up the csv file from there I take values such as mean, devisation etc. of the data from the file and assign those values to sp in another column.
– A.DS
Jun 29 at 10:43
@A.DS - I think I understand, please check edited answer.
– jezrael
Jun 29 at 11:19
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
@ Jezrael I am not using the loop just for assignment purpose. There are other operations too in every iteration of the loop.
– A.DS
Jun 29 at 10:30