Unable to assign values to a pandas data frame column from a list suing iteration


Unable to assign values to a pandas data frame column from a list suing iteration



I am changing my original code, to present a much simplified version of it. But, this is where the main problem is occurring.
Using the following code:


Sp=pd.DataFrame()
l1=['a', 'b', 'c']
for i in l1:
Sp['col1'] = i



Gives me the result Sp as:


col1



I would want my col1 to have values a, b and c. Could anyone please suggest why this is happening, and how to rectify it.



EDIT:



For every value in my list, I use it to connect to a different file using os, (file names are made up of list values). After picking up the csv file from there I take values such as mean, devisation etc. of the data from the file and assign those values to sp in another column. My final sp should look something as follows:


col1 Mean Median Deviation
a 1 1.1 0.5
b 2 2.1 0.5
c 3 3.1 0.5





@ Jezrael I am not using the loop just for assignment purpose. There are other operations too in every iteration of the loop.
– A.DS
Jun 29 at 10:30





1 Answer
1



EDIT: If need for each loop create DataFrame and processes it, iterate and final DataFrame append to list of DataFrames. Last concat all aggregated DataFrames together:


DataFrame


DataFrame


concat


dfs =
l1 = ['a', 'b', 'c']
for i in l1:
df = pd.read_csv(file)
df = df.groupby('col').agg({'col1':'mean', 'col2':'sum'})
#another code
dfs.append(df)

Sp = pd.concat(dfs, ignore_index=True)



Old answer:



I think need call DataFrame constructor with list:


DataFrame


list


Sp = pd.DataFrame({'col1':l1})



If really need it, but it is the slowiest possible solution:



6) updating an empty frame a-single-row-at-a-time. I have seen this method used WAY too much. It is by far the slowest. It is probably common place (and reasonably fast for some python structures), but a DataFrame does a fair number of checks on indexing, so this will always be very slow to update a row at a time. Much better to create new structures and concat.


Sp=pd.DataFrame()
l1=['a', 'b', 'c']
for j, i in enumerate(l1):
Sp.loc[j, 'col1'] = i

print (Sp)
col1
0 a
1 b
2 c





Thanks. Large data set . Wouldn't want it to slow down. Any other way to still use the loop?
– A.DS
Jun 29 at 10:40





@A.DS - Loops is necessary? Can you explain more?
– jezrael
Jun 29 at 10:41





Use [assign][1] which assigns new columns to a DataFrame, returning a new object (a copy) with the new columns added to the original ones as shown below - Sp = Sp.assign( col1 = l1 ) [1]: pandas.pydata.org/pandas-docs/stable/generated/…
– Vikash Kumar
Jun 29 at 10:42





For every value in my list, I use it to connect to a different file using os. After picking up the csv file from there I take values such as mean, devisation etc. of the data from the file and assign those values to sp in another column.
– A.DS
Jun 29 at 10:43






@A.DS - I think I understand, please check edited answer.
– jezrael
Jun 29 at 11:19






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Opening a url is failing in Swift

Export result set on Dbeaver to CSV