Home » Python » Combining two Series into a DataFrame in pandas

Combining two Series into a DataFrame in pandas

Posted by: admin November 1, 2017 Leave a comment


I have two Series s1 and s2 with the same (non-consecutive) indices. How do I combine s1 and s2 to being two columns in a DataFrame and keep one of the indices as a third column?


I think concat is a nice way to do this. If they are present it uses the name attributes of the Series as the columns (otherwise it simply numbers them):

In [1]: s1 = pd.Series([1, 2], index=['A', 'B'], name='s1')

In [2]: s2 = pd.Series([3, 4], index=['A', 'B'], name='s2')

In [3]: pd.concat([s1, s2], axis=1)
   s1  s2
A   1   3
B   2   4

In [4]: pd.concat([s1, s2], axis=1).reset_index()
  index  s1  s2
0     A   1   3
1     B   2   4

Note: This extends to more than 2 Series.


Pandas will automatically align these passed in series and create the joint index
They happen to be the same here. reset_index moves the index to a column.

In [2]: s1 = Series(randn(5),index=[1,2,4,5,6])

In [4]: s2 = Series(randn(5),index=[1,2,4,5,6])

In [8]: DataFrame(dict(s1 = s1, s2 = s2)).reset_index()
   index        s1        s2
0      1 -0.176143  0.128635
1      2 -1.286470  0.908497
2      4 -0.995881  0.528050
3      5  0.402241  0.458870
4      6  0.380457  0.072251


Why don’t you just use .to_frame if both have the same indexes?



Example code:

a = pd.Series([1,2,3,4], index=[7,2,8,9])
b = pd.Series([5,6,7,8], index=[7,2,8,9])
data = pd.DataFrame({'a': a,'b':b, 'idx_col':a.index})

Pandas allows you to create a DataFrame from a dict with Series as the values and the column names as the keys. When it finds a Series as a value, it uses the Series index as part of the DataFrame index. This data alignment is one of the main perks of Pandas. Consequently, unless you have other needs, the freshly created DataFrame has duplicated value. In the above example, data['idx_col'] has the same data as data.index.


Not sure I fully understand your question, but is this what you want to do?

pd.DataFrame(data=dict(s1=s1, s2=s2), index=s1.index)

(index=s1.index is not even necessary here)