熊猫:减去两个日期列,结果为整数
问题内容:
我在熊猫数据框中有两列是日期。
我正在寻找从另一列中减去一列,结果是天数之差 为整数 。
看一下数据:
df_test.head(10)
Out[20]:
First_Date Second Date
0 2016-02-09 2015-11-19
1 2016-01-06 2015-11-30
2 NaT 2015-12-04
3 2016-01-06 2015-12-08
4 NaT 2015-12-09
5 2016-01-07 2015-12-11
6 NaT 2015-12-12
7 NaT 2015-12-14
8 2016-01-06 2015-12-14
9 NaT 2015-12-15
我成功创建了一个新列,区别在于:
df_test['Difference'] = df_test['First_Date'].sub(df_test['Second Date'], axis=0)
df_test.head()
Out[22]:
First_Date Second Date Difference
0 2016-02-09 2015-11-19 82 days
1 2016-01-06 2015-11-30 37 days
2 NaT 2015-12-04 NaT
3 2016-01-06 2015-12-08 29 days
4 NaT 2015-12-09 NaT
但是我无法获得结果的数字版本:
df_test['Difference'] = df_test[['Difference']].apply(pd.to_numeric)
df_test.head()
Out[25]:
First_Date Second Date Difference
0 2016-02-09 2015-11-19 7.084800e+15
1 2016-01-06 2015-11-30 3.196800e+15
2 NaT 2015-12-04 NaN
3 2016-01-06 2015-12-08 2.505600e+15
4 NaT 2015-12-09 NaN
问题答案:
您可以将的列dtype
timedelta
除以np.timedelta64(1, 'D')
,但输出不是int
,而是float
,因为NaN
值是:
df_test['Difference'] = df_test['Difference'] / np.timedelta64(1, 'D')
print (df_test)
First_Date Second Date Difference
0 2016-02-09 2015-11-19 82.0
1 2016-01-06 2015-11-30 37.0
2 NaT 2015-12-04 NaN
3 2016-01-06 2015-12-08 29.0
4 NaT 2015-12-09 NaN
5 2016-01-07 2015-12-11 27.0
6 NaT 2015-12-12 NaN
7 NaT 2015-12-14 NaN
8 2016-01-06 2015-12-14 23.0
9 NaT 2015-12-15 NaN
变频器。