我有两个CSV文件,具有以下模式:
CSV1列:
"Id","First","Last","Email","Company"
CSV2列:
"PersonId","FirstName","LastName","Em","FavoriteFood"
如果我将它们每个加载到Pandas DataFrame中并执行newdf=df1.并合并(df2, How='外部',left_on=['Last','First'],right_on=['LastName','FirstName'])
然后,加入的DataFrame的CSV导出具有以下模式:
"Id","First","Last","Email","Company","PersonId","FirstName","LastName","Em","FavoriteFood"
我想要的是更像这样的输出模式:
"Id","First","Last","Email","Company","PersonId","Em","FavoriteFood"
我熟悉的大多数关系数据库软件都是这样做的(左侧连接列名称赢得了命名战)。熊猫有指示它这样做的语法吗?
我可以做df1. merme(df2.rename(列={'LastName':'Last','FirstName':'First'}),How='外部',on=['Last','First'])
,但是从风格上来说,在我的源代码中对相同的列名进行两次硬编码会让我发疯。如果我更改CSV文件中的列名,更需要修复。
一种方法是以相同的方式合并,但删除要删除的列。
newdf.drop(['LastName','FirstName'], 1, inplace=True)