Python:数据参数不能是迭代器


问题内容

我正在尝试复制此处提供的代码:https : //github.com/IdoZehori/Credit-
Score/blob/master/Credit%20score.ipynb

下面给出的功能无法运行并给出错误。有人可以帮我解决吗

def replaceOutlier(data, method = outlierVote, replace='median'):
'''replace: median (auto)
            'minUpper' which is the upper bound of the outlier detection'''
vote = outlierVote(data)
x = pd.DataFrame(zip(data, vote), columns=['annual_income', 'outlier'])
if replace == 'median':
    replace = x.debt.median()
elif replace == 'minUpper':
    replace = min([val for (val, vote) in list(zip(data, vote)) if vote == True])
    if replace < data.mean():
        return 'There are outliers lower than the sample mean'
debtNew = []
for i in range(x.shape[0]):
    if x.iloc[i][1] == True:
        debtNew.append(replace)
    else:
        debtNew.append(x.iloc[i][0])

return debtNew

函数调用:

incomeNew = replaceOutlier(df.annual_income, replace='minUpper')

错误:x = pd.DataFrame(zip(数据,投票),列=
[‘annual_income’,’离群值’])TypeError:数据参数不能为迭代器

PS:我知道以前已经有人问过这个问题,但是我尝试使用这些技术,但是错误仍然存​​在


问题答案:

zip不能直接使用,您应该将结果列为列表,即:

x = pd.DataFrame(list(zip(data, vote)), columns=['annual_income', 'outlier'])

编辑
(来自bayethierno答案):从0.24.0
版本开始,我们不再需要从中生成列表zip,以下语句有效:

x = pd.DataFrame(zip(data, vote), columns=['annual_income', 'outlier'])