def bootstrapMethod(percents, npHappiness): # Для 20 процентов выборки конфигурация Бутстрап n_iterations = 1000 n_size = int(len(npHappiness) * percents) # Начинаем метод bootstrap stats = list() for i in range(n_iterations): train = resample(npHappiness, n_samples = n_size) test = np.array([x for x in npHappiness if x.tolist() not in train.tolist()]) model = DecisionTreeClassifier() model.fit(train[:,:-1], train[:,-1]) predictions = model.predict(test[:,:-1]) score = accuracy_score(test[:,-1], predictions) print(score) stats.append(score) # Рисуем score plt.hist(stats) plt.show() alpha = 0.95 p = ((1.0 - alpha)/2.0)*100 lower = max(0.0, np.percentile(stats,p)) p = (alpha + ((1.0 - alpha)/2.0)) * 100 p = (alpha+((1.0-alpha)/2.0)) * 100 upper = min(1.0, np.percentile(stats, p)) print('%.1f Доверительный интервал %.1f%% и %.1f%%' % (alpha*100, lower*100, upper*100))
Here is performed using the bootstrap method is the interval on the line:
model.fit(train[:,:-1], train[:,-1]) predictions = model.predict(test[:,:-1])
I get an error:
IndexError: too many indices for array
What is the error and tell me how to fix it?
train
andtest
? What is their dimension? In such matters, it is customary to give a small but reproducible example of data. I also advise you to read: How to most effectively ask a question related to data processing and / or analysis (for example: by Pandas / Numpy / SciPy / SciKit Learn / SQL) - MaxU