Good afternoon, colleagues! uploading file
df = read_table('TQq.txt', sep = '\t', encoding='utf_8', decimal='.') I receive normal Russian headings and the text in the table. Further analysis
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=91) dtrain = xgb.DMatrix( X_train, label=y_train) dtest = xgb.DMatrix( X_test, label=y_test) param = {'bst:max_depth':3, 'bst:eta':0.1, 'silent':1, 'objective':'binary:logistic', 'eval_metric' : 'map', 'seed' : 1 } param['nthread'] = 40 evallist = [(dtest,'test'), (dtrain,'train')] num_round = 10 bst = xgb.train(param, dtrain, num_round, evallist) And when I try to get the importance of factors, I get
xgb.plot_importance(bst) in what place it is necessary to make encoding still? Yes, I did the cp1251 encoding too, the same result
