Good afternoon, colleagues! uploading file

df = read_table('TQq.txt', sep = '\t', encoding='utf_8', decimal='.') 

I receive normal Russian headings and the text in the table. Further analysis

 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=91) dtrain = xgb.DMatrix( X_train, label=y_train) dtest = xgb.DMatrix( X_test, label=y_test) param = {'bst:max_depth':3, 'bst:eta':0.1, 'silent':1, 'objective':'binary:logistic', 'eval_metric' : 'map', 'seed' : 1 } param['nthread'] = 40 evallist = [(dtest,'test'), (dtrain,'train')] num_round = 10 bst = xgb.train(param, dtrain, num_round, evallist) 

And when I try to get the importance of factors, I get

 xgb.plot_importance(bst) 

enter image description here

in what place it is necessary to make encoding still? Yes, I did the cp1251 encoding too, the same result

  • The standard font does not support Cyrillic; changing the font can help: stackoverflow.com/questions/10960463/… . Inside xgboost matplotlib is used for graphs. - Lebedev Ilya
  • Thanks, Ilya, it helped. And yes, it was probably worth pointing out matplotlib, not xgboost - Edward

0