Appearance
Prediction Model 01 on processed Designation text##
- FillNA with "" to avoid errors --> NAs from stemming?
- TfidfVectorizer
- Fit GradientBoostingClassifier
Results of PM01
precision recall f1-score support
10 0.38 0.01 0.03 612
40 0.47 0.27 0.34 521
50 0.51 0.21 0.29 357
60 0.88 0.33 0.48 161
1140 0.85 0.11 0.19 539
1160 0.73 0.55 0.63 786
1180 0.74 0.34 0.47 146
1280 0.58 0.28 0.37 961
1281 0.00 0.00 0.00 424
1300 0.85 0.76 0.81 974
1301 0.96 0.67 0.79 169
1302 0.90 0.45 0.60 507
1320 0.74 0.26 0.39 672
1560 0.72 0.46 0.56 1013
1920 0.85 0.67 0.75 841
1940 0.85 0.26 0.39 137
2060 0.78 0.27 0.40 1029
2220 0.88 0.09 0.16 170
2280 0.11 0.97 0.19 942
2403 0.47 0.37 0.42 986
2462 0.00 0.00 0.00 306
2522 0.64 0.34 0.44 991
2582 0.45 0.21 0.29 462
2583 0.95 0.67 0.79 2047
2585 0.58 0.32 0.41 525
2705 0.35 0.03 0.06 517
2905 0.98 0.94 0.96 189
accuracy 0.42 16984
macro avg 0.64 0.36 0.41 16984
weighted avg 0.65 0.42 0.46 16984
Prediction Model 02 on processed Designation text##
- FillNA with "" to avoid errors --> NAs from stemming?
- TfidfVectorizer
- SMOTE for handling class imbalance
- flatening of y_train to avoid error message
- Fit GradientBoostingClassifier
Results of PM02
precision recall f1-score support
10 0.07 0.94 0.14 612
40 0.49 0.25 0.33 521
50 0.53 0.19 0.28 357
60 0.52 0.60 0.56 161
1140 0.72 0.31 0.43 539
1160 0.74 0.55 0.63 786
1180 0.50 0.40 0.44 146
1280 0.95 0.06 0.12 961
1281 0.37 0.12 0.19 424
1300 0.88 0.80 0.84 974
1301 0.86 0.72 0.78 169
1302 0.75 0.62 0.68 507
1320 0.67 0.43 0.52 672
1560 0.66 0.12 0.21 1013
1920 0.86 0.25 0.39 841
1940 0.44 0.39 0.42 137
2060 0.40 0.29 0.34 1029
2220 0.63 0.62 0.63 170
2280 0.42 0.07 0.12 942
2403 0.54 0.39 0.45 986
2462 0.51 0.38 0.43 306
2522 0.64 0.39 0.48 991
2582 0.44 0.06 0.10 462
2583 0.94 0.73 0.82 2047
2585 0.65 0.30 0.42 525
2705 0.08 0.04 0.05 517
2905 0.98 0.94 0.96 189
accuracy 0.40 16984
macro avg 0.60 0.41 0.43 16984
weighted avg 0.64 0.40 0.44 16984
