|
Experiments
Naive Bayes (simple)
Class >50K: P(C) = 0.24082548
Attribute age
Mean: 0.37328548 Standard Deviation: 0.14409627
Attribute workclass
Private 0.64821102
Self-emp-not-inc 0.09467224
Self-emp-inc 0.08135283
Federal-gov 0.04857665
Local-gov 0.08069992
State-gov 0.04622617
Without-pay 0.00013058
Never-worked 0.00013058
Attribute fnlwgt
Mean: 0.11934095 Standard Deviation: 0.06964166
Attribute education
Bachelors 0.28280514
Some-college 0.17665776
11th 0.00776378
HS-grad 0.21331297
Prof-school 0.05396462
Assoc-acdm 0.03385516
Assoc-voc 0.04607356
9th 0.0035637
7th-8th 0.00521828
12th 0.00432735
Masters 0.12218404
1st-4th 0.00089093
10th 0.00801833
Doctorate 0.03907344
5th-6th 0.00216368
Preschool 0.00012728
Attribute education-num
Mean: 0.70744373 Standard Deviation: 0.15900867
Attribute marital-status
Married-civ-spouse 0.85282875
Divorced 0.05912334
Never-married 0.06269113
Separated 0.00853721
Widowed 0.01095821
Married-spouse-absent 0.00445973
Married-AF-spouse 0.00140163
Attribute occupation
Tech-support 0.03705637
Craft-repair 0.12134656
Other-service 0.01800626
Sales 0.12839248
Exec-managerial 0.25691545
Prof-specialty 0.24269311
Handlers-cleaners 0.01135177
Machine-op-inspct 0.03275052
Adm-clerical 0.06628392
Farming-fishing 0.0151357
Transport-moving 0.04188413
Priv-house-serv 0.00026096
Protective-serv 0.0276618
Armed-Forces 0.00026096
Attribute relationship
Wife 0.09506818
Own-child 0.00866573
Husband 0.75430101
Not-in-family 0.10921371
Other-relative 0.00484262
Unmarried 0.02790875
Attribute race
White 0.90721387
Asian-Pac-Islander 0.03530461
Amer-Indian-Eskimo 0.00471578
Other 0.00331379
Black 0.04945195
Attribute sex
Female 0.15045263
Male 0.84954737
Attribute capital-gain
Mean: 0.04006186 Standard Deviation: 0.14570527
Attribute capital-loss
Mean: 0.0447662 Standard Deviation: 0.13670517
Attribute hours-per-week
Mean: 0.45380636 Standard Deviation: 0.11237728
Attribute native-country
United-States 0.92709411
Cambodia 0.00103413
England 0.00400724
Puerto-Rico 0.00168046
Canada 0.00517063
Germany 0.00581696
Outlying-US(Guam-USVI-etc) 0.00012927
India 0.0052999
Japan 0.00323164
Greece 0.00116339
South 0.00219752
China 0.00271458
Cuba 0.00336091
Iran 0.00245605
Honduras 0.00025853
Philippines 0.00801448
Italy 0.00336091
Poland 0.00168046
Jamaica 0.00142192
Vietnam 0.00077559
Mexico 0.00439504
Portugal 0.00064633
Ireland 0.00077559
France 0.00168046
Dominican-Republic 0.0003878
Laos 0.0003878
Ecuador 0.00064633
Taiwan 0.00271458
Haiti 0.00064633
Columbia 0.0003878
Hungary 0.00051706
Guatemala 0.00051706
Nicaragua 0.0003878
Scotland 0.00051706
Thailand 0.00051706
Yugoslavia 0.00090486
El-Salvador 0.00129266
Trinadad&Tobago 0.0003878
Peru 0.0003878
Hong 0.00090486
Holand-Netherlands 0.00012927
Class <=50K: P(C) = 0.75917452
Attribute age
Mean: 0.27101011 Standard Deviation: 0.19205599
Attribute workclass
Private 0.76827102
Self-emp-not-inc 0.07875926
Self-emp-inc 0.02144435
Federal-gov 0.02555994
Local-gov 0.06398648
State-gov 0.04098254
Without-pay 0.00064983
Never-worked 0.00034658
Attribute fnlwgt
Mean: 0.12092736 Standard Deviation: 0.07231787
Attribute education
Bachelors 0.12673836
Some-college 0.23872089
11th 0.04511643
HS-grad 0.35684832
Prof-school 0.00622574
Assoc-acdm 0.03246281
Assoc-voc 0.0413163
9th 0.01972833
7th-8th 0.02453913
12th 0.01621119
Masters 0.03092658
1st-4th 0.00658959
10th 0.03525226
Doctorate 0.00436611
5th-6th 0.01285576
Preschool 0.0021022
Attribute education-num
Mean: 0.57300421 Standard Deviation: 0.16240983
Attribute marital-status
Married-civ-spouse 0.33505884
Divorced 0.1609981
Never-married 0.41222146
Separated 0.03882396
Widowed 0.03676143
Married-spouse-absent 0.01557002
Married-AF-spouse 0.00056618
Attribute occupation
Tech-support 0.02798718
Craft-repair 0.13737978
Other-service 0.13685989
Sales 0.1155879
Exec-managerial 0.09093666
Prof-specialty 0.09886492
Handlers-cleaners 0.05567109
Machine-op-inspct 0.07594663
Adm-clerical 0.14140889
Farming-fishing 0.03812495
Transport-moving 0.05536782
Priv-house-serv 0.00645525
Protective-serv 0.01901915
Armed-Forces 0.00038991
Attribute relationship
Wife 0.03332524
Own-child 0.20229718
Husband 0.29426515
Not-in-family 0.30130227
Other-relative 0.03821888
Unmarried 0.13059128
Attribute race
White 0.8372093
Asian-Pac-Islander 0.0308999
Amer-Indian-Eskimo 0.01116279
Other 0.00998989
Black 0.11073812
Attribute sex
Female 0.38803495
Male 0.61196505
Attribute capital-gain
Mean: 0.00148753 Standard Deviation: 0.00963147
Attribute capital-loss
Mean: 0.01219994 Standard Deviation: 0.07133971
Attribute hours-per-week
Mean: 0.38612455 Standard Deviation: 0.125704
Attribute native-country
United-States 0.9044565
Cambodia 0.00053445
England 0.00250781
Puerto-Rico 0.0042345
Canada 0.00341227
Germany 0.0038645
Outlying-US(Guam-USVI-etc) 0.00061667
India 0.00250781
Japan 0.00160335
Greece 0.00090446
South 0.00267226
China 0.00230225
Cuba 0.00291893
Iran 0.0010689
Honduras 0.00053445
Philippines 0.00567341
Italy 0.00201447
Poland 0.00201447
Jamaica 0.00296004
Vietnam 0.00259003
Mexico 0.02511922
Portugal 0.0013978
Ireland 0.00082223
France 0.00074001
Dominican-Republic 0.0028367
Laos 0.0006989
Ecuador 0.00102779
Taiwan 0.00131557
Haiti 0.00168558
Columbia 0.00238448
Hungary 0.00045223
Guatemala 0.00254892
Nicaragua 0.00135668
Scotland 0.00041112
Thailand 0.00065779
Yugoslavia 0.00045223
El-Salvador 0.00402894
Trinadad&Tobago 0.00074001
Peru 0.00123335
Hong 0.00061667
Holand-Netherlands 0.00008222
=== Error on training data ===
Correctly Classified Instances 27134 83.3328 %
Incorrectly Classified Instances 5427 16.6672 %
Mean absolute error 0.1742
Root mean squared error 0.3738
Relative absolute error 47.6361 %
Root relative squared error 87.4192 %
Total Number of Instances 32561
=== Confusion Matrix ===
a b <-- classified as
4048 3793 | a = >50K
1634 23086 | b = <=50K
=== Error on test data ===
Correctly Classified Instances 13512 82.9924 %
Incorrectly Classified Instances 2769 17.0076 %
Mean absolute error 0.1758
Root mean squared error 0.3759
Relative absolute error 48.3955 %
Root relative squared error 88.4921 %
Total Number of Instances 16281
=== Confusion Matrix ===
a b <-- classified as
1945 1901 | a = >50K
868 11567 | b = <=50K
|
|
|