ࡱ> XW (( / 0DTimes New Romanpv 0( 0DSymbolew Romanpv 0( 0 DCourier Newmanpv 0( 01 ` .  @n?" dd@  @@`` `}     3c $@7uʚ;2Nʚ; g42d2dv 0ppp@ <4!d!d` 0L\<4dddd` 0L\?&+ Dr. RiggsO =L#Classification by KNNDr. Riggs Spring 2004K-Nearest-NeighborsTThe classification problem may also be approached by comparing an unknown case to known cases Need a measure of  distance between cases Normally the features are real-valued We use the old example which is NOT real-valued The class predicted is the most prevalent class in the K nearest neighborsV^ZQZ0ZLZ^Q0L K-NN Algorithm// predict the class of unknown instances Choose a K For each unknown instance X For each known instance A Measure distance(X,A) Find Ai-Ak with with min distances Predict <- most frequent class in A1-Ak3 " >v*C K-NN as RULESUnknown item A & known item B => distance A B dist(A B) Need to calculate the distance  functions here Select the minimum distance facts Adjust ranks Delete instances with rank > k Predict class Collect distances Find most frequent class & predict it;Z0Z#Z,ZZ9Z;0#,9  Distances*Are normally real numbers or at least ordinals for KNN Scales Nominal: classes: only = or Ordinal: 1st, 2nd, 3rd & .. <,>, = , ; no 0 or + 1 Numeric (integer or real): measurements: <,>,=,<>, arithmetic Distance (Euclidean) dist((x,y),(r,s)) = (x-r)2+(y-s)2 7ZZZZ!ZZ*ZZZ$Z7                 *E Problem From Last Time*We have 3 features Shape  is nominal Size  is ordinal Color is nominal (?) Therefore K-NN won t work to well Nevertheless, we use it as an exampleT:#&:#&  DistancesRNominal distances are either 1 or 0 Color A = color B then distC(A,B) is 0 else 1 Shape A = Shape B the distSh(A,B) is 0 else 1 Ordinal distances may sometimes be assumed equally separated Size Order: 1-small, 2-medium, 3-large DistSz(A,B) = |ordinal(A)-ordinal(B)| Total distance Could try Euclidean distance Could try Manhattan distance$\=M:$ , =+  :  >;(vi  Pair Distance:;; Want facts (distance ?unKnwn Knwn ?dist ?rank) (defrule compare (item ?n1 ?sz1 ?c1 ?sh1 ?) (predict ?n1) ;;?n1 is to PREDICT (item ?n2 ?sz2 ?c2 ?sh2 ?) (not (predict ?n2)) ;;?n2 is KNOWN => (bind ?dist (distance ?sz1 ?c1 ?sh1 ?sz2 ?c2 ?sh2)) (assert (distance ?n1 ?n2 ?dist 1))) ;; rank set to 1";2 ,  Ranking by RulesAssign each distance an initial rank of 1 Already done in the distance fact If two distances have the same rank r, assign the larger (or =) rank r+1 See demote following If any distance has rank r > k forget it See drop following*Z#ZJZZ)ZZ*#J )       DEMOTE ;; If two distances have the same rank, assign ;; a higher rank number to the larger (or =) (defrule demote ?f<-(distance ?n ?n1 ?d1 ?r) (distance ?n ?n2&~?n1 ?d2 ?r) (test (>= ?d1 ?d2)) => (retract ?f) (assert (distance ?n ?n1 ?d1 (+ ?r 1)))) " ]b^)"+ DROP;; -- drop any distance with rank > k (defrule drop (declare (salience 10)) ?f<-(distance ? ? ? ?r) (test (> ?r ?*K*)) => (retract ?f))&v,'?, Collect the classes RemainingAFTER all distances are ranked Start a list of all classes with distances remaining When all are collected, find the most common class Predict the unknown will be in that classZ Collect known classes-start ;; for each unknown, start collection (defrule collect0 (declare (salience -10)) ;; wait for ranks (predict ?n) => (assert (collect ?n )) ) 'i :Collect known classes  cont. ;; add the class of each ?n1 with remaining distance ;; dist(?n,?n1) to the list of classes (defrule collect ?g<-(collect ?n $?classes) ?f<-(distance ?n ?n1 $?) (item ?n1 $? ?classN1) => (retract ?f ?g) (assert (collect ?n $?classes ?classN1)))  c6Predict the Unknown s Class (defrule predict (collect ?n $?classes) (not (distance ?n $?)) ;; all distances collected => (bind ?prediction (mostfrequent $?classes)) (printout t ?n " is predicted as in " ?prediction ) ),t E 9/10 TestingA standard way to test learning is to partition the learning set into two parts Select, at random, 1/10 of the data as test data Use the other 9/10 as training data We then  learn on the 9/10 and test on the 1/10 We repeat this for different tenths The example has too little data Choose 1 test (leaving 6 knowns) Repeat for each possible test case\PZUZwZEZPUwE5& Results on 6Dist from 6 to 7 is 1.0 Dist from 6 to 5 is .666666666666667 Dist from 6 to 4 is 2.0 Dist from 6 to 3 is 2.0 Dist from 6 to 2 is 1.66666666666667 Dist from 6 to 1 is .33333333333333 6 is predicted as in no 6 real class is: noResults on Example"  ` ` ̙33` 333MMM` ff3333f` f` f` 3>?" dd@,|?" dd@   " @ ` n?" dd@   @@``PR    @ ` ` p>> jb(    6xr "`  T Click to edit Master title style! !$  0 "  RClick to edit Master text styles Second level Third level Fourth level Fifth level!     S  0̵ "``  X*  0Թ "`   Z*  0 "`   Z*H  0޽h ? ̙33 Classes0 zr@ (    0 P    P*    0ܻ     R*  d  c $ ?    0辯  @  RClick to edit Master text styles Second level Third level Fourth level Fifth level!     S  6 `P   P*    6ǯ `   R*  H  0޽h ? ̙33@ P\(  \ \ 0* P    X*  \ 0/     Z*  \ 62 `P   X*  \ 68 `   Z* H \ 0޽h ? ̙33 0 $(   r  S p  r  S ؍ `    H  0޽h ? ̙33  `$(  r  S 7`   r  S <@  H  0޽h ? ̙33  p$(  r  S E`   r  S F  H  0޽h ? ̙33  $(  r  S L`   r  S L  H  0޽h ? ̙33  $(  r  S Q`   r  S  ]  H  0޽h ? ̙33   $(   r  S Ta`   r  S b  H  0޽h ? ̙33  $$(  $r $ S y`   r $ S \z  H $ 0޽h ? ̙33  ($(  (r ( S `   r ( S \  H ( 0޽h ? ̙33  ,$(  ,r , S @`   r , S   H , 0޽h ? ̙33  00(  0x 0 c $`   x 0 c $H  H 0 0޽h ? ̙33  4$(  4r 4 S t`   r 4 S   H 4 0޽h ? ̙33  <$(  <r < S /`   r < S /  H < 0޽h ? ̙33   8$(  8r 8 S `   r 8 S h  H 8 0޽h ? ̙33  @$(  @r @ S `   r @ S   H @ 0޽h ? ̙33   D$(  Dr D S `   r D S h  H D 0޽h ? ̙33  0T$(  Tr T S `   r T S L  H T 0޽h ? ̙33  @H$(  Hr H S H`   r H S     H H 0޽h ? ̙33   P12PF(  Pr P S D`   z  2P #"2&nnnnnnnn #P <tA?  Jpass @` "P < z?@  HNo @` !P <?x @  INo  @`  P <X? x G8 @` P <?$   JPass @` P <?@ $   HNo @` P <?x$ @  HNo @` P <0 ?$ x  G6 @` P <4 ? $  JPass @` P <?@ $  HNo @` P <l?x @ $  HNo @` P < ? x$  G5 @` P <(?H   JPass @` P </?@ H   IYes @` P <=?xH @  IYes @` P <>?H x  G4 @` P <L?H  JFAIL @` P < T?@ H  IYes @` P <8U?x@ H  HNo @` P <,c?xH  G3 @` P <xj?l JFAIL @` P <k?@ l IYes @`  P <y?xl@  HNo @`  P <?lx G2 @`  P <t?l JFAIL @`  P <?@ l IYes @`  P <(?x@ l HNo @` P <?xl G1 @` P <h? LResult @` P <?@  LActual @` P <?x@  O Predicted   @` P <4?x JItem @``B $P 0o ?ZB %P s *1 ?ZB &P s *1 ?llZB 'P s *1 ?ZB (P s *1 ?H H ZB )P s *1 ?  ZB *P s *1 ?$ $ ZB +P s *1 ?  `B ,P 0o ?`B -P 0o ?ZB .P s *1 ?xxZB /P s *1 ?@ @ ZB 0P s *1 ?`B 1P 0o ?H P 0޽h ? ̙33r\`=1(G;3=? ABDFHJLNwPcROT;VZ'X[6  |tOh+'0 hp ( H T `ltClassification by KNNP Ken RiggstiYC:\Documents and Settings\administrator\Application Data\Microsoft\Templates\Classes.poto Ken Riggsts1n Microsoft PowerPointing@P @0OI@Y-GLg  '& &&#TNPP2OMi & TNPP &&TNPP    --- !-----iyH--w@ [wdw0- @Times New Roman[wdw0- 33.'2 AClassification by KNNo+! -++.33--JJ-33- 33NG--Q1-- 33J@Times New Roman[wdw0- J.2  Dr. Riggs  . J.2 a Spring 2004!   .--"System 0-&TNPP &՜.+,0<    On-screen ShowFAMUree| Times New RomanSymbol Courier NewClassesClassification by KNNK-Nearest-NeighborsK-NN AlgorithmK-NN as RULES DistancesProblem From Last Time DistancesPair DistanceRanking by RulesDEMOTEDROPCollect the classes RemainingCollect known classes-start Collect known classes cont.Predict the Unknowns Class 9/10 Testing Results on 6Results on Example  Fonts UsedDesign Template Slide Titles!_q| Ken RiggsKen Riggs  !"#$%&'()*+,-./0123456789:;<=>@ABCDEFHIJKLMNPQRSTUVYRoot EntrydO)Current UserOSummaryInformation(?PowerPoint Document(|DocumentSummaryInformation8G