The R implementation

Here is the R source code of the main algorithm:

GSP  <- function (d,I,MIN_SUP){
  f <- NULL
  c[[1]] <- CreateInitalPrefixTree(NULL)
  len4I <- GetLength(I)
  for(idx in 1:len4I){
    SetSupportCount(I[idx],0)
    AddChild2Node(c[[1]], I[idx],NULL)
  }
  k <- 1
  while( !IsEmpty(c[[k]]) ){
     ComputeSupportCount(c[[k]],d)
     while(TRUE){
       r <- GetLeaf(c[[k]])
       if( r==NULL ){
         break
       }
       if(GetSupport(r)>=MIN_SUP){
         AddFrequentItemset(f,r,GetSupport(r))
       }else{
         RemoveLeaf(c[[k]],s)
       }
    }
    c[[k+1]] <- ExtendPrefixTree(c[[k]])
    k <- K+1
  }
  f
}

The SPADE algorithm

Sequential Pattern Discovery using Equivalent classes (SPADE) is a vertical sequence-mining algorithm applied to sequence patterns; it has a depth-first approach. Here are the features of the SPADE algorithm: ...

Get R: Data Analysis and Visualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.