1 Corpus

Twitter provides us with vast amounts of user-generated language data, which is a dream for anyone wanting to conduct textual analysis. The twitteR library provides access to Twitter data. Twitter marks its use as the ‘official’ way to download its tweets. An attractive and ‘easy-to-use’ alternative to Twitter’s ‘official rules’ is based on the use of the rtweet package. The following link seems to be a more updated package. This set of slides offers an easy-to-follow tutorial, showing the pipeline that you need.

Twitter’s link to create Twitter applications is https://developer.twitter.com/en/apps. You need to be logged in to Twitter to create a new app. This will provide you a set of 5 items related to the application called app, consumerKey, consumerSecret, accessToken and accessSecret. Both accessToken and accessSecret need to be activated after receiving the consumerKey and consumerSecret. Five parameters need to be used in the final authentification function call, create_token().

token <- create_token(
    app = app,
    consumer_key = consumer_key,
    consumer_secret = consumer_secret,
    access_token = access_key,
    access_secret = access_secret
)

Once the authentification is done, tweets of any user or hashtag can be retrieved and converted to a corpus. In this case, I have decided to make a corpus with the tweets of two mobile game accounts. As they are similar games, performing classification of tweets will be a challenging task. Only the last 1000 tweets of each account are retrieved.

Therefore, we have a binary classification problem, where the class is clashroyale or clashofclans. As we are working with text, the predictive features that we have are related to words.

library(rtweet)
# retrieve user tweets
n <- 1000
clashroyale_tweets <- get_timeline("clashroyale", n = n)
clashofclans_tweets <- get_timeline("clashofclans", n = n)

In the first 5 tweets of each dataset we can see that the tweets don’t only have words. There are also links and emotes for example. In the next section we will have to decide what we want to do with those words. Apart from the text, many other data is return by the previous function. In total, there are 90 columns, but we will only use a few of them. The most important one is the text column. We will use some other features such as the date for visualization.

head(clashroyale_tweets, n = 5L)

head(clashofclans_tweets, n = 5L)

clashroyale_tweets$text[1:5]

[1] "Not tried a new deck out yet? ⚔️ \n\nThe 1v1 Showdown is the perfect Party Mode for this and is here all season long! https://t.co/Hgq6QIurmM"
[2] "Welcome to the year of the Tiger 🐯 https://t.co/BU8EXwAyNc"                                                                                 
[3] "Thank YOU for an amazing 2021 year! 🎊\n\nHere's to 2022 👇\nhttps://t.co/wQ8sqE0Q4j https://t.co/qUpS7kjrqm"                                
[4] "Happy New Year! 🥳\nMay 2022 be full of Victories and Crowns! 👑👑👑 https://t.co/VJdu458cRH"                                                
[5] "☃️ ❄️ 🥶 https://t.co/huDJSbQNra"

clashofclans_tweets$text[1:5]

[1] "Caption this! \n\nThe face you and your Clan mates make when... 👇 https://t.co/mbNP1RIGVB"                                                                                                                                                                                           
[2] "New year, new Village layout? 🧐 \n\nThen check out @ClashChamps, and use the Advanced Search tool to find top layouts! Narrow down by Town Hall level, base type, sort by Most Downloaded, Most Recent, Highest Rated, and more!\n\nhttps://t.co/pykPfv0CWy"                         
[3] "Let's gooooooooo! https://t.co/RkzQs3z1sp https://t.co/1fD64uRb3s"                                                                                                                                                                                                                    
[4] "Predictions? 🧐🍿📺 https://t.co/RsNxCCAsw0"                                                                                                                                                                                                                                          
[5] "The 2021 #ClashOfClans World Championships brought us SO MANY incredible matches...but there were also some surprising turn arounds and losses. \n\nWhich was the one World Champion match that didn't happen you wished we could've seen? \n\nComment below! https://t.co/KYMQ5FlD8Z"

We can use the tm library to build a corpus for each class. Each tweet will be a document in this corpus. Then we can merge them to have a single corpus. Building a corpus is recommended because the tm package offers many transformations for preprocessing text.

library(tm)

Loading required package: NLP

# combine both frames in a single, binary, annotated set
tweets <- rbind(clashroyale_tweets, clashofclans_tweets)
# interpreting each element of the annotated vector as a document
clashroyale_docs <- VectorSource(clashroyale_tweets$text)
clashofclans_docs <- VectorSource(clashofclans_tweets$text)
# convert to a corpus: supervised classification to be applied in future steps
clashroyale_corpus <- VCorpus(clashroyale_docs)
clashofclans_corpus <- VCorpus(clashofclans_docs)
# merge, concatenate both groups-corpuses
corpus <- c(clashroyale_corpus, clashofclans_corpus)

2 Visualization

Visualizing the data is important to understand our corpus. In this section there are various time series plots, donut plots and wordclouds.

2.1 Time Series Plot

We can use the rtweet package get a time series plot with the frequencies of tweets. In these examples, I analyse the frequencies of both accounts by month, week and day. The tweet frequencies are similar, Clash Royale has more tweets.

ts_plot(dplyr::group_by(tweets, screen_name), "month") +
    ggplot2::theme_minimal() +
    ggplot2::theme(plot.title = ggplot2::element_text(face = "bold")) +
    ggplot2::labs(
        x = "Date", y = "Count",
        title = "Frequency of Tweets from Clash Royale and Clash of Clans",
        subtitle = "Tweet counts aggregated by month"
    )


Attaching package: ‘ggplot2’

The following object is masked from ‘package:NLP’:

    annotate

ts_plot(dplyr::group_by(tweets, screen_name), "week") +
    ggplot2::theme_minimal() +
    ggplot2::theme(plot.title = ggplot2::element_text(face = "bold")) +
    ggplot2::labs(
        x = "Date", y = "Count",
        title = "Frequency of Tweets from Clash Royale and Clash of Clans",
        subtitle = "Tweet counts aggregated by week"
    )

ts_plot(dplyr::group_by(tweets, screen_name), "day") +
    ggplot2::theme_minimal() +
    ggplot2::theme(plot.title = ggplot2::element_text(face = "bold")) +
    ggplot2::labs(
        x = "Date", y = "Count",
        title = "Frequency of Tweets from Clash Royale and Clash of Clans",
        subtitle = "Tweet counts aggregated by day"
    )

2.2 Tweet Types Chart

Analysing the ratio of quotes, replies, retweets and organic tweets can tell us the type of tweets we are analysing. We could choose to only keep organic tweets for our corpus. Removing retweets might reduce the variability of the data and therefore, make it easier to classify. This time we will keep all tweet types, but we will still visualize the types in a donut chart.

As a first step we have to divide each account tweets into the previously mentioned subsets.

tweet_types <- function(tweets) {
    organic <- tweets[tweets$is_retweet == FALSE, ]
    # Remove replies
    organic <- subset(organic, is.na(organic$reply_to_status_id))
    # Remove quotes
    organic <- organic[organic$is_quote == FALSE, ]
    # Keeping only the retweets
    retweets <- tweets[tweets$is_retweet == TRUE, ]
    # Keeping only the replies
    replies <- subset(tweets, !is.na(tweets$reply_to_status_id))
    # Keeping only the quotes
    quotes <- tweets[tweets$is_quote == TRUE, ]
    types_list <- list(organic, retweets, replies, quotes)
    return(types_list)
}

# get clashroyale tweet types
clashroyale_types <- tweet_types(clashroyale_tweets)
clashroyale_organic <- clashroyale_types[[1]]
clashroyale_retweets <- clashroyale_types[[2]]
clashroyale_replies <- clashroyale_types[[3]]
clashroyale_quotes <- clashroyale_types[[4]]

# get clashofclans tweet types
clashofclans_types <- tweet_types(clashofclans_tweets)
clashofclans_organic <- clashofclans_types[[1]]
clashofclans_retweets <- clashofclans_types[[2]]
clashofclans_replies <- clashofclans_types[[3]]
clashofclans_quotes <- clashofclans_types[[4]]

Then, we create a separate data frame containing the number of organic tweets, retweets, replies and quotes. We have to prepare the data frame for a donut chart. This includes adding columns that calculate the ratios and percentages and some visualisation tweaks such as specifying the legend and rounding up your data.

type_data <- function(organic, retweets, replies, quotes) {
    # Creating a data frame
    data <- data.frame(
        category = c("Organic", "Retweets", "Replies", "Quotes"),
        count = c(dim(organic)[1], dim(retweets)[1], dim(replies)[1], dim(quotes)[1])
    )

    # Adding columns
    data$fraction <- data$count / sum(data$count)
    data$percentage <- data$count / sum(data$count) * 100
    data$ymax <- cumsum(data$fraction)
    data$ymin <- c(0, head(data$ymax, n = -1))

    # Rounding the data to two decimal points
    data[, -1] <- round(data[, -1], 2)
    return(data)
}

library(ggplot2)
clashroyale_data <- type_data(clashroyale_organic, clashroyale_retweets, clashroyale_replies, clashroyale_quotes)
type <- paste(clashroyale_data$category, clashroyale_data$percentage, "%")
ggplot(clashroyale_data, aes(ymax = ymax, ymin = ymin, xmax = 4, xmin = 3, fill = type)) +
    geom_rect() +
    coord_polar(theta = "y") +
    xlim(c(2, 4)) +
    theme_void() +
    theme(legend.position = "right") +
    labs(title = "Clash Royale Tweet Types")

clashofclans_data <- type_data(clashofclans_organic, clashofclans_retweets, clashofclans_replies, clashofclans_quotes)
type <- paste(clashofclans_data$category, clashofclans_data$percentage, "%")
ggplot(clashofclans_data, aes(ymax = ymax, ymin = ymin, xmax = 4, xmin = 3, fill = type)) +
    geom_rect() +
    coord_polar(theta = "y") +
    xlim(c(2, 4)) +
    theme_void() +
    theme(legend.position = "right") +
    labs(title = "Clash of Clans Tweet Types")

2.3 Initial Wordclouds

Before starting learning the exposed machine learning models, let’s build a wordcloud with the following package [3]. Its wordcloud() command needs the list of words and their frequencies as parameters. As the words appear in columns in the document-term matrix, the colSums command is used to calculate the word frequencies. In order to complete the needed calculations, note that the term-document matrix needs to be transformed (casted) to a matrix form with the as.matrix cast-operator. This initial document-term matrix is very sparse, it contains 2000 documents and 7854 terms.

We can see that the generated wordclouds are not very informative. The reason for this is that the most common words are english stop words. These words are very common, but d’t have any meaning. That’s why we should remove them from our corpus.

corpus_dtm_init <- DocumentTermMatrix(corpus)
corpus_dtm_init

<<DocumentTermMatrix (documents: 2000, terms: 7846)>>
Non-/sparse entries: 26582/15665418
Sparsity           : 100%
Maximal term length: 33
Weighting          : term frequency (tf)

library(wordcloud)

Loading required package: RColorBrewer

word_freqs <- sort(colSums(as.matrix(corpus_dtm_init)[1:n, ]), decreasing = TRUE)
wordcloud(words = names(word_freqs), freq = word_freqs, max.words = 100, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

word_freqs <- sort(colSums(as.matrix(corpus_dtm_init)[(n + 1):(n + n), ]), decreasing = TRUE)
wordcloud(words = names(word_freqs), freq = word_freqs, max.words = 100, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

2.4 Better Wordclouds

To make a better wordcloud, we can pass the text directly. A corpus will be generated and stop words will be removed automatically. However, this time emotes are kept, and we can see that some of them are quite common. We can see that the following wordclouds are much more informative. We can already see some differences and similarities between the corpora.

wordcloud(clashroyale_tweets$text, max.words = 50, scale = c(3.5, 0.25), random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Warning in tm_map.SimpleCorpus(corpus, tm::removePunctuation) :
  transformation drops documents
Warning in tm_map.SimpleCorpus(corpus, function(x) tm::removeWords(x, tm::stopwords())) :
  transformation drops documents

wordcloud(clashofclans_tweets$text, max.words = 50, scale = c(3.5, 0.25), random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Warning in tm_map.SimpleCorpus(corpus, tm::removePunctuation) :
  transformation drops documents
Warning in tm_map.SimpleCorpus(corpus, function(x) tm::removeWords(x, tm::stopwords())) :
  transformation drops documents
Warning in wordcloud(clashofclans_tweets$text, max.words = 50, scale = c(3.5,  :
  complete could not be fit on page. It will not be plotted.

2.5 Hashtag Worldclouds

Finally, we can create another wordcloud that only contains the hashtags. We can see that hashtags are not very common, but they are different between the two corpora. We will have to decide if we want to keep or remove them in the next section.

clashroyale_tweets$hashtags <- as.character(clashroyale_tweets$hashtags)
clashroyale_tweets$hashtags <- gsub("c\\(", "", clashroyale_tweets$hashtags)
wordcloud(clashroyale_tweets$hashtags, min.freq = 1, scale = c(3.5, .5), max.words = 50, random.order = FALSE, rot.per = 0.35, colors = brewer.pal(8, "Dark2"))

Warning in tm_map.SimpleCorpus(corpus, tm::removePunctuation) :
  transformation drops documents
Warning in tm_map.SimpleCorpus(corpus, function(x) tm::removeWords(x, tm::stopwords())) :
  transformation drops documents

clashofclans_tweets$hashtags <- as.character(clashofclans_tweets$hashtags)
clashofclans_tweets$hashtags <- gsub("c\\(", "", clashofclans_tweets$hashtags)
wordcloud(clashofclans_tweets$hashtags, min.freq = 1, scale = c(3.5, .5), max.words = 50, random.order = FALSE, rot.per = 0.35, colors = brewer.pal(8, "Dark2"))

Warning in tm_map.SimpleCorpus(corpus, tm::removePunctuation) :
  transformation drops documents
Warning in tm_map.SimpleCorpus(corpus, function(x) tm::removeWords(x, tm::stopwords())) :
  transformation drops documents

3 Preprocessing

As we have said before, some preprocessing is needed so that we get better results when classifying the documents. First, we will apply some transformations such as removing stop words to the text. Then, we will remove sparse words and outlier documents from the corpus. Finally, we will display the final wordclouds so that we can compare them with the initial ones.

3.1 Apply Transformations

Transformations operators to the corpus are applied via tm_map function, which applies (maps) a function to all elements of the corpus. The transformations will be applied to the whole corpus, that constains documents of both classes. Apart from the transformations that are available in the tm package, some custom transformations are also applied with the function content_transformer.

First, some elements are removed from the corpus: numbers, punctuation, urls, mentions, hashtags, newlines and emojis. Then, all the words are converted to lowercase. Next, the previously mentioned english stopwords are removed. After, multiple whitespace characters are collapsed to a single one. Finally, all the words are stemmed to reduce the number of words. We can print the first 5 tweets of each corpus to see the difference with the initial ones.

remove_urls <- function(text) {
    gsub("http\\S*", "", text)
}
remove_mentions <- function(text) {
    gsub("@\\S*", "", text)
}
remove_hashtags <- function(text) {
    gsub("#\\S*", "", text)
}
remove_newlines <- function(text) {
    gsub("\\\n", " ", text)
}
remove_emojis <- function(text) {
    gsub("[^\x01-\x7F]", "", text)
}

# remove numbers
corpus_trans <- tm_map(corpus, removeNumbers)
# remove punctuation
corpus_trans <- tm_map(corpus_trans, removePunctuation)
# remove urls
corpus_trans <- tm_map(corpus_trans, content_transformer(remove_urls))
# remove mentions
corpus_trans <- tm_map(corpus_trans, content_transformer(remove_mentions))
# remove hastags
corpus_trans <- tm_map(corpus_trans, content_transformer(remove_hashtags))
# remove newlines
corpus_trans <- tm_map(corpus_trans, content_transformer(remove_newlines))
# remove emojis
corpus_trans <- tm_map(corpus_trans, content_transformer(remove_emojis))
# convert to lowercase
corpus_trans <- tm_map(corpus_trans, content_transformer(tolower))
# remove english stop words
corpus_trans <- tm_map(corpus_trans, removeWords, stopwords("english"))
# strip whitespace
corpus_trans <- tm_map(corpus_trans, stripWhitespace)
# to access Porter's word stemming algorithm
library(SnowballC)
corpus_trans <- tm_map(corpus_trans, stemDocument)

for (i in 1:5) {
    print(corpus_trans[[i]]$content)
}

[1] "tri new deck yet v showdown perfect parti mode season long"
[1] "welcom year tiger"
[1] "thank amaz year here"
[1] "happi new year may full victori crown"
[1] ""

for (i in (n + 1):(n + 6)) {
    print(corpus_trans[[i]]$content)
}

[1] "caption face clan mate make"
[1] "new year new villag layout check clashchamp use advanc search tool find top layout narrow town hall level base type sort download recent highest rate"
[1] "let gooooooooo"
[1] "predict"
[1] "clashofclan world championship brought us mani incred matchesbut also surpris turn around loss one world champion match didnt happen wish couldv seen comment"
[1] "first season challeng enjoy amaz perk reward complet multipl challeng unlock shadow queen skin month gold pass first hero skin shadow set"

3.2 Remove Sparse Terms

After corpus set transformation, a common approach in text mining is to create a document-term matrix from a corpus. This document-term matrix is the starting point to apply machine-learning modelization techniques such as classification and clustering. Different operations can be applied over this matrix. We can obtain the terms that occur at least 50 times. We can also consult the terms that associate with at least by a 0.3 correlation degree with the term “mainten”. We can see that the correlated words make sense: “short maintencance break soon”, “server upkeep”.

corpus_dtm <- DocumentTermMatrix(corpus_trans)
corpus_dtm

<<DocumentTermMatrix (documents: 2000, terms: 3073)>>
Non-/sparse entries: 17495/6128505
Sparsity           : 100%
Maximal term length: 22
Weighting          : term frequency (tf)

findFreqTerms(corpus_dtm, 50)

 [1] "amp"             "attack"          "back"            "battl"          
 [5] "best"            "break"           "can"             "card"           
 [9] "challeng"        "champion"        "chang"           "check"          
[13] "chief"           "clan"            "clash"           "clashesport"    
[17] "clashworld"      "come"            "complet"         "day"            
[21] "esportsroyaleen" "final"           "first"           "fix"            
[25] "game"            "get"             "happi"           "hey"            
[29] "king"            "know"            "let"             "live"           
[33] "look"            "mainten"         "make"            "month"          
[37] "new"             "next"            "now"             "one"            
[41] "play"            "player"          "reward"          "royal"          
[45] "season"          "see"             "server"          "soon"           
[49] "start"           "super"           "supercel"        "take"           
[53] "team"            "thank"           "time"            "today"          
[57] "troop"           "tune"            "unlock"          "updat"          
[61] "war"             "well"            "will"            "win"            
[65] "world"           "year"

findAssocs(corpus_dtm, term = "mainten", corlimit = 0.3)

$mainten
 break server upkeep   well   soon  minut  short    hey 
  0.63   0.53   0.48   0.48   0.39   0.38   0.32   0.31

We have removed nearly 4000 words from the initiaal document-term matrix. However, it has still a huge degree of sparsity: a low amount of non-zero elements. Thus, one of the most important operations is to remove sparse terms, terms occurring in very few documents. The sparse parameter in the removeSparseTerms function refers to the maximum sparseness allowed: the smaller its proportion, fewer terms will be retained. A trial and error approach will finally return a proper number of terms. This matrix will be the starting point for building further machine learning models.

After trying multiple values, we decide to keep terms with a maximum sparseness of 0.99. This seems to be very high, but it reduces the numbers of terms drastically. In fact, selecting lower values of sparseness the number of terms is too low.

corpus_dtm_95 <- removeSparseTerms(corpus_dtm, sparse = 0.95)
corpus_dtm_95

<<DocumentTermMatrix (documents: 2000, terms: 14)>>
Non-/sparse entries: 1860/26140
Sparsity           : 93%
Maximal term length: 7
Weighting          : term frequency (tf)

barplot(as.matrix(corpus_dtm_95),
    xlab = "terms", ylab = "number of occurrences",
    main = "Most frequent terms (sparseness=0.95)"
)

corpus_dtm_97 <- removeSparseTerms(corpus_dtm, sparse = 0.97)
corpus_dtm_97

<<DocumentTermMatrix (documents: 2000, terms: 42)>>
Non-/sparse entries: 3976/80024
Sparsity           : 95%
Maximal term length: 11
Weighting          : term frequency (tf)

barplot(as.matrix(corpus_dtm_97),
    xlab = "terms", ylab = "number of occurrences",
    main = "Most frequent terms (sparseness=0.97)"
)

corpus_dtm_99 <- removeSparseTerms(corpus_dtm, sparse = 0.99)
corpus_dtm_99

<<DocumentTermMatrix (documents: 2000, terms: 181)>>
Non-/sparse entries: 8723/353277
Sparsity           : 98%
Maximal term length: 15
Weighting          : term frequency (tf)

terms <- dim(corpus_dtm_99)[2]
barplot(as.matrix(corpus_dtm_99),
    xlab = "terms", ylab = "number of occurrences",
    main = "Most frequent terms (sparseness=0.99)"
)

3.3 Outlier Detection

Outlier detection can be used to detect and remove outlier documents from the corpus. We test the Isolation Forest method. I decided not to remove any document to simplify the next steps.

Isolation Forest constructs a tree per document. It tries to isolate the sample from the rest. As outliers are easy to isolate, their isolation score is high. We have to plot the outlierness and decide a threshold.

Isolation Forest

library(solitude)

Registered S3 method overwritten by 'data.table':
  method           from
  print.data.table

# Empty tree structure
iso <- isolationForest$new()

# convert dtm to dataframe
corpus_df_99 <- as.data.frame(as.matrix(corpus_dtm_99))

# Learn the IsolationForest for our data
iso$fit(corpus_df_99)

INFO  [18:05:20.026] dataset has duplicated rows 
INFO  [18:05:20.101] Building Isolation Forest ...  
INFO  [18:05:21.757] done 
INFO  [18:05:21.780] Computing depth of terminal nodes ...  
INFO  [18:05:22.867] done 
INFO  [18:05:22.977] Completed growing isolation forest

# predict for our data
p <- iso$predict(corpus_df_99)

# plot anomaly score
plot(density(p$anomaly_score), main = "Anomaly Score Density")


# Based on the plot, decide the cut-off point
which(p$anomaly_score > 0.62)

 [1]  200  210  456 1006 1008 1104 1593 1827 1846 1962

3.4 Final Worldclouds

Finally, the wordclouds of the reduced document-term matrix are plotted. We can see the difference with the initial wordcloud. The terms of each wordcloud are significantly different

# calculate the frequency of words and sort in descending order.
word_freqs <- sort(colSums(as.matrix(corpus_dtm_99)[1:n, ]), decreasing = TRUE)
wordcloud(words = names(word_freqs), freq = word_freqs, max.words = 50, scale = c(3.5, 0.25), random.order = FALSE, colors = brewer.pal(8, "Dark2"))

word_freqs <- sort(colSums(as.matrix(corpus_dtm_99)[(n + 1):(n + n), ]), decreasing = TRUE)
wordcloud(words = names(word_freqs), freq = word_freqs, max.words = 50, scale = c(3.5, 0.25), random.order = FALSE, colors = brewer.pal(8, "Dark2"))

4 Clustering

4.1 Clustering Words

We try to find clusters of words with hierarchical clustering, a popular clustering techniques which builds a dendogram to iteratively group pairs of similar objects. To do so, a matrix with the sparse terms removed is needed. We select the 0.97 sparsity matrix so that we can visualize them. After the application of the matrix-casting operator, number of occurrences are scaled.

We need to calculate the distance between pairs of terms. The dist operator performs this calculation between pairs of rows of the provided matrix. As terms appear in the columns of the document-term matrix (corpus_dtm_97), it needs to be transposed by means of the t operator. The clustering-dendogram is built with the hclust operator. It needs as input the calculated distance matrix between pairs of terms and a criteria to decide which pair of clusters to be consecutively joined in the bottom-up dendogram. In this case, the “complete” criteria takes into account the maximum distance between any pair of terms of both clusters to be merged. Heigth in the dendogram denotes the distance between a merged pair of clusters.

dist_matrix <- dist(t(scale(as.matrix(corpus_dtm_97))))
term_clustering <- hclust(dist_matrix, method = "complete")
plot(term_clustering)

4.2 Clustering Documents

Another type of popular task is to construct clusters of similar documents based on the frequencies of word occurrences. Here we select a small subset of the initial corpus, 15 documents from each class. We then apply a similar method to the previous one and try to divide documents into two clusters.

dist_matrix <- dist(scale(as.matrix(corpus_dtm_99)[(n - 15):(n + 15), ]))
groups <- hclust(dist_matrix, method = "ward.D")
plot(groups, cex = 0.9, hang = -1)
rect.hclust(groups, k = 2)

5 Data Splitting

Before learning a classification model we have to define the subsets of samples (documents) to train and test our model. We first need create a Data Frame from the Document Term Matrix.

5.1 Create Data Frame

The 0.99 sparseness value document-term matrix is our starting point. This matrix has 181 features, which correspond to the mos frequent terms. We first need to append the class vector as the last column of the matrix. There are 1000 documents of each class, 2000 documents in total.

dim(corpus_dtm_99)

[1] 2000  181

type <- c(rep("clashroyale", n), rep("clashofclans", n)) # create the type vector
corpus_dtm_99 <- cbind(corpus_dtm_99, type) # append
dim(corpus_dtm_99) # consult the updated number of columns

[1] 2000  182

This new matrix is the starting point for supervised classification. However, we first need to convert it to a dataframe. The name of the last column is updated. All the values are converted to numeric and the last column is converted to factor.

corpus_df_99 <- as.data.frame(as.matrix(corpus_dtm_99))
colnames(corpus_df_99)[terms + 1] <- "type"
corpus_df_99$type <- as.factor(corpus_df_99$type)
corpus_df_99 <- as.data.frame(sapply(corpus_df_99, as.numeric))
corpus_df_99[is.na(corpus_df_99)] <- 0
corpus_df_99$type <- as.factor(corpus_df_99$type)
levels(corpus_df_99$type) <- c("clashofclans", "clashroyale")

5.2 Create Data Partition

The createDataPartition produces a train-test partition of our corpus. This will be maintained during the whole pipeline of analysis. Test samples won’t be used for any modeling decision. We will only use them at the end to predict their class and create a confusion matrix. A list of randomly sampled numbers (in_train) is used to partition the whole corpus. 75% of the samples are used for training and the remaining 25% is used for testing.

library(caret)

Loading required package: lattice

set.seed(107) # a random seed to enable reproducibility
in_train <- createDataPartition(y = corpus_df_99$type, p = .75, list = FALSE)
str(in_train)

 int [1:1500, 1] 1 2 3 4 5 6 7 8 9 11 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr "Resample1"

training <- corpus_df_99[in_train, ]
testing <- corpus_df_99[-in_train, ]
nrow(training)

[1] 1500

Similarly, createResample can be used to make simple bootstrap samples. This creates resamples of the size of the corpus with repeated documents. createFolds can be used to generate balanced cross-validation groupings from a set of data.

resamples <- createResample(y = corpus_df_99$type)
str(resamples)

List of 10
 $ Resample01: int [1:2000] 1 1 3 3 4 4 4 5 5 5 ...
 $ Resample02: int [1:2000] 1 1 2 2 2 4 4 4 6 7 ...
 $ Resample03: int [1:2000] 1 1 1 1 2 2 5 5 6 6 ...
 $ Resample04: int [1:2000] 1 1 3 3 4 5 5 10 14 16 ...
 $ Resample05: int [1:2000] 1 5 5 6 6 7 7 9 10 10 ...
 $ Resample06: int [1:2000] 1 1 3 3 3 3 4 4 5 6 ...
 $ Resample07: int [1:2000] 3 3 4 4 5 7 11 11 12 13 ...
 $ Resample08: int [1:2000] 2 3 5 5 6 7 8 9 10 12 ...
 $ Resample09: int [1:2000] 2 2 4 4 5 6 6 7 7 8 ...
 $ Resample10: int [1:2000] 1 1 2 2 2 3 4 5 5 9 ...

folds <- createFolds(y = corpus_df_99$type)
str(folds)

List of 10
 $ Fold01: int [1:200] 5 7 10 17 26 32 34 40 46 50 ...
 $ Fold02: int [1:200] 20 38 64 69 81 83 94 112 134 139 ...
 $ Fold03: int [1:200] 6 18 29 30 35 42 55 63 67 74 ...
 $ Fold04: int [1:200] 3 19 31 36 43 58 61 65 66 85 ...
 $ Fold05: int [1:200] 9 37 49 52 62 68 72 75 87 88 ...
 $ Fold06: int [1:200] 1 4 8 21 22 54 57 96 105 107 ...
 $ Fold07: int [1:200] 2 14 41 48 59 76 84 98 103 117 ...
 $ Fold08: int [1:200] 13 15 25 44 51 71 82 92 93 127 ...
 $ Fold09: int [1:200] 11 12 27 33 39 60 95 99 126 202 ...
 $ Fold10: int [1:200] 16 23 24 28 45 47 56 70 73 78 ...

6 Classification

The caret [4, 5] package is the reference tool for building supervised classification and regression models in R. It covers all the steps of a classic pipeline: data preprocessing, model building, accuracy estimation, prediction of the type of new samples, and statistical comparison between the performance of different models. This cheatsheet of caret illustrates its main function in a single page: https://github.com/CABAH/learningRresources/blob/main/cheatsheets/caret.pdf.

Our objective is to learn a classifier that predicts the type of future documents based on terms occurrences. We have a two-class supervised classification problem.

We now can start training and testing different supervised classification models. The train function implements the building process.

form parameter is used with the expression type ~ . to denote the variable to be predicted, followed by the set of predictors. A point indicates that the rest of variables are used as predictors. data parameter is used for the training data.
method parameter fixes the type of classification algorithm to be learned. caret supports more than 150 supervised classification and regression algorithms. Taking into account the large dimensionality of classic NLP datasets, we have to use classifiers capable to deal with this. In this work we choose Linear Discriminant Analysis (LDA) and Boosted Logistic Regression (LR).
metric parameter fixes the score to assess-validates the goodness of each model. A large set of metrics is offered and we test the following ones: Accuracy, Kappa, ROC, Sensitivity and Specificity.
trControl parameter defines the method to estimate the error of the classifier. The trainControl function allows the use of different performance estimation procedures such as k-fold cross-validation, bootstrapping, etc. We apply a 10-fold cross-validation, repeated 3 times. This is an adequate option because it creates 30 results that can later be used to compare algorithms statistically.

6.1 Linear Discriminant Analysis

Linear Discirminant Analysis

LDA is used to find a linear combination of features that characterizes or separates two or more classes. The resulting combination can be used as a linear classifier, or for dimensionality reduction. This time we will use it as a classifier. We will see a similar unsupervised method called Principal Component Analysis (PCA) for dimensionality reduction in the Feature Extraction section.

Accuracy and Kappa are the default metrics used to evaluate algorithms on binary and multi-class classification datasets in caret. As we have to do binary classification, these metrics are adequate. Our classes are completely balanced, and that makes analysing the metrics easier.

Accuracy is the percentage of correctly classifies instances out of all instances. It is more useful on a binary classification than multi-class classification problems because it can be less clear exactly how the accuracy breaks down across those classes. This could be seen with a confusion matrix.

Kappa is similar to accuracy, but it is normalized at the baseline of random chance on our dataset. It is a more useful measure to use on problems that have an imbalance in the classes. For example, in a 70-30 split for classes 0 and 1 and you can achieve 70% accuracy by predicting all instances are for class 0. As our classes are completely balanced, 50% accuracy is obtained by predicting any of the classes for all instances.

The obtained accuracy is not very good, but this is expected because the problem is not an easy one. The kappa metric also reflects that our classifier is quite bad.

# fixing the performance estimation procedure
train_ctrl <- trainControl(method = "repeatedcv", repeats = 3)
lda_3x10cv <- train(type ~ ., data = training, method = "lda", trControl = train_ctrl)
lda_3x10cv

Linear Discriminant Analysis 

1500 samples
 181 predictor
   2 classes: 'clashofclans', 'clashroyale' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 1350, 1350, 1350, 1350, 1350, 1350, ... 
Resampling results:

  Accuracy   Kappa    
  0.7924444  0.5848889

Another metric that is only suitable for binary classification problems is ROC. The area under the ROC curve represents a models ability to discriminate between positive and negative classes. An area of 1.0 represents a model that made all predicts perfectly. An area of 0.5 represents a model as good as random.

ROC can be broken down into sensitivity and specificity. A binary classification problem is really a trade-off between sensitivity and specificity. Sensitivity is the true positive rate also called the recall. It is the number instances from the positive (first) class that actually predicted correctly. Specificity is also called the true negative rate. Is the number of instances from the negative (second) class that were actually predicted correctly.

To use this metric we have to select it in the function parameters. Moreover, extra parameters must be added to the trainControl function. In binary classification problems the twoClassSummary option displays area under the ROC curve, sensitity and specificity metrics. To do so, activating the classProbs option is also needed, which saves the class probabilities that the classifier assigns to each sample.

Looking at these numbers, we realise that the second class is predicted correctly more times than the first one. The first class is predicted correctly 67% of the times and the second one 90% of the times. This will also be evident if we calculate a confusion matrix when testing the model.

library(pROC)

Type 'citation("pROC")' for a citation.

Attaching package: ‘pROC’

The following objects are masked from ‘package:stats’:

    cov, smooth, var

train_ctrl <- trainControl(method = "repeatedcv",repeats=3, classProbs=TRUE, summaryFunction=twoClassSummary)
lda_roc_3x10cv <- train(type ~ ., data = training, method = "lda", metric="ROC", trControl = train_ctrl)
lda_roc_3x10cv

Linear Discriminant Analysis 

1500 samples
 181 predictor
   2 classes: 'clashofclans', 'clashroyale' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 1350, 1350, 1350, 1350, 1350, 1350, ... 
Resampling results:

  ROC        Sens       Spec     
  0.8668474  0.6786667  0.9057778

6.2 Boosted Logistic Regression

Logistic Regression

Logistic Regression is used to model the probability of a certain class. It uses a linear combination of independent variables, and applies the logistic function at the end to obtain probabilities. If we define a cut-off probability, it can be used as a binary or multi-class classification model. Boosted LR is an additive logistic regression model. It uses and ensemble of similar LR models to make predictions.

While the linear LDA classifier does not have parameters, LR has the nIter key parameter. This parameter indicates the number of iterations of the Logistic Regression model. By default, without changing the value of the parameter, caret evaluates 3 models. The tuneLength option of the train function fixes the number of values of each parameter to be checked. For example, if the classifier has 2 parameters and the tuneLength parameter is not changed, 3 x 3 = 9 models are evaluated.

train_ctrl <- trainControl(
    method = "repeatedcv", repeats = 3
)
lr_3x10cv <- train(type ~ .,
    data = training, method = "LogitBoost", trControl = train_ctrl
)
lr_3x10cv

Boosted Logistic Regression 

1500 samples
 181 predictor
   2 classes: 'clashofclans', 'clashroyale' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 1350, 1350, 1350, 1350, 1350, 1350, ... 
Resampling results across tuning parameters:

  nIter  Accuracy   Kappa    
  11     0.7051111  0.4102222
  21     0.7548889  0.5097778
  31     0.7704444  0.5408889

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was nIter = 31.

plot(lr_3x10cv)

If we increase the tuneLength to 15 we can evaluate more models, and check if the accuracy increases. We can see that the accuracy improves up to some point and then it is nearly constant. Therefore, it is not worth to increase the value of nIter

train_ctrl <- trainControl(
    method = "repeatedcv", repeats = 3
)
lr_tunel_3x10cv <- train(type ~ .,
    data = training, method = "LogitBoost", trControl = train_ctrl, tuneLength = 15
)
lr_tunel_3x10cv

Boosted Logistic Regression 

1500 samples
 181 predictor
   2 classes: 'clashofclans', 'clashroyale' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 1350, 1350, 1350, 1350, 1350, 1350, ... 
Resampling results across tuning parameters:

  nIter  Accuracy   Kappa    
   11    0.7055556  0.4111111
   21    0.7553333  0.5106667
   31    0.7702222  0.5404444
   41    0.7788889  0.5577778
   51    0.7775556  0.5551111
   61    0.7828889  0.5657778
   71    0.7835556  0.5671111
   81    0.7837778  0.5675556
   91    0.7846667  0.5693333
  101    0.7862222  0.5724444
  111    0.7862222  0.5724444
  121    0.7877778  0.5755556
  131    0.7886667  0.5773333
  141    0.7888889  0.5777778
  151    0.7882222  0.5764444

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was nIter = 141.

plot(lr_tunel_3x10cv)

We can also try the ROC metric to have more information about the performance of our model. We get similar results to the LDA classifier, with a much higher Specificity than Sensitivity.

train_ctrl <- trainControl(method = "repeatedcv",repeats=3, classProbs=TRUE, summaryFunction=twoClassSummary)
lr_roc_3x10cv <- train(type ~ ., data=training, method="LogitBoost", trControl=train_ctrl, metric="ROC", tuneLength=15)
lr_roc_3x10cv

Boosted Logistic Regression 

1500 samples
 181 predictor
   2 classes: 'clashofclans', 'clashroyale' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 1350, 1350, 1350, 1350, 1350, 1350, ... 
Resampling results across tuning parameters:

  nIter  ROC        Sens       Spec     
   11    0.7524148  0.4217778  0.9906667
   21    0.8164059  0.5240000  0.9804444
   31    0.8346933  0.5768889  0.9746667
   41    0.8385096  0.6102222  0.9528889
   51    0.8425156  0.6146667  0.9533333
   61    0.8396770  0.6177778  0.9448889
   71    0.8464533  0.6311111  0.9440000
   81    0.8493867  0.6328889  0.9466667
   91    0.8508533  0.6337778  0.9448889
  101    0.8531674  0.6386667  0.9440000
  111    0.8557422  0.6400000  0.9453333
  121    0.8568889  0.6466667  0.9431111
  131    0.8563644  0.6466667  0.9417778
  141    0.8577393  0.6448889  0.9466667
  151    0.8563704  0.6457778  0.9426667

ROC was used to select the optimal model using the largest value.
The final value used for the model was nIter = 141.

plot(lr_roc_3x10cv)

The tuneGrid option offers the possibility to select among a set of values to be tuned-tested.

train_ctrl <- trainControl(
    method = "repeatedcv", repeats = 3
)
tune_grid <- expand.grid(
  nIter = seq(100, 120, 2)
)
lr_tuneg_3x10cv <- train(type ~ .,
    data = training, method = "LogitBoost", trControl = train_ctrl, tuneGrid = tune_grid
)
lr_tuneg_3x10cv

Boosted Logistic Regression 

1500 samples
 181 predictor
   2 classes: 'clashofclans', 'clashroyale' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 1350, 1350, 1350, 1350, 1350, 1350, ... 
Resampling results across tuning parameters:

  nIter  Accuracy   Kappa    
  100    0.9099078  0.8102130
  102    0.9114335  0.8140331
  104    0.9112186  0.8128037
  106    0.9074133  0.8050636
  108    0.9062959  0.8035127
  110    0.9089574  0.8082699
  112    0.9093845  0.8090195
  114    0.9090713  0.8091313
  116    0.9087313  0.8075768
  118    0.9083232  0.8067941
  120    0.9075655  0.8061344

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was nIter = 102.

plot(lr_tuneg_3x10cv)

7 Subsampling

Our initial corpus is completely balanced, it has 1000 samples of each class. However, we can create an unbalanced corpus by removing some samples. For example, we can create a corpus that has 1000 samples of one class and 250 from the other class. If class-label distributions are unbalanced in our corpus, a resampling method will try to improve the recovery rate in the minority class.

This test will only be performed with the LR classifier. First, a normal classifier will be trained. Then multiple resampling methods will be tested and compared with the base classifier. ROC is an adequeate metric in this case because we can compare the sensitivity and specificity for each subsampling method.

We expect to have very high specificity but low sensitivity. Therefore, our aim is to increase sensistivity. Downsampling and upsampling improve the sensitivity a bit and the hybrid method gets worse results.

corpus_df_99_un = corpus_df_99[1:(n+n/4), ]
in_train_un <- createDataPartition(y = corpus_df_99_un$type, p = .75, list = FALSE)
str(in_train_un)
training_un <- corpus_df_99[in_train_un, ]
testing_un <- corpus_df_99[-in_train_un, ]

train_ctrl <- trainControl(method = "repeatedcv", repeats = 3, classProbs=TRUE, summaryFunction=twoClassSummary)
lda_un_3x10cv <- train(type ~ ., data = training_un, method = "LogitBoost", metric="ROC", trControl = train_ctrl)
lda_un_3x10cv

Boosted Logistic Regression 

938 samples
181 predictors
  2 classes: 'clashofclans', 'clashroyale' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 844, 844, 844, 845, 844, 844, ... 
Resampling results across tuning parameters:

  nIter  ROC        Sens       Spec     
  11     0.7196043  0.3411306  0.9835556
  21     0.7734756  0.4455166  0.9728889
  31     0.7980253  0.5037037  0.9644444

ROC was used to select the optimal model using the largest value.
The final value used for the model was nIter = 31.

7.1 Downsampling

Downsampling randomly subsets all the classes in the training set so that their class frequencies match the least prevalent class. For example, suppose that 80% of the training set samples are the first class and the remaining 20% are in the second class. Down-sampling would randomly sample the first class to be the same size as the second class (so that only 40% of the total training set is used to fit the model).

train_ctrl <- trainControl(method = "repeatedcv", repeats = 3, classProbs=TRUE, summaryFunction=twoClassSummary, sampling="down")
lda_down_3x10cv <- train(type ~ ., data = training_un, method = "LogitBoost", metric="ROC", trControl = train_ctrl)
lda_down_3x10cv

Boosted Logistic Regression 

938 samples
181 predictors
  2 classes: 'clashofclans', 'clashroyale' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 844, 844, 844, 845, 844, 844, ... 
Addtional sampling using down-sampling

Resampling results across tuning parameters:

  nIter  ROC        Sens       Spec     
  11     0.7018311  0.3384016  0.9648889
  21     0.7510702  0.4434698  0.9480000
  31     0.7826244  0.5300195  0.9293333

ROC was used to select the optimal model using the largest value.
The final value used for the model was nIter = 31.

7.2 Upsampling

Upsampling randomly samples the minority class to be the same size as the majority class.

train_ctrl <- trainControl(method = "repeatedcv", repeats = 3, classProbs=TRUE, summaryFunction=twoClassSummary, sampling="up")
lda_up_3x10cv <- train(type ~ ., data = training_un, method = "LogitBoost", metric="ROC", trControl = train_ctrl)
lda_up_3x10cv

Boosted Logistic Regression 

938 samples
181 predictors
  2 classes: 'clashofclans', 'clashroyale' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 844, 844, 845, 844, 844, 845, ... 
Addtional sampling using up-sampling

Resampling results across tuning parameters:

  nIter  ROC        Sens       Spec     
 11    0.7280273 0.3691033 0.9764444
 21    0.7836797 0.4647173 0.9648889
 31    0.7983190 0.5143275 0.9520000

ROC was used to select the optimal model using the largest value.
The final value used for the model was nIter = 31.

7.3 Hybrid

An hybrid method downsamples the majority class and synthesizes new data points in the minority class.

train_ctrl <- trainControl(method = "repeatedcv", repeats = 3, classProbs=TRUE, summaryFunction=twoClassSummary, sampling="smote")
lda_smote_3x10cv <- train(type ~ ., data = training_un, method = "LogitBoost", metric="ROC", trControl = train_ctrl)
lda_smote_3x10cv

Boosted Logistic Regression 

938 samples
181 predictors
  2 classes: 'clashofclans', 'clashroyale' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 844, 844, 844, 844, 845, 844, ... 
Addtional sampling using SMOTE

Resampling results across tuning parameters:

  nIter  ROC        Sens       Spec     
  11     0.6861157  0.2978558  0.9764444
  21     0.7571027  0.4308967  0.9648889
  31     0.7837505  0.4765107  0.9631111

ROC was used to select the optimal model using the largest value.
The final value used for the model was nIter = 31.

8 Feature Selection

Most approaches for reducing the number of features can be placed into two main categories: wrappers and filters.

Wrapper methods evaluate multiple models using procedures that add and/or remove predictors to find the optimal combination that maximizes model performance. In essence, wrapper methods are search algorithms that treat the predictors as the inputs and utilize model performance as the output to be optimized.

Filter methods evaluate the relevance of the predictors outside of the predictive models and subsequently model only the predictors that pass some criterion. Each predictor is ecaluated individually to check if there is a plausible relationship between it and the observed classes. Only predictors with important relationships would then be included in a classification model.

The functions are applied to the entire training set and also to different resampled versions of the data set. From this, generalizable estimates of performance can be computed that properly take into account the feature selection step.

In our case we will test Univariate Filter and 2 wrapper methods: Recursive Feature Elimination and Simulated Annealing. We will apply these methods to both classifiers and we will compare the results at the end.

8.1 Univariate Filter

Predictors can be filtered by conducting some sort of sample test to see if the mean of the predictor is different between the classes. Predictors that have statistically significant differences between the classes are then used for modeling.

On average, less than 80 variables are selected and the accuracy of the classifiers is improved. Therefore, this method is a great option in this case.

sbf_ctrl <- sbfControl(functions = rfSBF, method = "repeatedcv", repeats = 3)
train_ctrl <- trainControl(method = "repeatedcv", repeats = 3, classProbs = TRUE)
lr_sbf_3x10cv <- sbf(type ~ ., data = training, method = "LogitBoost", trControl = train_ctrl, sbfControl = sbf_ctrl)
lr_sbf_3x10cv


Selection By Filter

Outer resampling method: Cross-Validated (10 fold, repeated 3 times) 

Resampling performance:


Using the training set, 81 variables were selected:
   action, art, attack, avail, back...

During resampling, the top 5 selected variables (out of a possible 106):
   action (100%), attack (100%), best (100%), bit (100%), can (100%)

On average, 78.8 variables were selected (min = 74, max = 82)

lda_sbf_3x10cv <- sbf(type ~ ., data = training, method = "lda", trControl = train_ctrl, sbfControl = sbf_ctrl)
lda_sbf_3x10cv


Selection By Filter

Outer resampling method: Cross-Validated (10 fold, repeated 3 times) 

Resampling performance:


Using the training set, 81 variables were selected:
   action, art, attack, avail, back...

During resampling, the top 5 selected variables (out of a possible 104):
   action (100%), attack (100%), back (100%), best (100%), bit (100%)

On average, 79.1 variables were selected (min = 74, max = 82)

8.2 Recursive Feature Elimination

First, the algorithm fits the model to all predictors. Each predictor is ranked using it’s importance to the model. At each iteration of feature selection, the top ranked predictors are retained, the model is refit and performance is assessed. The number of predictors with the best performance is determined and the top predictors are used to fit the final model. In this case 4, 8, 16 and 181 predictors are tested.

The accuracy of the classifiers is improved. Therefore, this method is also a great option in this case.

lr_rfe_3x10cv


Recursive feature selection

Outer resampling method: Cross-Validated (10 fold, repeated 3 times) 

Resampling performance over subset size:


The top 5 variables (out of 181):
   devourlick, chief, redditclash, clashworld, clashquest

lda_rfe_3x10cv


Recursive feature selection

Outer resampling method: Cross-Validated (10 fold, repeated 3 times) 

Resampling performance over subset size:


The top 5 variables (out of 181):
   devourlick, chief, redditclash, clashworld, clashquest

8.3 Simulated Annealing

Simulated annealing is a global search method that makes small perturbations to an initial candidate solution. If the performance value for the perturbed value is better than the previous solution, the new solution is accepted. If not, an acceptance probability is determined based on the difference between the two performance values and the current iteration of the search. In the context of feature selection, a solution is a binary vector that describes the current subset. The subset is perturbed by randomly changing a small number of members in the subset.

Using this method the accuracy of the models decreases a lot, so it is not a good option.

lr_safs_3x10cv


Simulated Annealing Feature Selection

1500 samples
181 predictors
2 classes: 'clashofclans', 'clashroyale' 

Maximum search iterations: 10 

Internal performance values: Accuracy, Kappa
Subset selection driven to maximize internal Accuracy 

External performance values: Accuracy, Kappa
Best iteration chose by maximizing external Accuracy 
External resampling method: Cross-Validated (10 fold, repeated 3 times) 

During resampling:
  * the top 5 selected variables (out of a possible 181):
    short (46.7%), avail (40%), back (40%), final (40%), tomorrow (40%)
  * on average, 40.4 variables were selected (min = 37, max = 46)

In the final search using the entire training set:
   * 36 features selected at iteration 10 including:
     anoth, art, attack, avail, back ... 
   * external performance at this iteration is

   Accuracy       Kappa 
     0.6236      0.2471

lda_safs_3x10cv


Simulated Annealing Feature Selection

1500 samples
181 predictors
2 classes: 'clashofclans', 'clashroyale' 

Maximum search iterations: 10 

Internal performance values: Accuracy, Kappa
Subset selection driven to maximize internal Accuracy 

External performance values: Accuracy, Kappa
Best iteration chose by maximizing external Accuracy 
External resampling method: Cross-Validated (10 fold, repeated 3 times) 

During resampling:
  * the top 5 selected variables (out of a possible 181):
    around (46.7%), royal (43.3%), credit (40%), match (40%), teamquesogg (40%)
  * on average, 40.9 variables were selected (min = 37, max = 46)

In the final search using the entire training set:
   * 42 features selected at iteration 10 including:
     best, bug, card, chanc, chang ... 
   * external performance at this iteration is

   Accuracy       Kappa 
     0.6451      0.2902

9 Feature Extraction

Unlike feature selection, set of new features is constructed from original ones, which are commonly linear combinations of original ones. There are multiple methods to do feature extraction such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). Unlike PCA, LDA is a supervised method that can also be used for classification. This time we will only apply PCA to both classifiers, because one of our classifiers is LDA.

9.1 Summary Table

It is easy to learn a PCA in R with the prcomp function. First, we will print the summary of the principal components. We can see that there are 181 principal components. The principal components are not very good, their proportion of variance is generally very low. We would have to select many principal components to get a high proportion of variance.

pca_res <- prcomp(scale(training[, -ncol(training)]))
summary(pca_res)

Importance of components:
                           PC1     PC2     PC3     PC4     PC5     PC6     PC7
Standard deviation     2.51030 2.14480 2.07663 1.84400 1.66983 1.58807 1.55321
Proportion of Variance 0.03482 0.02542 0.02383 0.01879 0.01541 0.01393 0.01333
Cumulative Proportion  0.03482 0.06023 0.08406 0.10284 0.11825 0.13218 0.14551
                           PC8     PC9    PC10    PC11    PC12    PC13    PC14
Standard deviation     1.53688 1.49641 1.47912 1.44105 1.42793 1.40900 1.39285
Proportion of Variance 0.01305 0.01237 0.01209 0.01147 0.01127 0.01097 0.01072
Cumulative Proportion  0.15856 0.17093 0.18302 0.19449 0.20576 0.21672 0.22744
                          PC15    PC16    PC17    PC18    PC19    PC20    PC21
Standard deviation     1.38177 1.37380 1.35679 1.34754 1.34092 1.33441 1.32621
Proportion of Variance 0.01055 0.01043 0.01017 0.01003 0.00993 0.00984 0.00972
Cumulative Proportion  0.23799 0.24842 0.25859 0.26862 0.27856 0.28839 0.29811
                          PC22    PC23    PC24    PC25    PC26    PC27    PC28
Standard deviation     1.30223 1.28800 1.28036 1.27361 1.26999 1.25878 1.24428
Proportion of Variance 0.00937 0.00917 0.00906 0.00896 0.00891 0.00875 0.00855
Cumulative Proportion  0.30748 0.31665 0.32570 0.33466 0.34358 0.35233 0.36088
                          PC29    PC30    PC31    PC32   PC33    PC34    PC35
Standard deviation     1.23130 1.22340 1.22213 1.21357 1.2109 1.20387 1.19294
Proportion of Variance 0.00838 0.00827 0.00825 0.00814 0.0081 0.00801 0.00786
Cumulative Proportion  0.36926 0.37753 0.38578 0.39392 0.4020 0.41003 0.41789
                          PC36    PC37   PC38    PC39    PC40    PC41    PC42
Standard deviation     1.19205 1.18745 1.1804 1.17405 1.17038 1.15801 1.14826
Proportion of Variance 0.00785 0.00779 0.0077 0.00762 0.00757 0.00741 0.00728
Cumulative Proportion  0.42574 0.43353 0.4412 0.44884 0.45641 0.46382 0.47110
                          PC43    PC44    PC45    PC46    PC47    PC48    PC49
Standard deviation     1.14317 1.13599 1.13448 1.12490 1.11986 1.11090 1.10468
Proportion of Variance 0.00722 0.00713 0.00711 0.00699 0.00693 0.00682 0.00674
Cumulative Proportion  0.47832 0.48545 0.49256 0.49956 0.50648 0.51330 0.52004
                          PC50    PC51    PC52    PC53    PC54    PC55    PC56
Standard deviation     1.09810 1.09463 1.09156 1.08670 1.07889 1.06458 1.06399
Proportion of Variance 0.00666 0.00662 0.00658 0.00652 0.00643 0.00626 0.00625
Cumulative Proportion  0.52671 0.53333 0.53991 0.54643 0.55286 0.55913 0.56538
                          PC57   PC58    PC59   PC60    PC61    PC62    PC63
Standard deviation     1.05614 1.0507 1.04865 1.0417 1.04077 1.03453 1.03074
Proportion of Variance 0.00616 0.0061 0.00608 0.0060 0.00598 0.00591 0.00587
Cumulative Proportion  0.57154 0.5776 0.58372 0.5897 0.59570 0.60161 0.60748
                          PC64    PC65    PC66   PC67    PC68    PC69    PC70
Standard deviation     1.02857 1.02062 1.01265 1.0071 1.00273 0.99931 0.99684
Proportion of Variance 0.00585 0.00576 0.00567 0.0056 0.00556 0.00552 0.00549
Cumulative Proportion  0.61333 0.61908 0.62475 0.6303 0.63591 0.64142 0.64691
                          PC71   PC72    PC73    PC74    PC75    PC76    PC77
Standard deviation     0.99128 0.9884 0.98165 0.97842 0.97697 0.97270 0.96423
Proportion of Variance 0.00543 0.0054 0.00532 0.00529 0.00527 0.00523 0.00514
Cumulative Proportion  0.65234 0.6577 0.66306 0.66835 0.67363 0.67885 0.68399
                          PC78   PC79    PC80    PC81    PC82   PC83    PC84
Standard deviation     0.96359 0.9606 0.95259 0.94770 0.94368 0.9420 0.93992
Proportion of Variance 0.00513 0.0051 0.00501 0.00496 0.00492 0.0049 0.00488
Cumulative Proportion  0.68912 0.6942 0.69923 0.70419 0.70911 0.7140 0.71890
                          PC85   PC86    PC87    PC88    PC89    PC90   PC91
Standard deviation     0.93598 0.9320 0.92996 0.92611 0.92316 0.91588 0.9127
Proportion of Variance 0.00484 0.0048 0.00478 0.00474 0.00471 0.00463 0.0046
Cumulative Proportion  0.72374 0.7285 0.73331 0.73805 0.74276 0.74739 0.7520
                          PC92    PC93    PC94    PC95   PC96    PC97    PC98
Standard deviation     0.91105 0.90431 0.89700 0.89412 0.8928 0.88631 0.87987
Proportion of Variance 0.00459 0.00452 0.00445 0.00442 0.0044 0.00434 0.00428
Cumulative Proportion  0.75658 0.76110 0.76554 0.76996 0.7744 0.77870 0.78298
                          PC99   PC100   PC101   PC102   PC103   PC104   PC105
Standard deviation     0.87505 0.87036 0.86711 0.86303 0.85974 0.85550 0.84988
Proportion of Variance 0.00423 0.00419 0.00415 0.00412 0.00408 0.00404 0.00399
Cumulative Proportion  0.78721 0.79140 0.79555 0.79967 0.80375 0.80779 0.81178
                         PC106  PC107   PC108   PC109   PC110  PC111   PC112
Standard deviation     0.84832 0.8399 0.83580 0.83307 0.82696 0.8180 0.81435
Proportion of Variance 0.00398 0.0039 0.00386 0.00383 0.00378 0.0037 0.00366
Cumulative Proportion  0.81576 0.8197 0.82352 0.82735 0.83113 0.8348 0.83849
                         PC113  PC114   PC115   PC116   PC117   PC118   PC119
Standard deviation     0.80997 0.8076 0.80446 0.80259 0.79707 0.79226 0.78516
Proportion of Variance 0.00362 0.0036 0.00358 0.00356 0.00351 0.00347 0.00341
Cumulative Proportion  0.84212 0.8457 0.84929 0.85285 0.85636 0.85983 0.86324
                         PC120   PC121   PC122   PC123   PC124   PC125   PC126
Standard deviation     0.77956 0.77644 0.77123 0.76589 0.76451 0.75653 0.75455
Proportion of Variance 0.00336 0.00333 0.00329 0.00324 0.00323 0.00316 0.00315
Cumulative Proportion  0.86659 0.86993 0.87321 0.87645 0.87968 0.88284 0.88599
                         PC127   PC128   PC129   PC130   PC131   PC132   PC133
Standard deviation     0.74969 0.74644 0.74019 0.72827 0.72649 0.72379 0.72141
Proportion of Variance 0.00311 0.00308 0.00303 0.00293 0.00292 0.00289 0.00288
Cumulative Proportion  0.88909 0.89217 0.89520 0.89813 0.90105 0.90394 0.90682
                         PC134   PC135   PC136   PC137   PC138   PC139   PC140
Standard deviation     0.71488 0.71319 0.71110 0.70717 0.70153 0.70042 0.69390
Proportion of Variance 0.00282 0.00281 0.00279 0.00276 0.00272 0.00271 0.00266
Cumulative Proportion  0.90964 0.91245 0.91524 0.91801 0.92072 0.92344 0.92610
                         PC141   PC142   PC143   PC144  PC145   PC146  PC147
Standard deviation     0.68809 0.68199 0.68131 0.67393 0.6733 0.66217 0.6597
Proportion of Variance 0.00262 0.00257 0.00256 0.00251 0.0025 0.00242 0.0024
Cumulative Proportion  0.92871 0.93128 0.93385 0.93635 0.9389 0.94128 0.9437
                         PC148   PC149   PC150   PC151   PC152   PC153   PC154
Standard deviation     0.65298 0.64798 0.64372 0.63986 0.63505 0.62905 0.62409
Proportion of Variance 0.00236 0.00232 0.00229 0.00226 0.00223 0.00219 0.00215
Cumulative Proportion  0.94604 0.94836 0.95065 0.95291 0.95514 0.95733 0.95948
                         PC155   PC156   PC157  PC158   PC159   PC160  PC161
Standard deviation     0.62222 0.61243 0.60808 0.6018 0.59707 0.59181 0.5869
Proportion of Variance 0.00214 0.00207 0.00204 0.0020 0.00197 0.00194 0.0019
Cumulative Proportion  0.96162 0.96369 0.96573 0.9677 0.96970 0.97164 0.9735
                         PC162  PC163   PC164   PC165   PC166   PC167   PC168
Standard deviation     0.57949 0.5713 0.56122 0.55733 0.54677 0.54000 0.52983
Proportion of Variance 0.00186 0.0018 0.00174 0.00172 0.00165 0.00161 0.00155
Cumulative Proportion  0.97540 0.9772 0.97894 0.98066 0.98231 0.98392 0.98547
                         PC169   PC170   PC171   PC172   PC173   PC174   PC175
Standard deviation     0.52467 0.52253 0.51424 0.50955 0.49976 0.48317 0.45726
Proportion of Variance 0.00152 0.00151 0.00146 0.00143 0.00138 0.00129 0.00116
Cumulative Proportion  0.98699 0.98850 0.98996 0.99140 0.99277 0.99406 0.99522
                         PC176   PC177   PC178   PC179   PC180   PC181
Standard deviation     0.43164 0.42758 0.40820 0.40187 0.35141 0.21079
Proportion of Variance 0.00103 0.00101 0.00092 0.00089 0.00068 0.00025
Cumulative Proportion  0.99625 0.99726 0.99818 0.99907 0.99975 1.00000

9.2 Variance Plots

We can visualize the previous values in different plots to get abetter idea of the variance of the principal components.

pca_res_var <- pca_res$sdev ^ 2
pca_res_pvar <- pca_res_var/sum(pca_res_var)

plot(pca_res_pvar,xlab="Principal component", ylab="Proportion of variance explained", ylim=c(0,1), type='b')

plot(cumsum(pca_res_pvar),xlab="Principal component", ylab="Cumulative Proportion of variance explained", ylim=c(0,1), type='b')

screeplot(pca_res,type="l")

9.3 Main Components Plot

We visualize in a 2-D graph two first components, those that save larger variability of original data. The aim is to find an intuitive separation of problem classes. As expected, there is no clear separation between the classes. The variance of the principal components is too low to decide the two classes.

plot(main="Principal Components", pca_res$x[,1:2], col = training$type)

9.4 Classification

Finally, we can use caret to test the performance of the two models if we apply PCA as a preprocessing option. The preProcess parameter defines the preprocessing steps to be applied. They are popular with classic numeric variables, such as imputation of missing values, centering and scaling, etc. As NLP datasets have their own preprocessing tools, they have not been applied until now. However, caret offers pca as a prepocessing option. Two more preprocessing functions are applied: center and scale.

As we expected, applying PCA does not improve the results of the classifiers. In fact, the results are worse for both classifiers.

# fixing the performance estimation procedure
train_ctrl <- trainControl(method = "repeatedcv", repeats = 3)
lr_pca_3x10cv <- train(type ~ ., data = training, method = "LogitBoost", preProcess = c("center", "scale", "pca"), trControl = train_ctrl)
lr_pca_3x10cv

Boosted Logistic Regression 

1500 samples
 181 predictor
   2 classes: 'clashofclans', 'clashroyale' 

Pre-processing: centered (181), scaled (181), principal component
 signal extraction (181) 
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 1350, 1350, 1350, 1350, 1350, 1350, ... 
Resampling results across tuning parameters:

  nIter  Accuracy   Kappa    
  11     0.6651111  0.3302222
  21     0.6855556  0.3711111
  31     0.6786667  0.3573333

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was nIter = 21.

lda_pca_3x10cv <- train(type ~ ., data = training, method = "lda", preProcess = c("center", "scale", "pca"), trControl = train_ctrl)
lda_pca_3x10cv

Linear Discriminant Analysis 

1500 samples
 181 predictor
   2 classes: 'clashofclans', 'clashroyale' 

Pre-processing: centered (181), scaled (181), principal component
 signal extraction (181) 
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 1350, 1350, 1350, 1350, 1350, 1350, ... 
Resampling results:

  Accuracy   Kappa    
  0.7831111  0.5662222

10 Testing

In order to predict the class value of unseen documents of the test partition caret uses the classifier which shows the best accuracy estimation of their parameters. Function predict implements this functionality. Consult its parameters. The type parameter, by means of its probs value, outputs the probability of test each sample belonging to each class. On the other hand, the raw value outputs the class value with the largest probability. By means of the raw option the confusion matrix can be calculated: this crosses, for each test sample, predicted with real class values.

All the previously learned classifiers are tested on the test partition. There are 10 different classifiers in total, the two main types with the variations of feature selection and extraction. As expected, the accuracy for the testing partition is a bit lower than the train partition. Specificity is higher than Sensitivity in all the cases, which means that our model is better at predicting samples of class 2: clashroyale. This can also be seen in the confusion matrices. The performance of each algorithm will be compared more in detail in the next section.

10.1 LDA

lda_pred <- predict(lda_3x10cv, newdata = testing, type = "raw")
confusionMatrix(data = lda_pred, testing$type)

Confusion Matrix and Statistics

              Reference
Prediction     clashofclans clashroyale
  clashofclans          163          34
  clashroyale            87         216
                                         
               Accuracy : 0.758          
                 95% CI : (0.718, 0.7949)
    No Information Rate : 0.5            
    P-Value [Acc > NIR] : < 2.2e-16      
                                         
                  Kappa : 0.516          
                                         
 Mcnemar's Test P-Value : 2.276e-06      
                                         
            Sensitivity : 0.6520         
            Specificity : 0.8640         
         Pos Pred Value : 0.8274         
         Neg Pred Value : 0.7129         
             Prevalence : 0.5000         
         Detection Rate : 0.3260         
   Detection Prevalence : 0.3940         
      Balanced Accuracy : 0.7580         
                                         
       'Positive' Class : clashofclans

10.2 LDA SBF

lda_sbf_pred <- predict(lda_sbf_3x10cv, newdata = testing, type = "raw")
confusionMatrix(data = lda_sbf_pred$pred, testing$type)

Confusion Matrix and Statistics

              Reference
Prediction     clashofclans clashroyale
  clashofclans          170          25
  clashroyale            80         225
                                          
               Accuracy : 0.79            
                 95% CI : (0.7516, 0.8249)
    No Information Rate : 0.5             
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.58            
                                          
 Mcnemar's Test P-Value : 1.365e-07       
                                          
            Sensitivity : 0.6800          
            Specificity : 0.9000          
         Pos Pred Value : 0.8718          
         Neg Pred Value : 0.7377          
             Prevalence : 0.5000          
         Detection Rate : 0.3400          
   Detection Prevalence : 0.3900          
      Balanced Accuracy : 0.7900          
                                          
       'Positive' Class : clashofclans

10.3 LDA RFE

lda_rfe_pred <- predict(lda_rfe_3x10cv, newdata = testing)
confusionMatrix(data = lda_rfe_pred$pred, testing$type)

Confusion Matrix and Statistics

              Reference
Prediction     clashofclans clashroyale
  clashofclans          178          37
  clashroyale            72         213
                                          
               Accuracy : 0.782           
                 95% CI : (0.7432, 0.8174)
    No Information Rate : 0.5             
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.564           
                                          
 Mcnemar's Test P-Value : 0.001128        
                                          
            Sensitivity : 0.7120          
            Specificity : 0.8520          
         Pos Pred Value : 0.8279          
         Neg Pred Value : 0.7474          
             Prevalence : 0.5000          
         Detection Rate : 0.3560          
   Detection Prevalence : 0.4300          
      Balanced Accuracy : 0.7820          
                                          
       'Positive' Class : clashofclans

10.4 LDA SAFS

lda_safs_pred <- predict(lda_safs_3x10cv, newdata = testing, type = "raw")
confusionMatrix(data = lda_safs_pred, testing$type)

Confusion Matrix and Statistics

              Reference
Prediction     clashofclans clashroyale
  clashofclans          115          40
  clashroyale           135         210
                                          
               Accuracy : 0.65            
                 95% CI : (0.6064, 0.6918)
    No Information Rate : 0.5             
    P-Value [Acc > NIR] : 9.513e-12       
                                          
                  Kappa : 0.3             
                                          
 Mcnemar's Test P-Value : 1.197e-12       
                                          
            Sensitivity : 0.4600          
            Specificity : 0.8400          
         Pos Pred Value : 0.7419          
         Neg Pred Value : 0.6087          
             Prevalence : 0.5000          
         Detection Rate : 0.2300          
   Detection Prevalence : 0.3100          
      Balanced Accuracy : 0.6500          
                                          
       'Positive' Class : clashofclans

10.5 LDA PCA

lda_pca_pred <- predict(lda_pca_3x10cv, newdata = testing, type = "raw")
confusionMatrix(data = lda_pca_pred, testing$type)

Confusion Matrix and Statistics

              Reference
Prediction     clashofclans clashroyale
  clashofclans          160          26
  clashroyale            90         224
                                          
               Accuracy : 0.768           
                 95% CI : (0.7285, 0.8043)
    No Information Rate : 0.5             
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.536           
                                          
 Mcnemar's Test P-Value : 4.933e-09       
                                          
            Sensitivity : 0.6400          
            Specificity : 0.8960          
         Pos Pred Value : 0.8602          
         Neg Pred Value : 0.7134          
             Prevalence : 0.5000          
         Detection Rate : 0.3200          
   Detection Prevalence : 0.3720          
      Balanced Accuracy : 0.7680          
                                          
       'Positive' Class : clashofclans

10.6 BLR

lr_pred <- predict(lr_3x10cv, newdata = testing, type = "raw")
confusionMatrix(data = lr_pred, testing$type)

Confusion Matrix and Statistics

              Reference
Prediction     clashofclans clashroyale
  clashofclans          125          10
  clashroyale           125         240
                                          
               Accuracy : 0.73            
                 95% CI : (0.6888, 0.7685)
    No Information Rate : 0.5             
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.46            
                                          
 Mcnemar's Test P-Value : < 2.2e-16       
                                          
            Sensitivity : 0.5000          
            Specificity : 0.9600          
         Pos Pred Value : 0.9259          
         Neg Pred Value : 0.6575          
             Prevalence : 0.5000          
         Detection Rate : 0.2500          
   Detection Prevalence : 0.2700          
      Balanced Accuracy : 0.7300          
                                          
       'Positive' Class : clashofclans

10.7 BLR SBF

lr_sbf_pred <- predict(lr_sbf_3x10cv, newdata = testing, type = "raw")
confusionMatrix(data = lr_sbf_pred$pred, testing$type)

Confusion Matrix and Statistics

              Reference
Prediction     clashofclans clashroyale
  clashofclans          169          27
  clashroyale            81         223
                                          
               Accuracy : 0.784           
                 95% CI : (0.7453, 0.8193)
    No Information Rate : 0.5             
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.568           
                                          
 Mcnemar's Test P-Value : 3.398e-07       
                                          
            Sensitivity : 0.6760          
            Specificity : 0.8920          
         Pos Pred Value : 0.8622          
         Neg Pred Value : 0.7336          
             Prevalence : 0.5000          
         Detection Rate : 0.3380          
   Detection Prevalence : 0.3920          
      Balanced Accuracy : 0.7840          
                                          
       'Positive' Class : clashofclans

10.8 BLR RFE

lr_rfe_pred <- predict(lr_rfe_3x10cv, newdata = testing)
confusionMatrix(data = lr_rfe_pred$pred, testing$type)

Confusion Matrix and Statistics

              Reference
Prediction     clashofclans clashroyale
  clashofclans          179          37
  clashroyale            71         213
                                          
               Accuracy : 0.784           
                 95% CI : (0.7453, 0.8193)
    No Information Rate : 0.5             
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.568           
                                          
 Mcnemar's Test P-Value : 0.001496        
                                          
            Sensitivity : 0.7160          
            Specificity : 0.8520          
         Pos Pred Value : 0.8287          
         Neg Pred Value : 0.7500          
             Prevalence : 0.5000          
         Detection Rate : 0.3580          
   Detection Prevalence : 0.4320          
      Balanced Accuracy : 0.7840          
                                          
       'Positive' Class : clashofclans

10.9 BLR SAFS

lr_safs_pred <- predict(lr_safs_3x10cv, newdata = testing, type = "raw")
confusionMatrix(data = lr_safs_pred, testing$type)

Confusion Matrix and Statistics

              Reference
Prediction     clashofclans clashroyale
  clashofclans           79          10
  clashroyale           171         240
                                          
               Accuracy : 0.638           
                 95% CI : (0.5942, 0.6802)
    No Information Rate : 0.5             
    P-Value [Acc > NIR] : 3.52e-10        
                                          
                  Kappa : 0.276           
                                          
 Mcnemar's Test P-Value : < 2.2e-16       
                                          
            Sensitivity : 0.3160          
            Specificity : 0.9600          
         Pos Pred Value : 0.8876          
         Neg Pred Value : 0.5839          
             Prevalence : 0.5000          
         Detection Rate : 0.1580          
   Detection Prevalence : 0.1780          
      Balanced Accuracy : 0.6380          
                                          
       'Positive' Class : clashofclans

10.10 BLR PCA

lr_pca_pred <- predict(lr_pca_3x10cv, newdata = testing, type = "raw")
confusionMatrix(data = lr_pca_pred, testing$type)

Confusion Matrix and Statistics

              Reference
Prediction     clashofclans clashroyale
  clashofclans          154          62
  clashroyale            96         188
                                          
               Accuracy : 0.684           
                 95% CI : (0.6412, 0.7246)
    No Information Rate : 0.5             
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.368           
                                          
 Mcnemar's Test P-Value : 0.008656        
                                          
            Sensitivity : 0.616           
            Specificity : 0.752           
         Pos Pred Value : 0.713           
         Neg Pred Value : 0.662           
             Prevalence : 0.500           
         Detection Rate : 0.308           
   Detection Prevalence : 0.432           
      Balanced Accuracy : 0.684           
                                          
       'Positive' Class : clashofclans

11 Comparison

As a final step, we will compare the 10 models that we have trained. First, we will compare the results in a table. Then, we will create some plots to compare performance of the algorithms visually. Finally, we will perform a statistical significance test to know if there is a significant difference between pairs of classifiers.

11.1 Summary Tables

This is the easiest comparison that we can do, simply call the summary function and pass it the resamples result. It will create a table with one algorithm for each row and evaluation metrics for each column.

By looking at those values we can have an idea of which classifiers are the best ones. If we look at the base classifiers, LDA is better than LR. However, applying SBF or RFE feature selection improves the results of both classifiers and makes them similar. The other feature selection and extraction methods make the results of both classifiers worse.

resamps <- resamples(list(lr = lr_3x10cv, lr_sbf = lr_sbf_3x10cv, lr_rfe = lr_rfe_3x10cv, lr_safs = lr_safs_3x10cv, lr_pca = lr_pca_3x10cv, lda = lda_3x10cv, lda_sbf = lda_sbf_3x10cv, lda_rfe = lda_rfe_3x10cv, lda_safs = lda_safs_3x10cv, lda_pca = lda_pca_3x10cv))
summary(resamps)


Call:
summary.resamples(object = resamps)

Models: lr, lr_sbf, lr_rfe, lr_safs, lr_pca, lda, lda_sbf, lda_rfe, lda_safs, lda_pca 
Number of resamples: 30 

Accuracy 
              Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
lr       0.7200000 0.7550000 0.7733333 0.7704444 0.7866667 0.8066667    0
lr_sbf   0.7466667 0.7816667 0.8066667 0.8082222 0.8400000 0.8666667    0
lr_rfe   0.7666667 0.7866667 0.8000000 0.8062222 0.8266667 0.8533333    0
lr_safs  0.5600000 0.5883333 0.6200000 0.6235556 0.6466667 0.7400000    0
lr_pca   0.6066667 0.6733333 0.6866667 0.6855556 0.7066667 0.7466667    0
lda      0.7200000 0.7733333 0.7900000 0.7924444 0.8133333 0.8466667    0
lda_sbf  0.7533333 0.7800000 0.8000000 0.8031111 0.8133333 0.8666667    0
lda_rfe  0.7533333 0.7733333 0.8100000 0.8097778 0.8533333 0.8733333    0
lda_safs 0.5533333 0.6166667 0.6566667 0.6451111 0.6716667 0.6933333    0
lda_pca  0.7133333 0.7683333 0.7833333 0.7831111 0.8050000 0.8400000    0

Kappa 
              Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
lr       0.4400000 0.5100000 0.5466667 0.5408889 0.5733333 0.6133333    0
lr_sbf   0.4933333 0.5633333 0.6133333 0.6164444 0.6800000 0.7333333    0
lr_rfe   0.5333333 0.5733333 0.6000000 0.6124444 0.6533333 0.7066667    0
lr_safs  0.1200000 0.1766667 0.2400000 0.2471111 0.2933333 0.4800000    0
lr_pca   0.2133333 0.3466667 0.3733333 0.3711111 0.4133333 0.4933333    0
lda      0.4400000 0.5466667 0.5800000 0.5848889 0.6266667 0.6933333    0
lda_sbf  0.5066667 0.5600000 0.6000000 0.6062222 0.6266667 0.7333333    0
lda_rfe  0.5066667 0.5466667 0.6200000 0.6195556 0.7066667 0.7466667    0
lda_safs 0.1066667 0.2333333 0.3133333 0.2902222 0.3433333 0.3866667    0
lda_pca  0.4266667 0.5366667 0.5666667 0.5662222 0.6100000 0.6800000    0

11.2 Box and Whisker Plots

This is a useful way to look at the spread of the estimated accuracies for different methods and how they relate. Note that the boxes are ordered from highest to lowest mean accuracy. They are useful to look at the mean values (dots) and the boxes (middle 50% of results). We can extract the same conclusions we extracted by looking at the table easier by lookin at this plot.

scales <- list(x=list(relation="free"), y=list(relation="free"))
bwplot(resamps, scales=scales)

11.3 Density Plots

We can show the distribution of model accuracy as density plots. This is a useful way to evaluate the overlap in the estimated behavior of algorithms. They are also to look at the differences in the peaks as well as the variance of the distributions.

scales <- list(x=list(relation="free"), y=list(relation="free"))
densityplot(resamps, scales=scales, pch = "|")

11.4 Dot Plots

These are useful plots as the show both the mean estimated accuracy as well as the 95% confidence interval. They are useful to compare the means and the overlap of the spreads between algorithms. We can compare algorithms like we did with the boxplot.

scales <- list(x=list(relation="free"), y=list(relation="free"))
dotplot(resamps, scales=scales)

11.5 Scatterplot Matrix

This creates a scatterplot matrix of all results for an algorithm compared to the results for all other algorithms. These are useful to compare pairs of algorithms.

splom(resamps)

11.6 Pairwise xyPlots

We can zoom in on one pair-wise comparison of the accuracy for two algorithms with an xyplot. For example, we can compare the two main algorithms to see that LDA is better than LR.

xyplot(resamps, what = "BlandAltman", models = c("lr", "lda"))

Another useful comparison is to check the effect of feature selection and extraction. For the Logistic Regression algorithm, Univariate Filters and Recursive Feature Elimination improve the accuracy. However, Simulated Annealing and Principal Component Analysis get worse results.

xyplot(resamps, what = "BlandAltman", models = c("lr", "lr_sbf"))

xyplot(resamps, what = "BlandAltman", models = c("lr", "lr_rfe"))

xyplot(resamps, what = "BlandAltman", models = c("lr", "lr_safs"))

xyplot(resamps, what = "BlandAltman", models = c("lr", "lr_pca"))

11.7 Statistical Significance Tests

Note than in our case, due to the 3 repetitions of the 10-fold cross-validation process, there are 30 resampling results for each classifier. The same paired cross-validation subsets of samples were used for all classifiers. We have to use a paired t-test to calculate the significance of the differences between both classifiers.

Using the diff function over the resamps object calculates the differences between all pairs of classifiers. The output shows, for each metric (accuracy and kappa), the difference of the mean (positive or negative) between both classifiers. The p-value of the whole t-test is 0, which indicates that there is a significant difference between some classifiers. Therefore, we can discard the null hypothesis that says that there is no difference between classifiers.

The interpretation of the p-value is the key point. It is related with the risk of erroneously discarding the null-hypothesis of similarity between compared classifiers, when there is no real difference. Roughly speaking, it can also be interpreted as the degree of similarity between both classifiers. A p-value smaller than 0.05 alerts about statistically significant differences between both classifiers. That is, when the risk of erroneously discarding the hypothesis of similarity between both classifiers is low, we assume that there is a statistically significant difference between classifiers.

The lower diagonal of the table shows p-values for the null hypothesis. The upper diagonal of the table shows the estimated difference between the distributions. We can see that is come cases the p-value is bigger than 0.05 and therefore we can not discard the null hypothesis. In some other cases, the p-value is smaller than 0.05 so we can surely discard the null hypothesis.

We can see that all the ideas that we had before when comparing classifiers are confirmed with the statistical test. Some classifiers are significantly better than others. The base LDA is better than the base LR, applying SBF and RFE improves the results and applying SAFS and PCA makes results worse.

diffs <- diff(resamps)
summary(diffs)


Call:
summary.diff.resamples(object = diffs)

p-value adjustment: bonferroni 
Upper diagonal: estimates of the difference
Lower diagonal: p-value for H0: difference = 0

Accuracy 
         lr        lr_sbf    lr_rfe    lr_safs   lr_pca    lda       lda_sbf  
lr                 -0.037778 -0.035778  0.146889  0.084889 -0.022000 -0.032667
lr_sbf   0.0056665            0.002000  0.184667  0.122667  0.015778  0.005111
lr_rfe   0.0005784 1.0000000            0.182667  0.120667  0.013778  0.003111
lr_safs  6.967e-15 3.491e-16 < 2.2e-16           -0.062000 -0.168889 -0.179556
lr_pca   1.892e-09 1.372e-11 4.566e-13 0.0001458           -0.106889 -0.117556
lda      0.1057663 1.0000000 1.0000000 3.965e-15 3.230e-15           -0.010667
lda_sbf  0.0034098 1.0000000 1.0000000 6.978e-16 2.278e-15 1.0000000          
lda_rfe  0.0088474 1.0000000 1.0000000 2.379e-15 9.486e-14 1.0000000 1.0000000
lda_safs 1.204e-13 2.051e-15 < 2.2e-16 1.0000000 0.0002075 6.769e-15 < 2.2e-16
lda_pca  1.0000000 0.5988760 0.3064779 6.841e-13 3.971e-12 1.0000000 0.1333848
         lda_rfe   lda_safs  lda_pca  
lr       -0.039333  0.125333 -0.012667
lr_sbf   -0.001556  0.163111  0.025111
lr_rfe   -0.003556  0.161111  0.023111
lr_safs  -0.186222 -0.021556 -0.159556
lr_pca   -0.124222  0.040444 -0.097556
lda      -0.017333  0.147333  0.009333
lda_sbf  -0.006667  0.158000  0.020000
lda_rfe             0.164667  0.026667
lda_safs 3.805e-16           -0.138000
lda_pca  0.3630450 1.123e-12          

Kappa 
         lr        lr_sbf    lr_rfe    lr_safs   lr_pca    lda       lda_sbf  
lr                 -0.075556 -0.071556  0.293778  0.169778 -0.044000 -0.065333
lr_sbf   0.0056665            0.004000  0.369333  0.245333  0.031556  0.010222
lr_rfe   0.0005784 1.0000000            0.365333  0.241333  0.027556  0.006222
lr_safs  6.967e-15 3.491e-16 < 2.2e-16           -0.124000 -0.337778 -0.359111
lr_pca   1.892e-09 1.372e-11 4.566e-13 0.0001458           -0.213778 -0.235111
lda      0.1057663 1.0000000 1.0000000 3.965e-15 3.230e-15           -0.021333
lda_sbf  0.0034098 1.0000000 1.0000000 6.978e-16 2.278e-15 1.0000000          
lda_rfe  0.0088474 1.0000000 1.0000000 2.379e-15 9.486e-14 1.0000000 1.0000000
lda_safs 1.204e-13 2.051e-15 < 2.2e-16 1.0000000 0.0002075 6.769e-15 < 2.2e-16
lda_pca  1.0000000 0.5988760 0.3064779 6.841e-13 3.971e-12 1.0000000 0.1333848
         lda_rfe   lda_safs  lda_pca  
lr       -0.078667  0.250667 -0.025333
lr_sbf   -0.003111  0.326222  0.050222
lr_rfe   -0.007111  0.322222  0.046222
lr_safs  -0.372444 -0.043111 -0.319111
lr_pca   -0.248444  0.080889 -0.195111
lda      -0.034667  0.294667  0.018667
lda_sbf  -0.013333  0.316000  0.040000
lda_rfe             0.329333  0.053333
lda_safs 3.805e-16           -0.276000
lda_pca  0.3630450 1.123e-12

12 Bibliography

[1] Ingo Feinerer. tm: Text Mining Package, 2012. R package version 0.5-7.1.

[2] Ingo Feinerer, Kurt Hornik, and David Meyer. Text mining infrastructure in R. Journal of Statistical Software, 25(5):1-54, 3 2008.

[3] Ian Fellows. wordcloud: Word Clouds, 2014. R package version 2.5.

[4] M. Kuhn and K. Johnson. Applied Predictive Modeling. Springer, 2013.

[5] Max Kuhn. Contributions from Jed Wing, Steve Weston, Andre Williams, Chris Keefer, Allan Engelhardt, Tony Cooper, Zachary Mayer, and the R Core Team. caret: Classification and Regression Training, 2014. R package version 6.0-35.

LS0tDQp0aXRsZTogJ1ByZXByb2Nlc3NpbmcsIGNsdXN0ZXJpbmcgYW5kIGNsYXNzaWZpY2F0aW9uIG9mIHR3ZWV0cyBpbiBSJw0KYXV0aG9yOiAnSnVsZW4gRXR4YW5peicNCm91dHB1dDoNCiAgaHRtbF9ub3RlYm9vazogDQogICAgdG9jOiB5ZXMNCiAgICB0b2NfZmxvYXQ6IHllcw0KICAgIG51bWJlcl9zZWN0aW9uczogeWVzDQotLS0NCg0KIyBDb3JwdXMNCg0KVHdpdHRlciBwcm92aWRlcyB1cyB3aXRoIHZhc3QgYW1vdW50cyBvZiB1c2VyLWdlbmVyYXRlZCBsYW5ndWFnZSBkYXRhLCB3aGljaCBpcyBhIGRyZWFtIGZvciBhbnlvbmUgd2FudGluZyB0byBjb25kdWN0IHRleHR1YWwgYW5hbHlzaXMuIFRoZSBgdHdpdHRlUmAgbGlicmFyeSBwcm92aWRlcyBhY2Nlc3MgdG8gVHdpdHRlciBkYXRhLiBUd2l0dGVyIG1hcmtzIGl0cyB1c2UgYXMgdGhlIOKAmG9mZmljaWFs4oCZIHdheSB0byBkb3dubG9hZCBpdHMgdHdlZXRzLiBBbiBhdHRyYWN0aXZlIGFuZCDigJhlYXN5LXRvLXVzZeKAmSBhbHRlcm5hdGl2ZSB0byBUd2l0dGVy4oCZcyDigJhvZmZpY2lhbCBydWxlc+KAmSBpcyBiYXNlZCBvbiB0aGUgdXNlIG9mIHRoZSBgcnR3ZWV0YCBwYWNrYWdlLiBUaGUgW2ZvbGxvd2luZyBsaW5rXShodHRwczovL2dpdGh1Yi5jb20vcm9wZW5zY2kvcnR3ZWV0KSBzZWVtcyB0byBiZSBhIG1vcmUgdXBkYXRlZCBwYWNrYWdlLiBUaGlzIFtzZXQgb2Ygc2xpZGVzXShodHRwczovL21rZWFybmV5LmdpdGh1Yi5pby9uaWNhcl90d29ya3Nob3AvKSBvZmZlcnMgYW4gZWFzeS10by1mb2xsb3cgdHV0b3JpYWwsIHNob3dpbmcgdGhlIHBpcGVsaW5lIHRoYXQgeW91IG5lZWQuDQoNClR3aXR0ZXLigJlzIGxpbmsgdG8gY3JlYXRlIFR3aXR0ZXIgYXBwbGljYXRpb25zIGlzIGh0dHBzOi8vZGV2ZWxvcGVyLnR3aXR0ZXIuY29tL2VuL2FwcHMuIFlvdSBuZWVkIHRvIGJlIGxvZ2dlZCBpbiB0byBUd2l0dGVyIHRvIGNyZWF0ZSBhIG5ldyBhcHAuIFRoaXMgd2lsbCBwcm92aWRlIHlvdSBhIHNldCBvZiA1IGl0ZW1zIHJlbGF0ZWQgdG8gdGhlIGFwcGxpY2F0aW9uIGNhbGxlZCBgYXBwYCwgYGNvbnN1bWVyS2V5YCwgYGNvbnN1bWVyU2VjcmV0YCwgYGFjY2Vzc1Rva2VuYCBhbmQgYGFjY2Vzc1NlY3JldGAuIEJvdGggYGFjY2Vzc1Rva2VuYCBhbmQgYGFjY2Vzc1NlY3JldGAgbmVlZCB0byBiZSBhY3RpdmF0ZWQgYWZ0ZXIgcmVjZWl2aW5nIHRoZSBgY29uc3VtZXJLZXlgIGFuZCBgY29uc3VtZXJTZWNyZXRgLiBGaXZlIHBhcmFtZXRlcnMgbmVlZCB0byBiZSB1c2VkIGluIHRoZSBmaW5hbCBhdXRoZW50aWZpY2F0aW9uIGZ1bmN0aW9uIGNhbGwsIGBjcmVhdGVfdG9rZW4oKWAuDQoNCmBgYA0KdG9rZW4gPC0gY3JlYXRlX3Rva2VuKA0KICAgIGFwcCA9IGFwcCwNCiAgICBjb25zdW1lcl9rZXkgPSBjb25zdW1lcl9rZXksDQogICAgY29uc3VtZXJfc2VjcmV0ID0gY29uc3VtZXJfc2VjcmV0LA0KICAgIGFjY2Vzc190b2tlbiA9IGFjY2Vzc19rZXksDQogICAgYWNjZXNzX3NlY3JldCA9IGFjY2Vzc19zZWNyZXQNCikNCmBgYA0KDQpPbmNlIHRoZSBhdXRoZW50aWZpY2F0aW9uIGlzIGRvbmUsIHR3ZWV0cyBvZiBhbnkgdXNlciBvciBoYXNodGFnIGNhbiBiZSByZXRyaWV2ZWQgYW5kIGNvbnZlcnRlZCB0byBhIGNvcnB1cy4gSW4gdGhpcyBjYXNlLCBJIGhhdmUgZGVjaWRlZCB0byBtYWtlIGEgY29ycHVzIHdpdGggdGhlIHR3ZWV0cyBvZiB0d28gbW9iaWxlIGdhbWUgYWNjb3VudHMuIEFzIHRoZXkgYXJlIHNpbWlsYXIgZ2FtZXMsIHBlcmZvcm1pbmcgY2xhc3NpZmljYXRpb24gb2YgdHdlZXRzIHdpbGwgYmUgYSBjaGFsbGVuZ2luZyB0YXNrLiBPbmx5IHRoZSBsYXN0IDEwMDAgdHdlZXRzIG9mIGVhY2ggYWNjb3VudCBhcmUgcmV0cmlldmVkLg0KDQpUaGVyZWZvcmUsIHdlIGhhdmUgYSBiaW5hcnkgY2xhc3NpZmljYXRpb24gcHJvYmxlbSwgd2hlcmUgdGhlIGNsYXNzIGlzIGBjbGFzaHJveWFsZWAgb3IgYGNsYXNob2ZjbGFuc2AuIEFzIHdlIGFyZSB3b3JraW5nIHdpdGggdGV4dCwgdGhlIHByZWRpY3RpdmUgZmVhdHVyZXMgdGhhdCB3ZSBoYXZlIGFyZSByZWxhdGVkIHRvIHdvcmRzLg0KDQpgYGB7cn0NCmxpYnJhcnkocnR3ZWV0KQ0KIyByZXRyaWV2ZSB1c2VyIHR3ZWV0cw0KbiA8LSAxMDAwDQpjbGFzaHJveWFsZV90d2VldHMgPC0gZ2V0X3RpbWVsaW5lKCJjbGFzaHJveWFsZSIsIG4gPSBuKQ0KY2xhc2hvZmNsYW5zX3R3ZWV0cyA8LSBnZXRfdGltZWxpbmUoImNsYXNob2ZjbGFucyIsIG4gPSBuKQ0KYGBgDQoNCkluIHRoZSBmaXJzdCA1IHR3ZWV0cyBvZiBlYWNoIGRhdGFzZXQgd2UgY2FuIHNlZSB0aGF0IHRoZSB0d2VldHMgZG9uJ3Qgb25seSBoYXZlIHdvcmRzLiBUaGVyZSBhcmUgYWxzbyBsaW5rcyBhbmQgZW1vdGVzIGZvciBleGFtcGxlLiBJbiB0aGUgbmV4dCBzZWN0aW9uIHdlIHdpbGwgaGF2ZSB0byBkZWNpZGUgd2hhdCB3ZSB3YW50IHRvIGRvIHdpdGggdGhvc2Ugd29yZHMuIEFwYXJ0IGZyb20gdGhlIHRleHQsIG1hbnkgb3RoZXIgZGF0YSBpcyByZXR1cm4gYnkgdGhlIHByZXZpb3VzIGZ1bmN0aW9uLiBJbiB0b3RhbCwgdGhlcmUgYXJlIDkwIGNvbHVtbnMsIGJ1dCB3ZSB3aWxsIG9ubHkgdXNlIGEgZmV3IG9mIHRoZW0uIFRoZSBtb3N0IGltcG9ydGFudCBvbmUgaXMgdGhlIGB0ZXh0YCBjb2x1bW4uIFdlIHdpbGwgdXNlIHNvbWUgb3RoZXIgZmVhdHVyZXMgc3VjaCBhcyB0aGUgZGF0ZSBmb3IgdmlzdWFsaXphdGlvbi4NCg0KYGBge3J9DQpoZWFkKGNsYXNocm95YWxlX3R3ZWV0cywgbiA9IDVMKQ0KaGVhZChjbGFzaG9mY2xhbnNfdHdlZXRzLCBuID0gNUwpDQpjbGFzaHJveWFsZV90d2VldHMkdGV4dFsxOjVdDQpjbGFzaG9mY2xhbnNfdHdlZXRzJHRleHRbMTo1XQ0KYGBgDQoNCldlIGNhbiB1c2UgdGhlIGB0bWAgbGlicmFyeSB0byBidWlsZCBhIGNvcnB1cyBmb3IgZWFjaCBjbGFzcy4gRWFjaCB0d2VldCB3aWxsIGJlIGEgZG9jdW1lbnQgaW4gdGhpcyBjb3JwdXMuIFRoZW4gd2UgY2FuIG1lcmdlIHRoZW0gdG8gaGF2ZSBhIHNpbmdsZSBjb3JwdXMuIEJ1aWxkaW5nIGEgY29ycHVzIGlzIHJlY29tbWVuZGVkIGJlY2F1c2UgdGhlIGB0bWAgcGFja2FnZSBvZmZlcnMgbWFueSB0cmFuc2Zvcm1hdGlvbnMgZm9yIHByZXByb2Nlc3NpbmcgdGV4dC4NCg0KYGBge3J9DQpsaWJyYXJ5KHRtKQ0KIyBjb21iaW5lIGJvdGggZnJhbWVzIGluIGEgc2luZ2xlLCBiaW5hcnksIGFubm90YXRlZCBzZXQNCnR3ZWV0cyA8LSByYmluZChjbGFzaHJveWFsZV90d2VldHMsIGNsYXNob2ZjbGFuc190d2VldHMpDQojIGludGVycHJldGluZyBlYWNoIGVsZW1lbnQgb2YgdGhlIGFubm90YXRlZCB2ZWN0b3IgYXMgYSBkb2N1bWVudA0KY2xhc2hyb3lhbGVfZG9jcyA8LSBWZWN0b3JTb3VyY2UoY2xhc2hyb3lhbGVfdHdlZXRzJHRleHQpDQpjbGFzaG9mY2xhbnNfZG9jcyA8LSBWZWN0b3JTb3VyY2UoY2xhc2hvZmNsYW5zX3R3ZWV0cyR0ZXh0KQ0KIyBjb252ZXJ0IHRvIGEgY29ycHVzOiBzdXBlcnZpc2VkIGNsYXNzaWZpY2F0aW9uIHRvIGJlIGFwcGxpZWQgaW4gZnV0dXJlIHN0ZXBzDQpjbGFzaHJveWFsZV9jb3JwdXMgPC0gVkNvcnB1cyhjbGFzaHJveWFsZV9kb2NzKQ0KY2xhc2hvZmNsYW5zX2NvcnB1cyA8LSBWQ29ycHVzKGNsYXNob2ZjbGFuc19kb2NzKQ0KIyBtZXJnZSwgY29uY2F0ZW5hdGUgYm90aCBncm91cHMtY29ycHVzZXMNCmNvcnB1cyA8LSBjKGNsYXNocm95YWxlX2NvcnB1cywgY2xhc2hvZmNsYW5zX2NvcnB1cykNCmBgYA0KDQojIFZpc3VhbGl6YXRpb24NCg0KVmlzdWFsaXppbmcgdGhlIGRhdGEgaXMgaW1wb3J0YW50IHRvIHVuZGVyc3RhbmQgb3VyIGNvcnB1cy4gSW4gdGhpcyBzZWN0aW9uIHRoZXJlIGFyZSB2YXJpb3VzIHRpbWUgc2VyaWVzIHBsb3RzLCBkb251dCBwbG90cyBhbmQgd29yZGNsb3Vkcy4NCg0KIyMgVGltZSBTZXJpZXMgUGxvdA0KDQpXZSBjYW4gdXNlIHRoZSBgcnR3ZWV0YCBwYWNrYWdlIGdldCBhIHRpbWUgc2VyaWVzIHBsb3Qgd2l0aCB0aGUgZnJlcXVlbmNpZXMgb2YgdHdlZXRzLiBJbiB0aGVzZSBleGFtcGxlcywgSSBhbmFseXNlIHRoZSBmcmVxdWVuY2llcyBvZiBib3RoIGFjY291bnRzIGJ5IG1vbnRoLCB3ZWVrIGFuZCBkYXkuIFRoZSB0d2VldCBmcmVxdWVuY2llcyBhcmUgc2ltaWxhciwgQ2xhc2ggUm95YWxlIGhhcyBtb3JlIHR3ZWV0cy4NCg0KYGBge3J9DQp0c19wbG90KGRwbHlyOjpncm91cF9ieSh0d2VldHMsIHNjcmVlbl9uYW1lKSwgIm1vbnRoIikgKw0KICAgIGdncGxvdDI6OnRoZW1lX21pbmltYWwoKSArDQogICAgZ2dwbG90Mjo6dGhlbWUocGxvdC50aXRsZSA9IGdncGxvdDI6OmVsZW1lbnRfdGV4dChmYWNlID0gImJvbGQiKSkgKw0KICAgIGdncGxvdDI6OmxhYnMoDQogICAgICAgIHggPSAiRGF0ZSIsIHkgPSAiQ291bnQiLA0KICAgICAgICB0aXRsZSA9ICJGcmVxdWVuY3kgb2YgVHdlZXRzIGZyb20gQ2xhc2ggUm95YWxlIGFuZCBDbGFzaCBvZiBDbGFucyIsDQogICAgICAgIHN1YnRpdGxlID0gIlR3ZWV0IGNvdW50cyBhZ2dyZWdhdGVkIGJ5IG1vbnRoIg0KICAgICkNCmBgYA0KDQpgYGB7cn0NCnRzX3Bsb3QoZHBseXI6Omdyb3VwX2J5KHR3ZWV0cywgc2NyZWVuX25hbWUpLCAid2VlayIpICsNCiAgICBnZ3Bsb3QyOjp0aGVtZV9taW5pbWFsKCkgKw0KICAgIGdncGxvdDI6OnRoZW1lKHBsb3QudGl0bGUgPSBnZ3Bsb3QyOjplbGVtZW50X3RleHQoZmFjZSA9ICJib2xkIikpICsNCiAgICBnZ3Bsb3QyOjpsYWJzKA0KICAgICAgICB4ID0gIkRhdGUiLCB5ID0gIkNvdW50IiwNCiAgICAgICAgdGl0bGUgPSAiRnJlcXVlbmN5IG9mIFR3ZWV0cyBmcm9tIENsYXNoIFJveWFsZSBhbmQgQ2xhc2ggb2YgQ2xhbnMiLA0KICAgICAgICBzdWJ0aXRsZSA9ICJUd2VldCBjb3VudHMgYWdncmVnYXRlZCBieSB3ZWVrIg0KICAgICkNCmBgYA0KDQpgYGB7cn0NCnRzX3Bsb3QoZHBseXI6Omdyb3VwX2J5KHR3ZWV0cywgc2NyZWVuX25hbWUpLCAiZGF5IikgKw0KICAgIGdncGxvdDI6OnRoZW1lX21pbmltYWwoKSArDQogICAgZ2dwbG90Mjo6dGhlbWUocGxvdC50aXRsZSA9IGdncGxvdDI6OmVsZW1lbnRfdGV4dChmYWNlID0gImJvbGQiKSkgKw0KICAgIGdncGxvdDI6OmxhYnMoDQogICAgICAgIHggPSAiRGF0ZSIsIHkgPSAiQ291bnQiLA0KICAgICAgICB0aXRsZSA9ICJGcmVxdWVuY3kgb2YgVHdlZXRzIGZyb20gQ2xhc2ggUm95YWxlIGFuZCBDbGFzaCBvZiBDbGFucyIsDQogICAgICAgIHN1YnRpdGxlID0gIlR3ZWV0IGNvdW50cyBhZ2dyZWdhdGVkIGJ5IGRheSINCiAgICApDQpgYGANCg0KIyMgVHdlZXQgVHlwZXMgQ2hhcnQNCg0KQW5hbHlzaW5nIHRoZSByYXRpbyBvZiBxdW90ZXMsIHJlcGxpZXMsIHJldHdlZXRzIGFuZCBvcmdhbmljIHR3ZWV0cyBjYW4gdGVsbCB1cyB0aGUgdHlwZSBvZiB0d2VldHMgd2UgYXJlIGFuYWx5c2luZy4gV2UgY291bGQgY2hvb3NlIHRvIG9ubHkga2VlcCBvcmdhbmljIHR3ZWV0cyBmb3Igb3VyIGNvcnB1cy4gUmVtb3ZpbmcgcmV0d2VldHMgbWlnaHQgcmVkdWNlIHRoZSB2YXJpYWJpbGl0eSBvZiB0aGUgZGF0YSBhbmQgdGhlcmVmb3JlLCBtYWtlIGl0IGVhc2llciB0byBjbGFzc2lmeS4gVGhpcyB0aW1lIHdlIHdpbGwga2VlcCBhbGwgdHdlZXQgdHlwZXMsIGJ1dCB3ZSB3aWxsIHN0aWxsIHZpc3VhbGl6ZSB0aGUgdHlwZXMgaW4gYSBkb251dCBjaGFydC4NCg0KQXMgYSBmaXJzdCBzdGVwIHdlIGhhdmUgdG8gZGl2aWRlIGVhY2ggYWNjb3VudCB0d2VldHMgaW50byB0aGUgcHJldmlvdXNseSBtZW50aW9uZWQgc3Vic2V0cy4NCg0KYGBge3J9DQp0d2VldF90eXBlcyA8LSBmdW5jdGlvbih0d2VldHMpIHsNCiAgICBvcmdhbmljIDwtIHR3ZWV0c1t0d2VldHMkaXNfcmV0d2VldCA9PSBGQUxTRSwgXQ0KICAgICMgUmVtb3ZlIHJlcGxpZXMNCiAgICBvcmdhbmljIDwtIHN1YnNldChvcmdhbmljLCBpcy5uYShvcmdhbmljJHJlcGx5X3RvX3N0YXR1c19pZCkpDQogICAgIyBSZW1vdmUgcXVvdGVzDQogICAgb3JnYW5pYyA8LSBvcmdhbmljW29yZ2FuaWMkaXNfcXVvdGUgPT0gRkFMU0UsIF0NCiAgICAjIEtlZXBpbmcgb25seSB0aGUgcmV0d2VldHMNCiAgICByZXR3ZWV0cyA8LSB0d2VldHNbdHdlZXRzJGlzX3JldHdlZXQgPT0gVFJVRSwgXQ0KICAgICMgS2VlcGluZyBvbmx5IHRoZSByZXBsaWVzDQogICAgcmVwbGllcyA8LSBzdWJzZXQodHdlZXRzLCAhaXMubmEodHdlZXRzJHJlcGx5X3RvX3N0YXR1c19pZCkpDQogICAgIyBLZWVwaW5nIG9ubHkgdGhlIHF1b3Rlcw0KICAgIHF1b3RlcyA8LSB0d2VldHNbdHdlZXRzJGlzX3F1b3RlID09IFRSVUUsIF0NCiAgICB0eXBlc19saXN0IDwtIGxpc3Qob3JnYW5pYywgcmV0d2VldHMsIHJlcGxpZXMsIHF1b3RlcykNCiAgICByZXR1cm4odHlwZXNfbGlzdCkNCn0NCmBgYA0KDQpgYGB7cn0NCiMgZ2V0IGNsYXNocm95YWxlIHR3ZWV0IHR5cGVzDQpjbGFzaHJveWFsZV90eXBlcyA8LSB0d2VldF90eXBlcyhjbGFzaHJveWFsZV90d2VldHMpDQpjbGFzaHJveWFsZV9vcmdhbmljIDwtIGNsYXNocm95YWxlX3R5cGVzW1sxXV0NCmNsYXNocm95YWxlX3JldHdlZXRzIDwtIGNsYXNocm95YWxlX3R5cGVzW1syXV0NCmNsYXNocm95YWxlX3JlcGxpZXMgPC0gY2xhc2hyb3lhbGVfdHlwZXNbWzNdXQ0KY2xhc2hyb3lhbGVfcXVvdGVzIDwtIGNsYXNocm95YWxlX3R5cGVzW1s0XV0NCg0KIyBnZXQgY2xhc2hvZmNsYW5zIHR3ZWV0IHR5cGVzDQpjbGFzaG9mY2xhbnNfdHlwZXMgPC0gdHdlZXRfdHlwZXMoY2xhc2hvZmNsYW5zX3R3ZWV0cykNCmNsYXNob2ZjbGFuc19vcmdhbmljIDwtIGNsYXNob2ZjbGFuc190eXBlc1tbMV1dDQpjbGFzaG9mY2xhbnNfcmV0d2VldHMgPC0gY2xhc2hvZmNsYW5zX3R5cGVzW1syXV0NCmNsYXNob2ZjbGFuc19yZXBsaWVzIDwtIGNsYXNob2ZjbGFuc190eXBlc1tbM11dDQpjbGFzaG9mY2xhbnNfcXVvdGVzIDwtIGNsYXNob2ZjbGFuc190eXBlc1tbNF1dDQpgYGANCg0KVGhlbiwgd2UgY3JlYXRlIGEgc2VwYXJhdGUgZGF0YSBmcmFtZSBjb250YWluaW5nIHRoZSBudW1iZXIgb2Ygb3JnYW5pYyB0d2VldHMsIHJldHdlZXRzLCByZXBsaWVzIGFuZCBxdW90ZXMuIFdlIGhhdmUgdG8gcHJlcGFyZSB0aGUgZGF0YSBmcmFtZSBmb3IgYSBkb251dCBjaGFydC4gVGhpcyBpbmNsdWRlcyBhZGRpbmcgY29sdW1ucyB0aGF0IGNhbGN1bGF0ZSB0aGUgcmF0aW9zIGFuZCBwZXJjZW50YWdlcyBhbmQgc29tZSB2aXN1YWxpc2F0aW9uIHR3ZWFrcyBzdWNoIGFzIHNwZWNpZnlpbmcgdGhlIGxlZ2VuZCBhbmQgcm91bmRpbmcgdXAgeW91ciBkYXRhLg0KDQpgYGB7cn0NCnR5cGVfZGF0YSA8LSBmdW5jdGlvbihvcmdhbmljLCByZXR3ZWV0cywgcmVwbGllcywgcXVvdGVzKSB7DQogICAgIyBDcmVhdGluZyBhIGRhdGEgZnJhbWUNCiAgICBkYXRhIDwtIGRhdGEuZnJhbWUoDQogICAgICAgIGNhdGVnb3J5ID0gYygiT3JnYW5pYyIsICJSZXR3ZWV0cyIsICJSZXBsaWVzIiwgIlF1b3RlcyIpLA0KICAgICAgICBjb3VudCA9IGMoZGltKG9yZ2FuaWMpWzFdLCBkaW0ocmV0d2VldHMpWzFdLCBkaW0ocmVwbGllcylbMV0sIGRpbShxdW90ZXMpWzFdKQ0KICAgICkNCg0KICAgICMgQWRkaW5nIGNvbHVtbnMNCiAgICBkYXRhJGZyYWN0aW9uIDwtIGRhdGEkY291bnQgLyBzdW0oZGF0YSRjb3VudCkNCiAgICBkYXRhJHBlcmNlbnRhZ2UgPC0gZGF0YSRjb3VudCAvIHN1bShkYXRhJGNvdW50KSAqIDEwMA0KICAgIGRhdGEkeW1heCA8LSBjdW1zdW0oZGF0YSRmcmFjdGlvbikNCiAgICBkYXRhJHltaW4gPC0gYygwLCBoZWFkKGRhdGEkeW1heCwgbiA9IC0xKSkNCg0KICAgICMgUm91bmRpbmcgdGhlIGRhdGEgdG8gdHdvIGRlY2ltYWwgcG9pbnRzDQogICAgZGF0YVssIC0xXSA8LSByb3VuZChkYXRhWywgLTFdLCAyKQ0KICAgIHJldHVybihkYXRhKQ0KfQ0KYGBgDQoNCmBgYHtyfQ0KbGlicmFyeShnZ3Bsb3QyKQ0KY2xhc2hyb3lhbGVfZGF0YSA8LSB0eXBlX2RhdGEoY2xhc2hyb3lhbGVfb3JnYW5pYywgY2xhc2hyb3lhbGVfcmV0d2VldHMsIGNsYXNocm95YWxlX3JlcGxpZXMsIGNsYXNocm95YWxlX3F1b3RlcykNCnR5cGUgPC0gcGFzdGUoY2xhc2hyb3lhbGVfZGF0YSRjYXRlZ29yeSwgY2xhc2hyb3lhbGVfZGF0YSRwZXJjZW50YWdlLCAiJSIpDQpnZ3Bsb3QoY2xhc2hyb3lhbGVfZGF0YSwgYWVzKHltYXggPSB5bWF4LCB5bWluID0geW1pbiwgeG1heCA9IDQsIHhtaW4gPSAzLCBmaWxsID0gdHlwZSkpICsNCiAgICBnZW9tX3JlY3QoKSArDQogICAgY29vcmRfcG9sYXIodGhldGEgPSAieSIpICsNCiAgICB4bGltKGMoMiwgNCkpICsNCiAgICB0aGVtZV92b2lkKCkgKw0KICAgIHRoZW1lKGxlZ2VuZC5wb3NpdGlvbiA9ICJyaWdodCIpICsNCiAgICBsYWJzKHRpdGxlID0gIkNsYXNoIFJveWFsZSBUd2VldCBUeXBlcyIpDQpgYGANCg0KYGBge3J9DQpjbGFzaG9mY2xhbnNfZGF0YSA8LSB0eXBlX2RhdGEoY2xhc2hvZmNsYW5zX29yZ2FuaWMsIGNsYXNob2ZjbGFuc19yZXR3ZWV0cywgY2xhc2hvZmNsYW5zX3JlcGxpZXMsIGNsYXNob2ZjbGFuc19xdW90ZXMpDQp0eXBlIDwtIHBhc3RlKGNsYXNob2ZjbGFuc19kYXRhJGNhdGVnb3J5LCBjbGFzaG9mY2xhbnNfZGF0YSRwZXJjZW50YWdlLCAiJSIpDQpnZ3Bsb3QoY2xhc2hvZmNsYW5zX2RhdGEsIGFlcyh5bWF4ID0geW1heCwgeW1pbiA9IHltaW4sIHhtYXggPSA0LCB4bWluID0gMywgZmlsbCA9IHR5cGUpKSArDQogICAgZ2VvbV9yZWN0KCkgKw0KICAgIGNvb3JkX3BvbGFyKHRoZXRhID0gInkiKSArDQogICAgeGxpbShjKDIsIDQpKSArDQogICAgdGhlbWVfdm9pZCgpICsNCiAgICB0aGVtZShsZWdlbmQucG9zaXRpb24gPSAicmlnaHQiKSArDQogICAgbGFicyh0aXRsZSA9ICJDbGFzaCBvZiBDbGFucyBUd2VldCBUeXBlcyIpDQpgYGANCg0KIyMgSW5pdGlhbCBXb3JkY2xvdWRzDQoNCkJlZm9yZSBzdGFydGluZyBsZWFybmluZyB0aGUgZXhwb3NlZCBtYWNoaW5lIGxlYXJuaW5nIG1vZGVscywgbGV04oCZcyBidWlsZCBhIHdvcmRjbG91ZCB3aXRoIHRoZSBmb2xsb3dpbmcgcGFja2FnZSBbM10uIEl0cyBgd29yZGNsb3VkKClgIGNvbW1hbmQgbmVlZHMgdGhlIGxpc3Qgb2Ygd29yZHMgYW5kIHRoZWlyIGZyZXF1ZW5jaWVzIGFzIHBhcmFtZXRlcnMuIEFzIHRoZSB3b3JkcyBhcHBlYXIgaW4gY29sdW1ucyBpbiB0aGUgZG9jdW1lbnQtdGVybSBtYXRyaXgsIHRoZSBgY29sU3Vtc2AgY29tbWFuZCBpcyB1c2VkIHRvIGNhbGN1bGF0ZSB0aGUgd29yZCBmcmVxdWVuY2llcy4gSW4gb3JkZXIgdG8gY29tcGxldGUgdGhlIG5lZWRlZCBjYWxjdWxhdGlvbnMsIG5vdGUgdGhhdCB0aGUgdGVybS1kb2N1bWVudCBtYXRyaXggbmVlZHMgdG8gYmUgdHJhbnNmb3JtZWQgKGNhc3RlZCkgdG8gYSBtYXRyaXggZm9ybSB3aXRoIHRoZSBgYXMubWF0cml4YCBjYXN0LW9wZXJhdG9yLiBUaGlzIGluaXRpYWwgZG9jdW1lbnQtdGVybSBtYXRyaXggaXMgdmVyeSBzcGFyc2UsIGl0IGNvbnRhaW5zIDIwMDAgZG9jdW1lbnRzIGFuZCA3ODU0IHRlcm1zLg0KDQpXZSBjYW4gc2VlIHRoYXQgdGhlIGdlbmVyYXRlZCB3b3JkY2xvdWRzIGFyZSBub3QgdmVyeSBpbmZvcm1hdGl2ZS4gVGhlIHJlYXNvbiBmb3IgdGhpcyBpcyB0aGF0IHRoZSBtb3N0IGNvbW1vbiB3b3JkcyBhcmUgZW5nbGlzaCBzdG9wIHdvcmRzLiBUaGVzZSB3b3JkcyBhcmUgdmVyeSBjb21tb24sIGJ1dCBkJ3QgaGF2ZSBhbnkgbWVhbmluZy4gVGhhdCdzIHdoeSB3ZSBzaG91bGQgcmVtb3ZlIHRoZW0gZnJvbSBvdXIgY29ycHVzLg0KDQpgYGB7cn0NCmNvcnB1c19kdG1faW5pdCA8LSBEb2N1bWVudFRlcm1NYXRyaXgoY29ycHVzKQ0KY29ycHVzX2R0bV9pbml0DQpgYGANCg0KYGBge3J9DQpsaWJyYXJ5KHdvcmRjbG91ZCkNCndvcmRfZnJlcXMgPC0gc29ydChjb2xTdW1zKGFzLm1hdHJpeChjb3JwdXNfZHRtX2luaXQpWzE6biwgXSksIGRlY3JlYXNpbmcgPSBUUlVFKQ0Kd29yZGNsb3VkKHdvcmRzID0gbmFtZXMod29yZF9mcmVxcyksIGZyZXEgPSB3b3JkX2ZyZXFzLCBtYXgud29yZHMgPSAxMDAsIHJhbmRvbS5vcmRlciA9IEZBTFNFLCBjb2xvcnMgPSBicmV3ZXIucGFsKDgsICJEYXJrMiIpKQ0KYGBgDQoNCmBgYHtyfQ0Kd29yZF9mcmVxcyA8LSBzb3J0KGNvbFN1bXMoYXMubWF0cml4KGNvcnB1c19kdG1faW5pdClbKG4gKyAxKToobiArIG4pLCBdKSwgZGVjcmVhc2luZyA9IFRSVUUpDQp3b3JkY2xvdWQod29yZHMgPSBuYW1lcyh3b3JkX2ZyZXFzKSwgZnJlcSA9IHdvcmRfZnJlcXMsIG1heC53b3JkcyA9IDEwMCwgcmFuZG9tLm9yZGVyID0gRkFMU0UsIGNvbG9ycyA9IGJyZXdlci5wYWwoOCwgIkRhcmsyIikpDQpgYGANCg0KIyMgQmV0dGVyIFdvcmRjbG91ZHMNCg0KVG8gbWFrZSBhIGJldHRlciB3b3JkY2xvdWQsIHdlIGNhbiBwYXNzIHRoZSB0ZXh0IGRpcmVjdGx5LiBBIGNvcnB1cyB3aWxsIGJlIGdlbmVyYXRlZCBhbmQgc3RvcCB3b3JkcyB3aWxsIGJlIHJlbW92ZWQgYXV0b21hdGljYWxseS4gSG93ZXZlciwgdGhpcyB0aW1lIGVtb3RlcyBhcmUga2VwdCwgYW5kIHdlIGNhbiBzZWUgdGhhdCBzb21lIG9mIHRoZW0gYXJlIHF1aXRlIGNvbW1vbi4gV2UgY2FuIHNlZSB0aGF0IHRoZSBmb2xsb3dpbmcgd29yZGNsb3VkcyBhcmUgbXVjaCBtb3JlIGluZm9ybWF0aXZlLiBXZSBjYW4gYWxyZWFkeSBzZWUgc29tZSBkaWZmZXJlbmNlcyBhbmQgc2ltaWxhcml0aWVzIGJldHdlZW4gdGhlIGNvcnBvcmEuDQoNCmBgYHtyfQ0Kd29yZGNsb3VkKGNsYXNocm95YWxlX3R3ZWV0cyR0ZXh0LCBtYXgud29yZHMgPSA1MCwgc2NhbGUgPSBjKDMuNSwgMC4yNSksIHJhbmRvbS5vcmRlciA9IEZBTFNFLCBjb2xvcnMgPSBicmV3ZXIucGFsKDgsICJEYXJrMiIpKQ0KYGBgDQoNCmBgYHtyfQ0Kd29yZGNsb3VkKGNsYXNob2ZjbGFuc190d2VldHMkdGV4dCwgbWF4LndvcmRzID0gNTAsIHNjYWxlID0gYygzLjUsIDAuMjUpLCByYW5kb20ub3JkZXIgPSBGQUxTRSwgY29sb3JzID0gYnJld2VyLnBhbCg4LCAiRGFyazIiKSkNCmBgYA0KDQojIyBIYXNodGFnIFdvcmxkY2xvdWRzDQoNCkZpbmFsbHksIHdlIGNhbiBjcmVhdGUgYW5vdGhlciB3b3JkY2xvdWQgdGhhdCBvbmx5IGNvbnRhaW5zIHRoZSBoYXNodGFncy4gV2UgY2FuIHNlZSB0aGF0IGhhc2h0YWdzIGFyZSBub3QgdmVyeSBjb21tb24sIGJ1dCB0aGV5IGFyZSBkaWZmZXJlbnQgYmV0d2VlbiB0aGUgdHdvIGNvcnBvcmEuIFdlIHdpbGwgaGF2ZSB0byBkZWNpZGUgaWYgd2Ugd2FudCB0byBrZWVwIG9yIHJlbW92ZSB0aGVtIGluIHRoZSBuZXh0IHNlY3Rpb24uDQoNCmBgYHtyfQ0KY2xhc2hyb3lhbGVfdHdlZXRzJGhhc2h0YWdzIDwtIGFzLmNoYXJhY3RlcihjbGFzaHJveWFsZV90d2VldHMkaGFzaHRhZ3MpDQpjbGFzaHJveWFsZV90d2VldHMkaGFzaHRhZ3MgPC0gZ3N1YigiY1xcKCIsICIiLCBjbGFzaHJveWFsZV90d2VldHMkaGFzaHRhZ3MpDQp3b3JkY2xvdWQoY2xhc2hyb3lhbGVfdHdlZXRzJGhhc2h0YWdzLCBtaW4uZnJlcSA9IDEsIHNjYWxlID0gYygzLjUsIC41KSwgbWF4LndvcmRzID0gNTAsIHJhbmRvbS5vcmRlciA9IEZBTFNFLCByb3QucGVyID0gMC4zNSwgY29sb3JzID0gYnJld2VyLnBhbCg4LCAiRGFyazIiKSkNCmBgYA0KDQpgYGB7cn0NCmNsYXNob2ZjbGFuc190d2VldHMkaGFzaHRhZ3MgPC0gYXMuY2hhcmFjdGVyKGNsYXNob2ZjbGFuc190d2VldHMkaGFzaHRhZ3MpDQpjbGFzaG9mY2xhbnNfdHdlZXRzJGhhc2h0YWdzIDwtIGdzdWIoImNcXCgiLCAiIiwgY2xhc2hvZmNsYW5zX3R3ZWV0cyRoYXNodGFncykNCndvcmRjbG91ZChjbGFzaG9mY2xhbnNfdHdlZXRzJGhhc2h0YWdzLCBtaW4uZnJlcSA9IDEsIHNjYWxlID0gYygzLjUsIC41KSwgbWF4LndvcmRzID0gNTAsIHJhbmRvbS5vcmRlciA9IEZBTFNFLCByb3QucGVyID0gMC4zNSwgY29sb3JzID0gYnJld2VyLnBhbCg4LCAiRGFyazIiKSkNCmBgYA0KDQojIFByZXByb2Nlc3NpbmcNCg0KQXMgd2UgaGF2ZSBzYWlkIGJlZm9yZSwgc29tZSBwcmVwcm9jZXNzaW5nIGlzIG5lZWRlZCBzbyB0aGF0IHdlIGdldCBiZXR0ZXIgcmVzdWx0cyB3aGVuIGNsYXNzaWZ5aW5nIHRoZSBkb2N1bWVudHMuIEZpcnN0LCB3ZSB3aWxsIGFwcGx5IHNvbWUgdHJhbnNmb3JtYXRpb25zIHN1Y2ggYXMgcmVtb3Zpbmcgc3RvcCB3b3JkcyB0byB0aGUgdGV4dC4gVGhlbiwgd2Ugd2lsbCByZW1vdmUgc3BhcnNlIHdvcmRzIGFuZCBvdXRsaWVyIGRvY3VtZW50cyBmcm9tIHRoZSBjb3JwdXMuIEZpbmFsbHksIHdlIHdpbGwgZGlzcGxheSB0aGUgZmluYWwgd29yZGNsb3VkcyBzbyB0aGF0IHdlIGNhbiBjb21wYXJlIHRoZW0gd2l0aCB0aGUgaW5pdGlhbCBvbmVzLg0KDQojIyBBcHBseSBUcmFuc2Zvcm1hdGlvbnMNCg0KVHJhbnNmb3JtYXRpb25zIG9wZXJhdG9ycyB0byB0aGUgY29ycHVzIGFyZSBhcHBsaWVkIHZpYSBgdG1fbWFwYCBmdW5jdGlvbiwgd2hpY2ggYXBwbGllcyAobWFwcykgYSBmdW5jdGlvbiB0byBhbGwgZWxlbWVudHMgb2YgdGhlIGNvcnB1cy4gVGhlIHRyYW5zZm9ybWF0aW9ucyB3aWxsIGJlIGFwcGxpZWQgdG8gdGhlIHdob2xlIGNvcnB1cywgdGhhdCBjb25zdGFpbnMgZG9jdW1lbnRzIG9mIGJvdGggY2xhc3Nlcy4gQXBhcnQgZnJvbSB0aGUgdHJhbnNmb3JtYXRpb25zIHRoYXQgYXJlIGF2YWlsYWJsZSBpbiB0aGUgYHRtYCBwYWNrYWdlLCBzb21lIGN1c3RvbSB0cmFuc2Zvcm1hdGlvbnMgYXJlIGFsc28gYXBwbGllZCB3aXRoIHRoZSBmdW5jdGlvbiBgY29udGVudF90cmFuc2Zvcm1lcmAuDQoNCkZpcnN0LCBzb21lIGVsZW1lbnRzIGFyZSByZW1vdmVkIGZyb20gdGhlIGNvcnB1czogbnVtYmVycywgcHVuY3R1YXRpb24sIHVybHMsIG1lbnRpb25zLCBoYXNodGFncywgbmV3bGluZXMgYW5kIGVtb2ppcy4gVGhlbiwgYWxsIHRoZSB3b3JkcyBhcmUgY29udmVydGVkIHRvIGxvd2VyY2FzZS4gTmV4dCwgdGhlIHByZXZpb3VzbHkgbWVudGlvbmVkIGVuZ2xpc2ggc3RvcHdvcmRzIGFyZSByZW1vdmVkLiBBZnRlciwgbXVsdGlwbGUgd2hpdGVzcGFjZSBjaGFyYWN0ZXJzIGFyZSBjb2xsYXBzZWQgdG8gYSBzaW5nbGUgb25lLiBGaW5hbGx5LCBhbGwgdGhlIHdvcmRzIGFyZSBzdGVtbWVkIHRvIHJlZHVjZSB0aGUgbnVtYmVyIG9mIHdvcmRzLiBXZSBjYW4gcHJpbnQgdGhlIGZpcnN0IDUgdHdlZXRzIG9mIGVhY2ggY29ycHVzIHRvIHNlZSB0aGUgZGlmZmVyZW5jZSB3aXRoIHRoZSBpbml0aWFsIG9uZXMuDQoNCmBgYHtyfQ0KcmVtb3ZlX3VybHMgPC0gZnVuY3Rpb24odGV4dCkgew0KICAgIGdzdWIoImh0dHBcXFMqIiwgIiIsIHRleHQpDQp9DQpyZW1vdmVfbWVudGlvbnMgPC0gZnVuY3Rpb24odGV4dCkgew0KICAgIGdzdWIoIkBcXFMqIiwgIiIsIHRleHQpDQp9DQpyZW1vdmVfaGFzaHRhZ3MgPC0gZnVuY3Rpb24odGV4dCkgew0KICAgIGdzdWIoIiNcXFMqIiwgIiIsIHRleHQpDQp9DQpyZW1vdmVfbmV3bGluZXMgPC0gZnVuY3Rpb24odGV4dCkgew0KICAgIGdzdWIoIlxcXG4iLCAiICIsIHRleHQpDQp9DQpyZW1vdmVfZW1vamlzIDwtIGZ1bmN0aW9uKHRleHQpIHsNCiAgICBnc3ViKCJbXlx4MDEtXHg3Rl0iLCAiIiwgdGV4dCkNCn0NCmBgYA0KDQpgYGB7cn0NCiMgcmVtb3ZlIG51bWJlcnMNCmNvcnB1c190cmFucyA8LSB0bV9tYXAoY29ycHVzLCByZW1vdmVOdW1iZXJzKQ0KIyByZW1vdmUgcHVuY3R1YXRpb24NCmNvcnB1c190cmFucyA8LSB0bV9tYXAoY29ycHVzX3RyYW5zLCByZW1vdmVQdW5jdHVhdGlvbikNCiMgcmVtb3ZlIHVybHMNCmNvcnB1c190cmFucyA8LSB0bV9tYXAoY29ycHVzX3RyYW5zLCBjb250ZW50X3RyYW5zZm9ybWVyKHJlbW92ZV91cmxzKSkNCiMgcmVtb3ZlIG1lbnRpb25zDQpjb3JwdXNfdHJhbnMgPC0gdG1fbWFwKGNvcnB1c190cmFucywgY29udGVudF90cmFuc2Zvcm1lcihyZW1vdmVfbWVudGlvbnMpKQ0KIyByZW1vdmUgaGFzdGFncw0KY29ycHVzX3RyYW5zIDwtIHRtX21hcChjb3JwdXNfdHJhbnMsIGNvbnRlbnRfdHJhbnNmb3JtZXIocmVtb3ZlX2hhc2h0YWdzKSkNCiMgcmVtb3ZlIG5ld2xpbmVzDQpjb3JwdXNfdHJhbnMgPC0gdG1fbWFwKGNvcnB1c190cmFucywgY29udGVudF90cmFuc2Zvcm1lcihyZW1vdmVfbmV3bGluZXMpKQ0KIyByZW1vdmUgZW1vamlzDQpjb3JwdXNfdHJhbnMgPC0gdG1fbWFwKGNvcnB1c190cmFucywgY29udGVudF90cmFuc2Zvcm1lcihyZW1vdmVfZW1vamlzKSkNCiMgY29udmVydCB0byBsb3dlcmNhc2UNCmNvcnB1c190cmFucyA8LSB0bV9tYXAoY29ycHVzX3RyYW5zLCBjb250ZW50X3RyYW5zZm9ybWVyKHRvbG93ZXIpKQ0KIyByZW1vdmUgZW5nbGlzaCBzdG9wIHdvcmRzDQpjb3JwdXNfdHJhbnMgPC0gdG1fbWFwKGNvcnB1c190cmFucywgcmVtb3ZlV29yZHMsIHN0b3B3b3JkcygiZW5nbGlzaCIpKQ0KIyBzdHJpcCB3aGl0ZXNwYWNlDQpjb3JwdXNfdHJhbnMgPC0gdG1fbWFwKGNvcnB1c190cmFucywgc3RyaXBXaGl0ZXNwYWNlKQ0KIyB0byBhY2Nlc3MgUG9ydGVyJ3Mgd29yZCBzdGVtbWluZyBhbGdvcml0aG0NCmxpYnJhcnkoU25vd2JhbGxDKQ0KY29ycHVzX3RyYW5zIDwtIHRtX21hcChjb3JwdXNfdHJhbnMsIHN0ZW1Eb2N1bWVudCkNCmBgYA0KDQpgYGB7cn0NCmZvciAoaSBpbiAxOjUpIHsNCiAgICBwcmludChjb3JwdXNfdHJhbnNbW2ldXSRjb250ZW50KQ0KfQ0KZm9yIChpIGluIChuICsgMSk6KG4gKyA2KSkgew0KICAgIHByaW50KGNvcnB1c190cmFuc1tbaV1dJGNvbnRlbnQpDQp9DQpgYGANCg0KIyMgUmVtb3ZlIFNwYXJzZSBUZXJtcw0KDQpBZnRlciBjb3JwdXMgc2V0IHRyYW5zZm9ybWF0aW9uLCBhIGNvbW1vbiBhcHByb2FjaCBpbiB0ZXh0IG1pbmluZyBpcyB0byAqKmNyZWF0ZSBhIGRvY3VtZW50LXRlcm0gbWF0cml4KiogZnJvbSBhIGNvcnB1cy4gVGhpcyBkb2N1bWVudC10ZXJtIG1hdHJpeCBpcyB0aGUgc3RhcnRpbmcgcG9pbnQgdG8gYXBwbHkgbWFjaGluZS1sZWFybmluZyBtb2RlbGl6YXRpb24gdGVjaG5pcXVlcyBzdWNoIGFzIGNsYXNzaWZpY2F0aW9uIGFuZCBjbHVzdGVyaW5nLiBEaWZmZXJlbnQgb3BlcmF0aW9ucyBjYW4gYmUgYXBwbGllZCBvdmVyIHRoaXMgbWF0cml4LiBXZSBjYW4gb2J0YWluIHRoZSB0ZXJtcyB0aGF0IG9jY3VyIGF0IGxlYXN0IDUwIHRpbWVzLiBXZSBjYW4gYWxzbyBjb25zdWx0IHRoZSAqKnRlcm1zIHRoYXQgYXNzb2NpYXRlKiogd2l0aCBhdCBsZWFzdCBieSBhIDAuMyBjb3JyZWxhdGlvbiBkZWdyZWUgd2l0aCB0aGUgdGVybSAibWFpbnRlbiIuIFdlIGNhbiBzZWUgdGhhdCB0aGUgY29ycmVsYXRlZCB3b3JkcyBtYWtlIHNlbnNlOiAic2hvcnQgbWFpbnRlbmNhbmNlIGJyZWFrIHNvb24iLCAic2VydmVyIHVwa2VlcCIuDQoNCmBgYHtyfQ0KY29ycHVzX2R0bSA8LSBEb2N1bWVudFRlcm1NYXRyaXgoY29ycHVzX3RyYW5zKQ0KY29ycHVzX2R0bQ0KZmluZEZyZXFUZXJtcyhjb3JwdXNfZHRtLCA1MCkNCmZpbmRBc3NvY3MoY29ycHVzX2R0bSwgdGVybSA9ICJtYWludGVuIiwgY29ybGltaXQgPSAwLjMpDQpgYGANCg0KV2UgaGF2ZSByZW1vdmVkIG5lYXJseSA0MDAwIHdvcmRzIGZyb20gdGhlIGluaXRpYWFsIGRvY3VtZW50LXRlcm0gbWF0cml4LiBIb3dldmVyLCBpdCBoYXMgc3RpbGwgYSBodWdlIGRlZ3JlZSBvZiBzcGFyc2l0eTogYSBsb3cgYW1vdW50IG9mIG5vbi16ZXJvIGVsZW1lbnRzLiBUaHVzLCBvbmUgb2YgdGhlIG1vc3QgaW1wb3J0YW50IG9wZXJhdGlvbnMgaXMgdG8gcmVtb3ZlIHNwYXJzZSB0ZXJtcywgdGVybXMgb2NjdXJyaW5nIGluIHZlcnkgZmV3IGRvY3VtZW50cy4gVGhlIGBzcGFyc2VgIHBhcmFtZXRlciBpbiB0aGUgYHJlbW92ZVNwYXJzZVRlcm1zYCBmdW5jdGlvbiByZWZlcnMgdG8gdGhlIG1heGltdW0gc3BhcnNlbmVzcyBhbGxvd2VkOiB0aGUgc21hbGxlciBpdHMgcHJvcG9ydGlvbiwgZmV3ZXIgdGVybXMgd2lsbCBiZSByZXRhaW5lZC4gQSB0cmlhbCBhbmQgZXJyb3IgYXBwcm9hY2ggd2lsbCBmaW5hbGx5IHJldHVybiBhIHByb3BlciBudW1iZXIgb2YgdGVybXMuIFRoaXMgbWF0cml4IHdpbGwgYmUgdGhlIHN0YXJ0aW5nIHBvaW50IGZvciBidWlsZGluZyBmdXJ0aGVyIG1hY2hpbmUgbGVhcm5pbmcgbW9kZWxzLg0KDQpBZnRlciB0cnlpbmcgbXVsdGlwbGUgdmFsdWVzLCB3ZSBkZWNpZGUgdG8ga2VlcCB0ZXJtcyB3aXRoIGEgbWF4aW11bSBzcGFyc2VuZXNzIG9mIGAwLjk5YC4gVGhpcyBzZWVtcyB0byBiZSB2ZXJ5IGhpZ2gsIGJ1dCBpdCByZWR1Y2VzIHRoZSBudW1iZXJzIG9mIHRlcm1zIGRyYXN0aWNhbGx5LiBJbiBmYWN0LCBzZWxlY3RpbmcgbG93ZXIgdmFsdWVzIG9mIHNwYXJzZW5lc3MgdGhlIG51bWJlciBvZiB0ZXJtcyBpcyB0b28gbG93Lg0KDQpgYGB7cn0NCmNvcnB1c19kdG1fOTUgPC0gcmVtb3ZlU3BhcnNlVGVybXMoY29ycHVzX2R0bSwgc3BhcnNlID0gMC45NSkNCmNvcnB1c19kdG1fOTUNCmJhcnBsb3QoYXMubWF0cml4KGNvcnB1c19kdG1fOTUpLA0KICAgIHhsYWIgPSAidGVybXMiLCB5bGFiID0gIm51bWJlciBvZiBvY2N1cnJlbmNlcyIsDQogICAgbWFpbiA9ICJNb3N0IGZyZXF1ZW50IHRlcm1zIChzcGFyc2VuZXNzPTAuOTUpIg0KKQ0KY29ycHVzX2R0bV85NyA8LSByZW1vdmVTcGFyc2VUZXJtcyhjb3JwdXNfZHRtLCBzcGFyc2UgPSAwLjk3KQ0KY29ycHVzX2R0bV85Nw0KYmFycGxvdChhcy5tYXRyaXgoY29ycHVzX2R0bV85NyksDQogICAgeGxhYiA9ICJ0ZXJtcyIsIHlsYWIgPSAibnVtYmVyIG9mIG9jY3VycmVuY2VzIiwNCiAgICBtYWluID0gIk1vc3QgZnJlcXVlbnQgdGVybXMgKHNwYXJzZW5lc3M9MC45NykiDQopDQpjb3JwdXNfZHRtXzk5IDwtIHJlbW92ZVNwYXJzZVRlcm1zKGNvcnB1c19kdG0sIHNwYXJzZSA9IDAuOTkpDQpjb3JwdXNfZHRtXzk5DQp0ZXJtcyA8LSBkaW0oY29ycHVzX2R0bV85OSlbMl0NCmJhcnBsb3QoYXMubWF0cml4KGNvcnB1c19kdG1fOTkpLA0KICAgIHhsYWIgPSAidGVybXMiLCB5bGFiID0gIm51bWJlciBvZiBvY2N1cnJlbmNlcyIsDQogICAgbWFpbiA9ICJNb3N0IGZyZXF1ZW50IHRlcm1zIChzcGFyc2VuZXNzPTAuOTkpIg0KKQ0KYGBgDQoNCiMjIE91dGxpZXIgRGV0ZWN0aW9uDQoNCk91dGxpZXIgZGV0ZWN0aW9uIGNhbiBiZSB1c2VkIHRvIGRldGVjdCBhbmQgcmVtb3ZlIG91dGxpZXIgZG9jdW1lbnRzIGZyb20gdGhlIGNvcnB1cy4gV2UgdGVzdCB0aGUgSXNvbGF0aW9uIEZvcmVzdCBtZXRob2QuIEkgZGVjaWRlZCBub3QgdG8gcmVtb3ZlIGFueSBkb2N1bWVudCB0byBzaW1wbGlmeSB0aGUgbmV4dCBzdGVwcy4NCg0KSXNvbGF0aW9uIEZvcmVzdCBjb25zdHJ1Y3RzIGEgdHJlZSBwZXIgZG9jdW1lbnQuIEl0IHRyaWVzIHRvIGlzb2xhdGUgdGhlIHNhbXBsZSBmcm9tIHRoZSByZXN0LiBBcyBvdXRsaWVycyBhcmUgZWFzeSB0byBpc29sYXRlLCB0aGVpciBpc29sYXRpb24gc2NvcmUgaXMgaGlnaC4gV2UgaGF2ZSB0byBwbG90IHRoZSBvdXRsaWVybmVzcyBhbmQgZGVjaWRlIGEgdGhyZXNob2xkLg0KDQohW0lzb2xhdGlvbiBGb3Jlc3RdKC4uL2ltYWdlcy9pc29sYXRpb25fZm9yZXN0LnBuZykNCg0KYGBge3J9DQpsaWJyYXJ5KHNvbGl0dWRlKQ0KIyBFbXB0eSB0cmVlIHN0cnVjdHVyZQ0KaXNvIDwtIGlzb2xhdGlvbkZvcmVzdCRuZXcoKQ0KDQojIGNvbnZlcnQgZHRtIHRvIGRhdGFmcmFtZQ0KY29ycHVzX2RmXzk5IDwtIGFzLmRhdGEuZnJhbWUoYXMubWF0cml4KGNvcnB1c19kdG1fOTkpKQ0KDQojIExlYXJuIHRoZSBJc29sYXRpb25Gb3Jlc3QgZm9yIG91ciBkYXRhDQppc28kZml0KGNvcnB1c19kZl85OSkNCg0KIyBwcmVkaWN0IGZvciBvdXIgZGF0YQ0KcCA8LSBpc28kcHJlZGljdChjb3JwdXNfZGZfOTkpDQoNCiMgcGxvdCBhbm9tYWx5IHNjb3JlDQpwbG90KGRlbnNpdHkocCRhbm9tYWx5X3Njb3JlKSwgbWFpbiA9ICJBbm9tYWx5IFNjb3JlIERlbnNpdHkiKQ0KDQojIEJhc2VkIG9uIHRoZSBwbG90LCBkZWNpZGUgdGhlIGN1dC1vZmYgcG9pbnQNCndoaWNoKHAkYW5vbWFseV9zY29yZSA+IDAuNjIpDQpgYGANCg0KIyMgRmluYWwgV29ybGRjbG91ZHMNCg0KRmluYWxseSwgdGhlIHdvcmRjbG91ZHMgb2YgdGhlIHJlZHVjZWQgZG9jdW1lbnQtdGVybSBtYXRyaXggYXJlIHBsb3R0ZWQuIFdlIGNhbiBzZWUgdGhlIGRpZmZlcmVuY2Ugd2l0aCB0aGUgaW5pdGlhbCB3b3JkY2xvdWQuIFRoZSB0ZXJtcyBvZiBlYWNoIHdvcmRjbG91ZCBhcmUgc2lnbmlmaWNhbnRseSBkaWZmZXJlbnQNCg0KYGBge3J9DQojIGNhbGN1bGF0ZSB0aGUgZnJlcXVlbmN5IG9mIHdvcmRzIGFuZCBzb3J0IGluIGRlc2NlbmRpbmcgb3JkZXIuDQp3b3JkX2ZyZXFzIDwtIHNvcnQoY29sU3Vtcyhhcy5tYXRyaXgoY29ycHVzX2R0bV85OSlbMTpuLCBdKSwgZGVjcmVhc2luZyA9IFRSVUUpDQp3b3JkY2xvdWQod29yZHMgPSBuYW1lcyh3b3JkX2ZyZXFzKSwgZnJlcSA9IHdvcmRfZnJlcXMsIG1heC53b3JkcyA9IDUwLCBzY2FsZSA9IGMoMy41LCAwLjI1KSwgcmFuZG9tLm9yZGVyID0gRkFMU0UsIGNvbG9ycyA9IGJyZXdlci5wYWwoOCwgIkRhcmsyIikpDQpgYGANCg0KYGBge3J9DQp3b3JkX2ZyZXFzIDwtIHNvcnQoY29sU3Vtcyhhcy5tYXRyaXgoY29ycHVzX2R0bV85OSlbKG4gKyAxKToobiArIG4pLCBdKSwgZGVjcmVhc2luZyA9IFRSVUUpDQp3b3JkY2xvdWQod29yZHMgPSBuYW1lcyh3b3JkX2ZyZXFzKSwgZnJlcSA9IHdvcmRfZnJlcXMsIG1heC53b3JkcyA9IDUwLCBzY2FsZSA9IGMoMy41LCAwLjI1KSwgcmFuZG9tLm9yZGVyID0gRkFMU0UsIGNvbG9ycyA9IGJyZXdlci5wYWwoOCwgIkRhcmsyIikpDQpgYGANCg0KIyBDbHVzdGVyaW5nDQoNCiMjIENsdXN0ZXJpbmcgV29yZHMNCg0KV2UgdHJ5IHRvIGZpbmQgY2x1c3RlcnMgb2Ygd29yZHMgd2l0aCBoaWVyYXJjaGljYWwgY2x1c3RlcmluZywgYSBwb3B1bGFyIGNsdXN0ZXJpbmcgdGVjaG5pcXVlcyB3aGljaCBidWlsZHMgYSBkZW5kb2dyYW0gdG8gaXRlcmF0aXZlbHkgZ3JvdXAgcGFpcnMgb2Ygc2ltaWxhciBvYmplY3RzLiBUbyBkbyBzbywgYSBtYXRyaXggd2l0aCB0aGUgc3BhcnNlIHRlcm1zIHJlbW92ZWQgaXMgbmVlZGVkLiBXZSBzZWxlY3QgdGhlIDAuOTcgc3BhcnNpdHkgbWF0cml4IHNvIHRoYXQgd2UgY2FuIHZpc3VhbGl6ZSB0aGVtLiBBZnRlciB0aGUgYXBwbGljYXRpb24gb2YgdGhlIG1hdHJpeC1jYXN0aW5nIG9wZXJhdG9yLCBudW1iZXIgb2Ygb2NjdXJyZW5jZXMgYXJlIHNjYWxlZC4NCg0KV2UgbmVlZCB0byBjYWxjdWxhdGUgdGhlIGRpc3RhbmNlIGJldHdlZW4gcGFpcnMgb2YgdGVybXMuIFRoZSBgZGlzdGAgb3BlcmF0b3IgcGVyZm9ybXMgdGhpcyBjYWxjdWxhdGlvbiBiZXR3ZWVuIHBhaXJzIG9mIHJvd3Mgb2YgdGhlIHByb3ZpZGVkIG1hdHJpeC4gQXMgdGVybXMgYXBwZWFyIGluIHRoZSBjb2x1bW5zIG9mIHRoZSBkb2N1bWVudC10ZXJtIG1hdHJpeCAoYGNvcnB1c19kdG1fOTdgKSwgaXQgbmVlZHMgdG8gYmUgdHJhbnNwb3NlZCBieSBtZWFucyBvZiB0aGUgYHRgIG9wZXJhdG9yLiBUaGUgY2x1c3RlcmluZy1kZW5kb2dyYW0gaXMgYnVpbHQgd2l0aCB0aGUgYGhjbHVzdGAgb3BlcmF0b3IuIEl0IG5lZWRzIGFzIGlucHV0IHRoZSBjYWxjdWxhdGVkIGRpc3RhbmNlIG1hdHJpeCBiZXR3ZWVuIHBhaXJzIG9mIHRlcm1zIGFuZCBhIGNyaXRlcmlhIHRvIGRlY2lkZSB3aGljaCBwYWlyIG9mIGNsdXN0ZXJzIHRvIGJlIGNvbnNlY3V0aXZlbHkgam9pbmVkIGluIHRoZSBib3R0b20tdXAgZGVuZG9ncmFtLiBJbiB0aGlzIGNhc2UsIHRoZSDigJxjb21wbGV0ZeKAnSBjcml0ZXJpYSB0YWtlcyBpbnRvIGFjY291bnQgdGhlIG1heGltdW0gZGlzdGFuY2UgYmV0d2VlbiBhbnkgcGFpciBvZiB0ZXJtcyBvZiBib3RoIGNsdXN0ZXJzIHRvIGJlIG1lcmdlZC4gSGVpZ3RoIGluIHRoZSBkZW5kb2dyYW0gZGVub3RlcyB0aGUgKmRpc3RhbmNlKiBiZXR3ZWVuIGEgbWVyZ2VkIHBhaXIgb2YgY2x1c3RlcnMuDQoNCmBgYHtyfQ0KZGlzdF9tYXRyaXggPC0gZGlzdCh0KHNjYWxlKGFzLm1hdHJpeChjb3JwdXNfZHRtXzk3KSkpKQ0KdGVybV9jbHVzdGVyaW5nIDwtIGhjbHVzdChkaXN0X21hdHJpeCwgbWV0aG9kID0gImNvbXBsZXRlIikNCnBsb3QodGVybV9jbHVzdGVyaW5nKQ0KYGBgDQoNCiMjIENsdXN0ZXJpbmcgRG9jdW1lbnRzDQoNCkFub3RoZXIgdHlwZSBvZiBwb3B1bGFyIHRhc2sgaXMgdG8gY29uc3RydWN0IGNsdXN0ZXJzIG9mIHNpbWlsYXIgZG9jdW1lbnRzIGJhc2VkIG9uIHRoZSBmcmVxdWVuY2llcyBvZiB3b3JkIG9jY3VycmVuY2VzLiBIZXJlIHdlIHNlbGVjdCBhIHNtYWxsIHN1YnNldCBvZiB0aGUgaW5pdGlhbCBjb3JwdXMsIDE1IGRvY3VtZW50cyBmcm9tIGVhY2ggY2xhc3MuIFdlIHRoZW4gYXBwbHkgYSBzaW1pbGFyIG1ldGhvZCB0byB0aGUgcHJldmlvdXMgb25lIGFuZCB0cnkgdG8gZGl2aWRlIGRvY3VtZW50cyBpbnRvIHR3byBjbHVzdGVycy4NCg0KYGBge3J9DQpkaXN0X21hdHJpeCA8LSBkaXN0KHNjYWxlKGFzLm1hdHJpeChjb3JwdXNfZHRtXzk5KVsobiAtIDE1KToobiArIDE1KSwgXSkpDQpncm91cHMgPC0gaGNsdXN0KGRpc3RfbWF0cml4LCBtZXRob2QgPSAid2FyZC5EIikNCnBsb3QoZ3JvdXBzLCBjZXggPSAwLjksIGhhbmcgPSAtMSkNCnJlY3QuaGNsdXN0KGdyb3VwcywgayA9IDIpDQpgYGANCg0KIyBEYXRhIFNwbGl0dGluZw0KDQpCZWZvcmUgbGVhcm5pbmcgYSBjbGFzc2lmaWNhdGlvbiBtb2RlbCB3ZSBoYXZlIHRvIGRlZmluZSB0aGUgc3Vic2V0cyBvZiBzYW1wbGVzIChkb2N1bWVudHMpIHRvIHRyYWluIGFuZCB0ZXN0IG91ciBtb2RlbC4gV2UgZmlyc3QgbmVlZCBjcmVhdGUgYSBEYXRhIEZyYW1lIGZyb20gdGhlIERvY3VtZW50IFRlcm0gTWF0cml4Lg0KDQojIyBDcmVhdGUgRGF0YSBGcmFtZQ0KDQpUaGUgMC45OSBzcGFyc2VuZXNzIHZhbHVlIGRvY3VtZW50LXRlcm0gbWF0cml4IGlzIG91ciBzdGFydGluZyBwb2ludC4gVGhpcyBtYXRyaXggaGFzIDE4MSBmZWF0dXJlcywgd2hpY2ggY29ycmVzcG9uZCB0byB0aGUgbW9zIGZyZXF1ZW50IHRlcm1zLiBXZSBmaXJzdCBuZWVkIHRvIGFwcGVuZCB0aGUgY2xhc3MgdmVjdG9yIGFzIHRoZSBsYXN0IGNvbHVtbiBvZiB0aGUgbWF0cml4LiBUaGVyZSBhcmUgMTAwMCBkb2N1bWVudHMgb2YgZWFjaCBjbGFzcywgMjAwMCBkb2N1bWVudHMgaW4gdG90YWwuDQoNCmBgYHtyfQ0KZGltKGNvcnB1c19kdG1fOTkpDQp0eXBlIDwtIGMocmVwKCJjbGFzaHJveWFsZSIsIG4pLCByZXAoImNsYXNob2ZjbGFucyIsIG4pKSAjIGNyZWF0ZSB0aGUgdHlwZSB2ZWN0b3INCmNvcnB1c19kdG1fOTkgPC0gY2JpbmQoY29ycHVzX2R0bV85OSwgdHlwZSkgIyBhcHBlbmQNCmRpbShjb3JwdXNfZHRtXzk5KSAjIGNvbnN1bHQgdGhlIHVwZGF0ZWQgbnVtYmVyIG9mIGNvbHVtbnMNCmBgYA0KDQpUaGlzIG5ldyBtYXRyaXggaXMgdGhlIHN0YXJ0aW5nIHBvaW50IGZvciBzdXBlcnZpc2VkIGNsYXNzaWZpY2F0aW9uLiBIb3dldmVyLCB3ZSBmaXJzdCBuZWVkIHRvIGNvbnZlcnQgaXQgdG8gYSBkYXRhZnJhbWUuIFRoZSBuYW1lIG9mIHRoZSBsYXN0IGNvbHVtbiBpcyB1cGRhdGVkLiBBbGwgdGhlIHZhbHVlcyBhcmUgY29udmVydGVkIHRvIG51bWVyaWMgYW5kIHRoZSBsYXN0IGNvbHVtbiBpcyBjb252ZXJ0ZWQgdG8gZmFjdG9yLg0KDQpgYGB7cn0NCmNvcnB1c19kZl85OSA8LSBhcy5kYXRhLmZyYW1lKGFzLm1hdHJpeChjb3JwdXNfZHRtXzk5KSkNCmNvbG5hbWVzKGNvcnB1c19kZl85OSlbdGVybXMgKyAxXSA8LSAidHlwZSINCmNvcnB1c19kZl85OSR0eXBlIDwtIGFzLmZhY3Rvcihjb3JwdXNfZGZfOTkkdHlwZSkNCmNvcnB1c19kZl85OSA8LSBhcy5kYXRhLmZyYW1lKHNhcHBseShjb3JwdXNfZGZfOTksIGFzLm51bWVyaWMpKQ0KY29ycHVzX2RmXzk5W2lzLm5hKGNvcnB1c19kZl85OSldIDwtIDANCmNvcnB1c19kZl85OSR0eXBlIDwtIGFzLmZhY3Rvcihjb3JwdXNfZGZfOTkkdHlwZSkNCmxldmVscyhjb3JwdXNfZGZfOTkkdHlwZSkgPC0gYygiY2xhc2hvZmNsYW5zIiwgImNsYXNocm95YWxlIikNCmBgYA0KDQojIyBDcmVhdGUgRGF0YSBQYXJ0aXRpb24NCg0KVGhlIGBjcmVhdGVEYXRhUGFydGl0aW9uYCBwcm9kdWNlcyBhIHRyYWluLXRlc3QgcGFydGl0aW9uIG9mIG91ciBjb3JwdXMuIFRoaXMgd2lsbCBiZSBtYWludGFpbmVkIGR1cmluZyB0aGUgd2hvbGUgcGlwZWxpbmUgb2YgYW5hbHlzaXMuIFRlc3Qgc2FtcGxlcyB3b24ndCBiZSB1c2VkIGZvciBhbnkgbW9kZWxpbmcgZGVjaXNpb24uIFdlIHdpbGwgb25seSB1c2UgdGhlbSBhdCB0aGUgZW5kIHRvIHByZWRpY3QgdGhlaXIgY2xhc3MgYW5kIGNyZWF0ZSBhIGNvbmZ1c2lvbiBtYXRyaXguIEEgbGlzdCBvZiByYW5kb21seSBzYW1wbGVkIG51bWJlcnMgKGBpbl90cmFpbmApIGlzIHVzZWQgdG8gcGFydGl0aW9uIHRoZSB3aG9sZSBjb3JwdXMuIDc1JSBvZiB0aGUgc2FtcGxlcyBhcmUgdXNlZCBmb3IgdHJhaW5pbmcgYW5kIHRoZSByZW1haW5pbmcgMjUlIGlzIHVzZWQgZm9yIHRlc3RpbmcuDQoNCmBgYHtyfQ0KbGlicmFyeShjYXJldCkNCnNldC5zZWVkKDEwNykgIyBhIHJhbmRvbSBzZWVkIHRvIGVuYWJsZSByZXByb2R1Y2liaWxpdHkNCmluX3RyYWluIDwtIGNyZWF0ZURhdGFQYXJ0aXRpb24oeSA9IGNvcnB1c19kZl85OSR0eXBlLCBwID0gLjc1LCBsaXN0ID0gRkFMU0UpDQpzdHIoaW5fdHJhaW4pDQp0cmFpbmluZyA8LSBjb3JwdXNfZGZfOTlbaW5fdHJhaW4sIF0NCnRlc3RpbmcgPC0gY29ycHVzX2RmXzk5Wy1pbl90cmFpbiwgXQ0KbnJvdyh0cmFpbmluZykNCmBgYA0KDQpTaW1pbGFybHksIGBjcmVhdGVSZXNhbXBsZWAgY2FuIGJlIHVzZWQgdG8gbWFrZSBzaW1wbGUgYm9vdHN0cmFwIHNhbXBsZXMuIFRoaXMgY3JlYXRlcyByZXNhbXBsZXMgb2YgdGhlIHNpemUgb2YgdGhlIGNvcnB1cyB3aXRoIHJlcGVhdGVkIGRvY3VtZW50cy4gYGNyZWF0ZUZvbGRzYCBjYW4gYmUgdXNlZCB0byBnZW5lcmF0ZSBiYWxhbmNlZCBjcm9zcy12YWxpZGF0aW9uIGdyb3VwaW5ncyBmcm9tIGEgc2V0IG9mIGRhdGEuDQoNCmBgYHtyfQ0KcmVzYW1wbGVzIDwtIGNyZWF0ZVJlc2FtcGxlKHkgPSBjb3JwdXNfZGZfOTkkdHlwZSkNCnN0cihyZXNhbXBsZXMpDQpgYGANCg0KYGBge3J9DQpmb2xkcyA8LSBjcmVhdGVGb2xkcyh5ID0gY29ycHVzX2RmXzk5JHR5cGUpDQpzdHIoZm9sZHMpDQpgYGANCg0KIyBDbGFzc2lmaWNhdGlvbg0KDQpUaGUgYGNhcmV0YCBbNCwgNV0gcGFja2FnZSBpcyB0aGUgcmVmZXJlbmNlIHRvb2wgZm9yIGJ1aWxkaW5nIHN1cGVydmlzZWQgY2xhc3NpZmljYXRpb24gYW5kIHJlZ3Jlc3Npb24gbW9kZWxzIGluIFIuIEl0IGNvdmVycyBhbGwgdGhlIHN0ZXBzIG9mIGEgY2xhc3NpYyBwaXBlbGluZTogZGF0YSBwcmVwcm9jZXNzaW5nLCBtb2RlbCBidWlsZGluZywgYWNjdXJhY3kgZXN0aW1hdGlvbiwgcHJlZGljdGlvbiBvZiB0aGUgdHlwZSBvZiBuZXcgc2FtcGxlcywgYW5kIHN0YXRpc3RpY2FsIGNvbXBhcmlzb24gYmV0d2VlbiB0aGUgcGVyZm9ybWFuY2Ugb2YgZGlmZmVyZW50IG1vZGVscy4gVGhpcyBjaGVhdHNoZWV0IG9mIGNhcmV0IGlsbHVzdHJhdGVzIGl0cyBtYWluIGZ1bmN0aW9uIGluIGEgc2luZ2xlIHBhZ2U6IGh0dHBzOi8vZ2l0aHViLmNvbS9DQUJBSC9sZWFybmluZ1JyZXNvdXJjZXMvYmxvYi9tYWluL2NoZWF0c2hlZXRzL2NhcmV0LnBkZi4NCg0KT3VyIG9iamVjdGl2ZSBpcyB0byBsZWFybiBhIGNsYXNzaWZpZXIgdGhhdCBwcmVkaWN0cyB0aGUgdHlwZSBvZiBmdXR1cmUgZG9jdW1lbnRzIGJhc2VkIG9uIHRlcm1zIG9jY3VycmVuY2VzLiBXZSBoYXZlIGEgdHdvLWNsYXNzIHN1cGVydmlzZWQgY2xhc3NpZmljYXRpb24gcHJvYmxlbS4NCg0KV2Ugbm93IGNhbiBzdGFydCB0cmFpbmluZyBhbmQgdGVzdGluZyBkaWZmZXJlbnQgc3VwZXJ2aXNlZCBjbGFzc2lmaWNhdGlvbiBtb2RlbHMuIFRoZSBgdHJhaW5gIGZ1bmN0aW9uIGltcGxlbWVudHMgdGhlIGJ1aWxkaW5nIHByb2Nlc3MuDQoNCiogYGZvcm1gIHBhcmFtZXRlciBpcyB1c2VkIHdpdGggdGhlIGV4cHJlc3Npb24gYHR5cGUgfiAuYCB0byBkZW5vdGUgdGhlIHZhcmlhYmxlIHRvIGJlIHByZWRpY3RlZCwgZm9sbG93ZWQgYnkgdGhlIHNldCBvZiBwcmVkaWN0b3JzLiBBIHBvaW50IGluZGljYXRlcyB0aGF0IHRoZSByZXN0IG9mIHZhcmlhYmxlcyBhcmUgdXNlZCBhcyBwcmVkaWN0b3JzLiBgZGF0YWAgcGFyYW1ldGVyIGlzIHVzZWQgZm9yIHRoZSB0cmFpbmluZyBkYXRhLiANCg0KKiBgbWV0aG9kYCBwYXJhbWV0ZXIgZml4ZXMgdGhlIHR5cGUgb2YgY2xhc3NpZmljYXRpb24gYWxnb3JpdGhtIHRvIGJlIGxlYXJuZWQuIGBjYXJldGAgc3VwcG9ydHMgbW9yZSB0aGFuIDE1MCBzdXBlcnZpc2VkIGNsYXNzaWZpY2F0aW9uIGFuZCByZWdyZXNzaW9uIGFsZ29yaXRobXMuIFRha2luZyBpbnRvIGFjY291bnQgdGhlIGxhcmdlIGRpbWVuc2lvbmFsaXR5IG9mIGNsYXNzaWMgTkxQIGRhdGFzZXRzLCB3ZSBoYXZlIHRvIHVzZSBjbGFzc2lmaWVycyBjYXBhYmxlIHRvIGRlYWwgd2l0aCB0aGlzLiBJbiB0aGlzIHdvcmsgd2UgY2hvb3NlIExpbmVhciBEaXNjcmltaW5hbnQgQW5hbHlzaXMgKExEQSkgYW5kIEJvb3N0ZWQgTG9naXN0aWMgUmVncmVzc2lvbiAoTFIpLg0KDQoqIGBtZXRyaWNgIHBhcmFtZXRlciBmaXhlcyB0aGUgc2NvcmUgdG8gYXNzZXNzLXZhbGlkYXRlcyB0aGUgZ29vZG5lc3Mgb2YgZWFjaCBtb2RlbC4gQSBsYXJnZSBzZXQgb2YgbWV0cmljcyBpcyBvZmZlcmVkIGFuZCB3ZSB0ZXN0IHRoZSBmb2xsb3dpbmcgb25lczogQWNjdXJhY3ksIEthcHBhLCBST0MsIFNlbnNpdGl2aXR5IGFuZCBTcGVjaWZpY2l0eS4NCg0KKiBgdHJDb250cm9sYCBwYXJhbWV0ZXIgZGVmaW5lcyB0aGUgbWV0aG9kIHRvIGVzdGltYXRlIHRoZSBlcnJvciBvZiB0aGUgY2xhc3NpZmllci4gVGhlIGB0cmFpbkNvbnRyb2xgIGZ1bmN0aW9uIGFsbG93cyB0aGUgdXNlIG9mIGRpZmZlcmVudCBwZXJmb3JtYW5jZSBlc3RpbWF0aW9uIHByb2NlZHVyZXMgc3VjaCBhcyBrLWZvbGQgY3Jvc3MtdmFsaWRhdGlvbiwgYm9vdHN0cmFwcGluZywgZXRjLiBXZSBhcHBseSBhIDEwLWZvbGQgY3Jvc3MtdmFsaWRhdGlvbiwgcmVwZWF0ZWQgMyB0aW1lcy4gVGhpcyBpcyBhbiBhZGVxdWF0ZSBvcHRpb24gYmVjYXVzZSBpdCBjcmVhdGVzIDMwIHJlc3VsdHMgdGhhdCBjYW4gbGF0ZXIgYmUgdXNlZCB0byBjb21wYXJlIGFsZ29yaXRobXMgc3RhdGlzdGljYWxseS4NCg0KIyMgTGluZWFyIERpc2NyaW1pbmFudCBBbmFseXNpcw0KDQohW0xpbmVhciBEaXNjaXJtaW5hbnQgQW5hbHlzaXNdKC4uL2ltYWdlcy9saW5lYXJfZGlzY3JpbWluYW50X2FuYWx5c2lzLnBuZykNCg0KTERBIGlzIHVzZWQgdG8gZmluZCBhIGxpbmVhciBjb21iaW5hdGlvbiBvZiBmZWF0dXJlcyB0aGF0IGNoYXJhY3Rlcml6ZXMgb3Igc2VwYXJhdGVzIHR3byBvciBtb3JlIGNsYXNzZXMuIFRoZSByZXN1bHRpbmcgY29tYmluYXRpb24gY2FuIGJlIHVzZWQgYXMgYSBsaW5lYXIgY2xhc3NpZmllciwgb3IgZm9yIGRpbWVuc2lvbmFsaXR5IHJlZHVjdGlvbi4gVGhpcyB0aW1lIHdlIHdpbGwgdXNlIGl0IGFzIGEgY2xhc3NpZmllci4gV2Ugd2lsbCBzZWUgYSBzaW1pbGFyIHVuc3VwZXJ2aXNlZCBtZXRob2QgY2FsbGVkIFByaW5jaXBhbCBDb21wb25lbnQgQW5hbHlzaXMgKFBDQSkgZm9yIGRpbWVuc2lvbmFsaXR5IHJlZHVjdGlvbiBpbiB0aGUgRmVhdHVyZSBFeHRyYWN0aW9uIHNlY3Rpb24uDQoNCkFjY3VyYWN5IGFuZCBLYXBwYSBhcmUgdGhlIGRlZmF1bHQgbWV0cmljcyB1c2VkIHRvIGV2YWx1YXRlIGFsZ29yaXRobXMgb24gYmluYXJ5IGFuZCBtdWx0aS1jbGFzcyBjbGFzc2lmaWNhdGlvbiBkYXRhc2V0cyBpbiBjYXJldC4gQXMgd2UgaGF2ZSB0byBkbyBiaW5hcnkgY2xhc3NpZmljYXRpb24sIHRoZXNlIG1ldHJpY3MgYXJlIGFkZXF1YXRlLiBPdXIgY2xhc3NlcyBhcmUgY29tcGxldGVseSBiYWxhbmNlZCwgYW5kIHRoYXQgbWFrZXMgYW5hbHlzaW5nIHRoZSBtZXRyaWNzIGVhc2llci4NCg0KQWNjdXJhY3kgaXMgdGhlIHBlcmNlbnRhZ2Ugb2YgY29ycmVjdGx5IGNsYXNzaWZpZXMgaW5zdGFuY2VzIG91dCBvZiBhbGwgaW5zdGFuY2VzLiBJdCBpcyBtb3JlIHVzZWZ1bCBvbiBhIGJpbmFyeSBjbGFzc2lmaWNhdGlvbiB0aGFuIG11bHRpLWNsYXNzIGNsYXNzaWZpY2F0aW9uIHByb2JsZW1zIGJlY2F1c2UgaXQgY2FuIGJlIGxlc3MgY2xlYXIgZXhhY3RseSBob3cgdGhlIGFjY3VyYWN5IGJyZWFrcyBkb3duIGFjcm9zcyB0aG9zZSBjbGFzc2VzLiBUaGlzIGNvdWxkIGJlIHNlZW4gd2l0aCBhIGNvbmZ1c2lvbiBtYXRyaXguDQoNCkthcHBhIGlzIHNpbWlsYXIgdG8gYWNjdXJhY3ksIGJ1dCBpdCBpcyBub3JtYWxpemVkIGF0IHRoZSBiYXNlbGluZSBvZiByYW5kb20gY2hhbmNlIG9uIG91ciBkYXRhc2V0LiBJdCBpcyBhIG1vcmUgdXNlZnVsIG1lYXN1cmUgdG8gdXNlIG9uIHByb2JsZW1zIHRoYXQgaGF2ZSBhbiBpbWJhbGFuY2UgaW4gdGhlIGNsYXNzZXMuIEZvciBleGFtcGxlLCBpbiBhIDcwLTMwIHNwbGl0IGZvciBjbGFzc2VzIDAgYW5kIDEgYW5kIHlvdSBjYW4gYWNoaWV2ZSA3MCUgYWNjdXJhY3kgYnkgcHJlZGljdGluZyBhbGwgaW5zdGFuY2VzIGFyZSBmb3IgY2xhc3MgMC4gQXMgb3VyIGNsYXNzZXMgYXJlIGNvbXBsZXRlbHkgYmFsYW5jZWQsIDUwJSBhY2N1cmFjeSBpcyBvYnRhaW5lZCBieSBwcmVkaWN0aW5nIGFueSBvZiB0aGUgY2xhc3NlcyBmb3IgYWxsIGluc3RhbmNlcy4NCg0KVGhlIG9idGFpbmVkIGFjY3VyYWN5IGlzIG5vdCB2ZXJ5IGdvb2QsIGJ1dCB0aGlzIGlzIGV4cGVjdGVkIGJlY2F1c2UgdGhlIHByb2JsZW0gaXMgbm90IGFuIGVhc3kgb25lLiBUaGUga2FwcGEgbWV0cmljIGFsc28gcmVmbGVjdHMgdGhhdCBvdXIgY2xhc3NpZmllciBpcyBxdWl0ZSBiYWQuDQoNCmBgYHtyfQ0KIyBmaXhpbmcgdGhlIHBlcmZvcm1hbmNlIGVzdGltYXRpb24gcHJvY2VkdXJlDQp0cmFpbl9jdHJsIDwtIHRyYWluQ29udHJvbChtZXRob2QgPSAicmVwZWF0ZWRjdiIsIHJlcGVhdHMgPSAzKQ0KbGRhXzN4MTBjdiA8LSB0cmFpbih0eXBlIH4gLiwgZGF0YSA9IHRyYWluaW5nLCBtZXRob2QgPSAibGRhIiwgdHJDb250cm9sID0gdHJhaW5fY3RybCkNCmxkYV8zeDEwY3YNCmBgYA0KDQpBbm90aGVyIG1ldHJpYyB0aGF0IGlzIG9ubHkgc3VpdGFibGUgZm9yIGJpbmFyeSBjbGFzc2lmaWNhdGlvbiBwcm9ibGVtcyBpcyBST0MuIFRoZSBhcmVhIHVuZGVyIHRoZSBST0MgY3VydmUgcmVwcmVzZW50cyBhIG1vZGVscyBhYmlsaXR5IHRvIGRpc2NyaW1pbmF0ZSBiZXR3ZWVuIHBvc2l0aXZlIGFuZCBuZWdhdGl2ZSBjbGFzc2VzLiBBbiBhcmVhIG9mIDEuMCByZXByZXNlbnRzIGEgbW9kZWwgdGhhdCBtYWRlIGFsbCBwcmVkaWN0cyBwZXJmZWN0bHkuIEFuIGFyZWEgb2YgMC41IHJlcHJlc2VudHMgYSBtb2RlbCBhcyBnb29kIGFzIHJhbmRvbS4NCg0KUk9DIGNhbiBiZSBicm9rZW4gZG93biBpbnRvIHNlbnNpdGl2aXR5IGFuZCBzcGVjaWZpY2l0eS4gQSBiaW5hcnkgY2xhc3NpZmljYXRpb24gcHJvYmxlbSBpcyByZWFsbHkgYSB0cmFkZS1vZmYgYmV0d2VlbiBzZW5zaXRpdml0eSBhbmQgc3BlY2lmaWNpdHkuIFNlbnNpdGl2aXR5IGlzIHRoZSB0cnVlIHBvc2l0aXZlIHJhdGUgYWxzbyBjYWxsZWQgdGhlIHJlY2FsbC4gSXQgaXMgdGhlIG51bWJlciBpbnN0YW5jZXMgZnJvbSB0aGUgcG9zaXRpdmUgKGZpcnN0KSBjbGFzcyB0aGF0IGFjdHVhbGx5IHByZWRpY3RlZCBjb3JyZWN0bHkuIFNwZWNpZmljaXR5IGlzIGFsc28gY2FsbGVkIHRoZSB0cnVlIG5lZ2F0aXZlIHJhdGUuIElzIHRoZSBudW1iZXIgb2YgaW5zdGFuY2VzIGZyb20gdGhlIG5lZ2F0aXZlIChzZWNvbmQpIGNsYXNzIHRoYXQgd2VyZSBhY3R1YWxseSBwcmVkaWN0ZWQgY29ycmVjdGx5Lg0KDQpUbyB1c2UgdGhpcyBtZXRyaWMgd2UgaGF2ZSB0byBzZWxlY3QgaXQgaW4gdGhlIGZ1bmN0aW9uIHBhcmFtZXRlcnMuIE1vcmVvdmVyLCBleHRyYSBwYXJhbWV0ZXJzIG11c3QgYmUgYWRkZWQgdG8gdGhlIGB0cmFpbkNvbnRyb2xgIGZ1bmN0aW9uLiBJbiBiaW5hcnkgY2xhc3NpZmljYXRpb24gcHJvYmxlbXMgdGhlIGB0d29DbGFzc1N1bW1hcnlgIG9wdGlvbiBkaXNwbGF5cyBhcmVhIHVuZGVyIHRoZSBST0MgY3VydmUsIHNlbnNpdGl0eSBhbmQgc3BlY2lmaWNpdHkgbWV0cmljcy4gVG8gZG8gc28sIGFjdGl2YXRpbmcgdGhlIGBjbGFzc1Byb2JzYCBvcHRpb24gaXMgYWxzbyBuZWVkZWQsIHdoaWNoIHNhdmVzIHRoZSBjbGFzcyBwcm9iYWJpbGl0aWVzIHRoYXQgdGhlIGNsYXNzaWZpZXIgYXNzaWducyB0byBlYWNoIHNhbXBsZS4NCg0KTG9va2luZyBhdCB0aGVzZSBudW1iZXJzLCB3ZSByZWFsaXNlIHRoYXQgdGhlIHNlY29uZCBjbGFzcyBpcyBwcmVkaWN0ZWQgY29ycmVjdGx5IG1vcmUgdGltZXMgdGhhbiB0aGUgZmlyc3Qgb25lLiBUaGUgZmlyc3QgY2xhc3MgaXMgcHJlZGljdGVkIGNvcnJlY3RseSA2NyUgb2YgdGhlIHRpbWVzIGFuZCB0aGUgc2Vjb25kIG9uZSA5MCUgb2YgdGhlIHRpbWVzLiBUaGlzIHdpbGwgYWxzbyBiZSBldmlkZW50IGlmIHdlIGNhbGN1bGF0ZSBhIGNvbmZ1c2lvbiBtYXRyaXggd2hlbiB0ZXN0aW5nIHRoZSBtb2RlbC4NCg0KYGBge3J9DQpsaWJyYXJ5KHBST0MpDQp0cmFpbl9jdHJsIDwtIHRyYWluQ29udHJvbChtZXRob2QgPSAicmVwZWF0ZWRjdiIsIHJlcGVhdHMgPSAzLCBjbGFzc1Byb2JzID0gVFJVRSwgc3VtbWFyeUZ1bmN0aW9uID0gdHdvQ2xhc3NTdW1tYXJ5KQ0KbGRhX3JvY18zeDEwY3YgPC0gdHJhaW4odHlwZSB+IC4sIGRhdGEgPSB0cmFpbmluZywgbWV0aG9kID0gImxkYSIsIG1ldHJpYz0iUk9DIiwgdHJDb250cm9sID0gdHJhaW5fY3RybCkNCmxkYV9yb2NfM3gxMGN2DQpgYGANCg0KIyMgQm9vc3RlZCBMb2dpc3RpYyBSZWdyZXNzaW9uDQoNCiFbTG9naXN0aWMgUmVncmVzc2lvbl0oLi4vaW1hZ2VzL2xvZ2lzdGljX3JlZ3Jlc3Npb24ucG5nKQ0KDQpMb2dpc3RpYyBSZWdyZXNzaW9uIGlzIHVzZWQgdG8gbW9kZWwgdGhlIHByb2JhYmlsaXR5IG9mIGEgY2VydGFpbiBjbGFzcy4gSXQgdXNlcyBhIGxpbmVhciBjb21iaW5hdGlvbiBvZiBpbmRlcGVuZGVudCB2YXJpYWJsZXMsIGFuZCBhcHBsaWVzIHRoZSBsb2dpc3RpYyBmdW5jdGlvbiBhdCB0aGUgZW5kIHRvIG9idGFpbiBwcm9iYWJpbGl0aWVzLiBJZiB3ZSBkZWZpbmUgYSBjdXQtb2ZmIHByb2JhYmlsaXR5LCBpdCBjYW4gYmUgdXNlZCBhcyBhIGJpbmFyeSBvciBtdWx0aS1jbGFzcyBjbGFzc2lmaWNhdGlvbiBtb2RlbC4gQm9vc3RlZCBMUiBpcyBhbiBhZGRpdGl2ZSBsb2dpc3RpYyByZWdyZXNzaW9uIG1vZGVsLiBJdCB1c2VzIGFuZCBlbnNlbWJsZSBvZiBzaW1pbGFyIExSIG1vZGVscyB0byBtYWtlIHByZWRpY3Rpb25zLg0KDQpXaGlsZSB0aGUgbGluZWFyIExEQSBjbGFzc2lmaWVyIGRvZXMgbm90IGhhdmUgcGFyYW1ldGVycywgTFIgaGFzIHRoZSBgbkl0ZXJgIGtleSBwYXJhbWV0ZXIuIFRoaXMgcGFyYW1ldGVyIGluZGljYXRlcyB0aGUgbnVtYmVyIG9mIGl0ZXJhdGlvbnMgb2YgdGhlIExvZ2lzdGljIFJlZ3Jlc3Npb24gbW9kZWwuIEJ5IGRlZmF1bHQsIHdpdGhvdXQgY2hhbmdpbmcgdGhlIHZhbHVlIG9mIHRoZSBwYXJhbWV0ZXIsIGBjYXJldGAgZXZhbHVhdGVzIDMgbW9kZWxzLiBUaGUgYHR1bmVMZW5ndGhgIG9wdGlvbiBvZiB0aGUgYHRyYWluYCBmdW5jdGlvbiBmaXhlcyB0aGUgbnVtYmVyIG9mIHZhbHVlcyBvZiBlYWNoIHBhcmFtZXRlciB0byBiZSBjaGVja2VkLiBGb3IgZXhhbXBsZSwgaWYgdGhlIGNsYXNzaWZpZXIgaGFzIDIgcGFyYW1ldGVycyBhbmQgdGhlIGB0dW5lTGVuZ3RoYCBwYXJhbWV0ZXIgaXMgbm90IGNoYW5nZWQsIDMgeCAzID0gOSBtb2RlbHMgYXJlIGV2YWx1YXRlZC4NCg0KYGBge3J9DQp0cmFpbl9jdHJsIDwtIHRyYWluQ29udHJvbCgNCiAgICBtZXRob2QgPSAicmVwZWF0ZWRjdiIsIHJlcGVhdHMgPSAzDQopDQpscl8zeDEwY3YgPC0gdHJhaW4odHlwZSB+IC4sDQogICAgZGF0YSA9IHRyYWluaW5nLCBtZXRob2QgPSAiTG9naXRCb29zdCIsIHRyQ29udHJvbCA9IHRyYWluX2N0cmwNCikNCmxyXzN4MTBjdg0KcGxvdChscl8zeDEwY3YpDQpgYGANCg0KSWYgd2UgaW5jcmVhc2UgdGhlIGB0dW5lTGVuZ3RoYCB0byBgMTVgIHdlIGNhbiBldmFsdWF0ZSBtb3JlIG1vZGVscywgYW5kIGNoZWNrIGlmIHRoZSBhY2N1cmFjeSBpbmNyZWFzZXMuIFdlIGNhbiBzZWUgdGhhdCB0aGUgYWNjdXJhY3kgaW1wcm92ZXMgdXAgdG8gc29tZSBwb2ludCBhbmQgdGhlbiBpdCBpcyBuZWFybHkgY29uc3RhbnQuIFRoZXJlZm9yZSwgaXQgaXMgbm90IHdvcnRoIHRvIGluY3JlYXNlIHRoZSB2YWx1ZSBvZiBgbkl0ZXJgIA0KDQpgYGB7cn0NCnRyYWluX2N0cmwgPC0gdHJhaW5Db250cm9sKA0KICAgIG1ldGhvZCA9ICJyZXBlYXRlZGN2IiwgcmVwZWF0cyA9IDMNCikNCmxyX3R1bmVsXzN4MTBjdiA8LSB0cmFpbih0eXBlIH4gLiwNCiAgICBkYXRhID0gdHJhaW5pbmcsIG1ldGhvZCA9ICJMb2dpdEJvb3N0IiwgdHJDb250cm9sID0gdHJhaW5fY3RybCwgdHVuZUxlbmd0aCA9IDE1DQopDQpscl90dW5lbF8zeDEwY3YNCnBsb3QobHJfdHVuZWxfM3gxMGN2KQ0KYGBgDQoNCldlIGNhbiBhbHNvIHRyeSB0aGUgUk9DIG1ldHJpYyB0byBoYXZlIG1vcmUgaW5mb3JtYXRpb24gYWJvdXQgdGhlIHBlcmZvcm1hbmNlIG9mIG91ciBtb2RlbC4gV2UgZ2V0IHNpbWlsYXIgcmVzdWx0cyB0byB0aGUgTERBIGNsYXNzaWZpZXIsIHdpdGggYSBtdWNoIGhpZ2hlciBTcGVjaWZpY2l0eSB0aGFuIFNlbnNpdGl2aXR5Lg0KDQpgYGB7cn0NCnRyYWluX2N0cmwgPC0gdHJhaW5Db250cm9sKG1ldGhvZCA9ICJyZXBlYXRlZGN2IiwgcmVwZWF0cyA9IDMsIGNsYXNzUHJvYnMgPSBUUlVFLCBzdW1tYXJ5RnVuY3Rpb24gPSB0d29DbGFzc1N1bW1hcnkpDQpscl9yb2NfM3gxMGN2IDwtIHRyYWluKHR5cGUgfiAuLCBkYXRhPXRyYWluaW5nLCBtZXRob2Q9IkxvZ2l0Qm9vc3QiLCB0ckNvbnRyb2w9dHJhaW5fY3RybCwgbWV0cmljPSJST0MiLCB0dW5lTGVuZ3RoPTE1KQ0KbHJfcm9jXzN4MTBjdg0KcGxvdChscl9yb2NfM3gxMGN2KQ0KYGBgDQoNClRoZSBgdHVuZUdyaWRgIG9wdGlvbiBvZmZlcnMgdGhlIHBvc3NpYmlsaXR5IHRvIHNlbGVjdCBhbW9uZyBhIHNldCBvZiB2YWx1ZXMgdG8gYmUgdHVuZWQtdGVzdGVkLg0KDQpgYGB7cn0NCnRyYWluX2N0cmwgPC0gdHJhaW5Db250cm9sKA0KICAgIG1ldGhvZCA9ICJyZXBlYXRlZGN2IiwgcmVwZWF0cyA9IDMNCikNCnR1bmVfZ3JpZCA8LSBleHBhbmQuZ3JpZCgNCiAgbkl0ZXIgPSBzZXEoMTAwLCAxMjAsIDIpDQopDQpscl90dW5lZ18zeDEwY3YgPC0gdHJhaW4odHlwZSB+IC4sDQogICAgZGF0YSA9IHRyYWluaW5nLCBtZXRob2QgPSAiTG9naXRCb29zdCIsIHRyQ29udHJvbCA9IHRyYWluX2N0cmwsIHR1bmVHcmlkID0gdHVuZV9ncmlkDQopDQpscl90dW5lZ18zeDEwY3YNCnBsb3QobHJfdHVuZWdfM3gxMGN2KQ0KYGBgDQoNCiMgU3Vic2FtcGxpbmcNCg0KT3VyIGluaXRpYWwgY29ycHVzIGlzIGNvbXBsZXRlbHkgYmFsYW5jZWQsIGl0IGhhcyAxMDAwIHNhbXBsZXMgb2YgZWFjaCBjbGFzcy4gSG93ZXZlciwgd2UgY2FuIGNyZWF0ZSBhbiB1bmJhbGFuY2VkIGNvcnB1cyBieSByZW1vdmluZyBzb21lIHNhbXBsZXMuIEZvciBleGFtcGxlLCB3ZSBjYW4gY3JlYXRlIGEgY29ycHVzIHRoYXQgaGFzIDEwMDAgc2FtcGxlcyBvZiBvbmUgY2xhc3MgYW5kIDI1MCBmcm9tIHRoZSBvdGhlciBjbGFzcy4gSWYgY2xhc3MtbGFiZWwgZGlzdHJpYnV0aW9ucyBhcmUgdW5iYWxhbmNlZCBpbiBvdXIgY29ycHVzLCBhIHJlc2FtcGxpbmcgbWV0aG9kIHdpbGwgdHJ5IHRvIGltcHJvdmUgdGhlIHJlY292ZXJ5IHJhdGUgaW4gdGhlIG1pbm9yaXR5IGNsYXNzLg0KDQpUaGlzIHRlc3Qgd2lsbCBvbmx5IGJlIHBlcmZvcm1lZCB3aXRoIHRoZSBMUiBjbGFzc2lmaWVyLiBGaXJzdCwgYSBub3JtYWwgY2xhc3NpZmllciB3aWxsIGJlIHRyYWluZWQuIFRoZW4gbXVsdGlwbGUgcmVzYW1wbGluZyBtZXRob2RzIHdpbGwgYmUgdGVzdGVkIGFuZCBjb21wYXJlZCB3aXRoIHRoZSBiYXNlIGNsYXNzaWZpZXIuIFJPQyBpcyBhbiBhZGVxdWVhdGUgbWV0cmljIGluIHRoaXMgY2FzZSBiZWNhdXNlIHdlIGNhbiBjb21wYXJlIHRoZSBzZW5zaXRpdml0eSBhbmQgc3BlY2lmaWNpdHkgZm9yIGVhY2ggc3Vic2FtcGxpbmcgbWV0aG9kLg0KDQpXZSBleHBlY3QgdG8gaGF2ZSB2ZXJ5IGhpZ2ggc3BlY2lmaWNpdHkgYnV0IGxvdyBzZW5zaXRpdml0eS4gVGhlcmVmb3JlLCBvdXIgYWltIGlzIHRvIGluY3JlYXNlIHNlbnNpc3Rpdml0eS4gRG93bnNhbXBsaW5nIGFuZCB1cHNhbXBsaW5nIGltcHJvdmUgdGhlIHNlbnNpdGl2aXR5IGEgYml0IGFuZCB0aGUgaHlicmlkIG1ldGhvZCBnZXRzIHdvcnNlIHJlc3VsdHMuDQoNCmBgYHtyfQ0KY29ycHVzX2RmXzk5X3VuID0gY29ycHVzX2RmXzk5WzE6KG4rbi80KSwgXQ0KaW5fdHJhaW5fdW4gPC0gY3JlYXRlRGF0YVBhcnRpdGlvbih5ID0gY29ycHVzX2RmXzk5X3VuJHR5cGUsIHAgPSAuNzUsIGxpc3QgPSBGQUxTRSkNCnN0cihpbl90cmFpbl91bikNCnRyYWluaW5nX3VuIDwtIGNvcnB1c19kZl85OVtpbl90cmFpbl91biwgXQ0KdGVzdGluZ191biA8LSBjb3JwdXNfZGZfOTlbLWluX3RyYWluX3VuLCBdDQpgYGANCg0KYGBge3J9DQp0cmFpbl9jdHJsIDwtIHRyYWluQ29udHJvbChtZXRob2QgPSAicmVwZWF0ZWRjdiIsIHJlcGVhdHMgPSAzLCBjbGFzc1Byb2JzPVRSVUUsIHN1bW1hcnlGdW5jdGlvbj10d29DbGFzc1N1bW1hcnkpDQpsZGFfdW5fM3gxMGN2IDwtIHRyYWluKHR5cGUgfiAuLCBkYXRhID0gdHJhaW5pbmdfdW4sIG1ldGhvZCA9ICJMb2dpdEJvb3N0IiwgbWV0cmljPSJST0MiLCB0ckNvbnRyb2wgPSB0cmFpbl9jdHJsKQ0KbGRhX3VuXzN4MTBjdg0KYGBgDQoNCiMjIERvd25zYW1wbGluZw0KDQpEb3duc2FtcGxpbmcgcmFuZG9tbHkgc3Vic2V0cyBhbGwgdGhlIGNsYXNzZXMgaW4gdGhlIHRyYWluaW5nIHNldCBzbyB0aGF0IHRoZWlyIGNsYXNzIGZyZXF1ZW5jaWVzIG1hdGNoIHRoZSBsZWFzdCBwcmV2YWxlbnQgY2xhc3MuIEZvciBleGFtcGxlLCBzdXBwb3NlIHRoYXQgODAlIG9mIHRoZSB0cmFpbmluZyBzZXQgc2FtcGxlcyBhcmUgdGhlIGZpcnN0IGNsYXNzIGFuZCB0aGUgcmVtYWluaW5nIDIwJSBhcmUgaW4gdGhlIHNlY29uZCBjbGFzcy4gRG93bi1zYW1wbGluZyB3b3VsZCByYW5kb21seSBzYW1wbGUgdGhlIGZpcnN0IGNsYXNzIHRvIGJlIHRoZSBzYW1lIHNpemUgYXMgdGhlIHNlY29uZCBjbGFzcyAoc28gdGhhdCBvbmx5IDQwJSBvZiB0aGUgdG90YWwgdHJhaW5pbmcgc2V0IGlzIHVzZWQgdG8gZml0IHRoZSBtb2RlbCkuDQoNCmBgYHtyfQ0KdHJhaW5fY3RybCA8LSB0cmFpbkNvbnRyb2wobWV0aG9kID0gInJlcGVhdGVkY3YiLCByZXBlYXRzID0gMywgY2xhc3NQcm9icz1UUlVFLCBzdW1tYXJ5RnVuY3Rpb249dHdvQ2xhc3NTdW1tYXJ5LCBzYW1wbGluZz0iZG93biIpDQpsZGFfZG93bl8zeDEwY3YgPC0gdHJhaW4odHlwZSB+IC4sIGRhdGEgPSB0cmFpbmluZ191biwgbWV0aG9kID0gIkxvZ2l0Qm9vc3QiLCBtZXRyaWM9IlJPQyIsIHRyQ29udHJvbCA9IHRyYWluX2N0cmwpDQpsZGFfZG93bl8zeDEwY3YNCmBgYA0KDQojIyBVcHNhbXBsaW5nDQoNClVwc2FtcGxpbmcgcmFuZG9tbHkgc2FtcGxlcyB0aGUgbWlub3JpdHkgY2xhc3MgdG8gYmUgdGhlIHNhbWUgc2l6ZSBhcyB0aGUgbWFqb3JpdHkgY2xhc3MuDQoNCmBgYHtyfQ0KdHJhaW5fY3RybCA8LSB0cmFpbkNvbnRyb2wobWV0aG9kID0gInJlcGVhdGVkY3YiLCByZXBlYXRzID0gMywgY2xhc3NQcm9icz1UUlVFLCBzdW1tYXJ5RnVuY3Rpb249dHdvQ2xhc3NTdW1tYXJ5LCBzYW1wbGluZz0idXAiKQ0KbGRhX3VwXzN4MTBjdiA8LSB0cmFpbih0eXBlIH4gLiwgZGF0YSA9IHRyYWluaW5nX3VuLCBtZXRob2QgPSAiTG9naXRCb29zdCIsIG1ldHJpYz0iUk9DIiwgdHJDb250cm9sID0gdHJhaW5fY3RybCkNCmxkYV91cF8zeDEwY3YNCmBgYA0KDQojIyBIeWJyaWQNCg0KQW4gaHlicmlkIG1ldGhvZCBkb3duc2FtcGxlcyB0aGUgbWFqb3JpdHkgY2xhc3MgYW5kIHN5bnRoZXNpemVzIG5ldyBkYXRhIHBvaW50cyBpbiB0aGUgbWlub3JpdHkgY2xhc3MuDQoNCmBgYHtyfQ0KdHJhaW5fY3RybCA8LSB0cmFpbkNvbnRyb2wobWV0aG9kID0gInJlcGVhdGVkY3YiLCByZXBlYXRzID0gMywgY2xhc3NQcm9icz1UUlVFLCBzdW1tYXJ5RnVuY3Rpb249dHdvQ2xhc3NTdW1tYXJ5LCBzYW1wbGluZz0ic21vdGUiKQ0KbGRhX3Ntb3RlXzN4MTBjdiA8LSB0cmFpbih0eXBlIH4gLiwgZGF0YSA9IHRyYWluaW5nX3VuLCBtZXRob2QgPSAiTG9naXRCb29zdCIsIG1ldHJpYz0iUk9DIiwgdHJDb250cm9sID0gdHJhaW5fY3RybCkNCmxkYV9zbW90ZV8zeDEwY3YNCmBgYA0KDQojIEZlYXR1cmUgU2VsZWN0aW9uDQoNCk1vc3QgYXBwcm9hY2hlcyBmb3IgcmVkdWNpbmcgdGhlIG51bWJlciBvZiBmZWF0dXJlcyBjYW4gYmUgcGxhY2VkIGludG8gdHdvIG1haW4gY2F0ZWdvcmllczogd3JhcHBlcnMgYW5kIGZpbHRlcnMuDQoNCldyYXBwZXIgbWV0aG9kcyBldmFsdWF0ZSBtdWx0aXBsZSBtb2RlbHMgdXNpbmcgcHJvY2VkdXJlcyB0aGF0IGFkZCBhbmQvb3IgcmVtb3ZlIHByZWRpY3RvcnMgdG8gZmluZCB0aGUgb3B0aW1hbCBjb21iaW5hdGlvbiB0aGF0IG1heGltaXplcyBtb2RlbCBwZXJmb3JtYW5jZS4gSW4gZXNzZW5jZSwgd3JhcHBlciBtZXRob2RzIGFyZSBzZWFyY2ggYWxnb3JpdGhtcyB0aGF0IHRyZWF0IHRoZSBwcmVkaWN0b3JzIGFzIHRoZSBpbnB1dHMgYW5kIHV0aWxpemUgbW9kZWwgcGVyZm9ybWFuY2UgYXMgdGhlIG91dHB1dCB0byBiZSBvcHRpbWl6ZWQuDQoNCkZpbHRlciBtZXRob2RzIGV2YWx1YXRlIHRoZSByZWxldmFuY2Ugb2YgdGhlIHByZWRpY3RvcnMgb3V0c2lkZSBvZiB0aGUgcHJlZGljdGl2ZSBtb2RlbHMgYW5kIHN1YnNlcXVlbnRseSBtb2RlbCBvbmx5IHRoZSBwcmVkaWN0b3JzIHRoYXQgcGFzcyBzb21lIGNyaXRlcmlvbi4gRWFjaCBwcmVkaWN0b3IgaXMgZWNhbHVhdGVkIGluZGl2aWR1YWxseSB0byBjaGVjayBpZiB0aGVyZSBpcyBhIHBsYXVzaWJsZSByZWxhdGlvbnNoaXAgYmV0d2VlbiBpdCBhbmQgdGhlIG9ic2VydmVkIGNsYXNzZXMuIE9ubHkgcHJlZGljdG9ycyB3aXRoIGltcG9ydGFudCByZWxhdGlvbnNoaXBzIHdvdWxkIHRoZW4gYmUgaW5jbHVkZWQgaW4gYSBjbGFzc2lmaWNhdGlvbiBtb2RlbC4NCg0KVGhlIGZ1bmN0aW9ucyBhcmUgYXBwbGllZCB0byB0aGUgZW50aXJlIHRyYWluaW5nIHNldCBhbmQgYWxzbyB0byBkaWZmZXJlbnQgcmVzYW1wbGVkIHZlcnNpb25zIG9mIHRoZSBkYXRhIHNldC4gRnJvbSB0aGlzLCBnZW5lcmFsaXphYmxlIGVzdGltYXRlcyBvZiBwZXJmb3JtYW5jZSBjYW4gYmUgY29tcHV0ZWQgdGhhdCBwcm9wZXJseSB0YWtlIGludG8gYWNjb3VudCB0aGUgZmVhdHVyZSBzZWxlY3Rpb24gc3RlcC4NCg0KSW4gb3VyIGNhc2Ugd2Ugd2lsbCB0ZXN0IFVuaXZhcmlhdGUgRmlsdGVyIGFuZCAyIHdyYXBwZXIgbWV0aG9kczogUmVjdXJzaXZlIEZlYXR1cmUgRWxpbWluYXRpb24gYW5kIFNpbXVsYXRlZCBBbm5lYWxpbmcuIFdlIHdpbGwgYXBwbHkgdGhlc2UgbWV0aG9kcyB0byBib3RoIGNsYXNzaWZpZXJzIGFuZCB3ZSB3aWxsIGNvbXBhcmUgdGhlIHJlc3VsdHMgYXQgdGhlIGVuZC4NCg0KIyMgVW5pdmFyaWF0ZSBGaWx0ZXINCg0KUHJlZGljdG9ycyBjYW4gYmUgZmlsdGVyZWQgYnkgY29uZHVjdGluZyBzb21lIHNvcnQgb2Ygc2FtcGxlIHRlc3QgdG8gc2VlIGlmIHRoZSBtZWFuIG9mIHRoZSBwcmVkaWN0b3IgaXMgZGlmZmVyZW50IGJldHdlZW4gdGhlIGNsYXNzZXMuIFByZWRpY3RvcnMgdGhhdCBoYXZlIHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQgZGlmZmVyZW5jZXMgYmV0d2VlbiB0aGUgY2xhc3NlcyBhcmUgdGhlbiB1c2VkIGZvciBtb2RlbGluZy4NCg0KT24gYXZlcmFnZSwgbGVzcyB0aGFuIDgwIHZhcmlhYmxlcyBhcmUgc2VsZWN0ZWQgYW5kIHRoZSBhY2N1cmFjeSBvZiB0aGUgY2xhc3NpZmllcnMgaXMgaW1wcm92ZWQuIFRoZXJlZm9yZSwgdGhpcyBtZXRob2QgaXMgYSBncmVhdCBvcHRpb24gaW4gdGhpcyBjYXNlLg0KDQpgYGB7cn0NCmxpYnJhcnkocmFuZG9tRm9yZXN0KQ0Kc2JmX2N0cmwgPC0gc2JmQ29udHJvbChmdW5jdGlvbnMgPSByZlNCRiwgbWV0aG9kID0gInJlcGVhdGVkY3YiLCByZXBlYXRzID0gMykNCnRyYWluX2N0cmwgPC0gdHJhaW5Db250cm9sKG1ldGhvZCA9ICJyZXBlYXRlZGN2IiwgcmVwZWF0cyA9IDMsIGNsYXNzUHJvYnMgPSBUUlVFKQ0KbHJfc2JmXzN4MTBjdiA8LSBzYmYodHlwZSB+IC4sIGRhdGEgPSB0cmFpbmluZywgbWV0aG9kID0gIkxvZ2l0Qm9vc3QiLCB0ckNvbnRyb2wgPSB0cmFpbl9jdHJsLCBzYmZDb250cm9sID0gc2JmX2N0cmwpDQpscl9zYmZfM3gxMGN2DQpsZGFfc2JmXzN4MTBjdiA8LSBzYmYodHlwZSB+IC4sIGRhdGEgPSB0cmFpbmluZywgbWV0aG9kID0gImxkYSIsIHRyQ29udHJvbCA9IHRyYWluX2N0cmwsIHNiZkNvbnRyb2wgPSBzYmZfY3RybCkNCmxkYV9zYmZfM3gxMGN2DQpgYGANCg0KIyMgUmVjdXJzaXZlIEZlYXR1cmUgRWxpbWluYXRpb24NCg0KRmlyc3QsIHRoZSBhbGdvcml0aG0gZml0cyB0aGUgbW9kZWwgdG8gYWxsIHByZWRpY3RvcnMuIEVhY2ggcHJlZGljdG9yIGlzIHJhbmtlZCB1c2luZyBpdOKAmXMgaW1wb3J0YW5jZSB0byB0aGUgbW9kZWwuIEF0IGVhY2ggaXRlcmF0aW9uIG9mIGZlYXR1cmUgc2VsZWN0aW9uLCB0aGUgdG9wIHJhbmtlZCBwcmVkaWN0b3JzIGFyZSByZXRhaW5lZCwgdGhlIG1vZGVsIGlzIHJlZml0IGFuZCBwZXJmb3JtYW5jZSBpcyBhc3Nlc3NlZC4gVGhlIG51bWJlciBvZiBwcmVkaWN0b3JzIHdpdGggdGhlIGJlc3QgcGVyZm9ybWFuY2UgaXMgZGV0ZXJtaW5lZCBhbmQgdGhlIHRvcCBwcmVkaWN0b3JzIGFyZSB1c2VkIHRvIGZpdCB0aGUgZmluYWwgbW9kZWwuIEluIHRoaXMgY2FzZSA0LCA4LCAxNiBhbmQgMTgxIHByZWRpY3RvcnMgYXJlIHRlc3RlZC4NCg0KVGhlIGFjY3VyYWN5IG9mIHRoZSBjbGFzc2lmaWVycyBpcyBpbXByb3ZlZC4gVGhlcmVmb3JlLCB0aGlzIG1ldGhvZCBpcyBhbHNvIGEgZ3JlYXQgb3B0aW9uIGluIHRoaXMgY2FzZS4NCg0KYGBge3J9DQpyZmVfY3RybCA8LSByZmVDb250cm9sKGZ1bmN0aW9ucyA9IHJmRnVuY3MsIG1ldGhvZCA9ICJyZXBlYXRlZGN2IiwgcmVwZWF0cyA9IDMpDQp0cmFpbl9jdHJsIDwtIHRyYWluQ29udHJvbChtZXRob2QgPSAicmVwZWF0ZWRjdiIsIHJlcGVhdHMgPSAzKQ0KbHJfcmZlXzN4MTBjdiA8LSByZmUodHlwZSB+IC4sIGRhdGEgPSB0cmFpbmluZywgbWV0aG9kID0gIkxvZ2l0Qm9vc3QiLCB0ckNvbnRyb2wgPSB0cmFpbl9jdHJsLCByZmVDb250cm9sID0gcmZlX2N0cmwpDQpscl9yZmVfM3gxMGN2DQpsZGFfcmZlXzN4MTBjdiA8LSByZmUodHlwZSB+IC4sIGRhdGEgPSB0cmFpbmluZywgbWV0aG9kID0gImxkYSIsIHRyQ29udHJvbCA9IHRyYWluX2N0cmwsIHJmZUNvbnRyb2wgPSByZmVfY3RybCkNCmxkYV9yZmVfM3gxMGN2DQpgYGANCg0KIyMgU2ltdWxhdGVkIEFubmVhbGluZw0KDQpTaW11bGF0ZWQgYW5uZWFsaW5nIGlzIGEgZ2xvYmFsIHNlYXJjaCBtZXRob2QgdGhhdCBtYWtlcyBzbWFsbCBwZXJ0dXJiYXRpb25zIHRvIGFuIGluaXRpYWwgY2FuZGlkYXRlIHNvbHV0aW9uLiBJZiB0aGUgcGVyZm9ybWFuY2UgdmFsdWUgZm9yIHRoZSBwZXJ0dXJiZWQgdmFsdWUgaXMgYmV0dGVyIHRoYW4gdGhlIHByZXZpb3VzIHNvbHV0aW9uLCB0aGUgbmV3IHNvbHV0aW9uIGlzIGFjY2VwdGVkLiBJZiBub3QsIGFuIGFjY2VwdGFuY2UgcHJvYmFiaWxpdHkgaXMgZGV0ZXJtaW5lZCBiYXNlZCBvbiB0aGUgZGlmZmVyZW5jZSBiZXR3ZWVuIHRoZSB0d28gcGVyZm9ybWFuY2UgdmFsdWVzIGFuZCB0aGUgY3VycmVudCBpdGVyYXRpb24gb2YgdGhlIHNlYXJjaC4gSW4gdGhlIGNvbnRleHQgb2YgZmVhdHVyZSBzZWxlY3Rpb24sIGEgc29sdXRpb24gaXMgYSBiaW5hcnkgdmVjdG9yIHRoYXQgZGVzY3JpYmVzIHRoZSBjdXJyZW50IHN1YnNldC4gVGhlIHN1YnNldCBpcyBwZXJ0dXJiZWQgYnkgcmFuZG9tbHkgY2hhbmdpbmcgYSBzbWFsbCBudW1iZXIgb2YgbWVtYmVycyBpbiB0aGUgc3Vic2V0Lg0KDQpVc2luZyB0aGlzIG1ldGhvZCB0aGUgYWNjdXJhY3kgb2YgdGhlIG1vZGVscyBkZWNyZWFzZXMgYSBsb3QsIHNvIGl0IGlzIG5vdCBhIGdvb2Qgb3B0aW9uLg0KDQpgYGB7cn0NCnNhZnNfY3RybCA8LSBzYWZzQ29udHJvbChmdW5jdGlvbnMgPSBjYXJldFNBLCBtZXRob2QgPSAicmVwZWF0ZWRjdiIsIHJlcGVhdHMgPSAzKQ0KdHJhaW5fY3RybCA8LSB0cmFpbkNvbnRyb2wobWV0aG9kID0gInJlcGVhdGVkY3YiLCByZXBlYXRzID0gMykNCmxyX3NhZnNfM3gxMGN2IDwtIHNhZnMoeCA9IHRyYWluaW5nWywgLW5jb2wodHJhaW5pbmcpXSwgeSA9IHRyYWluaW5nJHR5cGUsIG1ldGhvZCA9ICJMb2dpdEJvb3N0IiwgdHJDb250cm9sID0gdHJhaW5fY3RybCwgc2Fmc0NvbnRyb2wgPSBzYWZzX2N0cmwpDQpscl9zYWZzXzN4MTBjdg0KbGRhX3NhZnNfM3gxMGN2IDwtIHNhZnMoeCA9IHRyYWluaW5nWywgLW5jb2wodHJhaW5pbmcpXSwgeSA9IHRyYWluaW5nJHR5cGUsIG1ldGhvZCA9ICJsZGEiLCB0ckNvbnRyb2wgPSB0cmFpbl9jdHJsLCBzYWZzQ29udHJvbCA9IHNhZnNfY3RybCkNCmxkYV9zYWZzXzN4MTBjdg0KYGBgDQoNCiMgRmVhdHVyZSBFeHRyYWN0aW9uDQoNClVubGlrZSBmZWF0dXJlIHNlbGVjdGlvbiwgc2V0IG9mIG5ldyBmZWF0dXJlcyBpcyBjb25zdHJ1Y3RlZCBmcm9tIG9yaWdpbmFsIG9uZXMsIHdoaWNoIGFyZSBjb21tb25seSBsaW5lYXIgY29tYmluYXRpb25zIG9mIG9yaWdpbmFsIG9uZXMuIFRoZXJlIGFyZSBtdWx0aXBsZSBtZXRob2RzIHRvIGRvIGZlYXR1cmUgZXh0cmFjdGlvbiBzdWNoIGFzIFByaW5jaXBhbCBDb21wb25lbnQgQW5hbHlzaXMgKFBDQSkgYW5kIExpbmVhciBEaXNjcmltaW5hbnQgQW5hbHlzaXMgKExEQSkuIFVubGlrZSBQQ0EsIExEQSBpcyBhIHN1cGVydmlzZWQgbWV0aG9kIHRoYXQgY2FuIGFsc28gYmUgdXNlZCBmb3IgY2xhc3NpZmljYXRpb24uIFRoaXMgdGltZSB3ZSB3aWxsIG9ubHkgYXBwbHkgUENBIHRvIGJvdGggY2xhc3NpZmllcnMsIGJlY2F1c2Ugb25lIG9mIG91ciBjbGFzc2lmaWVycyBpcyBMREEuDQoNCiMjIFN1bW1hcnkgVGFibGUNCg0KSXQgaXMgZWFzeSB0byBsZWFybiBhIFBDQSBpbiBSIHdpdGggdGhlIGBwcmNvbXBgIGZ1bmN0aW9uLiBGaXJzdCwgd2Ugd2lsbCBwcmludCB0aGUgc3VtbWFyeSBvZiB0aGUgcHJpbmNpcGFsIGNvbXBvbmVudHMuIFdlIGNhbiBzZWUgdGhhdCB0aGVyZSBhcmUgMTgxIHByaW5jaXBhbCBjb21wb25lbnRzLiBUaGUgcHJpbmNpcGFsIGNvbXBvbmVudHMgYXJlIG5vdCB2ZXJ5IGdvb2QsIHRoZWlyIHByb3BvcnRpb24gb2YgdmFyaWFuY2UgaXMgZ2VuZXJhbGx5IHZlcnkgbG93LiBXZSB3b3VsZCBoYXZlIHRvIHNlbGVjdCBtYW55IHByaW5jaXBhbCBjb21wb25lbnRzIHRvIGdldCBhIGhpZ2ggcHJvcG9ydGlvbiBvZiB2YXJpYW5jZS4NCg0KYGBge3J9DQpwY2FfcmVzIDwtIHByY29tcChzY2FsZSh0cmFpbmluZ1ssIC1uY29sKHRyYWluaW5nKV0pKQ0Kc3VtbWFyeShwY2FfcmVzKQ0KYGBgDQoNCiMjIFZhcmlhbmNlIFBsb3RzDQoNCldlIGNhbiB2aXN1YWxpemUgdGhlIHByZXZpb3VzIHZhbHVlcyBpbiBkaWZmZXJlbnQgcGxvdHMgdG8gZ2V0IGFiZXR0ZXIgaWRlYSBvZiB0aGUgdmFyaWFuY2Ugb2YgdGhlIHByaW5jaXBhbCBjb21wb25lbnRzLg0KDQpgYGB7cn0NCnBjYV9yZXNfdmFyIDwtIHBjYV9yZXMkc2RldiBeIDINCnBjYV9yZXNfcHZhciA8LSBwY2FfcmVzX3Zhci9zdW0ocGNhX3Jlc192YXIpDQoNCnBsb3QocGNhX3Jlc19wdmFyLHhsYWI9IlByaW5jaXBhbCBjb21wb25lbnQiLCB5bGFiPSJQcm9wb3J0aW9uIG9mIHZhcmlhbmNlIGV4cGxhaW5lZCIsIHlsaW09YygwLDEpLCB0eXBlPSdiJykNCnBsb3QoY3Vtc3VtKHBjYV9yZXNfcHZhcikseGxhYj0iUHJpbmNpcGFsIGNvbXBvbmVudCIsIHlsYWI9IkN1bXVsYXRpdmUgUHJvcG9ydGlvbiBvZiB2YXJpYW5jZSBleHBsYWluZWQiLCB5bGltPWMoMCwxKSwgdHlwZT0nYicpDQpzY3JlZXBsb3QocGNhX3Jlcyx0eXBlPSJsIikNCmBgYA0KDQojIyBNYWluIENvbXBvbmVudHMgUGxvdA0KDQpXZSB2aXN1YWxpemUgaW4gYSAyLUQgZ3JhcGggdHdvIGZpcnN0IGNvbXBvbmVudHMsIHRob3NlIHRoYXQgc2F2ZSBsYXJnZXIgdmFyaWFiaWxpdHkgb2Ygb3JpZ2luYWwgZGF0YS4gVGhlIGFpbSBpcyB0byBmaW5kIGFuIGludHVpdGl2ZSBzZXBhcmF0aW9uIG9mIHByb2JsZW0gY2xhc3Nlcy4gQXMgZXhwZWN0ZWQsIHRoZXJlIGlzIG5vIGNsZWFyIHNlcGFyYXRpb24gYmV0d2VlbiB0aGUgY2xhc3Nlcy4gVGhlIHZhcmlhbmNlIG9mIHRoZSBwcmluY2lwYWwgY29tcG9uZW50cyBpcyB0b28gbG93IHRvIGRlY2lkZSB0aGUgdHdvIGNsYXNzZXMuDQoNCmBgYHtyfQ0KcGxvdChtYWluPSJQcmluY2lwYWwgQ29tcG9uZW50cyIsIHBjYV9yZXMkeFssMToyXSwgY29sID0gdHJhaW5pbmckdHlwZSkNCmBgYA0KDQojIyBDbGFzc2lmaWNhdGlvbg0KDQpGaW5hbGx5LCB3ZSBjYW4gdXNlIGBjYXJldGAgdG8gdGVzdCB0aGUgcGVyZm9ybWFuY2Ugb2YgdGhlIHR3byBtb2RlbHMgaWYgd2UgYXBwbHkgUENBIGFzIGEgcHJlcHJvY2Vzc2luZyBvcHRpb24uIFRoZSBgcHJlUHJvY2Vzc2AgcGFyYW1ldGVyIGRlZmluZXMgdGhlIHByZXByb2Nlc3Npbmcgc3RlcHMgdG8gYmUgYXBwbGllZC4gVGhleSBhcmUgcG9wdWxhciB3aXRoIGNsYXNzaWMgbnVtZXJpYyB2YXJpYWJsZXMsIHN1Y2ggYXMgaW1wdXRhdGlvbiBvZiBtaXNzaW5nIHZhbHVlcywgY2VudGVyaW5nIGFuZCBzY2FsaW5nLCBldGMuIEFzIE5MUCBkYXRhc2V0cyBoYXZlIHRoZWlyIG93biBwcmVwcm9jZXNzaW5nIHRvb2xzLCB0aGV5IGhhdmUgbm90IGJlZW4gYXBwbGllZCB1bnRpbCBub3cuIEhvd2V2ZXIsIGNhcmV0IG9mZmVycyBgcGNhYCBhcyBhIHByZXBvY2Vzc2luZyBvcHRpb24uIFR3byBtb3JlIHByZXByb2Nlc3NpbmcgZnVuY3Rpb25zIGFyZSBhcHBsaWVkOiBgY2VudGVyYCBhbmQgYHNjYWxlYC4NCg0KQXMgd2UgZXhwZWN0ZWQsIGFwcGx5aW5nIFBDQSBkb2VzIG5vdCBpbXByb3ZlIHRoZSByZXN1bHRzIG9mIHRoZSBjbGFzc2lmaWVycy4gSW4gZmFjdCwgdGhlIHJlc3VsdHMgYXJlIHdvcnNlIGZvciBib3RoIGNsYXNzaWZpZXJzLg0KDQpgYGB7cn0NCiMgZml4aW5nIHRoZSBwZXJmb3JtYW5jZSBlc3RpbWF0aW9uIHByb2NlZHVyZQ0KdHJhaW5fY3RybCA8LSB0cmFpbkNvbnRyb2wobWV0aG9kID0gInJlcGVhdGVkY3YiLCByZXBlYXRzID0gMykNCmxyX3BjYV8zeDEwY3YgPC0gdHJhaW4odHlwZSB+IC4sIGRhdGEgPSB0cmFpbmluZywgbWV0aG9kID0gIkxvZ2l0Qm9vc3QiLCBwcmVQcm9jZXNzID0gYygiY2VudGVyIiwgInNjYWxlIiwgInBjYSIpLCB0ckNvbnRyb2wgPSB0cmFpbl9jdHJsKQ0KbHJfcGNhXzN4MTBjdg0KbGRhX3BjYV8zeDEwY3YgPC0gdHJhaW4odHlwZSB+IC4sIGRhdGEgPSB0cmFpbmluZywgbWV0aG9kID0gImxkYSIsIHByZVByb2Nlc3MgPSBjKCJjZW50ZXIiLCAic2NhbGUiLCAicGNhIiksIHRyQ29udHJvbCA9IHRyYWluX2N0cmwpDQpsZGFfcGNhXzN4MTBjdg0KYGBgDQoNCiMgVGVzdGluZw0KDQpJbiBvcmRlciB0byBwcmVkaWN0IHRoZSBjbGFzcyB2YWx1ZSBvZiB1bnNlZW4gZG9jdW1lbnRzIG9mIHRoZSB0ZXN0IHBhcnRpdGlvbiBjYXJldCB1c2VzIHRoZSBjbGFzc2lmaWVyIHdoaWNoIHNob3dzIHRoZSBiZXN0IGFjY3VyYWN5IGVzdGltYXRpb24gb2YgdGhlaXIgcGFyYW1ldGVycy4gRnVuY3Rpb24gcHJlZGljdCBpbXBsZW1lbnRzIHRoaXMgZnVuY3Rpb25hbGl0eS4gQ29uc3VsdCBpdHMgcGFyYW1ldGVycy4gVGhlIGB0eXBlYCBwYXJhbWV0ZXIsIGJ5IG1lYW5zIG9mIGl0cyBgcHJvYnNgIHZhbHVlLCBvdXRwdXRzIHRoZSBwcm9iYWJpbGl0eSBvZiB0ZXN0IGVhY2ggc2FtcGxlIGJlbG9uZ2luZyB0byBlYWNoIGNsYXNzLiBPbiB0aGUgb3RoZXIgaGFuZCwgdGhlIGByYXdgIHZhbHVlIG91dHB1dHMgdGhlIGNsYXNzIHZhbHVlIHdpdGggdGhlIGxhcmdlc3QgcHJvYmFiaWxpdHkuIEJ5IG1lYW5zIG9mIHRoZSBgcmF3YCBvcHRpb24gdGhlIGNvbmZ1c2lvbiBtYXRyaXggY2FuIGJlIGNhbGN1bGF0ZWQ6IHRoaXMgY3Jvc3NlcywgZm9yIGVhY2ggdGVzdCBzYW1wbGUsIHByZWRpY3RlZCB3aXRoIHJlYWwgY2xhc3MgdmFsdWVzLg0KDQpBbGwgdGhlIHByZXZpb3VzbHkgbGVhcm5lZCBjbGFzc2lmaWVycyBhcmUgdGVzdGVkIG9uIHRoZSB0ZXN0IHBhcnRpdGlvbi4gVGhlcmUgYXJlIDEwIGRpZmZlcmVudCBjbGFzc2lmaWVycyBpbiB0b3RhbCwgdGhlIHR3byBtYWluIHR5cGVzIHdpdGggdGhlIHZhcmlhdGlvbnMgb2YgZmVhdHVyZSBzZWxlY3Rpb24gYW5kIGV4dHJhY3Rpb24uIEFzIGV4cGVjdGVkLCB0aGUgYWNjdXJhY3kgZm9yIHRoZSB0ZXN0aW5nIHBhcnRpdGlvbiBpcyBhIGJpdCBsb3dlciB0aGFuIHRoZSB0cmFpbiBwYXJ0aXRpb24uIFNwZWNpZmljaXR5IGlzIGhpZ2hlciB0aGFuIFNlbnNpdGl2aXR5IGluIGFsbCB0aGUgY2FzZXMsIHdoaWNoIG1lYW5zIHRoYXQgb3VyIG1vZGVsIGlzIGJldHRlciBhdCBwcmVkaWN0aW5nIHNhbXBsZXMgb2YgY2xhc3MgMjogY2xhc2hyb3lhbGUuIFRoaXMgY2FuIGFsc28gYmUgc2VlbiBpbiB0aGUgY29uZnVzaW9uIG1hdHJpY2VzLiBUaGUgcGVyZm9ybWFuY2Ugb2YgZWFjaCBhbGdvcml0aG0gd2lsbCBiZSBjb21wYXJlZCBtb3JlIGluIGRldGFpbCBpbiB0aGUgbmV4dCBzZWN0aW9uLg0KDQojIyBMREENCg0KYGBge3J9DQpsZGFfcHJlZCA8LSBwcmVkaWN0KGxkYV8zeDEwY3YsIG5ld2RhdGEgPSB0ZXN0aW5nLCB0eXBlID0gInJhdyIpDQpjb25mdXNpb25NYXRyaXgoZGF0YSA9IGxkYV9wcmVkLCB0ZXN0aW5nJHR5cGUpDQpgYGANCiMjIExEQSBTQkYNCg0KYGBge3J9DQpsZGFfc2JmX3ByZWQgPC0gcHJlZGljdChsZGFfc2JmXzN4MTBjdiwgbmV3ZGF0YSA9IHRlc3RpbmcsIHR5cGUgPSAicmF3IikNCmNvbmZ1c2lvbk1hdHJpeChkYXRhID0gbGRhX3NiZl9wcmVkJHByZWQsIHRlc3RpbmckdHlwZSkNCmBgYA0KDQojIyBMREEgUkZFDQoNCmBgYHtyfQ0KbGRhX3JmZV9wcmVkIDwtIHByZWRpY3QobGRhX3JmZV8zeDEwY3YsIG5ld2RhdGEgPSB0ZXN0aW5nKQ0KY29uZnVzaW9uTWF0cml4KGRhdGEgPSBsZGFfcmZlX3ByZWQkcHJlZCwgdGVzdGluZyR0eXBlKQ0KYGBgDQoNCiMjIExEQSBTQUZTDQoNCmBgYHtyfQ0KbGRhX3NhZnNfcHJlZCA8LSBwcmVkaWN0KGxkYV9zYWZzXzN4MTBjdiwgbmV3ZGF0YSA9IHRlc3RpbmcsIHR5cGUgPSAicmF3IikNCmNvbmZ1c2lvbk1hdHJpeChkYXRhID0gbGRhX3NhZnNfcHJlZCwgdGVzdGluZyR0eXBlKQ0KYGBgDQoNCiMjIExEQSBQQ0ENCg0KYGBge3J9DQpsZGFfcGNhX3ByZWQgPC0gcHJlZGljdChsZGFfcGNhXzN4MTBjdiwgbmV3ZGF0YSA9IHRlc3RpbmcsIHR5cGUgPSAicmF3IikNCmNvbmZ1c2lvbk1hdHJpeChkYXRhID0gbGRhX3BjYV9wcmVkLCB0ZXN0aW5nJHR5cGUpDQpgYGANCg0KIyMgQkxSDQoNCmBgYHtyfQ0KbHJfcHJlZCA8LSBwcmVkaWN0KGxyXzN4MTBjdiwgbmV3ZGF0YSA9IHRlc3RpbmcsIHR5cGUgPSAicmF3IikNCmNvbmZ1c2lvbk1hdHJpeChkYXRhID0gbHJfcHJlZCwgdGVzdGluZyR0eXBlKQ0KYGBgDQoNCiMjIEJMUiBTQkYNCg0KYGBge3J9DQpscl9zYmZfcHJlZCA8LSBwcmVkaWN0KGxyX3NiZl8zeDEwY3YsIG5ld2RhdGEgPSB0ZXN0aW5nLCB0eXBlID0gInJhdyIpDQpjb25mdXNpb25NYXRyaXgoZGF0YSA9IGxyX3NiZl9wcmVkJHByZWQsIHRlc3RpbmckdHlwZSkNCmBgYA0KDQojIyBCTFIgUkZFDQoNCmBgYHtyfQ0KbHJfcmZlX3ByZWQgPC0gcHJlZGljdChscl9yZmVfM3gxMGN2LCBuZXdkYXRhID0gdGVzdGluZykNCmNvbmZ1c2lvbk1hdHJpeChkYXRhID0gbHJfcmZlX3ByZWQkcHJlZCwgdGVzdGluZyR0eXBlKQ0KYGBgDQoNCiMjIEJMUiBTQUZTDQoNCmBgYHtyfQ0KbHJfc2Fmc19wcmVkIDwtIHByZWRpY3QobHJfc2Fmc18zeDEwY3YsIG5ld2RhdGEgPSB0ZXN0aW5nLCB0eXBlID0gInJhdyIpDQpjb25mdXNpb25NYXRyaXgoZGF0YSA9IGxyX3NhZnNfcHJlZCwgdGVzdGluZyR0eXBlKQ0KYGBgDQoNCiMjIEJMUiBQQ0ENCg0KYGBge3J9DQpscl9wY2FfcHJlZCA8LSBwcmVkaWN0KGxyX3BjYV8zeDEwY3YsIG5ld2RhdGEgPSB0ZXN0aW5nLCB0eXBlID0gInJhdyIpDQpjb25mdXNpb25NYXRyaXgoZGF0YSA9IGxyX3BjYV9wcmVkLCB0ZXN0aW5nJHR5cGUpDQpgYGANCg0KIyBDb21wYXJpc29uDQoNCkFzIGEgZmluYWwgc3RlcCwgd2Ugd2lsbCBjb21wYXJlIHRoZSAxMCBtb2RlbHMgdGhhdCB3ZSBoYXZlIHRyYWluZWQuIEZpcnN0LCB3ZSB3aWxsIGNvbXBhcmUgdGhlIHJlc3VsdHMgaW4gYSB0YWJsZS4gVGhlbiwgd2Ugd2lsbCBjcmVhdGUgc29tZSBwbG90cyB0byBjb21wYXJlIHBlcmZvcm1hbmNlIG9mIHRoZSBhbGdvcml0aG1zIHZpc3VhbGx5LiBGaW5hbGx5LCB3ZSB3aWxsIHBlcmZvcm0gYSBzdGF0aXN0aWNhbCBzaWduaWZpY2FuY2UgdGVzdCB0byBrbm93IGlmIHRoZXJlIGlzIGEgc2lnbmlmaWNhbnQgZGlmZmVyZW5jZSBiZXR3ZWVuIHBhaXJzIG9mIGNsYXNzaWZpZXJzLiANCg0KIyMgU3VtbWFyeSBUYWJsZXMNCg0KVGhpcyBpcyB0aGUgZWFzaWVzdCBjb21wYXJpc29uIHRoYXQgd2UgY2FuIGRvLCBzaW1wbHkgY2FsbCB0aGUgYHN1bW1hcnlgIGZ1bmN0aW9uIGFuZCBwYXNzIGl0IHRoZSBgcmVzYW1wbGVzYCByZXN1bHQuIEl0IHdpbGwgY3JlYXRlIGEgdGFibGUgd2l0aCBvbmUgYWxnb3JpdGhtIGZvciBlYWNoIHJvdyBhbmQgZXZhbHVhdGlvbiBtZXRyaWNzIGZvciBlYWNoIGNvbHVtbi4gDQoNCkJ5IGxvb2tpbmcgYXQgdGhvc2UgdmFsdWVzIHdlIGNhbiBoYXZlIGFuIGlkZWEgb2Ygd2hpY2ggY2xhc3NpZmllcnMgYXJlIHRoZSBiZXN0IG9uZXMuIElmIHdlIGxvb2sgYXQgdGhlIGJhc2UgY2xhc3NpZmllcnMsIExEQSBpcyBiZXR0ZXIgdGhhbiBMUi4gSG93ZXZlciwgYXBwbHlpbmcgU0JGIG9yIFJGRSBmZWF0dXJlIHNlbGVjdGlvbiBpbXByb3ZlcyB0aGUgcmVzdWx0cyBvZiBib3RoIGNsYXNzaWZpZXJzIGFuZCBtYWtlcyB0aGVtIHNpbWlsYXIuIFRoZSBvdGhlciBmZWF0dXJlIHNlbGVjdGlvbiBhbmQgZXh0cmFjdGlvbiBtZXRob2RzIG1ha2UgdGhlIHJlc3VsdHMgb2YgYm90aCBjbGFzc2lmaWVycyB3b3JzZS4NCg0KYGBge3J9DQpyZXNhbXBzIDwtIHJlc2FtcGxlcyhsaXN0KGxyID0gbHJfM3gxMGN2LCBscl9zYmYgPSBscl9zYmZfM3gxMGN2LCBscl9yZmUgPSBscl9yZmVfM3gxMGN2LCBscl9zYWZzID0gbHJfc2Fmc18zeDEwY3YsIGxyX3BjYSA9IGxyX3BjYV8zeDEwY3YsIGxkYSA9IGxkYV8zeDEwY3YsIGxkYV9zYmYgPSBsZGFfc2JmXzN4MTBjdiwgbGRhX3JmZSA9IGxkYV9yZmVfM3gxMGN2LCBsZGFfc2FmcyA9IGxkYV9zYWZzXzN4MTBjdiwgbGRhX3BjYSA9IGxkYV9wY2FfM3gxMGN2KSkNCnN1bW1hcnkocmVzYW1wcykNCmBgYA0KDQojIyBCb3ggYW5kIFdoaXNrZXIgUGxvdHMNCg0KVGhpcyBpcyBhIHVzZWZ1bCB3YXkgdG8gbG9vayBhdCB0aGUgc3ByZWFkIG9mIHRoZSBlc3RpbWF0ZWQgYWNjdXJhY2llcyBmb3IgZGlmZmVyZW50IG1ldGhvZHMgYW5kIGhvdyB0aGV5IHJlbGF0ZS4gTm90ZSB0aGF0IHRoZSBib3hlcyBhcmUgb3JkZXJlZCBmcm9tIGhpZ2hlc3QgdG8gbG93ZXN0IG1lYW4gYWNjdXJhY3kuIFRoZXkgYXJlIHVzZWZ1bCB0byBsb29rIGF0IHRoZSBtZWFuIHZhbHVlcyAoZG90cykgYW5kIHRoZSBib3hlcyAobWlkZGxlIDUwJSBvZiByZXN1bHRzKS4gV2UgY2FuIGV4dHJhY3QgdGhlIHNhbWUgY29uY2x1c2lvbnMgd2UgZXh0cmFjdGVkIGJ5IGxvb2tpbmcgYXQgdGhlIHRhYmxlIGVhc2llciBieSBsb29raW4gYXQgdGhpcyBwbG90Lg0KDQpgYGB7cn0NCnNjYWxlcyA8LSBsaXN0KHg9bGlzdChyZWxhdGlvbj0iZnJlZSIpLCB5PWxpc3QocmVsYXRpb249ImZyZWUiKSkNCmJ3cGxvdChyZXNhbXBzLCBzY2FsZXM9c2NhbGVzKQ0KYGBgDQoNCiMjIERlbnNpdHkgUGxvdHMNCg0KV2UgY2FuIHNob3cgdGhlIGRpc3RyaWJ1dGlvbiBvZiBtb2RlbCBhY2N1cmFjeSBhcyBkZW5zaXR5IHBsb3RzLiBUaGlzIGlzIGEgdXNlZnVsIHdheSB0byBldmFsdWF0ZSB0aGUgb3ZlcmxhcCBpbiB0aGUgZXN0aW1hdGVkIGJlaGF2aW9yIG9mIGFsZ29yaXRobXMuIFRoZXkgYXJlIGFsc28gdG8gbG9vayBhdCB0aGUgZGlmZmVyZW5jZXMgaW4gdGhlIHBlYWtzIGFzIHdlbGwgYXMgdGhlIHZhcmlhbmNlIG9mIHRoZSBkaXN0cmlidXRpb25zLg0KDQpgYGB7cn0NCnNjYWxlcyA8LSBsaXN0KHg9bGlzdChyZWxhdGlvbj0iZnJlZSIpLCB5PWxpc3QocmVsYXRpb249ImZyZWUiKSkNCmRlbnNpdHlwbG90KHJlc2FtcHMsIHNjYWxlcz1zY2FsZXMsIHBjaCA9ICJ8IikNCmBgYA0KDQojIyBEb3QgUGxvdHMNCg0KVGhlc2UgYXJlIHVzZWZ1bCBwbG90cyBhcyB0aGUgc2hvdyBib3RoIHRoZSBtZWFuIGVzdGltYXRlZCBhY2N1cmFjeSBhcyB3ZWxsIGFzIHRoZSA5NSUgY29uZmlkZW5jZSBpbnRlcnZhbC4gVGhleSBhcmUgdXNlZnVsIHRvIGNvbXBhcmUgdGhlIG1lYW5zIGFuZCB0aGUgb3ZlcmxhcCBvZiB0aGUgc3ByZWFkcyBiZXR3ZWVuIGFsZ29yaXRobXMuIFdlIGNhbiBjb21wYXJlIGFsZ29yaXRobXMgbGlrZSB3ZSBkaWQgd2l0aCB0aGUgYm94cGxvdC4NCg0KYGBge3J9DQpzY2FsZXMgPC0gbGlzdCh4PWxpc3QocmVsYXRpb249ImZyZWUiKSwgeT1saXN0KHJlbGF0aW9uPSJmcmVlIikpDQpkb3RwbG90KHJlc2FtcHMsIHNjYWxlcz1zY2FsZXMpDQpgYGANCg0KIyMgU2NhdHRlcnBsb3QgTWF0cml4DQoNClRoaXMgY3JlYXRlcyBhIHNjYXR0ZXJwbG90IG1hdHJpeCBvZiBhbGwgcmVzdWx0cyBmb3IgYW4gYWxnb3JpdGhtIGNvbXBhcmVkIHRvIHRoZSByZXN1bHRzIGZvciBhbGwgb3RoZXIgYWxnb3JpdGhtcy4gVGhlc2UgYXJlIHVzZWZ1bCB0byBjb21wYXJlIHBhaXJzIG9mIGFsZ29yaXRobXMuDQoNCmBgYHtyIGZpZy5oZWlnaHQ9MTAsIGZpZy53aWR0aD0xMH0NCnNwbG9tKHJlc2FtcHMpDQpgYGANCg0KIyMgUGFpcndpc2UgeHlQbG90cw0KDQpXZSBjYW4gem9vbSBpbiBvbiBvbmUgcGFpci13aXNlIGNvbXBhcmlzb24gb2YgdGhlIGFjY3VyYWN5IGZvciB0d28gYWxnb3JpdGhtcyB3aXRoIGFuIHh5cGxvdC4gRm9yIGV4YW1wbGUsIHdlIGNhbiBjb21wYXJlIHRoZSB0d28gbWFpbiBhbGdvcml0aG1zIHRvIHNlZSB0aGF0IExEQSBpcyBiZXR0ZXIgdGhhbiBMUi4NCg0KYGBge3J9DQp4eXBsb3QocmVzYW1wcywgd2hhdCA9ICJCbGFuZEFsdG1hbiIsIG1vZGVscyA9IGMoImxyIiwgImxkYSIpKQ0KYGBgDQoNCkFub3RoZXIgdXNlZnVsIGNvbXBhcmlzb24gaXMgdG8gY2hlY2sgdGhlIGVmZmVjdCBvZiBmZWF0dXJlIHNlbGVjdGlvbiBhbmQgZXh0cmFjdGlvbi4gRm9yIHRoZSBMb2dpc3RpYyBSZWdyZXNzaW9uIGFsZ29yaXRobSwgVW5pdmFyaWF0ZSBGaWx0ZXJzIGFuZCBSZWN1cnNpdmUgRmVhdHVyZSBFbGltaW5hdGlvbiBpbXByb3ZlIHRoZSBhY2N1cmFjeS4gSG93ZXZlciwgU2ltdWxhdGVkIEFubmVhbGluZyBhbmQgUHJpbmNpcGFsIENvbXBvbmVudCBBbmFseXNpcyBnZXQgd29yc2UgcmVzdWx0cy4NCg0KYGBge3J9DQp4eXBsb3QocmVzYW1wcywgd2hhdCA9ICJCbGFuZEFsdG1hbiIsIG1vZGVscyA9IGMoImxyIiwgImxyX3NiZiIpKQ0KYGBgDQoNCmBgYHtyfQ0KeHlwbG90KHJlc2FtcHMsIHdoYXQgPSAiQmxhbmRBbHRtYW4iLCBtb2RlbHMgPSBjKCJsciIsICJscl9yZmUiKSkNCmBgYA0KDQpgYGB7cn0NCnh5cGxvdChyZXNhbXBzLCB3aGF0ID0gIkJsYW5kQWx0bWFuIiwgbW9kZWxzID0gYygibHIiLCAibHJfc2FmcyIpKQ0KYGBgDQoNCmBgYHtyfQ0KeHlwbG90KHJlc2FtcHMsIHdoYXQgPSAiQmxhbmRBbHRtYW4iLCBtb2RlbHMgPSBjKCJsciIsICJscl9wY2EiKSkNCmBgYA0KDQojIyBTdGF0aXN0aWNhbCBTaWduaWZpY2FuY2UgVGVzdHMNCg0KTm90ZSB0aGFuIGluIG91ciBjYXNlLCBkdWUgdG8gdGhlIDMgcmVwZXRpdGlvbnMgb2YgdGhlIDEwLWZvbGQgY3Jvc3MtdmFsaWRhdGlvbiBwcm9jZXNzLCB0aGVyZSBhcmUgMzAgcmVzYW1wbGluZyByZXN1bHRzIGZvciBlYWNoIGNsYXNzaWZpZXIuIFRoZSBzYW1lIHBhaXJlZCBjcm9zcy12YWxpZGF0aW9uIHN1YnNldHMgb2Ygc2FtcGxlcyB3ZXJlIHVzZWQgZm9yIGFsbCBjbGFzc2lmaWVycy4gV2UgaGF2ZSB0byB1c2UgYSBwYWlyZWQgdC10ZXN0IHRvIGNhbGN1bGF0ZSB0aGUgc2lnbmlmaWNhbmNlIG9mIHRoZSBkaWZmZXJlbmNlcyBiZXR3ZWVuIGJvdGggY2xhc3NpZmllcnMuDQoNClVzaW5nIHRoZSBgZGlmZmAgZnVuY3Rpb24gb3ZlciB0aGUgYHJlc2FtcHNgIG9iamVjdCBjYWxjdWxhdGVzIHRoZSBkaWZmZXJlbmNlcyBiZXR3ZWVuIGFsbCBwYWlycyBvZiBjbGFzc2lmaWVycy4gVGhlIG91dHB1dCBzaG93cywgZm9yIGVhY2ggbWV0cmljIChhY2N1cmFjeSBhbmQga2FwcGEpLCB0aGUgZGlmZmVyZW5jZSBvZiB0aGUgbWVhbiAocG9zaXRpdmUgb3IgbmVnYXRpdmUpIGJldHdlZW4gYm90aCBjbGFzc2lmaWVycy4gVGhlIHAtdmFsdWUgb2YgdGhlIHdob2xlIHQtdGVzdCBpcyAwLCB3aGljaCBpbmRpY2F0ZXMgdGhhdCB0aGVyZSBpcyBhIHNpZ25pZmljYW50IGRpZmZlcmVuY2UgYmV0d2VlbiBzb21lIGNsYXNzaWZpZXJzLiBUaGVyZWZvcmUsIHdlIGNhbiBkaXNjYXJkIHRoZSBudWxsIGh5cG90aGVzaXMgdGhhdCBzYXlzIHRoYXQgdGhlcmUgaXMgbm8gZGlmZmVyZW5jZSBiZXR3ZWVuIGNsYXNzaWZpZXJzLg0KDQpUaGUgaW50ZXJwcmV0YXRpb24gb2YgdGhlIHAtdmFsdWUgaXMgdGhlIGtleSBwb2ludC4gSXQgaXMgcmVsYXRlZCB3aXRoIHRoZSByaXNrIG9mIGVycm9uZW91c2x5IGRpc2NhcmRpbmcgdGhlIG51bGwtaHlwb3RoZXNpcyBvZiBzaW1pbGFyaXR5IGJldHdlZW4gY29tcGFyZWQgY2xhc3NpZmllcnMsIHdoZW4gdGhlcmUgaXMgbm8gcmVhbCBkaWZmZXJlbmNlLiBSb3VnaGx5IHNwZWFraW5nLCBpdCBjYW4gYWxzbyBiZSBpbnRlcnByZXRlZCBhcyB0aGUgZGVncmVlIG9mIHNpbWlsYXJpdHkgYmV0d2VlbiBib3RoIGNsYXNzaWZpZXJzLiBBIHAtdmFsdWUgc21hbGxlciB0aGFuIDAuMDUgYWxlcnRzIGFib3V0IHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQgZGlmZmVyZW5jZXMgYmV0d2VlbiBib3RoIGNsYXNzaWZpZXJzLiBUaGF0IGlzLCB3aGVuIHRoZSByaXNrIG9mIGVycm9uZW91c2x5IGRpc2NhcmRpbmcgdGhlIGh5cG90aGVzaXMgb2Ygc2ltaWxhcml0eSBiZXR3ZWVuIGJvdGggY2xhc3NpZmllcnMgaXMgbG93LCB3ZSBhc3N1bWUgdGhhdCB0aGVyZSBpcyBhIHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQgZGlmZmVyZW5jZSBiZXR3ZWVuIGNsYXNzaWZpZXJzLg0KDQpUaGUgbG93ZXIgZGlhZ29uYWwgb2YgdGhlIHRhYmxlIHNob3dzIHAtdmFsdWVzIGZvciB0aGUgbnVsbCBoeXBvdGhlc2lzLiBUaGUgdXBwZXIgZGlhZ29uYWwgb2YgdGhlIHRhYmxlIHNob3dzIHRoZSBlc3RpbWF0ZWQgZGlmZmVyZW5jZSBiZXR3ZWVuIHRoZSBkaXN0cmlidXRpb25zLiBXZSBjYW4gc2VlIHRoYXQgaXMgY29tZSBjYXNlcyB0aGUgcC12YWx1ZSBpcyBiaWdnZXIgdGhhbiAwLjA1IGFuZCB0aGVyZWZvcmUgd2UgY2FuIG5vdCBkaXNjYXJkIHRoZSBudWxsIGh5cG90aGVzaXMuIEluIHNvbWUgb3RoZXIgY2FzZXMsIHRoZSBwLXZhbHVlIGlzIHNtYWxsZXIgdGhhbiAwLjA1IHNvIHdlIGNhbiBzdXJlbHkgZGlzY2FyZCB0aGUgbnVsbCBoeXBvdGhlc2lzLg0KDQpXZSBjYW4gc2VlIHRoYXQgYWxsIHRoZSBpZGVhcyB0aGF0IHdlIGhhZCBiZWZvcmUgd2hlbiBjb21wYXJpbmcgY2xhc3NpZmllcnMgYXJlIGNvbmZpcm1lZCB3aXRoIHRoZSBzdGF0aXN0aWNhbCB0ZXN0LiBTb21lIGNsYXNzaWZpZXJzIGFyZSBzaWduaWZpY2FudGx5IGJldHRlciB0aGFuIG90aGVycy4gVGhlIGJhc2UgTERBIGlzIGJldHRlciB0aGFuIHRoZSBiYXNlIExSLCBhcHBseWluZyBTQkYgYW5kIFJGRSBpbXByb3ZlcyB0aGUgcmVzdWx0cyBhbmQgYXBwbHlpbmcgU0FGUyBhbmQgUENBIG1ha2VzIHJlc3VsdHMgd29yc2UuDQoNCmBgYHtyfQ0KZGlmZnMgPC0gZGlmZihyZXNhbXBzKQ0Kc3VtbWFyeShkaWZmcykNCmBgYA0KDQojIEJpYmxpb2dyYXBoeQ0KDQpbMV0gSW5nbyBGZWluZXJlci4gdG06IFRleHQgTWluaW5nIFBhY2thZ2UsIDIwMTIuIFIgcGFja2FnZSB2ZXJzaW9uIDAuNS03LjEuDQoNClsyXSBJbmdvIEZlaW5lcmVyLCBLdXJ0IEhvcm5paywgYW5kIERhdmlkIE1leWVyLiBUZXh0IG1pbmluZyBpbmZyYXN0cnVjdHVyZSBpbiBSLiBKb3VybmFsIG9mIFN0YXRpc3RpY2FsIFNvZnR3YXJlLCAyNSg1KToxLTU0LCAzIDIwMDguDQoNClszXSBJYW4gRmVsbG93cy4gd29yZGNsb3VkOiBXb3JkIENsb3VkcywgMjAxNC4gUiBwYWNrYWdlIHZlcnNpb24gMi41Lg0KDQpbNF0gTS4gS3VobiBhbmQgSy4gSm9obnNvbi4gQXBwbGllZCBQcmVkaWN0aXZlIE1vZGVsaW5nLiBTcHJpbmdlciwgMjAxMy4NCg0KWzVdIE1heCBLdWhuLiBDb250cmlidXRpb25zIGZyb20gSmVkIFdpbmcsIFN0ZXZlIFdlc3RvbiwgQW5kcmUgV2lsbGlhbXMsIENocmlzIEtlZWZlciwgQWxsYW4gRW5nZWxoYXJkdCwgVG9ueSBDb29wZXIsIFphY2hhcnkgTWF5ZXIsIGFuZCB0aGUgUiBDb3JlIFRlYW0uIGNhcmV0OiBDbGFzc2lmaWNhdGlvbiBhbmQgUmVncmVzc2lvbiBUcmFpbmluZywgMjAxNC4gUiBwYWNrYWdlIHZlcnNpb24gNi4wLTM1Lg0K

Preprocessing, clustering and classification of tweets in R

Julen Etxaniz