Match cheaters

compare_txt(txt1, txt2, n_grams = 10, across = c("both", "txt1", "txt2"))

Arguments

txt1, txt2

character vectors to compare, each of length 1.

n_grams

see ngram package.

across

How should the percentage of overlap be computed?

Value

The percent (0-1) of overlap between the texts

Examples

text1 <- "My horse is large and white, and I ride it every day." text2 <- "My mule is large and brown, and I ride it most days." compare_txt(text1, text2, n_grams = 3)
#> [1] 0.3