9.3 recommenderlab
package
We will use recommenderlab
package to build recommender systems. This package is designed for testing recommender systems in the lab setting rather than in the production setting. The package is available on CRAN.
library(recommenderlab)
library(ggplot2)
library(dplyr)
9.3.1 MovieLense data
We will use MovieLense data, which is bundled with recommenderlab
. MovieLense has user ratings for movies ranging from 1 to 5, where 5 means excellent.
We can load the data by using data()
function.
data("MovieLense")
class(MovieLense)
## [1] "realRatingMatrix"
## attr(,"package")
## [1] "recommenderlab"
We have never seen an object of class realRatingMatrix
before. This is because this is a class that is defined by the package recommenderlab
. Let’s take a look at the structure.
str(MovieLense, vec.len = 2)
## Formal class 'realRatingMatrix' [package "recommenderlab"] with 2 slots
## ..@ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
## .. .. ..@ i : int [1:99392] 0 1 4 5 9 ...
## .. .. ..@ p : int [1:1665] 0 452 583 673 882 ...
## .. .. ..@ Dim : int [1:2] 943 1664
## .. .. ..@ Dimnames:List of 2
## .. .. .. ..$ : chr [1:943] "1" "2" ...
## .. .. .. ..$ : chr [1:1664] "Toy Story (1995)" "GoldenEye (1995)" ...
## .. .. ..@ x : num [1:99392] 5 4 4 4 4 ...
## .. .. ..@ factors : list()
## ..@ normalize: NULL
MovieLense
looks like a list but we usually reference the list elements using \\$
sign rather than \\@
sign. So what seems to be different about realRatingmatrix
? According to the documentation, recommenderlab
is implemented using formal classes in the S4 class system.51 We can formally check it using isS4()
function from base R.
isS4(MovieLense)
## [1] TRUE
As this is a new class for us, it is important to understand the methods that are applicable to this class. We can use methods()
function from utils
package in base R to achieve this.
methods(class = class(MovieLense))
## [1] [ [<- binarize
## [4] calcPredictionAccuracy coerce colCounts
## [7] colMeans colSds colSums
## [10] denormalize dim dimnames
## [13] dimnames<- dissimilarity evaluationScheme
## [16] getData.frame getList getNormalize
## [19] getRatingMatrix getRatings getTopNLists
## [22] image normalize nratings
## [25] Recommender removeKnownRatings rowCounts
## [28] rowMeans rowSds rowSums
## [31] sample show similarity
## see '?methods' for accessing help and source code
Looks like realRatingMatrix
has several methods associated with it. For example, it can calculate row and column counts directly by using rowCounts()
and colCounts()
functions, respectively. Similarly, there is normalize()
function, which can be used to mean center user ratings.
Read more about S4 class here: http://adv-r.had.co.nz/OO-essentials.html↩