If you've been playing with the documentTermMatrix from the tm package in R, you might have encountered this error:
Error in vector(typeof(x$v), nr * nc) : vector size cannot be NA
In addition: Warning message:
In nr * nc : NAs produced by integer overflow
The object containing the data is too large to be converted to a matrix. How do we get around this? We need to remove sparse items.
corp <- ectorsource="" nbsp="" orpus="" x="">% tm_map(content_transformer(tolower)) %>% tm_map(stripWhitespace) %>% tm_map(stemDocument) %>% tm_map(removePunctuation)->
dtm <- corp2="" documenttermmatrix="" p="">density <- length="ncol(dtm))</p" vector="">for(i in 1:ncol(dtm))
density[i] <- dtm="" i="" j="" length="" p="">
r <- density="" which=""> 10)->
m <- as.matrix="" dtm="" p="" r="">v <- colsums="" decreasing="TRUE)</p" m="" sort="">d <- data.frame="" word="names(v),freq=v)</p">wordcloud(words<- d="" freq="" p="" word="">->->->->->->->
No comments:
Post a Comment