主要是因为我最近开始学习Matlab的开源代码.我目前在数据挖掘和机器学习领域工作.我发现许多机器学习算法是在R中实现的,我仍在探索在R中实现的不同包.
我有一个快速的问题:在数据挖掘应用方面,你如何将R与Matlab进行比较,它的流行程度、优缺点、行业和学术认可度等.?你会 Select 哪一个?为什么?
我对Matlab和R进行了各种比较,对比了各种指标,但我特别感兴趣的是它在数据挖掘和ML中的适用性.
我很感激任何建议.
主要是因为我最近开始学习Matlab的开源代码.我目前在数据挖掘和机器学习领域工作.我发现许多机器学习算法是在R中实现的,我仍在探索在R中实现的不同包.
我有一个快速的问题:在数据挖掘应用方面,你如何将R与Matlab进行比较,它的流行程度、优缺点、行业和学术认可度等.?你会 Select 哪一个?为什么?
我对Matlab和R进行了各种比较,对比了各种指标,但我特别感兴趣的是它在数据挖掘和ML中的适用性.
我很感激任何建议.
在过go 三年左右的时间里,我每天都在使用R,其中最大的一部分用于机器学习/数据挖掘问题.
我在大学时是Matlab的独家用户;当时我以为是
神经网络工具箱,优化工具箱,统计工具箱,
My Top 5 list for Learning ML/Data Mining in R:
This refers to a couple things: First, a group of R Package that all begin arules (available from CRAN); you can find the complete list (arules, aruluesViz, etc.) on the Project Homepage. Second, all of these packages are based on a data-mining technique known as Market-Basked Analysis and alternatively as Association Rules. In many respects, this family of algorithms is the essence of data-mining--exhaustively traverse large transaction databases and find above-average associations or correlations among the fields (variables or features) in those databases. In practice, you connect them to a data source and let them run overnight. The central R Package in the set mentioned above is called arules; On the CRAN Package page for arules, you will find links to a couple of excellent secondary sources (vignettes in R's lexicon) on the arules package and on Association Rules technique in general.
The most current edition of this book is available in digital form for free. Likewise, at the book's website (linked to just above) are all data sets used in ESL, available for free download. (As an aside, i have the free digital version; i also purchased the hardback version from BN.com; all of the color plots in the digital version are reproduced in the hardbound version.) ESL contains thorough introductions to at least one exemplar from most of the major
ML rubrics--e.g., neural metworks, SVM, KNN; unsupervised
techniques (LDA, PCA, MDS, SOM, clustering), numerous flavors of regression, CART,
Bayesian techniques, as well as model aggregation techniques (Boosting, Bagging)
and model tuning (regularization). Finally, get the R Package that accompanies the book from CRAN (which will save the trouble of having to download the enter the datasets).
The +3,500 Packages available
for R are divided up by domain into about 30 package families or 'Task Views'. Machine Learning
is one of these families. The Machine Learning Task View contains about 50 or so
Packages. Some of these Packages are part of the core distribution, including e1071
(a sprawling ML package that includes working code for quite a few of
the usual ML categories.)
With particular focus on the posts tagged with Predictive Analytics
A thorough study of the code would, by itself, be an excellent introduction to ML in R.
我认为最后一个资源非常好,但没有进入前五名:
发表在博客A Beautiful WWW上