SVDFeature : A Toolkit for Informative Collaborative Filtering and Ranking
Project Description
- Recommendation system has been used more and more frequently in many applications recent years. With the increasing information available, not only in quantities but also in types, how to leverage these rich information to build a better recommendation system becomes a natural problem. Most traditional approaches try to design a specific model for each scenario, which demands great efforts in developing and modifying models.
This project provides a toolkit to solve recommendation problem using feature-based matrix factorization. This model is an abstract of many variants of matrix factorization models. New types of information can be utilized by simply defining new features, without modifying any line code. See the technical report on download for more details.
News
2011-12-02(new): Add lite version of SVDFeature, used as example code for SGD implementation.
2011-11-28(new): We release our experiment on KDD Cup 2011 track1 using SVDFeature, link to the release page.
2011-11-21(new): Add support for common parameter space of user/item features. This allows interesting usage such as joint matrix factorization and symmetric relation prediction.
2011-11-02(new): We release our experiment on KDD Cup 2011 track2 using SVDFeature, link to the release page.
- 2011-10-12: fix a bug happened in rank-model. More robust input buffering for user grouped format, less memory cost for loading input data when doing SVD++ and rank training.
- 2011-10-07: Add detailed regularization settings, fix another bug that only occurs in windows
- 2011-09-21: Add source code document of the interface, fix a bug in windows, add more regularization options.
- 2011-09-17: Release of version 1.1, add efficient computation for models with implicit/explicit information(SVD++), better support for rank-model, user manual now available
2011-07-01: We are the 3rd place in track1 of KDD CUP 2011. Inner Peace is the SJTU-HKUST joint team. We report the best single method(RMSE=22.12 without blending) in track1.
Features
- Large-scale data handling: The toolkit buffer the training data in disk thus memory cost is invariant to training data size.
- Feature input: We support feature input similar to SVM format
- Strong description ability: Many variants of matrix factorization can be described in feature-based matrix factorization. One can try new approaches by generating corresponding features, and no modification of code is required.
Input Format
- The input format is in sparse feature format similar to SVM format. For a training sample, we need to specify three kinds of
features as well as prediction target. The format is as follows:
line:= r k n m <global features> <user features> <item features>
r := prediction target
k := number of nonzero global features
n := number of nonzero user features
m := number of nonzero item features
<global features> := gid[1]:gvalue[1] ... gid[k]:gvalue[k]
<user features> := uid[1]:uvalue[1] ... uid[n]:uvalue[n]
<item features> := iid[1]:ivalue[1] ... iid[m]:ivalue[m]
For example, if we use basic matrix factorization model, the sample of user 0 and item 10 with rate 5 is as follows
5 0 1 1 0:1 10:1
See our demo folder and manual in download archive for examples.
Usage Examples
- We provide demo scripts in the SVDFeature demo folder.
Non-trivial example on KDD Cup 2011 track2 using SVDFeature.
Non-trivial example on KDD Cup 2011 track1 using SVDFeature.
Download
Toolkit(in tar):svdfeature-1.1.6.tar.gz
Toolkit(in zip): svdfeature-1.1.6.zip
Technical report about toolkit: Feature-based Matrix Factorization. arXiv:1109.2271v2
- User manual of toolkit: in download archive.
Our paper for KDDCUP2011 PDF
Change Log
- 2011-07-11: version 1.0, first release:)
- 2011-09-17: version 1.1, efficient computation for models with implicit/explicit information(SVD++), better support for rank-model
Citation
BibTex format:
@TECHREPORT{APEX-TR-2011-07-11,
AUTHOR = "Chen, Tianqi and Zheng, Zhao and Lu, Qiuxia and Zhang, Weinan and Yu, Yong",
INSTITUTION = "Apex Data \& Knowledge Management Lab, Shanghai Jiao Tong University",
TITLE = "Feature-Based Matrix Factorization",
NUMBER = "APEX-TR-2011-07-11",
YEAR = "2011",
MONTH = "July",
Contact
- Please contact us if you have any advice/comments on SVDFeature:)
Tianqi Chen: tqchen [AT] apex.sjtu.edu.cn
Zhao Zheng: zhengzhao [AT] apex.sjtu.edu.cn
- Qiuxia Lu : luqiuxia [AT] apex.sjtu.edu.cn
Weinan Zhang: wnzhang [AT] apex.sjtu.edu.cn
