<< 地理空间距离计算优化 - 美团点评技术团队 | 首页 | 记一次java native memory增长问题的排查 - Axb的自我修养 >>

RSS | Atom | 电子邮件

搜索

分类 | 标签 | 高级搜索

分类

AppServer (26)

Database (61)

健康 (4)

生活 (25)

UNIX (38)

Mobile (23)

Tech (70)

Web前端 (0)

随笔 (0)

数据库 (0)

Java技术 (0)

收藏夹 (0)

标签

最新文章

陈爱云：打造坚如磐石的搜索架构 - 中生代技术 | 十条
对于一个在线系统而言，性能和稳定性是永远要追求的两个方向，如果是分布式系统，性能不够可以用机器来凑（当然这不是最好的方法，性能的提升不是本文的关注点，所以这里不对提升性能的方法赘述），但是稳定性不能靠机器来堆，并且机器越来越多可能会带来更多的稳定性的问题。做在线系统的同学应该会对墨菲定理感触特别深，...
Fix certificate problem in HTTPS - Real's Java How-to
HTTPS protocol is supported since JDK1.4 (AFAIK), you have nothing special to do. import java.io.InputStreamReader; import java.io.Reader; import java.net.URL; import java.net.URLConnection; public class ConnectHttps { public static void main(String[...
爬取百度网盘用户分享 | Guodong
获取用户订阅: http://yun.baidu.com/pcloud/friend/getfollowlist?query_uk=%s&limit=24&start=%s&bdstoken=e6f1efec456b92778e70c55ba5d81c3d&channel=chunl...

Log me in using Google

使用Spark-MLlib进行内容推荐

在许多的现实生活中的很多场景中，我们常常只能接触到隐性的反馈（例如游览，点击，购买，喜欢，分享等等）在 MLlib 中所用到的处理这种数据的方法来源于文献： Collaborative Filtering for Implicit Feedback Datasets。本质上，这个方法将数据作为二元偏好值和偏好强度的一个结合，而不是对评分矩阵直接进行建模。因此，评价就不是与用户对商品的显性评分而是和所观察到的用户偏好强度关联了起来。然后，这个模型将尝试找到隐语义因子来预估一个用户对一个商品的偏好。

package org.apache.spark.examples.mllib;

// $example on$

import scala.Tuple2;

import org.apache.spark.api.java.*;

import org.apache.spark.api.java.function.Function;

import org.apache.spark.mllib.recommendation.ALS;

import org.apache.spark.mllib.recommendation.MatrixFactorizationModel;

import org.apache.spark.mllib.recommendation.Rating;

import org.apache.spark.SparkConf;

// $example off$

public class JavaRecommendationExample {

public static void main(String args[]) {

// $example on$

SparkConf conf = new SparkConf().setAppName("Java Collaborative Filtering Example");

JavaSparkContext jsc = new JavaSparkContext(conf);

// Load and parse the data

String path = "../data/mllib/als/test.data";

JavaRDD<String> data = jsc.textFile(path);

JavaRDD<Rating> ratings = data.map(

new Function<String, Rating>() {

public Rating call(String s) {

String[] sarray = s.split(",");

return new Rating(Integer.parseInt(sarray[0]), Integer.parseInt(sarray[1]),

Double.parseDouble(sarray[2]));

}

);

// Build the recommendation model using ALS

int rank = 10;

int numIterations = 10;

//使用具体评分数进行训练

MatrixFactorizationModel model = ALS.train(JavaRDD.toRDD(ratings), rank, numIterations, 0.01);

//忽略评分数据进行模型训练

//MatrixFactorizationModel model = ALS.trainImplicit(JavaRDD.toRDD(ratings), rank, numIterations, 0.01, 0.01);

// Evaluate the model on rating data

JavaRDD<Tuple2<Object, Object>> userProducts = ratings.map(

new Function<Rating, Tuple2<Object, Object>>() {

public Tuple2<Object, Object> call(Rating r) {

return new Tuple2<Object, Object>(r.user(), r.product());

}

);

JavaPairRDD<Tuple2<Integer, Integer>, Double> predictions = JavaPairRDD.fromJavaRDD(

model.predict(JavaRDD.toRDD(userProducts)).toJavaRDD().map(

new Function<Rating, Tuple2<Tuple2<Integer, Integer>, Double>>() {

public Tuple2<Tuple2<Integer, Integer>, Double> call(Rating r){

return new Tuple2<Tuple2<Integer, Integer>, Double>(

new Tuple2<Integer, Integer>(r.user(), r.product()), r.rating());

}

));

JavaRDD<Tuple2<Double, Double>> ratesAndPreds =

JavaPairRDD.fromJavaRDD(ratings.map(

new Function<Rating, Tuple2<Tuple2<Integer, Integer>, Double>>() {

public Tuple2<Tuple2<Integer, Integer>, Double> call(Rating r){

return new Tuple2<Tuple2<Integer, Integer>, Double>(

new Tuple2<Integer, Integer>(r.user(), r.product()), r.rating());

}

)).join(predictions).values();

double MSE = JavaDoubleRDD.fromRDD(ratesAndPreds.map(

new Function<Tuple2<Double, Double>, Object>() {

public Object call(Tuple2<Double, Double> pair) {

Double err = pair._1() - pair._2();

return err * err;

}

).rdd()).mean();

System.out.println("Mean Squared Error = " + MSE);

// Save and load model

model.save(jsc.sc(), "target/tmp/myCollaborativeFilter");

MatrixFactorizationModel sameModel = MatrixFactorizationModel.load(jsc.sc(),

"target/tmp/myCollaborativeFilter");

//使用模型为用户推荐内容

Rating[] recommendations =sameModel.recommendProducts(1, 3);

for(int i=0;i<recommendations.length;i++){

System.out.println("推荐的产品:"+recommendations[i].product());

}

// $example off$

}

标签 : java, 大数据, 数据挖掘

发表评论

IT瘾于2016年4月11日上午05时52分00秒发布 #

发表评论发送引用通报

Re: 使用Spark-MLlib进行内容推荐 Anonymous于2025年8月8日下午03时11分20秒评论 #
标题
正文	HTML : b, strong, i, em, blockquote, br, p, pre, a href="", ul, ol, li, sub, sup
OpenID Login	(Not me?)
姓名
电子邮件
网站
记住我	是否
电邮地址不会公开在网页上，您留下的电子邮件仅用于本文有新评论时通知您（以后可以随时拿掉）。

使用Spark-MLlib进行内容推荐

Re: 使用Spark-MLlib进行内容推荐