Home · JohnLangford/vowpal_wabbit Wiki · GitHub

标签: home johnlangford vowpal | 发表时间:2017-07-31 21:09 | 作者:

Vowpal Wabbit

TheVowpal Wabbit(VW) project is a fast out-of-core learning system sponsored byMicrosoft Researchand (previously)Yahoo! Research. Support is available through themailing list.

There are two ways to have a fast learning algorithm: (a) start with a slow algorithm and speed it up, or (b) build an intrinsically fast learning algorithm. This project is about approach (b), and it's reached a state where it may be useful to others as a platform for research and experimentation.

There are several optimization algorithms available with the baseline being sparse gradient descent (GD) on a loss function (several are available), The code should be easily usable. Its only external dependence is on theboost library, which is often installed by default.

To build vw from source, in various environments, please follow the instructions in theREADME.mdfile.


There are several features that (in combination) can be powerful.
  1. Input Format. Theinput formatfor the learning algorithm is substantially more flexible than might be expected. Examples can have features consisting of free form text, which is interpreted in a bag-of-words way. There can even be multiple sets of free form text in different namespaces.
  2. Speed. The learning algorithm is pretty fast---similar to the few other online algorithm implementations out there. As one datapoint, it can be effectively applied on learning problems with a sparse terafeature (i.e. 1012sparse features). As another example, it's about a factor of 3 faster thanLeon Bottou'ssvmsgdon theRCV1 examplein wall clock execution time.
  3. Scalability. This is not the same as fast. Instead, the important characteristic here is that the memory footprint of the program is bounded independent of data. This means the training set is not loaded into main memory before learning starts. In addition, the size of the set of features is bounded independent of the amount of training data using thehashing trick.
  4. Feature Pairing. Subsets of features can be internally paired so that the algorithm is linear in the cross-product of the subsets. This is useful for ranking problems.David Grangierseems to have a similar trick in thePAMIR code. The alternative of explicitly expanding the features before feeding them into the learning algorithm can be both computation and space intensive, depending on how it's handled.

Many people have contributed to the project at this point.John Langford,Alekh Agarwal,Miroslav Dudik,Daniel Hsu,Nikos Karampatziakis,Olivier Chapelle,Paul Mineiro,Matt Hoffman,Jake Hofman,Sudarshan Lamkhede, Shubham Chopra, Ariel Faigon,Lihong Li, Gordon Rios, andAlex Strehlhave all worked on VW. Many others have contributed via feature requests, bug reports, or bug patches.


VW is also a vehicle for advanced research. Thefirst public versioncontaining hashing, caching, and true online learning was released in 2007. Since then, many different algorithms and results have influenced its design, including:
  1. Kai-Wei Chang,He He,Hal Daumé III,John Langford,Stephane Ross,A Credit Assignment Compiler for Joint Prediction, NIPS 2016.
  2. Kai-Wei Chang,Akshay Krishnamurthy,Alekh Agarwal,Hal Daumé III,John LangfordLearning to Search Better Than Your Teacher, ICML 2015.
  3. Alekh Agarwal,Olivier Chapelle,Miroslav Dudik,John Langford,A Reliable Effective Terascale Linear Learning System, 2011.
  4. M. Hoffman,D. Blei,F. Bach,Online Learning for Latent Dirichlet Allocation, in Neural Information Processing Systems (NIPS) 2010.
  5. Alina Beygelzimer,Daniel Hsu,John Langford, andTong ZhangAgnostic Active Learning Without ConstraintsNIPS 2010.
  6. John Duchi,Elad Hazan, andYoram Singer,Adaptive Subgradient Methods for Online Learning and Stochastic Optimization, JMLR 2011 & COLT 2010.
  7. H. Brendan McMahan,Matthew Streeter,Adaptive Bound Optimization for Online Convex Optimization, COLT 2010.
  8. Nikos KarampatziakisandJohn Langford,Importance Weight Aware Gradient UpdatesUAI 2010.
  9. Kilian Weinberger,Anirban Dasgupta,John Langford,Alex Smola,Josh Attenberg,Feature Hashing for Large Scale Multitask Learning, ICML 2009.
  10. Qinfeng Shi,James Petterson,Gideon Dror,John Langford,Alex Smola, andSVN Vishwanathan,Hash Kernels for Structured Data, AISTAT 2009.
  11. John Langford,Lihong Li, andTong Zhang,Sparse Online Learning via Truncated Gradient, NIPS 2008.
  12. Leon Bottou,Stochastic Gradient Descent, 2007.
  13. Avrim Blum,Adam Kalai, andJohn LangfordBeating the Holdout: Bounds for KFold and Progressive Cross-Validation. COLT99 pages 203-208.
  14. Nocedal, J.(1980). "Updating Quasi-Newton Matrices with Limited Storage". Mathematics of Computation 35: 773–782.

相关 [home johnlangford vowpal] 推荐:

Home · JohnLangford/vowpal_wabbit Wiki · GitHub

- -
There are two ways to have a fast learning algorithm: (a) start with a slow algorithm and speed it up, or (b) build an intrinsically fast learning algorithm.

[email protected]是什么碗糕?

- Sai柳 - Engadget 中国版
Android除了可以给手机或平板用,还能做啥. 这次的Google I/O,Google喊出了 "[email protected]" 的概念,简单说Google想要把Android可以控制的装置,从原先的智能型手机(后来加了平板),再扩张到了家电装置,所以未来出现Android手机控制马桶,也不算太令人意外.

一个 Home School 的女孩

- stephanie - vieplivee 随笔
她的数学功底也许不是最好,但最终成绩却是最好的一个,这结果在我看来倒一点也不奇怪. 她经常问问题,课上举手问,课下会找我给我看她记在本本上的疑问点,一个一个快速地过——她说这是因为心中问题太多,不想课上占用别人的时间. 的确,那些疑问点都是很简单的小事情,我可以在五分钟内轻松地全部答掉. 然后她看上去一脸安心的样子离开教室.

在树莓派上安装 Home Assistant | bornhe

- -
1、一块 Raspberry Pi 3B+ / 3B(推荐 3B+) 2、一根 USB 电源线(树莓派连接电源即开机) 3、一张不小于 8 GB 的 micro SD 卡(推荐 32 GB). Home Assistant 和 Hassbian. Home Assistant 是一套开源的、基于 Python 实现的智能家居管理系统.

Spring-Retry重试实现原理 | Alben's home

- -
Spring实现了一套重试机制,功能简单实用. 本文将讲述如何使用Spring Retry的及其重试机制的实现原理. Spring实现了一套重试机制,功能简单实用. Spring Retry是从Spring Batch独立出来的一个功能,已经广泛应用于Spring Batch,Spring Integration, Spring for Apache Hadoop等Spring项目.

Google 发布 Android @ Home,让你用 Android 设备控制家电

- Kofai - 谷奥——探寻谷歌的奥秘
Google在考虑让你用Android设备控制家电,不过他们没明确说自己用什么方式让Android平板或手机跟家电通讯(因为你的灯泡不可能支持WiFi),他们只是说还在研究中,这个标准将是全新的,很省电的. 现场演示了用Android控制灯泡,甚至是在你玩游戏的时候随着打枪的时机,家里的电灯也会跟着一亮一灭.

[email protected] 2.0创造虚拟粒子加速器

- yinseny - Solidot
Isgtw介绍了CERN开发的网格项目[email protected] 2.0,这个项目将让志愿者有机会参与搜索希格斯玻色子. [email protected] 2.0将用于多个研究项目,第一个项目Test4Theory正处于alpha测试阶段,现有超过100名志愿者,提供了CERN理论物理学家需要的10%的计算资源. 物理学家Peter Skands说,如果有1万人以上志愿者参与Test4Theory,将能真正产生实质性的效果.

iPhone5生产线照片曝光 Home键或加宽

- Sunset - cnBeta全文版
只要iPhone 5一天没出来,关于它的流言就一天不会消停,最近又有一些关于它的照片曝光了. 不过与之前的流言不同,这次照片中的内容是iPhone 5的生产线. 这次消息似乎来自为苹果生产触摸屏的胜华科技,在照片中我们可以看出这款屏幕的边框较窄,符合之前屏幕会变大的传闻. 而最为明显的一点就是iPhone原本为home键的地方成了一个长槽,由此看来或许home键的形状由圆形变成了长方形.

Sweet Home 3D: 免费的室内设计软件

- Shearer - Wow! Ubuntu
Sweet Home 3D 是一款免费的室内装潢设计软件,并且是跨平台,采用 Java 技术构建,支持 Windows,Mac OS X 和 Linux 系统,并具有中文语言界面. 在 Sweet Home 3D 中,你可以用二维界面来设计家居平面图来及放置家俱的位置,并还可以用 3D 的视角预览整个装修布局的全貌.

iPhone Home键偶尔失灵的解决方法

- leafduo - 苹果控
首先,打开任意一款应用程序,按住电源开关几秒钟,直到屏幕出现滑动关键的指示. 此时可以放开电源键,但不要滑动关机,同时按压Home键几秒,直到 屏幕回到Springboard. 这样你就已经重新调整了Home键,应该可以正常使用了. 但是如果尝试之后仍不能使用Home键,由此可以判断可能是硬件问题,需要找苹果售后维修中心来解决.