中文分词算法代码大全
- - 鲁塔弗的博客做中文搜索,关键词提取,文档分类都离不开中文分词,能用的代码包有如下. 单字切分 sphinx只要把min_word_len设置为1,并配置charset_table,默认就是单字切分,lucene用StandardAnalyzer. CJKAnalyzer lucene自带,两两分词,就是把 ABCD 分成 AB,BC,CD 3段.
Keith Schwarz是一个斯坦福大学计算机科学系的讲师。他对编程充满了热情。他的主页上他自己正在实现各种各样的有意思的算法和数据结构,http://www.keithschwarz.com/interesting/, 目前这个网页上有88个(见下面的列表),但这位大哥要干135个,你可以看看他的To-Do List。
从这个列表上,我们可以看到,他从去年7月份就在自己实现这些东西了,我把他实现的这些算法转过来,
Name | Link | Date Added | Language | Description |
---|---|---|---|---|
Binomial Heap | (link) | 7‑24‑2010 | C++ | An implementation of a binomial heap data structure for use as a priority queue. |
Bounded Priority Queue | (link) | 7‑24‑2010 | C++ | An implementation of a priority queue with a fixed upper limit to its size.. |
Matrix | (link) | 7‑24‑2010 | C++ | A collection of classes for manipulating matrices. |
VList | (link) | 8‑16‑2010 | Java | An implementation of the List abstraction backed by a VList. |
Function Wrapper | (link) | 8‑16‑2010 | C++ | A C++ wrapper class around unary functions. |
String | (link) | 8‑17‑2010 | C++ | An implementation of a string abstraction that uses the small string optimization. |
nstream | (link) | 8‑31‑2010 | C++ | An stream class that sends and receives data over a network. |
Snake | (link) | 8‑31‑2010 | C++ | An implementation of the game Snake with a rudimentary AI. |
Mergesort | (link) | 9‑14‑2010 | C++ | An implementation of the mergesort algorithm. |
Next Permutation | (link) | 10‑6‑2010 | C++ | An implementation of the next_permutation STL algorithm. |
Interval Heap | (link) | 10‑17‑2010 | Java | An implementation of a double-ended priority queue using an interval heap. |
Linear-Time Selection | (link) | 10‑18‑2010 | C++ | A deterministic, linear-time selection algorithm using the median-of-medians algorithm. |
Heapsort | (link) | 10‑18‑2010 | C++ | An implementation of the heapsort algorithm. |
Union-Find | (link) | 10‑19‑2010 | Java | An implementation of a disjoint-set data structure using a disjoint set forest. |
Radix Sort | (link) | 10‑19‑2010 | C++ | An implementation of the radix sort algorithm. |
Rational | (link) | 10‑23‑2010 | C++ | A data structure representing a rational number. |
DPLL | (link) | 10‑23‑2010 | Haskell | An implementation of the DPLL algorithm for solving CNF-SAT. |
Smoothsort | (link) | 10‑27‑2010 | C++ | An implementation of the smoothsort algorithm, an adaptive heapsort variant. |
Extendible Array | (link) | 10‑28‑2010 | Java | A dynamic array class with O(1) worst-case runtime lookup and append. |
In-Place Merge | (link) | 10‑29‑2010 | C++ | An implementation of a merge algorithm that runs in-place. |
Random Shuffle | (link) | 10‑29‑2010 | C++ | An algorithm for generating a random permutation of a set of elements. |
Random Sample | (link) | 10‑29‑2010 | C++ | An O(n) time, O(1) space algorithm for randomly choosing k elements out of a stream with uniform probability. |
Natural Mergesort | (link) | 10‑30‑2010 | C++ | An implementation of natural mergesort, an adaptive variant of mergesort. |
Interpolation Search | (link) | 10‑31‑2010 | C++ | An implementation of the interpolation search algorithm. |
Introsort | (link) | 10‑31‑2010 | C++ | An implementation of the introsort algorithm, a fast hybrid of quicksort, heapsort, andinsertion sort. |
Hashed Array Tree | (link) | 11‑3‑2010 | Java | An implementation of a dynamic array backed by a hashed array tree. |
Recurrence Solver | (link) | 11‑13‑2010 | C++ | A fast algorithm for generating terms of a sequence defined by a linear recurrence relation. |
Fibonacci Heap | (link) | 11‑15‑2010 | Java | An implementation of a priority queue backed by a Fibonacci heap. |
Dijkstra’s Algorithm | (link) | 11‑16‑2010 | Java | An implementation of Dijkstra’s algorithm for single-source shortest paths. |
Prim’s Algorithm | (link) | 11‑17‑2010 | Java | An implementation of Prim’s algorithm for computing minimum spanning trees. |
Kruskal’s Algorithm | (link) | 11‑17‑2010 | Java | An implementation of Kruskal’s algorithm for computing minimum spanning trees. |
Majority Element | (link) | 11‑17‑2010 | C++ | A fast, linear-time algorithm for finding the majority element of a data set. |
Haar Transform | (link) | 11‑17‑2010 | C++ | A set of functions to decompose a sequence of values into a sum of Haar wavelets. |
Argmax | (link) | 11‑19‑2010 | C++ | A pair of functions to compute the arg min or max of a function on some range. |
Derivative | (link) | 11‑19‑2010 | C++ | A function object that approximates the derivative of a function. |
Levenshtein Distance | (link) | 11‑19‑2010 | C++ | An algorithm for computing the Levenshtein distance between two sequences. |
Skiplist | (link) | 11‑20‑2010 | C++ | An implementation of a skip list, a randomized data structure for maintaining a sorted collection. |
van Emde Boas Tree | (link) | 11‑26‑2010 | C++ | An implementation of a sorted associative array backed by a van Emde Boas tree. |
Cuckoo HashMap | (link) | 11‑27‑2010 | Java | An implementation of a hash table using cuckoo hashing. |
Needleman-Wunsch Algorithm | (link) | 11‑28‑2010 | C++ | An implementation of the Needleman-Wunsch algorithm for optimal string alignment. |
Treap | (link) | 11‑28‑2010 | C++ | An implementation of a sorted associative array backed by a treap. |
Floyd-Warshall Algorithm | (link) | 12‑10‑2010 | Java | An implementation of the Floyd-Warshall algorithm for all-pairs shortest paths in a graph. |
Power Iteration | (link) | 12‑10‑2010 | C++ | An implementation of the power iteration algorithm for finding dominant eigenvectors. |
Edmonds’s Matching Algorithm | (link) | 12‑15‑2010 | Java | An implementation of Edmonds’s matching algorithm for finding maximum matchings in undirected graphs. |
Kosaraju’s Algorithm | (link) | 12‑15‑2010 | Java | An implementation of Kosaraju’s algorithm algorithm for finding strongly connected components of a directed graph. |
2-SAT | (link) | 12‑15‑2010 | Java | A linear-time algorithm for solving 2-SAT. |
Bellman-Ford Algorithm | (link) | 12‑17‑2010 | Java | An implementation of the Bellman-Ford algorithm for single-source shortest paths. |
Topological Sort | (link) | 12‑17‑2010 | Java | An algorithm for computing a topological sort of a directed acyclic graph. |
Graham Scan | (link) | 12‑19‑2010 | C++ | An implementation of the Graham scan for finding convex hulls in 2D space. |
Bipartite Testing | (link) | 12‑19‑2010 | Java | A linear-time algorithm for checking whether a directed graph is bipartite. |
Johnson’s Algorithm | (link) | 12‑19‑2010 | Java | An implementation of Johnson’s algorithm for all-pairs shortest paths. |
Strassen Algorithm | (link) | 12‑20‑2010 | C++ | An implementation of the Strassen algorithm for fast matrix multiplication. |
Cartesian Tree Sort | (link) | 12‑21‑2010 | C++ | An implementation of Cartesian tree sort, an adaptive, out-of-place heapsort variant. |
Ford-Fulkerson Algorithm | (link) | 12‑21‑2010 | Java | An implementation of the Ford-Fulkerson maximum-flow algorithm. |
Scaling Ford-Fulkerson | (link) | 12‑22‑2010 | Java | An modification of the Ford-Fulkerson maximum-flow algorithm that uses scaling to achieve polynomial time.. |
Splay Tree | (link) | 12‑27‑2010 | C++ | An implementation of a sorted associative array backed by a splay tree. |
Ternary Search Tree | (link) | 12‑28‑2010 | C++ | An implementation of a sorted set of strings backed by a ternary search tree. |
Ring Buffer | (link) | 12‑30‑2010 | Java | An implementation of a FIFO queue using a ring buffer. |
AVL Tree | (link) | 12‑30‑2010 | C++ | A sorted associative container backed by an AVL tree. |
Rabin-Karp Algorithm | (link) | 1‑1‑2011 | C++ | An implementation of the Rabin-Karp algorithm for string matching. |
RPN Evaluator | (link) | 1‑18‑2011 | C++ / strain | A library to tokenize and evaluate simple arithmetic expressions in reverse Polish notation. |
Shunting-Yard Algorithm | (link) | 1‑18‑2011 | C++ / strain | An implementation of Dijkstra’s shunting-yard algorithm for converting infix expressions to reverse-Polish notation. |
Skew Binomial Heap | (link) | 1‑20‑2011 | C++ | An implementation of a priority queue backed by a skew binomial heap. |
2/3 Heap | (link) | 3‑1‑2011 | C++ | An implementation of a priority queue whose branching factor alternates at different levels to maximize performance. |
Zeckendorf Logarithm | (link) | 3‑10‑2011 | C++ | An algorithm based on Zeckendorf representations that efficiently computes logarithms. |
Factoradic Permutations | (link) | 3‑17‑2011 | C++ | A set of algorithms for generating permutations using the factoradic number system. |
Binary Cyclic Subsets | (link) | 3‑20‑2011 | C++ | A set of algorithms for generating subsets in lexicographical order using binary numbers and cyclic shifts. |
Fibonacci Iterator | (link) | 3‑22‑2011 | C++ | An STL-style iterator for iterating over the Fibonacci numbers. |
Fibonacci Search | (link) | 3‑22‑2011 | C++ | An implementation of the Fibonacci search algorithm. |
Euclid’s Algorithm | (link) | 4‑18‑2011 | Haskell | An implementation of Euclid’s algorithm and applications to continued fractions and the extended Euclidean algorithm. |
Find Duplicate | (link) | 4‑18‑2011 | Python | An algorithm to find a repeated element in an array using Floyd’s cycle-finding algorithm. |
Permutation Generator | (link) | 4‑19‑2011 | Python | A generator for producing all permutations of a list of elements. |
Matrix Find | (link) | 4‑19‑2011 | Python | A solution to the classic interview question of searching a sorted matrix for a particular value. |
Binary GCD | (link) | 4‑23‑2011 | Scheme | An implementation of the binary GCD algorithm for computing greatest common divisors of nonnegative integers. |
Knuth-Morris-Pratt Algorithm | (link) | 5‑3‑2011 | Python | An implementation of the Knuth-Morris-Pratt algorithm for fast string matching. |
Kadane’s Algorithm | (link) | 5‑7‑2011 | C++ | An implementation of Kadane’s algorithm for solving the maximum-weight subarray problem. |
Karatsuba’s Algorithm | (link) | 8‑15‑2011 | Python | An implementation of Karatsuba’s algorithm for fast integer multiplication. |
Min-Stack | (link) | 8‑15‑2011 | C++ | An implementation of a LIFO stack that supports O(1) push, pop, and find-minimum. |
Random Bag | (link) | 8‑15‑2011 | Python | A data structure that supports insertion and removal of a uniformly-random element. |
Min-Queue | (link) | 8‑15‑2011 | C++ | An implementation of a FIFO queue that supports O(1) push, pop, and find-minimum. |
Lights-Out Solver | (link) | 8‑29‑2011 | C++ | A solver for the game Lights Out using Gaussian elimination over GF(2). |
Maximum Single-Sell Profit | (link) | 11‑9‑2011 | Python | Four algorithms for the maximum single-sell profit problem, each showing off a different algorithmic technique. |
Generalized Kadane’s Algorithm | (link) | 11‑10‑2011 | C++ | A generalization of Kadane’s algorithm for solving the maximum subarray problem subject to a length restriction. |
Longest Range | (link) | 11‑19‑2011 | Java | An algorithm for solving the longest contiguous range problem. |
Egyptian Fractions | (link) | 11‑20‑2011 | Python | An implementation of the greedy algorithm for finding Egyptian fractions. |
LL(1) Parser Generator | (link) | 11‑21‑2011 | Java | An LL(1) parser generator. |
LR(0) Parser Generator | (link) | 11‑23‑2011 | Java | An LR(0) parser generator. |
Word Ladders | (link) | 11‑27‑2011 | JavaScript | A program for finding word ladders between two words. |
(全文完)
您可能也喜欢: | ||||
C++和JAVA传统中积极的一面 |
一个排序算法比较的网站 |
一些重要的算法 |
一些有意思的文章和资源 |
可视化的数据结构和算法 |
无觅 |