MySQL 单表插入 10w+ TPS达成
装B留念:
MySQL服务端机器配置和操作系统信息,没有使用FlashCache哦:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
# Aspersa System Summary Report ##############################
Uptime | 84 days, 19:00, 3 users, load average: 0.00, 0.21, 0.16 Platform | Linux Release | Red Hat Enterprise Linux Server release 5.4 (Tikanga) Kernel | 2.6.18-164.el5 Architecture | CPU = 64-bit, OS = 64-bit Threading | NPTL 2.5 Compiler | GNU CC version 4.1.2 20080704 (Red Hat 4.1.2-44). SELinux | Disabled Virtualized | No virtualization detected # Processor ################################################## Processors | physical = 2, cores = 8, virtual = 16, hyperthreading = yes Speeds | 16×2261.058 Models | 16xIntel(R) Xeon(R) CPU E5520 @ 2.27GHz Caches | 16×8192 KB # Memory ##################################################### Total | 23.53G Free | 2.16G Used | physical = 21.38G, swap = 240.00k, virtual = 21.38G Buffers | 1.03G Caches | 13.60G Dirty | 156 kB UsedRSS | 6.1G Swappiness | vm.swappiness = 60 DirtyPolicy | vm.dirty_ratio = 40, vm.dirty_background_ratio = 10 # Mounted Filesystems ######################################## Filesystem Size Used Type Opts Mountpoint /dev/sda10 766G 11% ext3 rw /uxx /dev/sda1 122M 16% ext3 rw /boot /dev/sda2 15G 67% ext3 rw / /dev/sda3 15G 76% ext3 rw /usr /dev/sda5 8.6G 2% ext3 rw /tmp tmpfs 12G 0% tmpfs rw /dev/shm # Disk Schedulers And Queue Size ############################# sda | [cfq] 128 # Disk Partioning ############################################ Device Type Start End Size ============ ==== ========== ========== ================== # Kernel Inode State ######################################### dentry-state | 297447 276749 45 0 0 0 file-nr | 3570 0 2390094 inode-nr | 220730 32 # LVM Volumes ################################################ WARNING: Running as a non-root user. Functionality may be unavailable. # RAID Controller ############################################ Controller | LSI Logic MegaRAID SAS Model | , interface, ports Cache | Memory, BBU BBU | % Charged, Temperature C, isSOHGood= VirtualDev Size RAID Level Disks SpnDpth Stripe Status Cache ========== ========= ========== ===== ======= ====== ======= ========= PhysiclDev Type State Errors Vendor Model Size ========== ==== ======= ====== ======= ============ =========== # Network Config ############################################# Controller | Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Controller | Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Controller | Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Controller | Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Controller | Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Controller | Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Controller | Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) Controller | Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) FIN Timeout | net.ipv4.tcp_fin_timeout = 60 Port Range | net.ipv4.ip_local_port_range = 32768 61000 # Interface Statistics ####################################### interface rx_bytes rx_packets rx_errors tx_bytes tx_packets tx_errors ========= ========= ========== ========== ========== ========== ========== lo 600000000 500000 0 600000000 500000 0 eth0 0 0 0 0 0 0 eth1 0 0 0 0 0 0 eth2 0 0 0 0 0 0 eth3 0 0 0 0 0 0 eth4 1000000000 600000000 0 2000000000 450000000 0 eth5 0 0 0 0 0 0 eth6 1250000000 15000000 0 0 0 0 eth7 0 0 0 0 0 0 sit0 0 0 0 0 0 0 bond0 2500000000 600000000 0 2000000000 450000000 0 # The End #################################################### |
MySQL 使用Percona 5.5.18
my.cnf的配置如下
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
default-storage-engine = INNODB
innodb_flush_method = O_DIRECT innodb_file_per_table = 1 innodb_flush_log_at_trx_commit = 1 innodb_lock_wait_timeout = 100 innodb_additional_mem_pool_size = 20M innodb_buffer_pool_size = 5G innodb_log_buffer_size= 800M innodb_log_file_size = 200M innodb_log_files_in_group = 4 innodb_file_io_threads = 4 innodb_thread_concurrency = 32 innodb_max_dirty_pages_pct = 90 innodb_data_file_path = ibdata1:1G:autoextend innodb_sync_spin_loops = 0 innodb_spin_wait_delay = 0 tdh_socket_thread_num = 8 tdh_socket_slow_read_thread_num = 64 tdh_socket_io_thread_num = 4 tdh_socket_write_thread_num = 16 tdh_socket_optimize_on = 1 tdh_socket_cache_table_num_for_thd = 40 |
插入的表结构
1
2 3 4 5 6 7 8 |
CREATE TABLE `test` (
`id` INT(20) UNSIGNED NOT NULL AUTO_INCREMENT, `k` INT(20) DEFAULT NULL, `i` INT(20) NOT NULL, `c` CHAR(120) DEFAULT NULL, `kc` INT(20) DEFAULT ’1′, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=gbk; |
压测脚本:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
package benchmark.insert
import benchmark.StressTest import com.taobao.tdhs.client.TDHSClient import com.taobao.tdhs.client.TDHSClientImpl import com.taobao.tdhs.client.response.TDHSResponse import com.taobao.tdhs.client.response.TDHSResponseEnum import java.util.concurrent.atomic.AtomicLong /** * @author <a href=”mailto:[email protected]”>文通</a> * @since 12-2-9 下午3:31 * */ int index = Integer.valueOf(args[0]) AtomicLong id = new AtomicLong((index - 1) * step + 1) TDHSClient client = new TDHSClientImpl( new InetSocketAddress(“10.232.31.25″, 9999), 2); def s = new StressTest( count: step) s.add({ def _id = id.incrementAndGet() String table = “test” TDHSResponse response = client.createStatement(index).insert(). use(“benchmark_insert”).from(table) .value(“id”, _id.toString()) .value(“k”, “10000″) .value(“i”, _id.toString()) .value(“c”, _id.toString() + “_abcdefghijklmnopqrstuvwxyz”).insert() if (response != null && response.getStatus() != TDHSResponseEnum.ClientStatus.OK) { System.out. println(response); } }, 100) s.run() client.shutdown() |
使用TDH_SOCKET进行插入能到这么高TPS的关键:
- TDH_SOCKET提供group commit:
- 减少redo log的fsync次数,使IOPS不成为瓶颈
- TDH_SOCKET能提供并发写.
- 自增id的生成策略需要分段自增:
- 如果采用全自增id生成策略(即默认innodb提供的自增id生成策略)的话,会在高并发插入的时候遇到一个block的写锁.
- 因为是顺序自增,所以并发插入的记录都会集中写入到同一个page上,而一个线程写page会对这个page做一次rw_lock_x_lock_func(&(block->lock), 0, file, line); 那么这个写锁会成为瓶颈,使TPS无法上去所以只要改变生成id的策略:
- 分段自增比如一个写线程一下子获取1-10000的id范围 下一个写线程获取10001-20000的id范围..
- 每个写线程在各自的范围里面自增的生成id,那么即能保证一定的顺序写,又能使写page不会因为并发写锁而性能下降!
好,当上了10wTPS之后你会发现CPU占用:
发现CPU的占用还不是很高..但是这个时候的瓶颈在于log_sys->mutex 这个锁了.因为插入么最后要写undo log.
至于TDH_SOCKET是啥…..先买个关子~
© 2012, 淘宝文通. 版权所有.