Optimizing and testing MySQL over NFS with NetAPP

In: General|Linux|MySQL

1 Jun 2011

Currently I have been trying to migrate a 4 node production MySQL database to a blade G5/G7 with vmware ubuntu instances on top of netapp. Typically I would not use a virtual machine nor try to run MySQL on top of NFS, however this needs to fit within the clients infrastructure.

Baseline

Base line performance unoptimized vms-gu-mysql04

root@vms-gu-mysql04:~# time dd if=/dev/zero of=/mnt/storage/io.test bs=16k count
16384+0 records in
16384+0 records out
268435456 bytes (268 MB) copied, 79.1257 s, 3.4 MB/s

real    1m19.202s
user    0m0.010s
sys     1m8.810s

Base line performance on current production that it is being migrated from on a raid 10 sas disks.

db1gorkana:~# time dd if=/dev/zero of=/mnt/mysql.tmp/io.test bs=16
16384+0 records in
16384+0 records out
268435456 bytes (268 MB) copied, 1.22437 s, 219 MB/s

real    0m1.227s
user    0m0.008s
sys     0m1.200s

Whilst dd tests only single-threaded disk access with a sequential write followed by a large sequential read, it is a very primitive test and fairly inline with MySQL IO usage.

Optimized

Optimized NFS mount options and sysctl.

mount

  • For best performance, OLTP databases can benefit from multiple mountpoints and distribution of the I/O load across these mountpoints. The performance improvement is generally from 2% to 9%. To accomplish this, on the host system create multiple mountpoints to the same file system and/or spread database data/index/log files across multiple volumes on the NetApp? storage controller(s).
  • For best performance, logs should be on a separate mountpoint. Generally for I/O tuning, the idea is to minimize database writes and optimize any writes (such as to log files) to avoid contention.

Mount mysql.tmp as a ramdisk and allocate sufficient swap.

root@vms-gu-mysql04:~# cat /etc/fstab 
# /etc/fstab: static file system information.
#
# Use 'blkid -o value -s UUID' to print the universally unique identifier
# for a device; this may be used with UUID= as a more robust way to name
# devices that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
proc            /proc           proc    nodev,noexec,nosuid 0       0
/dev/mapper/vms--gu--mysql04-root /               ext4    errors=remount-ro 0       1
# /boot was on /dev/sda1 during installation
UUID=9251f0b6-b3ea-4370-b1ae-368de078242f /boot           ext2    defaults        0       2
/dev/mapper/vms--gu--mysql04-swap_1 none            swap    sw              0       0
/dev/fd0        /media/floppy0  auto    rw,user,noauto,exec,utf8 0       0

#tc2-nas01:/vol/vms_gu_mysql04_dbStorage /mnt/storage nfs rw,hard,nointr,rsize=65536,wsize=65536,bg,vers=3,proto=tcp
tc2-nas01:/vol/vms_gu_mysql04_dbStorage /mnt/storage nfs rw,hard,nointr,rsize=65536,wsize=65536,bg,vers=3,proto=tcp,noatime

tc2-nas01:/vol/vms_gu_mysql04_dbStorage/mysql.tmp /mnt/mysql.tmp nfs rw,hard,nointr,rsize=65536,wsize=65536,bg,vers=3,proto=tcp,noatime
tc2-nas01:/vol/vms_gu_mysql04_dbStorage/mysql.log /mnt/mysql.log nfs rw,hard,nointr,rsize=65536,wsize=65536,bg,vers=3,proto=tcp,noatime
tc2-nas01:/vol/vms_gu_mysql04_dbStorage/mysql /mnt/mysql nfs rw,hard,nointr,rsize=65536,wsize=65536,bg,vers=3,proto=tcp,noatime

# UAT-NAS02.durrants.uat:/vol/vms_gu_mysql01_dbStorage /mnt/storage_uat nfs rw,hard,nointr,rsize=65536,wsize=65536,bg,vers=3,proto=tcp

#/.swap swap swap defaults 0 0

sysctl

root@vms-gu-mysql04:~# cat /etc/sysctl.conf
#
# /etc/sysctl.conf - Configuration file for setting system variables
# See /etc/sysctl.d/ for additional system variables.
# See sysctl.conf (5) for information.
#

#kernel.domainname = example.com

##############################################################3
# Functions previously found in netbase
#

# Uncomment the next two lines to enable Spoof protection (reverse-path filter)
# Turn on Source Address Verification in all interfaces to
# prevent some spoofing attacks
net.ipv4.conf.default.rp_filter=1
#net.ipv4.conf.all.rp_filter=1

# Uncomment the next line to enable TCP/IP SYN cookies
net.ipv4.tcp_syncookies=0

###################################################################
# Additional settings - these settings can improve the network
# security of the host and prevent against some network attacks
# including spoofing attacks and man in the middle attacks through
# redirection. Some network environments, however, require that these
# settings are disabled so review and enable them as needed.
#
# Ignore ICMP broadcasts
#net.ipv4.icmp_echo_ignore_broadcasts = 1
#
# Ignore bogus ICMP errors
#net.ipv4.icmp_ignore_bogus_error_responses = 1
# 
# Do not accept ICMP redirects (prevent MITM attacks)
#net.ipv4.conf.all.accept_redirects = 0
#net.ipv6.conf.all.accept_redirects = 0
# _or_
# Accept ICMP redirects only for gateways listed in our default
# gateway list (enabled by default)
# net.ipv4.conf.all.secure_redirects = 1
#
# Do not send ICMP redirects (we are not a router)
#net.ipv4.conf.all.send_redirects = 0
#
# Do not accept IP source route packets (we are not a router)
net.ipv4.conf.all.accept_source_route = 0
#net.ipv6.conf.all.accept_source_route = 0
#
# Log Martian Packets
#net.ipv4.conf.all.log_martians = 1

###################################################################
# Enable packet forwarding for IPv4 & IPv6
net.ipv4.ip_forward=1
net.ipv6.conf.all.forwarding=1

###################################################################
# Stops low-level messages on console
kernel.printk = 4 4 1 7

###################################################################
# Semaphores & IPC for optimizations in innodb
kernel.shmmax=2147483648
kernel.shmall=2147483648
kernel.msgmni=1024
kernel.msgmax=65536
kernel.sem=250 32000 32 1024

###################################################################
# Swap
vm.swappiness = 0
vm.vfs_cache_pressure = 50

###################################################################
# Increase max open files limit, this causes problems when opening
# a number of files typically set to 1024
fs.file-max=65536

###################################################################
# Optimization for netapp/nfs increased from 64k, @see http://tldp.org/HOWTO/NFS-HOWTO/performance.html#MEMLIMITS
net.core.wmem_default=262144
net.core.rmem_default=262144
net.core.wmem_max=262144
net.core.rmem_max=262144

net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_no_metrics_save = 1

# Guidelines from http://media.netapp.com/documents/mysqlperformance-5.pdf
net.ipv4.tcp_sack=0
net.ipv4.tcp_timestamps=0
sunrpc.tcp_slot_table_entries=128
#nvfail on
#fcp.enable on
#iscsi.enable on

#nfs.v3.enable on
nfs.tcp.enable on
nfs.tcp.recvwindowsize 65536
nfs.tcp.xfersize 65536

#iscsi.iswt.max_ios_per_session 128
#iscsi.iswt.tcp_window_size 131400
#iscsi.max_connections_per_session 16

net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65023
net.ipv4.tcp_max_syn_backlog = 10240
net.ipv4.tcp_max_tw_buckets = 400000
net.ipv4.tcp_max_orphans = 60000
net.ipv4.tcp_synack_retries = 3
net.core.somaxconn = 10000

kernel.sysrq=0

net.ipv4.neigh.default.gc_thresh1 = 4096
net.ipv4.neigh.default.gc_thresh2 = 8192
net.ipv4.neigh.default.gc_thresh3 = 8192
net.ipv4.neigh.default.base_reachable_time = 86400
net.ipv4.neigh.default.gc_stale_time = 86400

Testing optimization of mounts and sysctl

root@vms-gu-mysql04:/mnt# time dd if=/dev/zero of=/mnt/storage/io.test bs=16k count=16384
16384+0 records in
16384+0 records out
268435456 bytes (268 MB) copied, 7.41033 s, 36.2 MB/s

real	0m7.457s
user	0m0.000s
sys	0m0.320s

mysql

#
# The MySQL database server configuration file.
#
# You can copy this to one of:
# - "/etc/mysql/my.cnf" to set global options,
# - "~/.my.cnf" to set user-specific options.
#
# One can use all long options that the program supports.
# Run program with --help to get a list of available options and with
# --print-defaults to see which it would actually understand and use.
#
# For explanations see
# http://dev.mysql.com/doc/mysql/en/server-system-variables.html

# This will be passed to all mysql clients
# It has been reported that passwords should be enclosed with ticks/quotes
# escpecially if they contain "#" chars...
# Remember to edit /etc/mysql/debian.cnf when changing the socket location.
[client]
port            = 3306
socket          = /var/run/mysqld/mysqld.sock

# Here is entries for some specific programs
# The following values assume you have at least 32M ram

# This was formally known as [safe_mysqld]. Both versions are currently parsed.
[mysqld_safe]
socket          = /var/run/mysqld/mysqld.sock
nice            = 0
open_files_limit=16384

[mysqld]
#
# * Basic Settings
#
datadir         = /mnt/mysql
tmpdir          = /mnt/mysql.tmp/

performance_schema

skip_name_resolve
default-time-zone='UTC'
ft_min_word_len=2
group_concat_max_len=8192

myisam-recover          = BACKUP
log_slow_queries        = /mnt/mysql.log/slow.log

max_allowed_packet=128M
key_buffer = 512M
table_cache = 2000
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 64M

# Try number of CPU's*2 for thread_concurrency
thread_concurrency = 16

max_connections=1000
wait_timeout = 120
connect_timeout = 10
tmp_table_size = 64M
max_connect_errors = 1000

## INNODB
innodb_file_per_table
innodb_table_locks=0
innodb_doublewrite=0
innodb_open_files=4096
innodb_support_xa=off
innodb_log_files_in_group=2
innodb_adaptive_checkpoint=keep_average
innodb_thread_concurrency=0
innodb_flush_method             = fsync
Innodb_checksums=0
innodb_read_ahead = none
innodb_flush_neighbor_pages = 0
#innodb_write_io_threads=16
#innodb_read_io_threads=16
innodb_io_capacity=2000

innodb_buffer_pool_size = 7G
innodb_additional_mem_pool_size = 256M
innodb_lock_wait_timeout = 5
innodb_flush_log_at_trx_commit = 1

innodb_data_file_path=ibdata1:100M:autoextend 
#innodb_log_file_size = 900M
## REPLICATION
log-bin=mysql-bin
server-id=7
sync_binlog=1
relay-log = relay.log
relay-log-info-file = relay-log.info
relay-log-index = relay-log.index
log-error=repl.log

binlog_cache_size = 10M
expire-logs-days=5


[mysqldump]
quick
quote-names
max_allowed_packet      = 64M

[mysql]
default-character-set=utf8

[isamchk]
key_buffer              = 16M

#
# * IMPORTANT: Additional settings that can override those from this file!
#   The files must end with '.cnf', otherwise they'll be ignored.
#
!includedir /etc/mysql/conf.d/

Testing

In order to test traffic from our production environment we setup stunnel for both replicated and piped mysql traffic. I setup MySQL on a single node as a slave whilst the other acted as a test box for optimization. The replicated data would only test the write capacity of the new environment, however this would not simulate queries running through our production. Additionally routing traffic from production via iptables would not allow me to filter only selects.

Whilst maatkit provides mk-log-player this didn’t cover my needs to test the new infrastructure. As such I rewrote parts of mysqlsniffer and stripped it down to execute the statements on the other box and filtering select and creation of temporary tables and inserts into them.

Summary

After testing for a period of time, we decided against switching to use netapp and simply add disk packs to the blade datacenters. There were further issues with their configuration on there network/netapp which the client is looking into. It was expected that we could issue reads/writes at 120mb a second, however only managed to get up to 40mb a second. Contrasting 40mb/s to direct disk access on other machines that were optimized we could achieve 500mb/s. The optimizations improved reads/writes by up to 900%.

Share

1 Response to Optimizing and testing MySQL over NFS with NetAPP

Avatar

darkfader

June 30th, 2011 at 7:10 pm

Very nice you put this up for reading!
VMWare networking is slow, unless tuned by absolute experts, and the Linux NFS code sucks beyond description, one of the reasons why Oracle implemented pNFS support right into the database kernel instead of relying on the OS. They made it go to wirespeed rate over multiple links to a single NetAPP filer.
But its smart to know when to not ride a dead horse (as in – aligning with Customer environment if it just ain’t fit!

Comment Form

About this blog

Andrew Johnstone is a software engineer / lead developer working at Everlution Software.

Photostream

  • Multiple Ucarp on the same host | PaoloBertasi: […] http://ajohnstone.com/achives/running-several-vips-on-the-same-interface-with-ucarp-and-ha [...]
  • mohit: I am working the same on windows server 2008 and mysql version 5.1.39 ,but it is not working and i a [...]
  • Steve: Hi there... sorry that this is old, but I'm trying to use your script to check for my usage per seco [...]
  • vinodh: great. thanks . My boss wanted me to configure multiple mysql instances on same physical machine . i [...]
  • andrew.johnstone: In the example above this was in fact using spl_autoload_register. I've never debugged this properly [...]

GitHub

GitHub Octocat

ajohnstone @ GitHub

  • Status updating...