Skip to content

Getting SQL from a SPAN port

Recently I needed the query stream hitting a very busy master. Normally I would have been using the MySQL Proxy to collect queries, but on a very busy machine the Proxy is as much of a problem as it is part of the solution, so I chose a different approach.

I had a SPAN port configured for the master, which is Ciscospeak for a port on a switch which mirrors all traffic of one or more other ports. I had an otherwise idle machine listening to the SPAN port on a spare network card. That way it is possible to collect traffic to and from the master without any interference with the master.

On the listener box, I had tcpflow collecting data to my master (and only traffic to, not from the master):
CODE:
tcpflow -i eth1 dst master and port 3306
These tcpflow files now need to be processed into a slow-log like format for further processing. For that I wrote a very simple processor in C after some experimentation with tcpdump and mk-query-digest had been shown as being too slow to keep up.

The processor is called extract_queries and it's souce can be found below. It would be used like so:
CODE:
# mkdir flow
# cd flow
# tcpflow -i eth1 dst master and port 3306
(wait 1h)
(break)
# cd ..
# find flow -print0 | xargs -0 extract_queries -u > slow
# mysqldumpslow -s c slow > stats


The Source: (extract_queries.c)

DELETE, innodb_max_purge_lag and a case for PARTITIONS

Where I work, Merlin is an important tool for us and provides a lot of insight that other, more generic monitoring tools do not provide. We love it, and in fact love it such much that we have about 140 database agents reporting into Merlin 2.0 from about 120 different machines. That results in a data influx of about 1.2G a day without using QUAN, and in a data influx of about 6G a day using QUAN on a set of selected machines.

It completely overwhelms the Merlin data purge process, so the merlin database grows out of bounds, which is quite unfortunate because our disk space is in fact very bounded.

The immediate answer to our purge problem was to disable the merlin internal purge and with the kind help of MySQL support to create a script which generates a list of record ids to delete. These ids end up in a number of delete statements with very large WHERE ... IN (...) clauses that do the actual delete.

This is a band-aid fix, which does work in a way, but also has unintended consequences, though. Or, as we use to say around here: 'That also breaks, but in a different and interesting way.' Continue reading "DELETE, innodb_max_purge_lag and a case for PARTITIONS"

Sysadmins Nightly Mental Pain (SNMP)

Dear Diary, today I ventured into one of the darkest realms of the sysadmin profession: I started playing with SNMP. My goal was very simple and quite clearly stated: Make the output of "SHOW GLOBAL STATUS" available to a SNMP client. One would think that this is a reasonable and easily fulfilled wish.

Little did I know of the madness and despair that linger in the depths which are guarded by the dread named ASN.1 and where the noxious fumes of the organisation no longer called CCITT can still strongly be smelled.

But let us begin this story at the beginning - with a clean install of Suse Linux 10.0 and my trusty apt4rpm and me. It was my thought that the perl support of net-snmp might me handy to get me where I wanted to me: perl is easily the more convenient language for prototyping that C or C++ and getting to the data source using perl-DBI would be pretty easy. So "apt install net-snmp net-snmp-devel perl-SNMP" it was. So with my spirits high and a song on my iPod I ventured out into the documentation...

Continue reading "Sysadmins Nightly Mental Pain (SNMP)"