CodeIgniter Database Memory Use In Long Running Scripts

PHP and associated frameworks aren’t optimised for running long scripts so when your app needs to do a lot of data crunching, there are often memory problems. In this case, I discovered CodeIgniter was keeping all my queries tucked away in its internal structures. Apparently it does this so you can access profiling information when the script finishes.

I was working on my hacky app for analysing a log file today when I loaded in a biggish log file and hit the dreaded memory limit exceeded problem:

PHP Fatal error: Allowed memory size of 33554432 bytes exhausted

I googled and stackoverflowed for PHP memory profiling solutions and pretty much saw only two options: xdebug and the built-in memory_get_usage(). I considered using xdebug but in this case it would mean installing it on a production server and potentially breaking things or else copying some big files to the office and trying to reproduce the problem on our development server which would also be a big hassle. I was sure I could find the issue more easily than that. Using memory_get_usage in a log statement, I was able to get some clues as to what was going on but it’s hard to use this as well because while I could see where the memory was growing, I couldn’t tell if the large arrays were being garbage collected at a later stage (which I’m pretty sure they were).

The way the log file processing works is like this. I iterate through the file a line at a time and grep for markers that let me see what type of log message it is. Depending on the grep result, I call into the appropriate method for parsing the log message and storing the result into the matching array for that log message type. Once any of these arrays gets to a certain number of elements, I call into the method which creates a bulk insert to the DB and clears out the array.

As best as I could tell, the big arrays would have been marked for garbage collection at this point in my method:


$this->tagset_events = array();
/* the former contents of tagset_events should
now be free for garbage collection */


 

After a bit of reading I came across this StackOverflow post which mentioned the database library and that got me thinking about whether CodeIgniter was keeping my queries hanging around and using up all my memory.

As a first experiment, I changed my bulk insert calls to use simple_query() instead of query() and ran my script again. The results were immediately a success! The script finished running and the memory use reported by the profiler was only 3MB as opposed to 32MB. From this I am concluding that CodeIgniter does keep your SQL queries tucked away somewhere by default which can use up memory if you’re processing a lot of queries (in this case it was in the order of 3000 queries each seeming to use about 8K of memory)

Leave a Comment

Optimising and Profiling a CodeIgniter App

CodeIgniter has a handy profiling class which automatically times queries and can be used to time blocks of code. There is no need to live with the pain of a slow app when often you can get improvements in the order of 100 times faster just by adding an index.

I have a hacked together CodeIgniter (aka CI) based web app here at work that I use to analyse some logfiles. The basic workflow of this app is to grep through the log files looking for certain events and create database rows for these events that I can then run queries against and produce a graphical representation of what’s going on (I use SVG for the graphics but I’m not going to talk about that in this post).

Since I wrote this app, I’ve had abysmally bad performance which has gradually worsened as I’ve added more log data to the system. Today it got bad enough that I decided to bite the bullet and sort the mess out.

My first attempt to locate the issue was to place log comments with time stamps at various points in the code. By looking at the timestamps on these messages in the CodeIgniter log, I was able to locate the problem methods in my app. It turned out I had two methods that were causing the issues. I picked the first of these methods and put some debug messages in the inner for loop to see if I could break things down further but unfortunately, the granularity of the log timestamps is only one second so I couldn’t get any further using this method unless I started rolling my own timing methods.

Rather than rolling my own at this point, I decided to investigate the CodeIgnitor profiling options, assuming it had some built in help. It turned out that yes indeed CI has a cool and useful profiling class built in. Not only does it automatically print the timings of your SQL queries, and overall page load time, it lets you mark points in the code and print out the execution times of sections of code (these are called benchmark points in CI). I broke my problem loop into three timed sections and enabled the profiling as per the CodeIgniter User Manual

At this point I saw that two of the sections in my loop had ~300ms execution time. I also noticed in the SQL query section that the queries launched by these methods also had ~300ms execution time so I knew that these queries were the problem as they were being called inside a for loop so the 300ms was quickly multiplying out to tens of seconds in my total page load time.

The query in question looked like this:


SELECT value FROM niotag_events
WHERE logfile_id = 14 AND uri='our://funny/url/scheme/id/thing'
AND evtime < '2011-04-29 11:02:34.345' AND resource='M123B'
ORDER BY evtime DESC LIMIT 1

 

As It turns out, I hadn’t put any indexes on my tables so it made sense that the queries were running slow. I’m not a database optimisation expert so I just decided to add an index that covered the columns in my where clause:


CREATE INDEX niotag_events_idx_1
  ON niotag_events
  USING btree
  (logfile_id, resource, uri, evtime DESC);

 

Just this index reduced the query to less than a millisecond and I immediately felt the page load speed halve. I then tackled three other queries on different tables using the same method with similar improvements.

The upshot of this exercise was that I saw my page load time go from 25 seconds to about 250ms, that’s a factor of a 100 times speed increase. Think about the minutes I was spending waiting for results of my log crunching app while I could have been clicking through to the next page to find what I was looking for. Think of the amount of boredom that was causing me and even worse, the windows for distraction as I clicked over to twitter or elsewhere just waiting for pages to load. This exercise was well worth the effort and my only question I have now is why didn’t I fix this sooner?

Leave a Comment