Tuesday, July 5, 2011

Stored procedures and Triggers


Having multiple ways to achieve a task is something we all enjoy as developers and DBAs. We find, develop and learn new ways to do things better and faster all the time.

At the risk of starting a debate, I am curious on others opinions or practices when it comes to Stored Procedures and Triggers. To use them or not versus code based functions ? Best case use versus worst case use? There is no real wrong answer here as it depends on your development application. Certainly some lean one way over another and there are more than enough valid reasons on both sides of the debate.

Here are couple of my thoughts on the topic....

I come from the dot.com bubble era , and from that I rarely use stored procedures or triggers. Back then PHP was still new, Perl dominated websites with the cgi-bin and MySQL did not have stored procedures or triggers. Thank goodness things have changed. Developing in those days, forced developers to develop in certain ways. Hopefully, we all used functions and classes so we could reuse code as much as possible. Yet we all survived without stored procedures or triggers just fine by having the logic in our code. But did that make it better?
During this time I had to interact with a SQL server and the DBA that ran it. My LAMP stack used MySQL except for the one call to the finance department's SQL server for a yes/no value. ( I actually used FreeTDS for this connection and it worked like a champ, my side of things anyway. No comment on the speed or other issues of the SQL server needed here. ) I did not need or care about the table structures and layout of the finance SQL system. I only needed a yes or no answer from it. It was the finance teams system and they protected it, rightfully so. So in that case it made perfect sense, for a stored procedure. I handled the code on my side , made a call into the SQL server got the answer and got right out.

Now days of course we have stored procedures in MySQL, do we use them? Is it just an old school development versus modern development in terms of who uses them? I have seen them used at times when it just made it harder to track down a problem. Tracing code across six different functions and cursing the entire way at the missing comments in the code only to find a CALL statement. So we end up going back to the DB to find the query and hopefully do not break the site as we test changes. At the same time, I think they are great if you do need to compartmentalize your logic away from others.

It really all comes down to your application. Where do you want your logic to live? The biggest thing we need to take into account when we decide this is, what do your benchmarks say about your query and do you value the results? Do you want to take advantage of not passing the data across a network and keeping it local to the database? Are you ok with the a separate cache per connection to a stored procedure versus a globally cached query? Depending on the query and how often it is used plays a big part in that decision.

A simple example of stored procedure using the “menagerie database”  is below:

DROP PROCEDURE IF EXISTS addnewpet;
DELIMITER //
CREATE PROCEDURE addnewpet(IN petname VARCHAR(20) , ownername VARCHAR(20), speciestype VARCHAR(20), gender CHAR(1) , bday date )
BEGIN
INSERT INTO pet (name,owner,species,sex,birth) VALUES (petname,ownername,speciestype,gender,bday);
INSERT INTO event VALUES (petname, date_format(NOW(),'%Y-%m-%d'), 'first visit', NULL) ;
END //
DELIMITER ;
CALL addnewpet('Clifford','Emily','dog','m',date_format(NOW(),'%Y-%m-%d') );

This simple call will add the dog name to both tables when used.

You will notice the one of the problems with stored procedures is that you might end up having to pass a lot of variables that might already be defined in your class.

Triggers are another example where history has had developers learn to query data after insert and/or have scripts check the status of data to then make changes as needed. Triggers avoid this for you and should be used whenever possible. Let MySQL handle the adjustments for you immediately and avoid having to write code just for a simple update. Granted as situations get more complex the need might arise for function based code as well. Make sure that all code is optimized before used in the trigger as trigger options are limited.

Replication concerns with triggers are addressed here:

A quick example can be shown using the “menagerie database

mysql> SELECT TRIGGER_NAME, EVENT_MANIPULATION, EVENT_OBJECT_TABLE, ACTION_STATEMENT
FROM INFORMATION_SCHEMA.TRIGGERS
WHERE TRIGGER_SCHEMA='menagerie';
No triggers so we will create one. In this example we can assume that people forget to update the death of pets in the pet table. They note it in events but that is it.
So a simple trigger will keep this updated for us :
delimiter ;;
CREATE TRIGGER ed_update AFTER INSERT ON event
for each row
begin
UPDATE pet p
INNER JOIN event e ON p.name = e.name
SET p.death = e.date
WHERE e.remark = 'death' AND p.death IS NULL;
end;;
delimiter ;

This trigger just updated the pets table for us quickly.


To boil down my thoughts on stored procedures and triggers.... When you are considering a stored procedure make sure to benchmark and test to confirm your ok with the cache results. A stored procedure is best when the global cache is not going to be more effective overall. If tests show that the results are faster handling large data sets with stored procedures versus code base changes then also a good option to use stored procedures. Testing all options is critical.

Triggers can be a great addition to clean up additional scripting work that can be done immediately upon data insert. Again test and confirm to be sure your aware of all the options of using them. Options with triggers are limited so do all the explains ahead of time.

More links here: