In my blog yesterday I wrote a simple cursor example for a client who is learning MySQL the results of which should have been a distinct list of customerIDs. Unfortunately the final value in the result set was repeated! After first assuming that the cursor SELECT DISTINCT query had failed, I later verified that the query was working correctly. In fact the final repeated value in the result set only had one entry in the source table to begin with! What is going on? Here is the original cursor:
CREATE PROCEDURE cid_orders()
BEGIN
-- Declare local variables
DECLARE done BOOLEAN DEFAULT 0;
DECLARE id INT;
-- Declare the cursor
DECLARE customer_ids CURSOR
FOR
SELECT DISTINCT customerID FROM orders ORDER BY customerID ASC;
-- so we are asking for DISTINCT customerID's (so if we have
-- a duplicate customerID we shouldn't get it) and
-- are ordering the result set in ASCENDING order by
-- the customerID column
-- Declare continue handler
-- when no more data remains to be fetched the '02000' state
-- is fired and done is set, causing the loop to end
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done=1;
-- Create a table to store the results
CREATE TABLE IF NOT EXISTS order1 (id INT);
-- Open the cursor
OPEN customer_ids;
-- Loop through all rows
REPEAT
-- Get the next id number
FETCH NEXT FROM customer_ids INTO id;
INSERT INTO order1(id) VALUES(id);
-- End of loop
UNTIL done = 1 END REPEAT;
-- Close the cursor
CLOSE customer_ids;
END;
So what is happening is that when no more data is seen in the cursor result set during the FETCH statement, SQLSTATE is set to ’02000′ and the continue handler sets done to 1. However the loop doesn’t immediately exit. As a result, the INSERT statement following the FETCH is executed with the old customerID value and thus the final piece of data is written to the output table twice!
Yesterday I solved the problem before checking if the SELECT DISTINCT query was actually working correctly. By the time I thought to check, I had already written a nice solution that fixed the problem – I kept track of the previous customerID and wrapped the INSERT statement in an decision statement to keep the code from inserting a customerID twice. The only problem is that I solved the wrong problem (and constantly checking for a previous customerID is wasteful).
The actual problem is that the INSERT statement is fired following a FETCH in the original code regardless of the SQLSTATE. Unfortunately it doesn’t appear possible to check the SQLSTATE in the loop itself. However we can check the done variable and that is the first correct solution:
CREATE PROCEDURE cid_orders()
BEGIN
-- Declare local variables
DECLARE done BOOLEAN DEFAULT 0;
DECLARE id INT;
-- Declare the cursor
DECLARE customer_ids CURSOR
FOR
SELECT DISTINCT customerID FROM orders ORDER BY customerID ASC;
-- so we are asking for DISTINCT customerID's (so if we have
-- a duplicate customerID we shouldn't get it) and
-- are ordering the result set in ASCENDING order by
-- the customerID column
-- Declare continue handler
-- when no more data remains to be fetched the '02000' state
-- is fired and done is set, causing the loop to end
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done=1;
-- Create a table to store the results
CREATE TABLE IF NOT EXISTS order1 (id INT);
-- Open the cursor
OPEN customer_ids;
-- Loop through all rows
REPEAT
-- Get the next id number
FETCH NEXT FROM customer_ids INTO id;
IF done <> 1 THEN
INSERT INTO order1(id) VALUES(id);
END IF;
-- End of loop
UNTIL done END REPEAT;
-- Close the cursor
CLOSE customer_ids;
END;
The second (better) correct solution is to unconditionally exit the loop after the FETCH statement has raised the ’02000′ SQLSTATE. This requires changing the REPEAT … UNTIL loop to a labeled LOOP:
CREATE PROCEDURE cid_orders3()
BEGIN
-- Declare local variables
DECLARE done BOOLEAN DEFAULT 0;
DECLARE id INT;
-- Declare the cursor
DECLARE customer_ids CURSOR
FOR
SELECT DISTINCT customerID FROM orders ORDER BY customerID ASC;
-- so we are asking for DISTINCT customerID's (so if we have
-- a duplicate customerID we shouldn't get it) and
-- are ordering the result set in ASCENDING order by
-- the customerID column
-- Declare continue handler
-- when no more data remains to be fetched the '02000' state
-- is fired and done is set, causing the loop to end
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done=1;
-- Create a table to store the results
CREATE TABLE IF NOT EXISTS order1 (id INT);
-- Open the cursor
OPEN customer_ids;
-- Loop through all rows
loop1: LOOP
-- Get the next id number
FETCH NEXT FROM customer_ids INTO id;
IF done = 1 THEN
LEAVE loop1;
END IF;
INSERT INTO order1(id) VALUES(id);
-- End of loop
END LOOP loop1;
-- Close the cursor
CLOSE customer_ids;
END;