IPC 2008: Impressions

In the speaker shuttle to the (far away) hotel:

“Stop moving your knees!”
“This is not my knee…”
“OH MY GOD!”

Refactoring your legacy code - Part four: Fixture this!

Another day in paradise. The right time to continue with our small series about refactoring. Today Tomas and me will talk a bit about the right (and wrong) way of preparing your data for the tests - the so called “Fixtures”. If you will read in the text the context “we” then Tomas wrote this part about his experience with the OXID refactoring.

Remember - we learned how to test simple 2×2 functions but when we tried to test more complex ones initialy we felt like stalled with no or negative progress. We tried to identify what is the difference between simple functions returning a value from supplied parameters and the real life ones.
The problem with those functions is that they depend in most cases on your application state (script globals, config vars, session vars, object state, existing data etc..).

As already shown in the previous postings we are able to separate all the code into a few abstraction layers (db access, fetching config vars,..) and we did that, but finally we still wanted to test in a simple way the high-level functions. Eg. createOrderFromBasket()->saveOrder()->loadOrder() flow. Lets identify the issues preventing us from simply testing these functions and see the way how OXID handled them.

Time consuming fixture creation:

It is quite easy and fast to prepare a small data set for tests, but when you try to test a bit more complex functions like the calculation of values in a shopping basket, then you have to prepare the full basket and define all the needed values (products, amounts, vat… ). And if you have to do this for every test, this is a pretty boring task.
OXID decided to solve this issue by creating a test database filled with all the common data: users, orders, various combination’s of products, discounts etc. For each test where we needed this data we created on the fly fixtures out of this database - some sort of “fixture” generator with a static source (our database). This helped a lot in speeding up fixture creation. I’ve heard some opinions and complaints that our big demo data is a big NO, not a really valid “fixture” and you should prepare fixtures for every test separately, but why not to consider our test data as one fixture - only bigger? Leave us a comment if you think different!

Actually the way described above violate the theory behind TDD. In the swoodoo team we do it a bit different - but still I want to share this with you, as it’s a real world example and - not to forget - a very successful one.

Test data is altered via another test:

And here we have the next problem. When you have one big test data chunk, it gets altered by other tests. One of the (very wrong) ways would be adjusting tests, depending on which order they are run. But it’s clear you will reach a dead end very soon. Do not even try this. Wrong!

How did OXID solve this? First of all we created a rule: use test data for reading relaxed (actually the nature of our business logic is that it is “read data” oriented) and - if we need to change test data for test – make sure we change it back on tearDown(). This is still dangerous as one failed test may result of other test failing. Tracking this kind of failures is most difficult task. So additionally we use a small script which is run after each test (test class), which diffs existing db with initial and restores it.

Actually the much better way is to create the fixtures each time manually with tools like yaml. This way is the correct way and it can be solved also without too much work if you create yourself a generator for yaml. Doing so will avoid many problems like the ones described above. On the other hand - this is pure theory. And refactoring an existing application is something totally different than creating a new one with the latest state of the art technology.

Refactoring your legacy code - Part Three: View me!

And there we are again - today we will take a deeper look into the views.

It is important to decide which classes are critical to test and which classes only deal with retrieving information. The MVC pattern is the answer. It separates logic from presentation. View classes only “reroute” the information and Controller classes host all the logic.

Decoupling Models from the Views is vital for your tests. For the model you need to have nice unit tests. The Views should be tested by acceptance tests. Some people do also run unit tests on the Views - it is possible - but my personal opinion on this issue is that you really should try to avoid logic in the view and then it’s sufficient to test them via selenium tests.

This is why we created a new rule in effect – no logics in view classes, just getters and setters for controllers and super global wrappers.

Let me give you a good hint: do never ever use the accessor methods for the super globals from the model classes. Let it be $REQUEST, $COOKIES or even $SESSION. All these parameter should be collected in the view and passed on to the model. Doing so gives you the chance to have small, clear and decoupled functions and also your tests can be done very easy.

Let’s see how this could look like before:

class displayUserDetails()
{
    /**
     *  Processes input and sends user first name, last name to display;
     */
    function show() {
        global $dbLink;
        global $templateEngine;
        $itemId = (int) $_REQUEST['user_id'];
 
        $firstName = $dbLink->getOne("select first_name from users where id = $itemId");
        $lastName = $dbLink->getOne("select last_name from users where id = $itemId");
 
        $templateEngine->addTemplateVar('firstName', $firstName);
        $templateEngine->addTemplateVar('lastName', $lastName);
        $templateEngine->display();
    }
}

and after refactoring you should get something similar:

/**
 * A view class responsible for displaying user details. 
 */
class userView()
{
    /**
     * Loads user object and sends first name, last name to display
     */
    public function show()
    {
        $userId = $this->_inputProcessor->getParameter("user_id");
 
        $this->templateEngine->addTemplateVar('user', $this->model->loadUser(userId));
        $this->templateEngine->display();
    }
}
 
/**
 * And the corresponding model
 */
class userModel()
{
	public function loadUser($userId) 
	{
		$user = new User( $userId );
 
        return array( 'firstName' => $user->getFirstName(),
        			  'lastName' => $user->getLastName());	
	}
}

And - as usual thanks to Tomas who provided this time the examples.

Refactoring your legacy code - Part Two: Global Problems ?

Today we want to address a global problem. No - not the global warming up - instead the misuse of global scope in variables and functions. This is a followup to my previous posting. And again (as every posting in this series) some advertisement for our talk at the phpconference.

Legacy Applications are often done in PHP4. Which means no really object orientation. All class variables are public - and this is how they are used. Over the time many members are set/get in different places.
If you start writing a test - you need to make sure that you only test specific functions. All subsequent calls, data etc. need to be stubbed == faked.

Quite common would be that you have to deal with many global variables. In OOP applications it shouldn’t  be that the output of the function is influenced by global parameters. And for correct testing you need to refactor this.

Both cases need to be addressed by you - and this is where the refactoring starts. As I wrote in the last posting - while writing the test you have to refactor otherwise you can’t create proper tests.

But what to do with PHP super globals $_GET, $_POST, $_SERVER, $_COOKIE etc.? They are globals by PHP nature. You could create a wrapper (proxy) myInput::getParameter(‘getParam’) and use it everywhere in the code. No exceptions allowed. Doing so you can stub these calls also. And furthermore it adds a bit more security as you can add in this function the very basic and idiotic security checks. They won’t be enough for sure - but it’s a start.

The same applies to the global configuration parameter. The most important here is – you must be able to supply your own parameter in tests here. This way you can write tests for different request parameters and config options. On test tearDown() all parameters should be restored to defaults.

Let’s see how this could look like. You might find such settings in your code:

class someOtherClass {
    var $setting;
 
    function calculateSomething($a, $b) {
        return $a+$b;
    }
}
 
class myOldNastyClass {
 
    function needToTestThisFunction() {
 
        $class = new someOtherClass();
 
        $z = $_GET['input'];
 
        // ....
 
        return $class->calculateSomething( $class->setting, $z);
    }
}

This code has more or less all the pitfalls you will explore while dealing with globals. Before you can even think of writing a test for this you need to refactor it. And this could lead to:

class someOtherClass {
    private $setting;
 
    public function calculateSomething($a, $b) {
        return $a+$b;
    }
 
    public function setSetting($set) {
        $this->setting = $set;
    }
 
    public function getSetting() {
        return $this->setting;
    }
}
 
class myInput {
    public function getParameter($name) {
        return $_GET[$name];
    }
}
 
class myOldNastyClass {
 
    private $input; // set e.g. in constructor
 
    public function needToTestThisFunction(someOtherClass &$class, $z) {
 
        $z = $input->getParameter('input');
        // ....
 
        return $class->calculateSomething( $class->getSetting(), $z);
    }
}

Now we have some code we can test. Let’s see how the tests could be:

class myStub extends someOtherClass {
 
    public function getSetting() {
 
        // fixture
        return 99;
    }
}
 
class myInputStub extends myInput {
     public function getParameter($name) {
        // fixture
        return 999;
    }
}
 
// in the test
 
$test = myOldNastyClass->needToTestThisFunction( new myStub(), 88);
// ....

I hope I could point you in the correct direction ;)

Refactoring your legacy code - Part One: In the beginning there was….

As promised Tomas and me will try to give a rough overview what you need to consider when you plan to refactor your old (legacy) applications. It won’t be a detailed guideline nor very much sorted. More or less our toughts and experiences over the last years.

If you want a more detailed (including sourcecode level) introduction you should attend our workshop on phpconference. As I will run this workshop together with two of the best known PHP developers ( Johann Peter Hartmann and Thorsten Rinne) you can expect much more details.

For sure I will upload the presentation after the IPC - but not before ;)

Now - refactoring. Hmm… Refactoring means per definition modifying (cleaning up) it without changing its behavior. But how do you make sure that you won’t change funtionality if there are no tests ? This means - you only can refactor when there are tests. And this is where the problem starts - no tests, or ?

And there we go. Let’s assume you do have a larger project - growed over the years. And now you are in the lucky situation that your boss agree’s with you about the need of tests. But where to start ? The team has no experience in writing tests. Nor Test Driven Development what is even worse.

You will need to invest some time into the team so that they learn about writing tests. And you will experience that knowing and understanding are two different things. From my personal opinion there is only one thing which can be compared. The difference between procedural and object oriented programming. Strange example ? No.

Let me explain. It is very easy to get a book and learn a bit about classes, objects etc. But only after using it for a while you will deeply understand the concept of OOP  and get the sense. Same is for TDD - writing a test is more or less a matter of minutes. Or let’s say hours if its your first. But understanding why tests are important and how to use them. This needs time. And you will have to invest this time.

After I pushed the team into this direction ( and hey - this was not so easy as most developers tend to be very conservative ) they did the tests - simply because the boss said so. But only a few weeks one Developer told me “Hey! Tests are cool! I found a bug I would have never found before”. This is the point where you need to get your team to.

So much about the theory. But how about the tests. Where to start? The answer is easy. You need to start with the worst, nastiest and largest file you can find in your project. I know I know. They want to start with the easier files, where tests are done fast and you see some progress. Developers will really fear this file. But what will happen if you start with the easy ones? Simply - you will see some progress and get the false feeling that things are alright. Your team still won’t understand the tests fully - and they leave the nasty piece of code what to the end. You have them to force to adress their demon in the beginning. This will take a while. But then you can be sure that the rest will be a piece of cake. Otherwise you will move the risky part to the end of the refactoring period - and this is not a good idea.

You might think “this guy is nuts where are the problems in writing tests…”? Hey! We are talking about old, grown ( == spaghetti ) code. This code is interconnected massively. Globals, mixed classes, wild calls between modules and objects etc. It is very difficult to write tests for this.

Actually while writing tests you will note that you need to refactor the code. You simply have to. As otherwise you can’t write a test. Or - let’s be honest - you can. If you write mocks and stubs which are triple the size of your code. This is exactly what unexperienced TDD developers will do. If you see this - kick them. And then again just because it feels good ;)

Each tests leads to refactoring. Like the gordian knot you will start to pull the worm and you will pull, pull and pull. And after some while refactoring is done and the first test can be written. Ok just kidding. But there is some truth in this statement. You need to refactor the code first - then write the test. No large stubs.

And this will be a lot of work - especially in the beginning as everything is interconnected. And these dependencies need to be addressed and solved. First. Seperate the model and the view. In the View leave no logic - just I/O. And then test the model. Some people even test the View with unit tests. I don’t. I do prefer testing the model, having small views without logic and leave the rest to the selenium tests.

Nasty. Takes long. Much work. But believe me - there is no way around.

While reading you might get the idea that you will get away with acceptance tests ( selenium ). Do you think this could do the deal? The idea is thrilling. If you will write selenium tests for your application fully - then refactor - and you did not break any tests you could be sure that you did not change any functionality. Nice idea? Won’t work. Sorry.

Remember - I was talking about a larger grown application. This one will have tons of functionality. This means 100s or 1000s of tests. And these tests can last long. I know it. I went down this road - and it was a oneway road. We ended with one full test running 10 hours. This is not Test Driven Development. For doing that you need instant - instant - feedback. Not on the next day. Nobody can work like this.

Let’s continue in Part Two.

Refactoring legacy Code ?

It has been a while since my last posting - sorry about that. Actually I was - and am - really busy with the new milestone of swoodoo.com which we will (hopefully) release soon.
We did refactor nearly the whole system - starting from the database layer, over to the API and finally the GUI. Everything will be brand new - or at least massivly refactored.

As we do agile development since many years we are quite experienced with managing such large refactoring sessions. More or less the same happend in OXID eSales - the new eShop Version was refactored massively. Under the hood (and not only there) many things changed.

And how about you ? Over the last years I heard many developers, managers and companies stating that they would love to change/refactor their big ball of mud - but they won’t as they fear to fail. And in deed - the risk to fail is definitly there.

On the upcoming php Conference near Frankfurt Johann Peter Hartmann, Thorsten Rinne and me will run a full day workshop where we try to show you, where the devil in the details is creeping out of his hole.

For the ones who can’t or don’t want to attend the php Conference (which is a huge mistake but hey - it’s your decision) my friend Tomas Liubinas (Senior Developer at OXID eSales) and myself will try to give you a brief overview over the pitfalls and learnings we had in refactoring OXID eShop.

This will be a small series of articles - as otherwise it would be a monster posting :)

Check back soon.