Refactoring your legacy code - Part One: In the beginning there was….
As promised Tomas and me will try to give a rough overview what you need to consider when you plan to refactor your old (legacy) applications. It won’t be a detailed guideline nor very much sorted. More or less our toughts and experiences over the last years.
If you want a more detailed (including sourcecode level) introduction you should attend our workshop on phpconference. As I will run this workshop together with two of the best known PHP developers ( Johann Peter Hartmann and Thorsten Rinne) you can expect much more details.
For sure I will upload the presentation after the IPC - but not before
Now - refactoring. Hmm… Refactoring means per definition modifying (cleaning up) it without changing its behavior. But how do you make sure that you won’t change funtionality if there are no tests ? This means - you only can refactor when there are tests. And this is where the problem starts - no tests, or ?
And there we go. Let’s assume you do have a larger project - growed over the years. And now you are in the lucky situation that your boss agree’s with you about the need of tests. But where to start ? The team has no experience in writing tests. Nor Test Driven Development what is even worse.
You will need to invest some time into the team so that they learn about writing tests. And you will experience that knowing and understanding are two different things. From my personal opinion there is only one thing which can be compared. The difference between procedural and object oriented programming. Strange example ? No.
Let me explain. It is very easy to get a book and learn a bit about classes, objects etc. But only after using it for a while you will deeply understand the concept of OOP and get the sense. Same is for TDD - writing a test is more or less a matter of minutes. Or let’s say hours if its your first. But understanding why tests are important and how to use them. This needs time. And you will have to invest this time.
After I pushed the team into this direction ( and hey - this was not so easy as most developers tend to be very conservative ) they did the tests - simply because the boss said so. But only a few weeks one Developer told me “Hey! Tests are cool! I found a bug I would have never found before”. This is the point where you need to get your team to.
So much about the theory. But how about the tests. Where to start? The answer is easy. You need to start with the worst, nastiest and largest file you can find in your project. I know I know. They want to start with the easier files, where tests are done fast and you see some progress. Developers will really fear this file. But what will happen if you start with the easy ones? Simply - you will see some progress and get the false feeling that things are alright. Your team still won’t understand the tests fully - and they leave the nasty piece of code what to the end. You have them to force to adress their demon in the beginning. This will take a while. But then you can be sure that the rest will be a piece of cake. Otherwise you will move the risky part to the end of the refactoring period - and this is not a good idea.
You might think “this guy is nuts where are the problems in writing tests…”? Hey! We are talking about old, grown ( == spaghetti ) code. This code is interconnected massively. Globals, mixed classes, wild calls between modules and objects etc. It is very difficult to write tests for this.
Actually while writing tests you will note that you need to refactor the code. You simply have to. As otherwise you can’t write a test. Or - let’s be honest - you can. If you write mocks and stubs which are triple the size of your code. This is exactly what unexperienced TDD developers will do. If you see this - kick them. And then again just because it feels good
Each tests leads to refactoring. Like the gordian knot you will start to pull the worm and you will pull, pull and pull. And after some while refactoring is done and the first test can be written. Ok just kidding. But there is some truth in this statement. You need to refactor the code first - then write the test. No large stubs.
And this will be a lot of work - especially in the beginning as everything is interconnected. And these dependencies need to be addressed and solved. First. Seperate the model and the view. In the View leave no logic - just I/O. And then test the model. Some people even test the View with unit tests. I don’t. I do prefer testing the model, having small views without logic and leave the rest to the selenium tests.
Nasty. Takes long. Much work. But believe me - there is no way around.
While reading you might get the idea that you will get away with acceptance tests ( selenium ). Do you think this could do the deal? The idea is thrilling. If you will write selenium tests for your application fully - then refactor - and you did not break any tests you could be sure that you did not change any functionality. Nice idea? Won’t work. Sorry.
Remember - I was talking about a larger grown application. This one will have tons of functionality. This means 100s or 1000s of tests. And these tests can last long. I know it. I went down this road - and it was a oneway road. We ended with one full test running 10 hours. This is not Test Driven Development. For doing that you need instant - instant - feedback. Not on the next day. Nobody can work like this.
Let’s continue in Part Two.
Comments
4 Responses to “Refactoring your legacy code - Part One: In the beginning there was….”
Leave a Reply
[...] - instead the misuse of global scope in variables and functions. This is a followup to my previous posting. And again (as every posting in this series) some advertisement for our talk at the [...]
Hi, I want to translate your article to Portuguese, can I?
Sure
go ahead.
[...] artigo é uma tradução do artigo Refactoring your legacy code - Part One: In the beginning there was…, caso você encontre erros de português, concordância, tem algum comentário ou agradecimento, [...]