Imagine that 90% of your job is merely to triage issues on a very massive, very broken website. Imagine that this website is written in the most tightly coupled, least cohesive PHP code you’ve ever seen, the type of code that would add the original developers to your “slap on sight” list. Imagine that this web application is made up of 4 very disparate parts (1 commercial, 2 “repurposed”, and 1 custom) and a crap-ton of virtual duct tape and shims. Imagine that it contains the type of programming practices in which major components of the website actually rely on things NOT working properly, and fixing these broken things usually breaks other things. Imagine that you know from too many bad experiences that changing one seemingly innocuous part of the website, such as splitting a “name” field into two separate “first” and “last” fields, will bring the site to its knees and require hours of rollbacks, merges and patches. Imagine pleading with the customer for years to just ditch the code and start all over but being met with Enterprise-Grade despair and hand wringing. Then imagine getting ASAP/EMERGENCY tickets to implement new features that in any other web site would take 4 hours but you know better with this site so you quote 40 hours, then blow right by that and bill 80 hours, but it’s OK because the client is used to that with their website.
Here are some other things that you should also imagine:
- there are no tests at all right now
- there are googleteen different layers of logins. Some customers actually have 3 different accounts for different sections of the website
- when I say “tightly coupled”, I mean the loops of include/require statements would probably map out like a celtic knot
- when I say “least cohesive” I mean some stuff is organized sort of like MVC, but it’s not really MVC. In some cases it may take you several hours just to find out how URI A is mapped to file B
- the UI was written like “obtrusive” and “inaccessible” were the buzzwords of the day
Imagining all that, is it even worth trying to achieve even a moderate level of test coverage? Or should you, in this imaginary scenario, just keep doing the best you can with what you’ve been given and hoping, praying, maybe even sacrificing, that the client will agree to a rewrite one of these days and THEN you can start writing tests?
Since many of you brought it up: I have approached the possibility of a re-write at every chance I’ve had to date. The marketing people I work with know that their code is crap, and they know it’s the fault of the “lowest bid” firm they went with originally. I’ve probably overstepped my bounds as a contractor by pointing out that they spend a crap ton of money on me to provide hospice care for this site, and that by redeveloping it from scratch they would see an ROI very quickly. I’ve also said that I refuse to rewrite the site as-is, because it doesn’t really do what they want it to do anyway. The plan is to rewrite it BDD style, but getting all the key players in one place is tough, and I’m still not sure they know what they need. In any case, I fully expect that to be A Very Big Project.
Thanks for all the feedback so far!
You’ll likely not get full coverage for some time. But you can write tests for new code/features you implement. Start small. Don’t try to do everything all at once.
And perhaps the book “Working effectively with Legacy Code” is worth a read?
I also would recommend watching this presentation from Uncle Bob which touches on this scenario and how to transform a bad code base, into a good one using “Progressive Widening”
Start by finding any self contained functions. Functions that don’t reference anything but the arguments passed in. Move and organize them into helper classes. This is likely just temporary, as many will end up in different classes later, but this will help identify some duplicated code, and start getting things organized. Then, look at how these are used, write tests based on these uses. And pat yourself on the back. You’ve now started making your code maintainable.
With great timing InfoQ just posted another article How To Do Large Scale Refactoring which is specifically this sort of thing, and another, older article called Refactor or Rewrite? and there are techniques like the Mikado Method where you have to realize you can’t always make the move you want in one step, you have to make other moves to setup and realize your goal.
If you say that multiple things rely on other things specifically not working then how can you even begin to test it?
Personally I would say scrap it and start over. Four hour features that take 80? I hope this is an exaggeration. The headaches you must have.
I would start with a very firm proposal to re-write the code base. Hand-wringing clients must be told the blunt truth some times. How many other developers will work with a broken base? Make some pretty cost / benefit charts.
By all means write tests for code you write. Don’t neglect that. I’m saying I wouldn’t try to write tests on the existing code base.
Give it a go
Writing tests enables you to refactor. If you write your tests at a high-enough level, you might manage to make it so you can refactor without having to re-write the tests every time.
It’s a least worth a go, on a small part of the site (I know you won’t be able to fully isolate any part, but you can still target part).
Maybe set yourself a time budget (it’s down to you to work out what’s affordable/worth it), then have a go with some tests and some refactorings. If it doesn’t work out, roll back. If it does, carry on.
First, if your customer is use to your estimates being half what it actually takes, fix your estimates! It is nice the customer is ‘OK’ with the estimates being off — but it is critical you get your estimates more in line with effort actually needed. Without that, what customer would ever consent to a major refactoring — let alone a rewrite. So get some history of being right with estimates, then move to rework the project.
As for writing tests. That is even more vital for what you describe than for a green-field project. In every piece of code you touch ask yourself if it is possible to decouple the behavior that should be there from the behavior that is there. Write the code the way it should be (with tests) and then add a layer of abstraction to make it the way it currently is (and test that too!). You will feel like your adding to the mess, and you will be — but slowly, over time, the tests will give you confidence in these areas.
If it’s anything like what I’ve been dealing with, it will be on the order of pulling a single method out into a helper class and patching it back into the existing code, hardly seems worth it — but it does pay off every time you have to touch that part of the system again. Like they say — “leave it better than you found it” and you’ll start finding it in better shape each time you come back to it. Tests are the best way to leave it better than you found it.
But seriously, getting the client confident in the accuracy of your estimates is required before they will be fully confident in your ability to handle a rework.
Absolutely write tests. Especially in a tight-coupled environment the tests are going to be even more critical (since a bug fix in one area can drastically affect other areas due to the tight coupling).
Now, realize that this will likely not be a trivial task. In order to write tests, you’ll need to modify the code to be testable. In order to modify the code, you need to have tests. So you’re caught in a dependency cycle…
However, look at the potential benefits. That should tell you if it is really worth it or not.
If you do start out, start small. Pick one tiny piece that looks loosely-coupled, and test that. Then find something else that’s not that tangled. Test all the loosest pieces first (the low hanging fruit). Then, once you get to the really tight parts, you’ll both feel more comfortable and (hopefully) have more insight as to what you really need to do.
Remember, you don’t need 100% coverage to reap the benefits. Each test adds meaning…
You can’t scrap it. The customer isn’t going to let you, and it might not be the best path anyways.
So instead of quoting 40 hours for a fix that should have taken minutes… quote 60. The customer seems A-OK with that. Use 40 to fix, and 20 to refactor… and write tests on what you refactor. If the 60 runs to 100, then spend 120; 80 to fix, and 40 to refactor/test.
Build in time to improve the thing into your normal estimates, or find new work; the current situation, it sounds like, will drive you into hating our field.
This sounds like in order to make it testable at all, you’d have to rewrite parts of the system from scratch – unavoidably causing tons of bugs in the process.
From what you describe, the old system is not worth putting that kind of effort into.
I would under no circumstances try and introduce testing for this, but try to get permission to rewrite as soon as possible.
If your client doesn’t see the light, consider whether refactoring the project is worth giving some time of your own: Working with clean code is so much better for one’s well-being…
The most important thing (After buying Working efficiently with legacy code) is to start small. I work on several projects, each several thousand PHP lines long and often without a single function (and don’t even think of objects) and whenever i have to change code i try to refactor the part into a function and write a test for it. This is combined with extensive manual testing of that part so i can be sure it works as before. When i have multiple functions for similar things i move them as static methods into a class and then, step by step, replace them with proper object-oriented code.
Every step from moving it into a function to changing it into a real class is surrounded by unit testing (not very good one as 90% of the code are SQL queries and it’s nearly impossible to set up a reliable testing database, but i can still test the behaviour).
Since a lot of code repeats (i found a single SQL query repeated 13 times in a single file and many times more in the other 50 files of that project) i could change all other places, but i don’t since those are neither tested nor can i be sure the surrounding code doesn’t depend on that code in some wierd way (think
global). That code can be changed as soon as i have to touch that code anyways.
It’s a long and tedious work and every time i see the code i feel a step (or rather a leap) closer to mental breakdown, but the code quality will improve (slowly but mostly reliably).
Your situation seems to be quite similar, so maybe my ideas might help you with your code.
Start small, change only what you work on and begin to write only limited unit tests and expand them the more you learn about the system.
Start by doing black box, functional testing, connected parts or bits and pieces here and there. This makes continued development and refactoring/rewriting much easier.
Been there, doing that.
Took a while till we could start adding unit testing, but got there eventually.
It’s still far from bulletproof but all developers are much more confident to dare to change/fix things when you know that there is a test suite waiting to try to verify your code changes.
From your scenario, you should have a long list of fragile areas of the code that tend to be affected by innocuous changes (or at least areas that absolutely must work). If you can wright tests against this list, you have a quick way to find out when a change you’re implementing has broken something.
In theory, definitely. The more tightly coupled, bug ridden the maintenance process then the more important the tests. In practise, walk away and live another day!
If things behave reliably, you can test them, right? Your system works the majority of the time, so you can test for those success conditions.
..innocuous part of the website, such as
splitting a “name” field into two
separate “first” and “last” fields,
will bring the site to its knees and
require hours of rollbacks
Splitting a field apart such as first and last name sounds like a potential massive thing – but sounds like you’ve learned your lesson. At least try to get some funding for a full size test system and put the procedures in place to make moving production data to it automatic, so you can fully test this thing.
Sounds pretty horrible though. Time to dust of the ole resume?
You might want to consider billing another 40 hours/iteration to create a nice BDD (domain) model of how the application works or better: should work. That creates a nice framework where you can document the needed features. When the model is complete enough, you can estimate how much time you’d need to convert it to a working application.