God classes sound nice, but they are the Devil

We have different names for it, but every software developer has seen them: one or more classes that seem to know everything and follow absolutely no programming principle. The class has 1000+ lines of code, class fields, static fields, methods with DB calls, static methods, if statements in an if statement in a double loop...Any question you could ask in the domain is answered by one of these classes. Any class you look into seems to depend on them. You can’t change any class in the system without breaking the tests for a god class, if it even has tests.

How to tackle them? First, the most obvious approach is not to write them :-) don't be the guy that:

  • leaves code with comments like this: //When I wrote this, only God and I understood what I was doing //Now, God only knows
  • says “Management put impossible deadlines, so I needed to cut corners”

  • says “I will commit these 500 lines of code here temporary, and refactor it later”

If they are already created there are many ways to deal with them, non is a unique solution, but here are some general suggestions.

  • read the book “Clean code” by Robert Martin
  • Write tests
  • break methods into smaller methods and break the class into smaller classes (extract till you drop)
  • follow the SOLID principles, especially the single responsibility principle
  • implement Continuous Integration and Continuous Delivery

How did it all look for me? Some developers from my team, including myself, were given the task of cleaning a pure back-end application that ran as a scheduled job, so that new features could be added or existing ones changed. Since adding new or changing existing features was almost impossible because non of us was the author of the application and the code was such a mess that it was hard to see what was going on under the hood. In short we were dealing with poorly maintained legacy code.

As most developers would start, we said “Ok, let's look at the tests to get to know the application features”
No tests. No problem. I will start looking at the code in order to see all the things the application is doing and rewrite it.

This isn't gonna work. What now?
We decided to cut the classes by extracting. But, until we could do so, we needed some insurance that we are not changing the existing logic. When you have code like we had, the only valid tests you could make where black box tests. What we basically did was take input data from production, run it on the application and compare is the output the same as the output on production.

Now the refactoring begins. The first candidates for heavy refactoring where of course the god classes. We have tried many different approaches, some where a success, some not so much.
My recipe for success was as follows. First, if you see a group of methods that can be exported as a separate class, do it instantly. Secondly, extract till you drop. Cut methods into smaller methods, then cut classes into smaller classes. While doing so try to separate the code that works with database calls from the rest of the code. With this you can clearly see at what point do you read data, and when do you write it. Also, it is a good idea while refactoring to change method and field names if the existing ones are confusing or wrong.

Next, class fields. Usually, class fields from god classes can be exported into separate objects or data structures and used as method input or output parameters (for instance, domain objects, DTOs and similar).

Last big point is abstraction. For instance, create interfaces and later worry about the implementation. Keep in mind that most IDE-s have tools that can help you refactor, I use IntelliJ and it has really good options for refactoring. After all of the mentioned is done your code should look like something that is readable and you could start writing unit tests and adding finishing touches.
Apart from coding, Continuous Integration and Continuous Delivery plays a major factor in this process. If you have your code in Git or similar source control, always create a separate branch on which you will do small changes. At the same time, do small commits. When you are done, if tests pass, merge the branch. This way you don't depend on the changes you made. For instance, get too tangled into code that even you do not know what is done. Also, if you don't like the direction it's going simply drop the changes and start fresh. Do not be afraid of your application, deploy after every change. Of course, have a rollback strategy! You will never be totally sure that the changes done are good until they are in production.

Conclusion. Nobody likes god classes, but somehow they get created. In my opinion god classes are a synonym for really bad coding. They are usually created because of time pressure or simply developers not taking care of their code. Remember, maintenance of code might look as waste of time in the beginning, but in a long run it will save a lot of time. If an application is already in a poor state and has tangled god classes do not try to add additional code and make a bigger mess than it already is. Write tests, so you can know if the logic is the same as when you started. Start refactoring, in small steps. Make your code a little better with every commit, following some standard principles, until it begins to look like the code you would want to see. Have a CI and CD strategy, it's a bad idea to push code into production after months of working on it, because if somethings is wrong, it will take you ages to figure it out. Hope this article can help ease your pain when working with train wreck code.

Let's end with a nice quote from stackoverflow:

Tight deadline leads to pressure, pressure leads to rash thinking, rash thinking leads to tight coupling, tight coupling leads to unmaintainable code, unmaintainable code leads to....suffering.

comments powered by Disqus