Long functions without reusable parts

You have a long function (like 100-300 lines) that describes a sequence of steps that only make sense together.

What do you do?

  • Add some comments to show what the steps are, but keep it in the same function. No point introducing extra indirection and jumping around the code. If a part is ever reused, it can still be split at that time.
  • Split it into multiple functions. Methods should stay short so as to give a faster overview. Who knows, maybe it'll be re-used one day in the future.

Bonus question: if the function is a method and you're splitting it, do you just share data through instance fields, or pass everything as arguments?

EDIT: I'm thinking about compiled languages with the possibility to force the compiler to inline these methods, so no performance cost. But performance considerations may be useful for other readers.

Split it


Keep it together


I like turtles


Comments (4)

Marco Alka's photo

As always, depends. If it's a hot path, keeping the function or method together might yield performance benefits. Other than that, I'd split it into logical groups. That way, the code is even self-documenting (but you should add comments nevertheless!)

if the function is a method and you're splitting it, do you just share data through instance fields, or pass everything as arguments?

It's just one method. Keep your method's business out of the class/object. Especially when going parallel, having one field for all threads is a bottleneck. It's usually better to take a more functional approach (even in OOP) by using helper functions. Hell, I'd not even write the parts into methods, I'd write them into functions, which are encapsulated inside of the module (which I already do in my JS and Rust projects, btw).

import foo from 'Foo';

export default class Bar {
  doSth() {
    // very long method here

  // ...

would become

import foo from 'Foo';

export default class Bar {
  doSth() {
    feedThePigeons(123, 'bread');

  // ...

const feedThePigeons = function(pigeonId, food) {
  // small function

const spankTheMonkey = function(monkeyId) {
  // small function

const lickTheLizard = function(lizardId) {
  // small function
Hipkiss's photo

I would suggest it depends on the use case. Personally, in javascript, I have created a few classes that are a few hundred lines long, in some cases or two methods that are a couple hundred lines long and my reasoning for doing this is to improve performance when the code is running. Obviously, because of this I have commented well and used useful naming conventions.

If you're thinking of creating a class/set of methods that aren't going to be reused somewhere else then I wouldn't worry about splitting it. Splitting something for the sake of it can be counter intuitive as it can introduce unnecessary obfuscation, which slows down the code and also can make it harder to follow when debugging.

As for variables, I would suggest putting as much as possible in the constructor, some might disagree with this (obviously this has limits and should be taken on a variable by variable basis). My reasoning is that, in my opinion, it's cleaner. Especially if the variables are used in multiple methods. Furthermore I find that doing this allows easier tracking of the variables, meaning an improved consistency when in use as well as checking for debugging.

So that's what I would probably do given the limited information provided; create a class where you can setup the variables in a constructor then put the function in a single method and determine if it should be split based on complexity, future use cases and who's going to also be looking at this code.

Jos Fabre's photo

If it can be split, split!

Reusability of bits of code is an important reason for splitting. In "good code practice", a single method doesn't have to make sense at all. It gets meaning by being called, thus getting a cognitive/explanatory environment.

If your enormous code block makes sense because of what you do in sequence, the same sense will come from calling the separate (explanatory named) methods that you created (probably also in sequence).

Passing data as arguments follows the black box principle, where a function always has input, processing, and output (which is just processed input). By using this practice, you will never have dependency and thus no scoping problems, because there's always just the input to work with.

That being said, when coding javascript I regularly use variables in the global scope, something that is similar to (or could be called) instance fields.

Jason Knight's photo

For me this question REALLY depends on what it's doing. Breaking it into functions introduces overhead -- the push of the current address to the stack, push of any parameters or pointers to parameters to the stack, the far call, the popping of all that off the stack, and that's assuming a compiled language. Interpreted (and/or loosely typed) can often be worse if they do things like "locking" to prevent mutability or transformation by type.

You have to weigh performance against code clarity, but really if you're not re-using any of it I would just comment each block and make DAMNED sure you've got your indentation consistent. Remember, whitespace is your FRIEND.

Though I kind-of worry when people call 100 lines a "long function". Probably because I've written code in assembly, where 1000 lines without a single call or non-conditional jump is the norm.

That said, I would look long and hard at it to try and figure out if there's an easier/faster/smaller way... but then SO many people's codebases these days seem to use ten times the code I would to do the same job. OR WORSE rely on megabytes of frameworks to have written the same amount of code or more as I'd have had WITHOUT the framework involved.

But what do I know? I'm the nut who's been doing this on collections in JS for two decades:

for (var i = 0, element; element = collection[i]; i++) {