It doesn’t scale

“It doesn’t scale” is something I say often. How do we know something doesn’t scale? Software and organizations are systems; systems have inputs and create outputs. The amount of output in relation to the inputs gives an idea of the throughput of the system. It’s not necessary to get super scientific or mathematical about this, just understand the thought process. If we increase one of the input variables and the system can’t keep up, this is a failure to scale.

It’s the role of a manager/operator to understand the bottlenecks in a system and devise solutions and to have the foresight to plan for addressing future scaling bottlenecks.

A simple exercise is taking each of the input variables and increase each individually by the next order of magnitude (e.g. 10x customers, 10x the page requests, or 10x transactions, etc), then work out what would happen to the system. Then identify the parts of the system that would need to compensate for the increase in input and determine if the compensation is linear.

Parts of the system that scales linearly in relation to the inputs are the constraints of the system. Perhaps, your input is “customers” and you discover that scaling customers requires a corresponding linear increase in employees (e.g. each customer requires 5 additional staff). If you want to 10x the number of customers it would require a corresponding 10x increase in employees. To illustrate why this is not a good relationship, let’s assume you have 10 customers and 50 employees. Would you be able to hire an additional 500 employees to support 100 customers? Maybe you could, but what’s the lead time to that, what systems and processes would you need to put in place for that to happen? Growing from 50 employees to 550 employees is fraught with problems you have never encountered before. That’s for you to determine if it’s reasonable, or if there’s a better solution.

Solving for bottlenecks involves understanding the fundamental relationship between parts of the system. Merely doing something faster may not yield the increase in output desired. I often seek leverage, leverage occurs when effort has a multiplier effect on the output.

Bespoke things generally do not scale, but if those bespoke things can have leverage then there’s a multiplier effect. Let’s take a tailor on Saville Road, a tailor that creates bespoke suits might produce a product of very high quality, but it relies on his singular skill to design a suit for that client. It requires hands-on measuring, fitting and re-fittings. The only way they can grow their business is either by charging more for his services or by working longer hours, this has limits and is a real-world bottleneck. Technology has always provided avenues to solve bottlenecks, it can help decouple parts of the process, introduce parallelism, or automation. As an exercise, think of how you might scale this business. Determine what you trying to solve for and what are you willing to compromise on to get a result that meets those objectives.

TIL – You can’t wire from TD to TD

Over the weekend, I tried to wire funds from my TD bank account another TD account, yesterday I learned it had bounced back.

At the time I sent the wire, I noted to the teller it was going to another TD account, she didn’t seem to have any concerns. This morning, I went to the branch at Brookfield Place to ask the branch manager to investigate.

While it’s nice that I get to save on the wire transfer fees, this seems like a broken abstraction. There’s no reason why TD’s wire department couldn’t detect this case and just do the inter-bank account-to-account transfer. To them, this should just be an internal accounting issue. Another thing that I learned through this process was that this limitation was a little known at TD across multiple branches.

Educating tellers and staff is not high leverage, humans forget easily edge cases they don’t encounter often. Their systems should either have abstracted this limitation or informed the staff that this should be done via an account-to-account transfer. Catching exceptions early or making them transparent to the end-client leads to better end user-experiences.