In many organizations – the expansion of the role of the QA Engineer, for quality assurance, the people who test the lines of software created by developers, has led to a similar shift of accountability, in effect laying out the welcome mat for risk and sloppiness, that we saw come to a head in banking in 2007.
This dawned on me when someone was explaining the importance of Agile as a software development model (this being the week of the national Agile2016 conference), and why the old waterfall model many of us knew in the 90s is ill-suited to current needs.
I get it.
But in the description of the journey – they mentioned something that made me realize that part of the massive mess many development organizations are in today (but software companies as well as companies with an IT department that does development), which includes waste and risk and stress, has a lot in common with the 2007 financial crisis in terms of the root cause. I expect Freakonomics authors Levitt and Dubner would see the connection.
Let me back up and offer a simple explanation of the financial crisis.
The 2007 financial crisis hurt millions of people all over the world, and as people like Michael Lewis told us in The Big Short, and Matt Damon narrated in Inside Job, it was because of some now painfully obvious decisions by a relatively small number of people. If you haven’t seen the movies or read the books – I can’t recommend them enough – but (spoiler alert) you won’t like the ending. At all. The short of it, no pun intended, is that banking, specifically loans, used to be extremely simple two party transactions with a borrower and a lender like the image below.
When the borrower was too risky for the lender, they wouldn’t get a loan, which makes perfect sense, and then because banks usually did a pretty good job of assessing risk, most loans were safe for a long time. If you haven’t gotten a loan, they ask a series of pretty basic (oversimplified here) questions:
- What is your income level? If you have a job and make a certain amount of money, that’s an important way to tell whether someone can afford a loan.
- What are your monthly expenses? If you make $5,000 a month and your expenses are already $4,500 a month, you probably shouldn’t have a mortgage payment higher than $500 a month
- What’s your net worth? If your income changes, is there another way for you to pay the mortgage.
- Do you pay your taxes? They don’t really ask the question this way, but part of the test of loaning someone money, in addition to seeing how much income they have told the government they make, is whether they are honest and timely with their tax payments.
These seem like very reasonable questions to ask.
Then things started to get complicated when other organizations like investment banks started buying loans in bulk from the lending banks and then selling shares in those bulk blocks of loans like shares of stock, which are securities, and thus was born the mortgage-backed security. On paper that’s still sane and on the up-and-up. Then what happened is that the people lumping in all of these mortgages started to put some less valuable, riskier mortgages into the bundles. But as they stuffed the bundles with worse and worse loans, no one seemed to notice, and they maintained very high safety ratings (there was no motivation for the ratings agencies to downgrade them – that was also part of the problem). This is what Inside Job describes (in great detail) as the Securitization Food Chain, depicted below.
So while banks used to be rigorous in choosing who they would, and would not, lend to – some banks like Countrywide figured out that someone would buy the mortgages as soon as they were signed, so there was really no risk to them in signing them, so because there was so little risk – they decided to be much less rigorous in choosing who got loans. In many cases, they didn’t even ask question #1 above. So people who should not have been approved for loans got approved. Sloppy. That’s a big part of why millions of people lost their jobs as a result of the crisis.
As I said at the outset – in many organizations – the expansion of the role of the QA Engineer, for quality assurance, the people who test the lines of software created by developers, has led to a similar shift of accountability, in effect laying out the welcome mat for risk and sloppiness, that we saw come to a head in banking in 2007.
Back in the 90s I was a developer and we wrote software for a machine called an AS/400, and like most developers who were writing software for internal use at a big company (things like order entry and accounts receivable, general ledger, customer relationship management, etc.), you were also expected to test that it did what it was supposed to do (usually captured in a requirements document). There were several levels of testing:
- Unit Test. This is making sure it does what it is supposed to do, all by itself.
- String Test. Most software applications, especially complex internal systems, communicate with other applications, which means passing information back and forth in the right format. String test usually included just the immediate “neighbor” applications with which that application communicated.
- Integration Test. Often applications have to communicate with other applications at other companies, in other places, where you have less control and visibility over changes. Integration test is a way to ensure the software passes information back and forth successfully with those “outside” applications. A simplistic example would be that many companies process payroll through another company, and that means sending lots of information about people, money, timing, and so forth, to the payments company, in a timeframe that works to complete payroll successfully.
- System Test. System Test is where you run all applications in a given ecosystem at the same time – often simulating entire weeks or months of transactions.
- Stress/Volume/Load/Failover Test. Once you are confident the software does what it is supposed to do – some organizations process millions of transactions a day now, and that means you have to make sure there is enough hardware and network bandwidth to handle the loads successfully all day every day.
Both of the above lists of questions (loans) and tests are simplified – but they make sense even if you aren’t a developer. When I was developing, test #5 was usually done by a different group of specialists, but each person was accountable for testing their stuff through System Test, and if it failed, it reflected badly on you, and you felt you let the whole team down.
Fast forward to today, and a lot of developers aren’t even expected to do test #1, Unit Test for the code they write. Once they finish coding, they hand off the code to these QA testers and it’s their job to see if the code works or if it has bugs, and then there are even more QA people to do the other string, integration, and system tests. And when they find a bug – of course they send it back to the person who wrote the software to fix it, right?
No they don’t.
They document it as a bug and it’s sent to a different group of people to fix it.
Which is why – when someone was telling me they need to hire more QA engineers to ensure they are producing high quality code, I realized it has gone too far. Developers aren’t being held accountable for their software – their only motivation is to write hundreds of lines of code, even if it just says “All work and no play” like it did in The Shining.
So I get the importance of Agile, but if at the same time we don’t hand accountability back to the software developers who write the code, we are no less crazy than Jack in The Shining if we expect to realize all of the risk, efficiency, and cost benefits promised by Agile. Well maybe not that crazy.
Just as the banking errors that led to the 2007 crisis are now incredibly obvious – this shift of accountability should be equally obvious to people leading development teams, and the steps needed to fix it aren’t too complex.
Next I will point out that many organizations think they are Agile, or have started to try, but haven’t gotten to Agile – and that in part because of the mountains of additional data they are going to have to start collecting very soon – they are running out of runway to become truly Agile before their maintenance costs and bug rate expense spins completely out of control.
Leave a Reply