Thoughts On: The Pragmatic Programmer

Preface, and A Pragmatic Philosophy

Software Entropy

Software has the tendency to devolve into disorder over time – Also known as "software rot"

From my own experience, over time, and through the combination of new feature requests and bug fixes, applications slowly become such a bloated tangled mess that of code that one inevitably wants to just burn them down and start again. Couple this with changes in technology and design trends and the desire is only magnified.

The first problem can be partially mitigated through better upfront planning and better documentation. It's been a hard lesson learned, but spending the extra few minutes on good documentation pays off over the long-term. No matter how self-explanatory the code seems at the time of writing it, when you revisit it a year later (or even a month later!), it isn't always clear what it does, why it was added, or how it integrates into the larger project scope.

To the second problem, it's all too easy to avoid improving something that already works. Technology doesn't change overnight, but it does over years. Often you can see problems such as outdated libraries miles down the road problems. If you don't spend the time to fix them now – when they don't need fixing - you'll eventually have to fix them when they become an emergency. And the problem with emergencies is, as a rule, they always happen at the worst possible times.

As a policy, I try do make small fixes and cleanups to code any time I open a file. It just takes minutes a day, but those minutes add up and keep your code up-to-date and constantly improving.

Good-Enough Software

Perfect is the enemy of good

There is a diminishing return on time invested. Sometimes (usually) it's better to get something out the door rather than to make it 'perfect'. It's not even clear what perfect is. Minimum viable products, and 'failing fast', is the idea that it's better to get a product out there, and improve upon it based on real user feedback, rather than spending time developing features users may not want at all.

Your Knowledge Portfolio

The internet moves fast. 20 years experience means 3 years of relevant experience and 17 years of experience in outdated technologies and practices

Being a software developer means constantly learning new skills and abandoning old technologies. That's not to say that the old knowledge is without value, however. Skills such as problem-solving and even the ability to learn new skills will aid you in your path of constant growth.

This section contained helpful suggestions on ways to foster the learning habit such as setting measurable goals for reading books or learning new languages (such as OCaml!).

Additionally, there's the often overlooked and undervalued knowledge investment of 'who you know'. In an age where all the information you need is at your fingertips, it's even more important to create processes for investing in relationships – it's all too easy to stay head-down in the code and not make the social investments

Communicate!

Know your audience and know what you want to say

Jargon can be an efficient way of summarizing complex ideas into a single word, but it's not efficient if it's not understood by the listener. Additionally, in all forms of communication (emails, meetings, technical documentation, etc...) organized, well-structured ideas are far more efficient and persuasive than sprawling streams of consciousness.

A Pragmatic Approach

Evils of Duplication

I've always thought of DRY (Don't repeat yourself) in terms of being an efficiency model – If you're going to do something more than once, you should automate it to save time. DRY has values beyond time savings, notably that without it, changes have to be made in multiple locations, so we should be mindful of the ways duplication sneaks in.

One practice I'm guilty of is verbose code documentation (something I've considered a positive but which can be a form of duplication). The risk of over-documenting code is that when the code is updated, the documentation may fall out of sync. Good code should document itself.

The second type of duplication I often succumb to is when I want to make a modified version of a function that already exists. Under the guise of expediency, I just copy the old function and change a line or two to add the functionality I'm looking for. This creates a lot of duplication and, potentially, the need two find and update two instances of code when a future modification is needed.

Orthogonality

In a system, components should be independent. Non-orthogonality can be found in organizations where people have overlapping responsibilities, or in code where components rely on other components to work.

Two areas in my work where I should be mindful of non-orthogonality are the expectation of dependencies being present, and of the reliance on a global state.

Ideally, components should be stand-alone pieces of code. This allows them to be tested independently and to be more reusable. If a component relies on state, it should be set internally.

A very clear sign that code is non-orthogonal is when modifying one component requires making change across many other components. This can be seen if fixing a bug requires working across many files.

Future-proofing

Decoupling systems makes it easier to make changes in the future. An example that comes to mind is data handling. If an application relies on data to be formatted a certain way and that data structure changes, every component would have to be rewritten. Separating data handling into a dedicated component or interfacing with an API rather than directly with a database would allow you to only make changes in one place if the data source was to change.

Tracer-code vs Prototyping

Tracer-Code

Tracer-code is a means of progressive development. By having non-dependant components, very early on you can string together an application using real components and then build each one out in time to handle more complex cases. This approach allows you to user-test early and frequently without having to write disposable code (or at least not too much).

Tracer-code is a better solution than prototyping for its ability to be continually built upon, and for when discrete aspects of an app need to be tested in tandem, such as UI and data output when dummy data just won't do.

Prototypes

Prototypes are quick to develop and change. They can be UI mockups in image editors, whiteboard planning of architecture, quick and dirty code, etc. They allow for rapid planning and testing but lack detail.

One caveat to prototypes is that it should be made clear that these are disposable. Don't build from your prototypes or you'll have a weak foundation.

Writing in a domain's language

Coding in a domain's language means using coding in plain language that can be understood by end-users. This may require parser handlers to translate terms into language-specific commands, but being able to understand the code makes for easier development and maintenance.

Consider config files, it's far easier to define a program in terms of natural language rather than obscure parameter names. Another example of the benefit of having code in a natural language is making error messages easier to read.

Estimates

Helpful tips for giving an estimate

Clearly define the scope
Don't be too specific with your estimate
Ask someone who's done a similar project
Break the project into smaller components and estimate each of those

If you develop incrementally, you can track how long one feature takes to develop and continually readjust from there.

How long will a project take

The only right answer when asked how long something will take is "I'll get back to you".

When backed against the wall, my rule of thumb is to take everything I know, make the most informed estimate I can, then double that number.

The Basic Tools

Overall this section is a bit out of date. Most of its recommendations have become common practice. That being said...

Plain Text

If there ever were a debate about writing in plain text language versus machine code, I think plain text won a long time ago. I'm sure it still exists in niche industries, but any language I've ever come across has been written in human language and then compiled down after the fact.

Shell Games

In direct opposition to the previous plain text argument, Shell Games argues the merits of terminal commands, one of the least human-readable syntaxes I've had to deal with. That being said, the book raises some strong arguments for actually learning it.

Power Editing

Also out of date, full-featured editors have become the norm. I use VS Code for everything (even my long-form writing in markdown). I did, however, meet someone recently who told me their favorite editor was VIM – they obviously don't need convincing of the merits of the command line.

Source Code Control (or source control as we call it now)

The Pragmatic Programmer was clearly ahead of its time. Source control has become ubiquitous. I can't remember the last time I did work that wasn't source controlled.

Debugging

This is the one evergreen section of the chapter. Programming is and will always be an endless debugging process. Don't panic, I've learned from experience, rash decisions only make more bugs. Don't blame, but do own up. We all make coding errors so there's no shame in taking responsibility. Don't blame user-error. Sure it's the user's fault 90% of the time, but if you go directly there you won't catch the legitimate bugs.

The Debugging process

Try to trace down the error through logs or print statements that trace the data through the application.
Talk it out - just the act of explaining it out-loud may make you realize the problem.
Walk back through changes you made recently to find the breaking point.

Text Manipulation

When there are large amounts of data to reformat, I usually buckle down and copy and paste. I have in the paste written parses, but the data has to be huge. If I were more proficient in writing text manipulators, my threshold for when it's easier than doing it manually would be lower.

Code Generators

I've never thought about code generators. It's an interesting idea. Although from the examples shown in the book, it looks like it's only applicable to incredibly boring industries. I once met a guy who worked on the software behind tracking labels – making sure tracking codes were unique, and addresses were formatted properly. So boring industries like that.

Pragmatic Paranoia

Design by Contract

A function should be thought of as a contract – Agreed-upon content in, an agreed-upon state during, and agreed-upon content out. Formalized, this defined as:

Preconditions
Post Conditions
Invariants

The degree to which this is enforced can vary from strict, via exception handling, or loose, via simple commenting

For my coding practice, it seems like bloat to always do strict checking, but see the value in adopting the concept. Outlining the 'loopy states' before the writing a function clearly defines what needs to be implemented, and afterward, it acts as a verbal contract with future users.

Dead Programs Tell No Lies

The principle here is that if a program is going to fail, have it do so on the first error. I've often worked on programs that throw errors but keep on chugging. The fail-fast idea would state that rather than powering on, the program should crash. This would help with debugging because then you would know when the problem happened and, more importantly, prevent the program from continuing to run but outputting the wrong results.

Assertive Programming

The gist of this and the next few sections is that no matter how sure you are an error can't happen, things go weird in the real world. It's a judgment call, but if they don't impact performance, leave assertions in the production code.

The book offers warnings of being mindful of cases where assertions may be saving or breaking your code. An example of the former would be preventing further code from running, and an example of the latter given is preventing garbage clean-up.

Although the latter doesn't feel relevant to me because I don't foresee myself ever working in a language without automatic garbage collection, the book does go on to discuss another area that does concern me, memory leaks.

I have, on projects, found myself passing large amounts of data from one component to another without ever considering when one of them should be unmounted. As earlier in this book, where I found myself questioning my methods of accessing that data (directly vs through a handler), I now find myself rethinking memory usage. Perhaps both problems could be solved through some better data handling.

Metaprogramming

Separating specifics from the abstract.

As projects grow more complex, I feel like I'm spending more and more time writing config files for build processes. Without knowing anything about the internal workings of the modules, I string together chains of dependencies, customized to each project through JSON config files.

Before reading this, I hadn't thought of how I might apply this to my code practice, but I can see value in this. I think one roadblock to following this approach is that we often think the details of an application are finalized when we start working on it. The truth is, there are always a bunch of reshuffles at the end that could be simplified by separating abstract and specific details. Outside of programming, this applies to web design as well where I try to separate components from layout to increase usability.

Temporal Coupling

Things don't always happen in the order you expect. Javascript (which is my language of choice) tends to be fairly linear, but not always. Timing based issues tend to happen when you make external calls, There is a laborious process by which you can wrap functions in 'promises' to call callbacks only after the function has returned to solve this. Another timing-based error comes from dependency file loading. The latter case has been mitigated to some degree by compilers which bundle all dependencies together so you can be sure they're available at runtime. I have also experienced race error to file writing bugs which could have been avoided through a token system, but who would have thought processes that take milliseconds to execute would be called by different users at the same time. This just demonstrates that if it can go wrong, it eventually will.

It’s Just a View

Model. The abstract data model representing the target object. The model has no direct knowledge of any views or controllers.
View. A way to interpret the model. It subscribes to changes in the model and logical events from the controller.
Controller. A way to control the view and provide the model with new data. It publishes events to both the model and the view.

I have found it difficult in the past to know exactly where to draw the line between model and view. Case in point, I was working on data visualization in javascript. The model is the data, the view is the visualization, and the controller handles loading data into the model and triggering the views. different views required the data to be structured differently and so I included the data reshaping in the view, but then, as the project grew, I found multiple views repeating the same data manipulation process. I probably should have refactored at that point and included the different data structures as output options from the model.

Blackboards

This section outlines a concept for keeping a system independent of asynchronous events. Each object exists independently in a state of incompleteness. As data comes into an object, new processes may be triggered. The logical end to this being that when an object is done, it triggers an alert notification to the controller.

While You Are Coding

Programming by Coincidence

When I first started programming, a lot of it (most) was through blindly poking in the dark. I would find someone else's code and, through trial and error, make small changes until it did what I wanted – usually never understanding how it worked. I still tend to do that when I'm learning something new, only now I'd like to think I'm doing that as a discovery process.

Algorithm Speed

Big O notation:

Simple loops tend to be O(n)
Nested loops tend to be O(n^2)
Binary chop tend to be O(log(n))
Divide and conquer tend to be O(n log(n))
Combinatoric are bad

Learning about Big O notation and algorithm speed was a big "aha!" moment for me. I can remember running home to rewrite a recursive loop search function to only loop once and use a temporary dictionary list to eliminate the need for a second loop iteration on each result. My app sped up by tens of seconds and it wasn't even dealing with a particularly large set of data.

Although I often forget to think about algorithm speed when I'm programming, when things go slow, considering big O times is in my debugging arsenal and I'm always happy to refactor.

Refactoring

When to refactor:

You've discovered Duplication (DRY)
You've discovered Non-orthogonal design
Outdated knowledge: requirements change our you just know more about the problem
Performance - moving functionality from one area of the app to another

I love refactoring, it's my favorite part of programming. Largely because of point number three. Given enough time (usually about a year) there's either a better new technology available, or I've improved as a developer. I'm always seeing ways I can redo my previous work and make it cleaner, shorter, more reusable. It's hard to justify going back into something that's already done rather than starting something new to a client, but for personal projects, there's no end to the joy of a fresh refactor.

Code That’s Easy to Test

We all know testing is important. Every class in school stresses testing and places excessive grading weight on it. In the real world though, I don't test. This chapter raises some interesting ideas for keeping component tests within the component. Within the component, they'd be seen when you're making changes and remind you to run whatever external file strings them together. My todo after this write up: google "react unit testing".

Other interesting ideas are that code will eventually be tested – if not by you, then by users.

Before the Project

Digging for Requirements

(Rather than gathering requirements) Users rarely know what they want no matter what they say. What they do know is that something's not working for them and It's the designer/developer's job to interpret what they're saying and find a solution to that problem. Further, management often has requirements that aren't in line with users' needs at all. User-centric design should be at the center of any project plan.

This section offers some good advice on keeping the requirement documentation abstract enough that it's not mixing opinionated solutions into the requirement outline. This allows for more flexible problem solving further down the line.

Two very helpful ideas suggested are:

Documenting the rationale behind a requirement as well as the requirement itself. This will help with future decisions, because just like with code documentation, what seems obvious today may be a mystery just a couple weeks from now.
Documenting feature creep with who added a requirement and why.

Solving Impossible Puzzles

The deeper I dig into to solve a problem, the more entrenched I become. Invariably, the solution comes as the result of one of two things:

Rubber ducking it. Though I have never actually talked to a rubber duck. The process of vocalizing the problem usually exposes the gaps in my thinking.
Taking a walk. Often, if I take a ten-minute break with a change of scenery, even if I'm not actively thinking about the problem, a new approach to the solution will pop into my head on its own.

Not Until You’re Ready & The Specification Trap

"Software development is still not a science. Let your instincts contribute to your performance."

Being too rigid in specifications removes the art of discovery from the coding practice. It's interesting to hear the author refer to coding as hacking out code, he describes it more as an art than a science. That's how I code, but I figured that's because I was still learning the process. I'd always imagined that the better one got at programming the less "hacking" would be going on. I thought the end goal was to be able to robotically meet specification requirements.

Circles and Arrows

Closing out the programming is an art argument, a case is raised against formal development project management models. Though some of the terms are passé now, having been replaced with kanban and scrum as the models du jour, the argument holds. Following a project management methodology to strictly can hamper the creative development process, create knowledge silos. Most interestingly, a 1999 study showed that adopting these processes hurts productivity due to the time required to learn the new framework.

This matches my experience working in any productivity or project management tool. It invariably is too rigid to support the specifics of my use-case and I end up dropping it because it's more effort than it's worth. That's not to say learning these methodologies is without merit – there are insights to be gleaned and used in your ad hock practice.

Pragmatic Projects

Many of the topics discussed previously apply to the team as they did to the individual.

No Broken Windows: The whole team should be committed to fighting entropy
Communicate: Teams need to have a clear, consistent voice
DRY: members of the team shouldn't be doing duplicate work
Orthogonality: Decisions of one team shouldn't have effects across the organization.

Ubiquitous Automation

Some automation has been built into modern web development since this book has been written. If you're doing it properly, version control, automatic builds, and deployment should be integrated. Modern web hosts can watch your project's git repo for new source code commits and automatically run build scripts and deploy from the build folder.

Potentially, one could include in the build scripts some automatic tests which would block the build if they fail, and auto-comment extractors to build the docs. I'm sure this must have been done and is documented somewhere.

Ruthless Testing

Unit testing: testing individual components
Integration testing: system testing of integrated components
Validation and verification: beyond does this work, does this solve the problem posed
Resource exhaustion, errors, and recovery: real world, real data testing
Performance testing: From my experience, stress testing has been the most overlooked test. I have worked on multiple projects that appeared to work fine but failed at scale.
Usability testing: Usability needs to be done before, after, and during a project. It should be part of requirements gathering, making informed development decisions, and part of the final QA before releasing an app.

Tips for testing

Regressive testing

Test against the previous tests to see if anything changed.

Test Data

Tests should be run on real-world data as well as synthetic data. Artificial data allows you to test against edge cases, specific scenarios, as well as larger data sets than may be available currently. Testing against real data covers cases you may not have thought of.

When to Test

Tests are often saved until the end when the project is up against a deadline. This will result in incomplete tests. Tests should be run throughout, and before committing code to the repo.

Tightening the Net

If you find a bug that's slipped through the cracks into production, fix it, and write a test for it. This should be easy because you've already discovered it and would prevent that bug from ever slipping into the system again.

It’s All Writing

There are two types of documentation:

Internal documentation which includes sources code comments, decision rationals, development procedures, etc..
External documentation which is user instructions.

Internal documentation

The right amount of comment to code is a delicate balance. Some things to keep out of comments are:

Stay away from redundant comments like variable types in a typed language.
Don't document info that will become out of date or could be automated like filename, or list of functions that use this function, etc..
Follow the DRY principles and not include code descriptions that could

That said, code can be too minimally documented as well. It's my opinion that the commenting rubric for this course is overly restrictive. Documenting decisions should have a place in code comments less you destine yourself to forever having to relearn the same decisions. Also, starting every comment with "returns" becomes redundant and makes for poor readability.

External documentation

On a large project, this could be a user manual or help docs, but on a smaller project, this could be as simple as the readme in a GitHub repo. If the code is meant for anyone beyond yourself to be used, someone will need some instructions. Being on the user end of a lot of code, I very much appreciate solid documentation. Let's use something as simple as a node module as an example – I don't need to know the internal workings, but I do need how do use it, how to configure it, and what it outputs.

Great Expectations

Not listening to the user is a recipe for failure. As was discussed earlier, this doesn't just mean having the user tell you what they want and then building it, but listening to the user's request, parsing it to discover what they actually want, and only then delivering.

As the book puts it "..the success of a project is measured by how well it meets the expectations of its users. A project that falls below their expectations is deemed a failure, no matter how good the deliverable is in absolute terms. However, like the parent of the child expecting the cheap doll, go too far and you’ll be a failure, too."

This is an interesting example of thinking we know better than the user. Something of which I am often guilty.

Pride and Prejudice

Pride in ownership can be a valuable motivator. When I first started working on shared projects I used to sign my name to everything. With more experience, I lost the feeling of being 'special' for having written code and stopped signing my name. I thought of this as an act of humility.

I now see another value in signing code in a project. It encourages others to sign their code and take pride in their work. It also fosters an atmosphere of respect in a project. It's not about code ownership, but I know when I'm going to edit someone else's code, I let them know first, and I appreciate the same courtesy in return.