Thursday, April 14, 2022

Fix Problems, Not Symptoms


When software fails, you have an obligation to fully understand the cause of the failure, not just to do a cursory analysis and apply a quick fix to what you think is the cause.

Suppose you are trying to trace the cause of a software failure. You have noticed that every time a specific component transmits a value, it is exactly twice the desired value. A quick and dirty solution is to divide the generated value by two just before it is transmitted. This solution is inappropriate because 1) it may not work for all cases, and 2) it leaves the program with what is essentially two errors that compensate for each other, rendering the program virtually unmaintainable in the future. An even worse quick and dirty solution is for the recipient to divide the value it receives by two before using it. This solution has all these problems associated with the first one, plus it causes all future components that invoke the faulty component to receive the wrong value. The correct solution is to examine the program and determine why the value is consistently doubled; then fix it.

Unfortunately, you are going to deal with software that has the two bad ways of fixing errors listed above in them. You'll know it because your coworkers will warn you that making unrequested fixes might break other things. This is why it is so important to report issues and getting sign-off from everyone involved before changing code.

Reference:

McConnell, S., Code Complete, Redmond, WA: Microsoft Press, 1993.

Tuesday, March 22, 2022

Principles of Distributed System Design

Three garage bays to represent a distributed system.

Every day software engineers face the task of designing new systems or maintaining existing systems. Whether the need to make those systems distributed is due to performance or reliability requirements it hardly matters. Distributed system design needs to be considered and broken into a limited number of principles to adequately assess the tradeoffs and costs.

Below are 10 principles of distributed system design that I think do a good job summarizing and separating the problem. These are the principles that Amazon used when designing their S3 service (see reference at bottom).

▸ Decentralization: Use fully decentralized techniques to remove scaling bottlenecks and single points of failure.

▸ Asynchrony: The system makes progress under all circumstances.

▸ Autonomy: The system is designed such that individual components can make decisions based on local information.

▸ Local responsibility: Each individual component is responsible for achieving its consistency; this is never the burden of its peers.

▸ Controlled concurrency: Operations are designed such that no or limited concurrency control is required.

▸ Failure tolerant: The system considers the failure of components to be a normal mode of operation and continues operation with no or minimal interruption.

▸ Controlled parallelism: Abstractions used in the system are of such granularity that parallelism can be used to improve performance and robustness of recovery or the introduction of new nodes.

▸ Decompose into small, well-understood building blocks: Do not try to provide a single service that does everything for everyone, but instead build small components that can be used as building blocks for other services.

▸ Symmetry: Nodes in the system are identical in terms of functionality, and require no or minimal node-specific configuration to function.

▸ Simplicity: The system should be made as simple as possible, but no simpler.


Reference:

Amazon Web Services Launches

Saturday, March 05, 2022

If it ain't broke, don't fix it

A photo of the electronics inside of a microwave.

Of course the advice of "if it ain't broke, don't fix it" is applicable to many aspects of life, but it is particularly applicable to software. By its very name, software is considered malleable, easily modified. Don't be fooled into thinking that it is either easy to see or repair a "break" in software.

Suppose you are maintaining a system. You are examining the source code of a component. You are either trying to enhance it or seeking the cause of an error. While examining it, you detect what you believe is another error. Do not try to "repair" it. The probability is very high that you will introduce an error, not fix one. Instead, file a change request. Hopefully, the configuration control and associated technical reviews will determine if it is an error and what priority its repair should be given.


Reference:
Reagan, R., More Programming Pearls, Reading, MA: Addison-Wesley, 1988.

Tuesday, March 01, 2022

Validation and Verification


Large software developments need as many checks and balances as possible to ensure a quality product. One proven technique is the use of an organization independent of the development team to conduct validation and verification (V&V). Validation is the process of examining each intermediate product of software development to ensure that it conforms to the previous product. For example, validation ensures that software requirements meet system requirements, that high-level software design satisfies all the software requirements (and none other), that algorithms satisfy the component's external specification, that code implements the algorithms, and so on. Verification is the process of examining each intermediate product of software development to ensure that it satisfies the requirements.

You can think of V&V as a solution to the children's game of telephone. In telephone, a group of children form a chain and a specific oral message is whispered down the line. At the end, the last child tells what he or she heard, and it is rarely the same as the initial message. Validation would cause each child to ask the previous child, "Did you say x?" Verification would cause each child to ask the first child, "Did you say x?"

On a project, V&V should be planned early. It can be documented in the quality assurance plan or it can exist in a separate V&V plan. In either case, its procedures, players, actions, and results should be approved at roughly the same time the software requirements specification is approved.


Reference:

Wallace, D. and Fujii, R., "Software Verification and Validation: An Overview," IEEE Software, May 1989.

Monday, February 28, 2022

BIOS Update

The day before going on a week-long work trip I updated my laptop's BIOS. This image is what came up on restart. Luckily, it wasn't an issue but I did start to sweat a little when I saw that unexpected message.

Sunday, February 13, 2022

Twitter Card Meta Tags

My girlfriend likes to say that I'm, "such a fanboy" of Mary Roach. Maybe so. I was going to make a Twitter post referencing Mary's website and I noticed that there was no twitter card! đŸ™€

The Twitter Card Validator showing that maryroach.net doesn't have any twitter meta tags.

So, I'm going to use my tech skills and Mary Roach's website to [hopefully] do some good and provide a walkthrough on making a better website. Yeah, yeah, I know what you are thinking. Someone using blogspot can't talk about making a better website. My response to that is that it is an example of finding balance between cost, effort/time, and quality - the holy grail in software engineering.

I love the look of Mary's website. Simple, clean design with easy to read source code. But I've noticed a couple things that I'm confident that she'll want to address:

  • No twitter card meta tags.
  • No favicon.

Twitter Card Meta Tags

Mary has been very active on Twitter lately. Or maybe she was always active and I've only recently noticed. Either way it is clearly a social media platform that she cares about and as such it would be really nice if when a tweet includes her website URL that the appropriate twitter:card would display.

To do this Mary will need to ask her website company, Coconut Moon, to add the following meta tags to each of her website pages:
  • <meta name="twitter:card" content="summary_large_image">
  • <meta name="twitter:creator" content="@mary_roach">
  • <meta name="twitter:description" content="Mary Roach, Author of Fuzz, Grunt, Packing for Mars, Stiff, Spook and Bonk.">
    • The content here should be different on every webpage. Ideally, it would be different from the meta description tag too. BTW, Mary, you're missing that tag too and you'll want it to improve your SEO score/ranking. This tag is an opportunity to make custom descriptions just for Twitter.
  • <meta name="twitter:title" content="Mary Roach, Author of Fuzz, Grunt, Packing for Mars, Stiff, Spook and Bonk">
    • Again, the title should be different on every webpage. Side note: Mary, I'm so sad to see that you didn't use an Oxford comma in the title of your index page... I've read Spook and Bonk but have yet to find a copy of Spook and Bonk. Before you say anything, yes, I have heard "Oxford Comma" by Vampire Weekend. I love their sound.


  • <meta name="twitter:image" content="https://maryroach.net/images/books/Fuzz_350.jpg">
    • I'm going to keep saying it - use a different image for every webpage and if you are going to use an og:image tag for FaceBook as well then take the opportunity to make it different, make it special and specific to the social media platform. The users there will notice. Like a rockstar yelling out the name of a city before a concert.
  • <meta name="twitter:image:alt" content="Book cover for Fuzz: When Nature Breaks the Law. It has an arrow head style logo clearly intended to resemble the National Park Service logo with a mountain lion, a bear, and an elephant on it with pine trees behind them and birds above.">
    • I would love to feed you some story about how this content will help your website rank better. It will. But that is not why you should add it. You should do it because our world is tough enough for the blind and if a blind person comes to your website then help them out. Mary, the alt tags on your website are all one word descriptions of the images. You can do better. Please do better.
There are other tags but these are the ones that really matter. The rest can be found at https://developer.twitter.com/en/docs/twitter-for-websites/cards/overview/markup.

I almost forgot but it is very useful. Use the Twitter Card Validator to test your webpages.

Favicon

A favicon is a small icon that serves as branding for your website. Its main purpose is to help visitors locate your page when they have multiple tabs open. 

Creating a favicon is a small but important step to setting up a website. It adds legitimacy to your site and helps boost your online branding as well as trust from users. Favicons are an immediate visual marker for the website which enables easy and quick identification for users.
  • <link rel="icon" href="/favicon.png" type="image/x-icon">
    • This is one case where you should use the same image for your entire site.

Lighthouse

To go a bit further, I want to also recommend to anyone reading this to improve their website to use Google Lighthouse to analyze your website. Do it for both desktop and mobile. It'll give a high-level scoring for each fundamental area of your website and detailed recommendations to improve those scores.

Google Lighthouse report of maryroach.net. Performance, accessibility, and SEO are above 90 but best practices was 62.


Wednesday, January 26, 2022

Keep Track of Every Change

Sunset on Missouri river at Hermann, MO with court house on the left and bridge spanning the river on the right.

Every change has the potential to cause problems. Three common problems are:

  1. The change did not fix the problem for which it was intended.
  2. The change fixed the problem but caused others.
  3. At a future date, the change is noticed and nobody can figure out why it was made or by whom.
In all three cases the prevention is to track every change. Tracking entails recording:
  • The original request for change. This might be a customer's request for a new feature, a customer's complaint about a malfunction, a developer's detection of a problem, or a developer's desire to add a new feature.
  • The approval process used to approve the change (who, when, why, in what release).
  • The changes to all intermediate products (who, what, when).
  • Appropriate cross-references among the change request, change approval, and changes themselves.
Such an audit trail enables you to easily back out, redo, and/or understand changes.


Reference:
Bersoff, E., Henderson, and Siegel, Software Configuration Management, Englewood Cliffs, NJ: Prentice Hall, 1980.

Wednesday, December 22, 2021

Control Baselines

Software Configuration Management

It is the responsibility of software configuration management (SCM) to hold the agreed-upon specifications and regulate changes to them. You might not have a board in charge of SCM. I once worked as a technical product manager (TPM) and controlled the backlog for several teams. Regardless of the name, there is someone or a group that sets priorities for tasks yet to be completed.

While repairing or enhancing a software component, a software engineer will occasionally discover something else that can be changed, perhaps to fix a yet unreported bug or to add some quick new feature. This kind of uncontrolled (and untracked) change is intolerable. SCM should avoid incorporating such changes into the new baseline. The correct procedure is for the software engineer to make a change request (CR). This CR is then processed along with the others from development, marketing, testing, and the customers by a configuration control board, which prioritizes and schedules the change requests. Only then can the engineer be allowed to make the change and only then can the SCM accept the change.

Configuration control board is a bit of an old, formal name. I think most companies these days just use the representatives of the stakeholders - product managers, engineering directors, and architects.

So, this practice is immensely frustrating to the go-getter developers who see problems and want to fix them. But it is immensely practical in the sense that one man's bug is another man's expected functionality and changing things breaks expectations which leads to upset customers.


Reference:

Bersoff, E., Henderson, V., and Siegel, S., Software Configuration Management, Englewood Cliffs, NJ: Prentice Hall, 1980.

Saturday, November 13, 2021

Independence Hall

Where America started...

Independence Hall
Construction on the Pennsylvania State House,
now called Independence Hall, began in 1732
and was completed 21 years later in 1753.

Katy and I had the opportunity to visit Philadelphia, PA in April, 2017.

Located in Philadelphia's Center City, the area between Fifth and Sixth streets, and between Market and Chestnut, is home to the spirit of US history. Independence National Historic Park covers an area of 45-acres with approximately twenty buildings. Including what was once called the Pennsylvania State House which we now call Independence Hall. The Declaration of Independence was planned, discussed, and signed here.

Independence Hall Tower
Independence Hall Tower

The stately two-story redbrick building has a steeple with a clock in it. It used to house the 2,080-pound Liberty Bell which was rung on July 8, 1776, to announce the first public reading of the Declaration of Independence.  

Liberty Bell

"Proclaim Liberty Throughout All the
Land 
Unto All the Inhabitants thereof"


The Liberty Bell has its own home on the park grounds with a nice view of Independence Hall behind it.



Monday, August 23, 2021

Rotate People Through Product Assurance

 

Utah Skyline

In many organizations, people are moved into product assurance teams as a first assignment or after they have demonstrated poor performance at engineering software. Product assurance, however, requires the same level of engineering quality and discipline as designing and coding. As an alternative, rotate the best engineering talent through the product assurance team. A good guideline might be that every excellent engineer spends six months in product assurance organization every two or three years. The expectation of all such engineers is that they will make significant improvements to product assurance during their "visit." Such a policy must clearly state that the job rotation is a reward for excellent performance.


Reference:

Mendis, K., "Personnel Requirements to Make Software Quality Assurance Work," in Handbook of Software Quality Assurance, New York: Van Nostrand Reinhold, 1987.