Testing as Proof

April 27, 2009 by bhushit

Often I think that the ideal language for implementation of test automation control logic is Prolog!
Now, you’d say only professors talk of Prolog.
Let us see what I mean here.
Testing is after all proving that the system works or not.
So,
System works if component A works and Component B works and …
This is very similar to Prolog programming, isn’t it?
That means test automation scripts must be controlled by Prolog kind of an engine.
I know there are more details to be discussed. I am just sowing a seed here.

Some useful links

July 12, 2007 by bhushit

Array of function pointers is next best to virtual functions. Function pointers are less discussed and hence less used (rather than the other way).

Here is a very good tutorial on the topic.

Relational switches are my dream. Here is a discussion and Denis Ritchi’s comments on this concept.

Summary

April 21, 2007 by bhushit

Looking  back, I have covered most topics that I wanted to discuss so far. The presentation has not gone exactly in the way planned. However, this is the time to offer the summary. This summary is written with a refactory point of view.

1. There are two types of decisions. Existence and environment decisions AND value decisions.

2. Most bugs and inflexible designs arise because decisions are spread in the design and the code.

3.  Make sure all “not”s are substituted by “if-else”s. Break decision clauses into larger chain of conditions if needed.

4. Convert all “equality” value “if-else if-else” into switch-cases. Split all “inequality” value “if-else if-else” into explicit rule base and simplify the implementation.

5. Once rule base is separated, try to convert as many switch-cases into function pointers as possible. This is going to separate “what” from “how” and will keep errors limited to  “what” parts. If you reach this step, most of your program will function without bugs and maintenance will start becoming easy.

6. If you are allowed to take a step further, see that the function pointer array that was generated in previous statement is nothing but class heirarchy. Convert them to sub classes. At this point you should see a collapse in the “how” code size and a drastic reduction in number of bugs.

7. If you are still allowed to take a step further, you are at door steps of aspect oriented programming. At this moment, you will be able to see environment variables changing the behavior across classes.  This is the time to write cross-cutting concerns and eliminate further bugs and induce further versatality in the code

8. Make sure your next code starts with these ideas followed rigidly. Introduce these ideas as check marks in code review. [e.g. "All possible if's are converted into switches", "all possible switches are converted into function pointers", "all function pointers are converted into subclasses with inherited interface",  ...]

9. And yes, participate in language design issues and favor introduction of switch like control structures, associative arrays, late bindings, OO and AO – dependent on the context of the language.

My next series will be on testing algorithms. Until then, good bye!

“value if”s to “switch-case”s – ideas and issues

April 21, 2007 by bhushit

*****

Then then a lot of questions arise. The first one is: “Why should we have “switch-case” instead of “if”s?”
From earlier postings, we have made it clear that the flow of “switch-case” is much easier to understand than that of “if-elseif-else” for a number of reasons.

1. “Switch-case” announces its concern first unlike “if-else if-else” where parameters of decisions are unknown till you read each phrase

2. “Switch-case” has minimal and uniform logic unlike “if-else if-else” where unrestricted strings of logic contain a lot of surprise and inflexibility for the code maintainer

3. “Switch-case” organize ANDing and ORing as code flow itself and keep “apparent” logic simple unlike “if-else if-else” where flow of code through unrestricted strings of logic has to be reconstructed in the maintainer’s brain

4. “Switch-case” uses “default”, which can be expressed in the problem domain language easier than De Morgan Law firing “not”s

5. “Switch-case” logic substituting “value if”s usually point to higher chances of abstraction such as pushing the logic into the caller function and to higher cohesion

*****

The second question is: “Are all “if”s are convertible to “switch-case”s?”

The answer is obviously “Unfortunately, Not”. For example, how do you convert the following into a “switch-case” statement?

if (temperature == 0 ) {

printf ( “Kelvin rules”);

} else if (temperature < 237) {

printf (“Water is frozen”);

} else if (temperature < 337) {

printf (“Steam is liquid”);

} else {

printf (“Ice is vapor”);

}

The only way to introduce a switch-case statement here is to set some state variable somehow and then switch. If your library provider does not set up temperature range state, you have no choice but to implement “if-else if-else” chained rule base and then take a decision based on switch-case.

Interestingly, in this case, you could also eliminate the switch-case altogether and have rule base nicely detached from execution logic. For example,

/* in the physical chemist’s file */

enum state = {zero, ice, water, steam};

char *message[4] = { “Kelvin rules”, “Water is frozen”, “Steam is liquid”, “Ice is vapor” };

enum state my_state;

if (temperature == 0 ) { my_state = zero;

} else if (temperature < 237) {

my_state = ice;

} else if (temperature < 337) {

my_state = water;

} else {

my_state = steam;

}

/* In the plant engineer’s file */

enum state my_state;

printf (message[(int)my_state]);
Where is the saving? Try adding the fact that if the tempearture exceeds “Planck’s temperature”, the message should read “Planck rules”. You will realize that even with the same “if-else if-else”, the concerns are separated.

Now assume that in our language, we have a relational switch statement that can switch based on given entities, the physical chemist’s file will look like this:

/* in the physical chemist’s file */

enum state = {zero, ice, water, steam};

char *message[4] = { “Kelvin rules”, “Water is frozen”, “Steam is liquid”, “Ice is vapor” };

enum state my_state;

switch (temperature) {
case == 0 : my_state = zero;

case < 237: my_state = ice;

case < 337: my_state = water;

case < PLANCK_T: my_state = vapor;

default: my_state = planck;

}

Suddenly the code starts looking like a physical chemist’s handbook and we have not changed the plant engineer’s code just like the previous case!

Unfortunately, C does not support relational switch as stated above! AFAIK no other language support such a statement either.

That is why I love Tcl. The only language I know that lets user define his/her own control structures. Expect is perhaps the most well-known control structure defined later than its language.

Another example is from the design pattern book from Go4. [The code appears as a sample code for Singleton pattern. Copied here for review.]

MazeFactory* MazeFactory::Instance () {

if ( _instance == 0 ) {

const char* mazeStyle = getenv(“MAZESTYLE”);

if ( strcmp (mazeStyle, “bombed”) == 0 ) {

_instance = new BombedMazeFactory;

} else if ( strcmp (mazeStyle, “enchanted”) == 0) {

_instance = new EnchantedMazeFactory;

}
// … other possible subclasses

} else {

_instance = new MazeFactory;

}

return _instance;

}

A C programmer has to live with this implementation. However, Tcl programmer can split the rule and the implementation, leaving implementation to be eval’ed later.

# Rule base

set mazeConstructor(“bombed”) “BombedMazeFactory”

set mazeConstructor(“enchanted”) “EnchantedMazeFactory”

set mazeConstructor(“default”) “MazeFactory”

# Implementation

set _instance $mazeConstructor(“default”)

if { [ info exists [mazeConstructor([$env(MAZESTYLE)]) } {

    set _instance [ new $mazeConstructor($env(MAZESTYLE)) ]

}

Yet another comparison could be:

if ( hare.distance == FULL_DISTANCE )

    declareWinner(hare);

else if (tortoise.distance == FULL_DISTANCE )

    declareWinner(tortoise); 

This scenario arises when a number of variables are compared against a constant value. In that sense it is “variable switch” rather than traditional “value switch”.

If we could have such a variable switch, our code could look like:

variableSwitch (FULL_DISTANCE) {

case hare.distance: declareWinner(hare);

                                   break;

case tortoise.distance: declareWinner(tortoise);

                                   break;

}

Two types of “if”s and their ideal places

December 30, 2006 by bhushit

While I am writing this blog, I am also working full time. As I write the blog, my own ideas clarify against my experiences. This is one such evolutionary realization.

I am not sure whether I wrote earlier:

In ideal programming, decisions belong either to the top of the class hierarchy or at the leaf.

“if” statements come in two flavors: the “existential if” and the “value if”.

The existential if statements reflect hard realities of computation. “Whether the file pointer was NULL” or “Whether the malloc succeeded” or “Whether the password matched”. There is hardly anything one can do in the “else if” or “else” parts of this condition. These statements are the least attractive candidates to be pushed into a “switch-case” structures.

The value if statements are different. “Whether current temperature is within limits”, “Whether the passed parameter is an upper case letter”, “Whether this packet is of IPX protocol” and so on. Under normal circumstances, such statements also have a very strong “else if” or “else” parts. These are ideal candidates to convert into “switch-case” statements.

*****

As a matter of fact, a reasonably written C code will have 90% of the control statements as “value if” – and that is where the problem lies. A lot of these decisions are just type decisions - if the value of the variable a is x, the type must be y and z function needs to be called.

That is where OO starts scoring. In a good OO code, type decisions are minimal. The same situation would simply translate into: invoke z method of a. There is no decision to be taken!

This factor is often ignored when embedded systems programmers berate OO as slow because VFT lookup may slow down the program. However, we just now avoided a decision using OO and made the execution *faster*. In my experience, OO avoids a lot of decisions and hence compensates for the standard (and hence optimised) VFT lookup.

In one of the (scripting) OO codes I have written, only one existential if was needed per thousand lines of code – and no value if was ever needed in 20k line of code. The code ran 25% faster than its procedural counterpart. I am yet not factoring in the design flexibility OO brought with itself.

“if” – only if …

August 6, 2006 by bhushit

Our crusade against control structures is progressing. We have seen places how we can eliminate many of instances of “not”,”goto”,”else” and “else if”.

Can we eliminate the mother of all logic – branching itself – or “if”?

Interestingly, my observation is that in most cases when we take a decision, it is more because the way our data is laid out. Most “if”’s are avoidable!

No, I am not drunk.

—–

If you are following the blog in chronological fashion, by now you must be convinced that wherever a decision tests equivalence (or !=), it must be converted to a switch-case statement and not into an “if”.

—–

The simplest way of reducing “if”s is to attack the condition block and start reducing the complexity.

If you have a keen eye, you can see that

1. Just because someone chose not to write a function for the action block, all the conditions were forced into the condition block.

2. Again, in order to get all the parameters for those conditions, your current function has a large list of parameters.

3. The larger the list of parameter becomes, the more complicated gets the contract of the function. Who is going to do error checks on each parameter that comes in.

This sounds like perfect disaster recipe – and familiar, isn’t it?
How would converting the action block into a function help?

If it is an ORing

if (x == 3 || y == 5) {

/* code equivalent to foo(); */

}

Call to foo(); can be actually made where x becomes 3 AND y becomes 5.

In case of ANDing, it is similar logic.

if (x == 3 && y == 5) {

/* code equivalent to foo(); */

}

In many such cases, foo() could be called when x became 3 OR y became 5.

And yes, I am ignoring NOT. We have eliminated NOT.

—–

“Life is not so simple!” you will say.

Let us attack the problem from another angle.

A procedural programmer often uses “if”s to determine the type of the data. “If the first two bits are 01, the caller is my girlfriend else my mother-in-law.”

On the other hand, an OO programmer will eliminate this dilemma by putting proper class hierarchy and design pattern. Sub-typing takes care of most decision needs.

If you can subtype, you can not only avoid most “if”s but also most of void pointers, type indicating parameters, typedefs and ?:’s.

—–

Then there is a question of efficiency. Many of my friends argue that calling a function, more so a virtual function, is a wastage of time.

They fail to notice that functions with smaller signatures (and less contract) are likely to execute faster because need of communication through stack operations goes down significantly – and there is less code to check the contract. Even in case of RISCs where pushing on a stack is not so common, frames do rearrange data between registers.

As a matter of fact, after converting one of Tcl procedural applications into STOOOP code, we have seen 20% increase in code efficiency. My guess is that efficiency conscious C to C++ gain must be more impressive than implementation conscious Tcl to STOOOP gain.

“else if” – just in case … – and switches

July 30, 2006 by bhushit

Believe it or not, I have seen a code like this in one of the most reputable products:

if ( x == 0 ) {

… /* 50 lines of code */

} else if ( x == 1 ) {

… /* another two “page down” */

} …

When I asked the developer why he chose not to use switch cases instead, his answer was: “It has very unintutive syntax.”

What? Switches are perhaps the best control structures procedural programming could have. Just because most times you use “if”, you shouldn’t use them always!

Then there is OO crowd, which eliminates most switch structures.

Personally I favor “flat code” with minimal decisions – and that is why I love OO. Still I find there are cases where switches are necessary. I am going to talk about those conditions later. Currently let us focus on procedural programming.

The biggest advantage of switch is that it declares the focus of the code that follows. If I am not interested in the variable on which the code switches, I will simply skip it to the next matching paranthesis.

Yet another advantage of switch will be clear if we use our logical parallel notation |||. Assuming there is no “fall through”,
switch (x) {

case 0: foo[0]; ||| case 1: foo[1]; ||| …

}

Arranging the code this way will inspire you to think which code is common across the segments. You will be tempted to push that code out of the control structure and make code smaller and more maintainable.

[Please don't assume people always reduce the code to logical minimal. I have actually converted experienced programmers' code to switches and showed them the redundancy. When code becomes large, this is inevitable.]

Yet another interesting point is that once you put your code in ||| parallelism inside a switch, you often realize that you might not need a lot of logic at all.

Yet another advantage is that when you get a “null code fall through”, you are essentially coding for an “OR” condition. To “OR” challenged person like me, this is fantastic.

switch 5:

switch 6:

switch 7: foo();

I am sure that when x becomes any of 5, 6 or 7, the same code is going to be executed. That is graphically intutive. I don’t need to decipher this:

if ( x==5 || x==6 || x==7 ) foo();

Yet another advantage is also graphical. The “AND” condition is nothing but a switch within a switch. The code gets longer but the condition block becomes so small that it is very comprehensible.

Compare this:

if ( x==3 && y==4 && z==5) foo(); /* and fifteen such decisions in a tree */

with

switch (x) {

3: switch (y) {

4: switch (z) {

5: foo();

}

}

}

The code has gone “beyond horizon” but the comprehension has become idiot proof. The huge condition block has reduced to single condition at every step.

At last, the “NOT” is taken care. If a code has “not” logic, it will not exist in a block.

Still, switch structure in C are far from perfect. I will discuss that in a later posting.

“else” – minimize surprize

July 30, 2006 by bhushit

Have you ever felt irritated calling up an 800 number when the option you are often interested is at the last and the line does not recognize button press till it is done speaking.

The same is true for “else” statements. So often we see that we are interested in default behavior of the code – and it is available only at the end of 500 line if-else if-else chain.

if ( condition1 )

… /* take action responding to the condition */

else if ( condition2 )

… /* take action responding to the condition*/

else

foo(); /* default action */
Good code should not have last “else” part. Why? Because an unconditional “else” – “for every other case” – often spells unthought out situations. Such situations turn Sphinx and pose riddles when “else” part of the code needs to be further split for feature additions. Also, “else” fires at most unopporune moments when none of above conditions were met and you were truly clueless about “else”’s side effects.

Instead, I suggest eliminating “else”. This is how.

bar (); /* prepare for default behavior – set flags to describe the situation */

if ( condition1 )

… /* set flags to describe the situation */

else if ( condition2 )

/* set flags to describe the situation */

/* code to handle the conditions set by above blocks */
And the code inside conditions is modified in such a way that they signal a common code segment following the block rather than taking any direct action.

Ideally the action block that follows the situation is nothing but a function pointer, executing a piece of code fixed by the decisions.

This approach has two parts – modifications in decision as to raise only the flags and coming up with an action block that caters to all conditions.

There may be a doubt that when one of “non else” conditions could be true, we are executing more than necessary – first assigning defaults and then assigning condition specific code.

However, if you go thorugh the above approach carefully, it is not too much of execution overload as all you are doing is setting appropriate flag.

The second problem is with too long an “if-else if” chain, which might in turn result into huge function table. I acknowledge that a large array of function pointers might be difficult to debug for those who don’t like function pointers.

However, we should also consider that creating an array of function pointers isolates the “business logic” from the “implementation logic” – or separates “rules” from “data”.

This separation is invaluable. Maintenance of such a code becomes a breeze. As a matter of fact, if you are lucky to have a project in a language which supports associative arrays (like Tcl or Perl or PHP), you can eliminate the entire “if-else if-else” chain altogether. This is a topic of yet another discussion.

“not” – Think Positive

April 23, 2006 by bhushit

Self help books never get tired of singing praise of positive thinking.

I have investigated why positive thinking succeeds. I have realized that it is much easier to think positive than to think negative.

If that is also your opinion, what do you think of all the !’s spread in the code? Is it compassionate to the maintainer to force to think in negative term?

Also, think about all the double negatives! “If the customer is not “not logged out”, “

Theoreticians will jump with a gun in their hand for me! How can I think of logical system without a “not”! Forget about Boolean algebra, even a Group structure can’t stand without an inverse! Have I gone nuts?

The world has come a long way from Leibnitz like notations to express logic. The same Boolean logic can be expressed without explicit not in most cases. For example:

if (not )
else

can be easily converted into

if ()
else

That way, even if your hardware design has negative fields, your code does not need to go double negative.

Many of us will remember the famous negative:
while (!feof(fp))

This idiom has been so ingrained in our thinking, one of our very good engineer used it against my proposal. His main line of argument was:

“Use ‘not’ if it falls in normal language – just like what we do in case of feof.”

I say it is reverse case! feof is an example of how lower layer convenience went up the problem domain. If I were K&R, feof was the best way to avoid additional logic because eof was so detectable a condition!

How do we really think while we debug in this case?

“How does the software behave when data is available?” In other words, we think of “fif” (file pointer in file”).

Frankly, I am convinced a little care to reflect the problem domain language could avoid a lot of inverted logic.

The only case where I find “not” useful is in cutting short error handling i.e. in situations where “catch” is unavailable. For example, if you want to know whether an object existed in Tcl, rather than coding:

if { [ info exists x ] } {

# action of 100 lines

} else {

# error handling

}

and leaving the maintainer guessing about the error handling, we can error handle it before and terminate his interest earlier on. It will be only a few lines to skip if he is interested in what happens if x existed.

if { ![ info exists x ] } {

# error handling

} else {

}

Programming by contract will cut this kind of negative logic significantly.

Someday I may write why it is so difficult to think negative. However, that is not current focus.

“goto” – think of slimmer functions

April 18, 2006 by bhushit

I confess my writing lacks theoretical rigor.  I was never a Computer Science student and I am not an academician. So I lack requisite research as well as formal training to deal with some of the issues. I also risk repeating what the research literature might have proven decades ago.

However, I also want to note that I am not alone. I have seen people from all vocations joining the great software talent drag. None of us gave basic thought to make our programs better. I hope to serve my crowd.

I request reader community to point me to rigorous places and prove or disprove me wherever they feel the need. 

For simpletons like me, goto statements are considered harmful.

Where you can goto
However, one of my friends, who have written many more lines of code of C, insists that goto statements are actually useful in one circumstance. He points out the following condition:

int foo (…) {

/* resource allocation */

/* condition1 */ 

/* need return */

/* condition 2 */

/* some code */

/* free resources */

return <return value>;

In this case, if we return in the action block of "condition 1", we risk resource leaks. Instead, according to his experience, it makes sense to jump to the "free resources" block, we are sure all allocated resources are free.

Let us understand that he does not insist that goto is necessary. He says goto is very convenient in this case.

How to avoid even this goto

To say it bluntly, I consider programs like above badly designed.

The "resource allocation no matter what" and freeing up of those resources should ideally be pushed to constructors and destructors of this object or the objects that collaborate with this object.

If you are working with procedural paradigm, you could have designed functions similar to *nix device driver interface and pushed all unconditional allocations and deallocations to equivalent of "create" and "destroy" functions of this object. As an alternate, you could have pushed them in similar functions of other objects and used their pointers in the code.

Deeper problem when you needed goto

goto might contribute to bad flow of the code. However, there is a deeper problem too.

Let me introduce two fuzzy concepts in my mind: depth of a program and width of a program.

"Depth of a program" is the inherent algorithmic terseness or the "stuff". Most of us would recognize this very easily. The books of Data Structures, Algorithms or Encoding are full of "stuffed" algorithms. Even most device drivers fall into this category. Such programs are characterized by:

  1. Well known, small number of inputs and outputs
  2. Most data manipulation is done using inputs
  3. Most statements in the code are dependent on the previous statement in the code

"Width of a program" is the informal measure of decoherence or the "fluff". Not many books talk about this. Why? Because such programs are "practical" and can't be represented in books satisfactorily. Moreover, *all* the languages lack the construct that can underscore the fluff. Also, the concept falls in the negative learning – "what not to do".

To elucidate my point, consider an imaginary construct "|||" – that means "execute left or right in any order. Consider the following piece of code:

int foo ( int *a, int b, int c) {

x = *a;

y = *a + b;

z = b * c;

*a =  x + z / y;

return *a;

Using the rudimentary (and insufficient) notation we defined earlier, the same code can be re-written as:

int foo ( int *a, int b, int c) {

x = *a; ||| y = *a + b; ||| z = b * c;

*a =  x + z / y;

return *a;

}

Did you see that the code got wider? IMHO that should be the measure of decoherence or "fluff". The wider the code is on average, the worse the functional decomposition is.

In limiting case, you may end up with multiple return paths joined by ||| – essentially, the function may be effectively split into many!

Let us consider our sympathetic case to goto. The fact that there were many returns from the function also told us that the code was decoherent or fluffier. If proper functional decomposition is applied to achieve slimmer functions, we will not be forced to use goto to avoid resource leaks.

0