About management systems and test automation framework

(These are my views. It has nothing to do with my current or past employers. The narration here doesn’t infer my experiences with or existing practices of my current or past employers.)

Eventually every networking product evolves a management software either in form of an element management system (EMS) or network management system (NMS). Similar approaches are also taken by storage industry, bio-medical device industry and so on. Essentially, an embedded software needs some management software.

It is also true that in this combination, larger portion of intellectual property lies within the embedded product. It is usually the first one to get developed, sold and later for margin extension management software is developed.

If it were not for margin extension, most companies would line up with OpenNMS kind of a solution. A proprietary management system also allows a vendor lockup.

However, there also exists a keen, silent awareness that a management software is a secondary product. Less resources are given for its engineering and testing. If a management software product is developed and tested fully, it could be really a USP – but it seldom happens.

On the other hand, such step-motherly treatment is also given to the core embedded product’s test automation framework. Actually it is even worse to be in charge of a test automation framework. The work demands maturity of an architect but is usually treated as even less important than testing or test automation.

If you take a step back and look, both pieces of software are doing the same thing – using an application software to control the core embedded software.

A management system typically cares about FCAPS (Fault, Configuration, Accounting, Performance and Security). In many commercial systems it allows scheduled tasks.

A test automation framework also does the same! It should have a scheduling mechanism, configuration mechanism, fault monitoring to raise errors, accounting and performance measurement modules for system testing and of course, security framework – to protect itself and also in form of penetration testing.

That means a large overlap exists in the requirement sphere of the two. Ideally, common libraries and even common front-ends should be written for the two. When it runs in customer mode, it should act towards management; when it runs in testing mode, it should run towards testing automation.

Not only testing resources are saved that way, we can save a lot of testing efforts too. Say if 60% code of the management system is common with the test automation framework, during test automation runs, this 60% overlap will be hammered like anything!

Then the question is, why is this not a practice? The answers are surprisingly not technical.

  1. As I mention earlier, management systems aren’t usually implemented later than the embedded software.  Often their development is later than even the test automation framework – at best they are developed in parallel with the test automation framework. Program management for absorbing both of them is hard
  2. Also, as I mentioned, there 60% overlap in functionality – but there is also a gap of 40% each. That means, you have to spend 140% to get both. Most of the time, senior management (rightly) is unwilling to risk so much at the same point of time – when one of them is a secondary software and the other isn’t even going to fetch any top line!
  3. Here is the dirty secrete of Hi-tech industry (or as it is called IT industry in India). It has a caste system. Hardware engineering is held at the highest esteem, followed by embedded software, followed by application software, followed by test automation, followed by testing, followed by support. This hierarchy reflects in choice of tools – like hardware is tested in C++ – and applications may be written in C++; embedded software is written in C/C++ and often programmers feel offended if they are asked to try higher level language – and are tested using either scripting languages or human languages (manpower). In this hierarchy, a development manager for a management system, a Java guy, usually finds it extremely offensive to agree that his/her problem space has so much in common with “something as mundane as a test automation framework”, which is “nothing but a bunch of scripts cobbled together”. Every hierarchy, every caste system doesn’t exist on the feeling of inferiority of  “I am sitting higher than y but I am sitting lower than x”. It works on the feeling of superiority of “I am sitting lower than x but I am sitting higher than y”. The world needs everyone with fairly the same importance – but human pride makes such hierarchies sub-optimal yet rigid. Writing a common software for the use of high-caste developers and low-caste testing usually becomes unthinkable, unspeakable or at least impractical.

As I mention in one of my jokes, sadly, technical problems are often the simplest.

Q: What is harder than to colonize Mars?

A: To get the budget approval for it!

Various levels of test automation

QA/Test managers tend to divide their QA/testing operation in two parts – manual and automated. In such a simple world, typically automation comes later than manual testing. In the mental image of a manager, automation starts replacing manpower a test case at a time – till he can “free up” his manpower to “do more productive things” (quotes are euphemisms for “fire” and “save the budget and get that big promotion” respectively.)

This dream is never achieved because of reasons I mentioned in one of my previous posts.

This is what happens in a typical product’s life:

  1. Product starts, everyone is in a rush to deliver features
  2. Features have shallow bugs – test cases proliferate
  3. In a few versions, features stabilizes (in other words, the code learns test cases)
  4. There is more income and time
  5. Someone remembers automation because “juicy” bugs stopped coming
  6. Automation is started by a bunch of enthusiasts
  7. Automation reaches to 40%+ of testing, test work is greatly reduced. Managers make heroes of automators
  8. A few releases go by, UI technology changes. Automation breaks
  9. Automation is faster than manpower – so loads system I/O in a manner it isn’t designed. Developers come in friction with automation guys
  10. Automation manpower is dragged to manual testing because the very new UI requires intensive testing
  11. Automation plays a catch up game
  12. Manual testing lead/manager revolts against the blue eyed boy called automation. He has undeniable arguments: a. Automation doesn’t find bugs, manual testing does (obviously! The code has learnt the automation!) and b. Automation maintenance is expensive (because it broke with change in the UI). Developers join them because “Automation finds irrelevant bugs. System wasn’t designed for 50,000 commands per second.”
  13. Automation continues at a slower speed
  14. Back to step #8

The cure to this is to see that neither automation is a goal, nor a means to achieve the goal. Automation is no silver bullet.

Realistic strategy is to divorce “test” and “automation” from the phrase “test automation”.

TASKS related to testing should be automated at various levels, not test cases.

To give you an example, take an embedded product like set top box (STB). An STB can have many bugs initially. Let us take a list of bugs:

  1. The STB does not responding to the remote control
  2. The STB crashes every now and then
  3. The STB does not have new interface (say HDMI) working
  4. The STB does not display with a type of a TV or DVR
  5. The STB fails when SNR goes down by 20 dB and the channel is flipped before 20 mS of that

Now look at the approaches to automate all the tests:

  1. The STB started responding to the remote control starting version 1.2 and now we are in the version 11. (Because the developer doesn’t check in the code drunk) the code NEVER breaks after version 1.2. Still someone has to make sure the remote control does work with the STB. So automation is (rightfully) verifying this in the build verification test
  2. Finding crashes is a pride of manual testing. They often take jibes at automation for not being able to find as many crashes as the men do. However, the automation guy smiles without brilliance and tells them that automation does look at cores with every test it runs – cores just don’t happen as much during automation runs. In a private meeting with the QA manager, the support manager shows number of cores that happened in the field that should be caught by QA. The QA manager realizes that automation for such cases doesn’t exist – and junior manpower doesn’t look at cores so often
  3. QA finds a lot of bugs, automation doesn’t have a libraries to use HDMI interface! Waiting and waiting on GitHub and Sourceforge…
  4. Manual and automation – both approaches are at loss. TVs and DVRs are raining in the market. Which one to test the compatibility with? QA manager goes by the market survey to identify top 5. It takes a quarter to come up with market survey. By then all the bugs are already in the support database
  5. Oops! Didn’t the picture quality test plan have it? Doesn’t the channel flipping test plan have it?

As you can see, the STB testing manager is in a crisis. What happened to all the good job automation team has done for these many years?

The right way for this team is to split the work in FIVE levels of automation. (What? Five levels? Read on.)

First of all, understand that the goal is to deliver as much quality (that is inverse of field reported bugs) in as little cost (and most probably as less time). Automation doesn’t matter, productivity matters. Not automating itself is a perfect option.

However, not automating isn’t the first option in the hierarchy. Understand that testing can be carried out by testers of varying degree of qualification – cheaper the better. The tester who can understand resource lockup isn’t necessary to test compatibility with 50 TV models.

So our approaches from least to most complex code in automation are:

  1. Nautomation – No Automation, rather anti-automation – Deploy a crowd of minimal wage workers to look at each TV and give one bell to each. Put a large screen in front of them and another behind them. Wire up webcams, passing their outputs through a multiplexer. In the front of them is seen your expensive test engineer demonstrating them how to test a TV with your STB. If their TV gives a different picture, they have to press the bell. The webcam behind that person’s back activates and projects to the screen on their back. Your expensive test engineer looks at the error and decides whether it is a bug or not. Collective stoppage of the crowd is less expensive than missing that bug on that TV model, which could be in the top 10. Here whatever automation was used for web cams and the multiplexer, is used to INCREASE the headcount, not decrease
  2. Manual testing – for new features. Until the interface or new hardware doesn’t “set in”, investing manual testers is cheaper than automation QA. As Krutarth quoted Google’s view in above post, automating too early is detrimental. Also, use manpower to test UI intensive testing. Because when UI changes, you don’t want your automation to be brittle. In this approach, there is zero automation and manpower neither increases, nor decreases
  3. Semi-automation – for what is called as “bull’s eye” observations – like watching for cores and process restarts and CPU usages. Give your manpower automated tools that act like an associate – checking fixed set of criteria and warn when something goes awry.  Yet another area you can automate is to challenge the manpower testing a feature with changing “everything else” or creating unexpected conditions like restarting a process, rebooting a box, failing over an High Availability solution etc. This will keep your testing safe from the code learning the tests. Combinations go out of hand really fast. So the code doesn’t get a chance to “learn and saturate”. Here automation is small and manpower marginally decreases (because in a large team, you may typically save a man or two by not always testing for those cores).
  4. Test automation – for regression testing, including of that of that remote control. Slowly test automation should cover up manual tests as much as possible. Don’t use test automation for UI intensive testing. In other matters like being aided by observation engines or combinatorial engines or event engines, test automation is identical to manual testing. Code actually learns faster from test automation because it is more predictable. Test automation is almost linear – more you have, more manpower you can substitute – once again, subjected to the UI limitations
  5. Meta-automation – this is the most abused word by theoreticians. Meta automation is like “automating the automation”. Someone on the web sells pair testing based on this label. Pair testing is just one of meta automation approaches possible.  Test automation with variable “everything else” will be an obvious extension of this approach. Another could be “off-by-one”, wherein you pass the constructor/destructor and the count of all kinds of classes you can think of. Yet another could be what I would like to call the Brahama-Vishnu-Mahesh (BVM) testing in which three independent loops try to create a object, invoke operations that “uses an” object and destroy an object. Given randomness of such operations, various life stages of an object can be tested. There could be so many patterns for testing like there are Design Patterns in the famous Go4 book. Here it may not be possible for the code to learn all the test scenarios. However, the flip side is, it may not be possible to even test all the scenarios, or to deduce the right behavior of the software under a given scenario – and at last, may not even be possible to recreate a bug at will with 100% confidence. However, such testing will expose the weakest assumptions in the design. Let me tell you, developers hate this testing :-). If the automation libraries are designed carefully, it will be as complex as number of features (or classes) plus number of cross-cutting concerns or aspects (like logging) times number of patterns (or templates) of testing. However, it will keep testing in an exponential manner. There is no point in comparing how much manpower it will save – yet you can safely bet, exponential saving in manpower is possible.

At various stages of my life, I have tried all the five approaches and have succeeded in all the five.

Once free from the dogma of “automation must save manpower linearly”, much higher levels of productivity and quality are possible.

What approaches have you seen in your experience? Is this list exhaustive?

Can you suggest more testing patterns? I am planning to hitch on a wide survey of some bug databases to find more patterns.

Also, next time, I will highlight how money can be saved by intelligently clubbing administration tools and testing tools. Stay tuned!

First hundred integers in Fibonacci number system

1 1
2 10
3 100
4 101
5 1000
6 1001
7 1010
8 10000
9 10001
10 10010
11 10100
12 10101
13 100000
14 100001
15 100010
16 100100
17 100101
18 101000
19 101001
20 101010
21 1000000
22 1000001
23 1000010
24 1000100
25 1000101
26 1001000
27 1001001
28 1001010
29 1010000
30 1010001
31 1010010
32 1010100
33 1010101
34 10000000
35 10000001
36 10000010
37 10000100
38 10000101
39 10001000
40 10001001
41 10001010
42 10010000
43 10010001
44 10010010
45 10010100
46 10010101
47 10100000
48 10100001
49 10100010
50 10100100
51 10100101
52 10101000
53 10101001
54 10101010
55 100000000
56 100000001
57 100000010
58 100000100
59 100000101
60 100001000
61 100001001
62 100001010
63 100010000
64 100010001
65 100010010
66 100010100
67 100010101
68 100100000
69 100100001
70 100100010
71 100100100
72 100100101
73 100101000
74 100101001
75 100101010
76 101000000
77 101000001
78 101000010
79 101000100
80 101000101
81 101001000
82 101001001
83 101001010
84 101010000
85 101010001
86 101010010
87 101010100
88 101010101
89 1000000000
90 1000000001
91 1000000010
92 1000000100
93 1000000101
94 1000001000
95 1000001001
96 1000001010
97 1000010000
98 1000010001
99 1000010010

Fibonacci number system

(In this post, “number” means “positive integer”.)

Simply speaking, if we substitute 2’s or 10’s power system as the basis for presentation with Fibonacci series, we can get  Fibonacci “based” number system.

For example, number 16 (base 10) can be converted into Fibonacci number system like this:

16 -> The biggest smaller Fibonacci number 13 take out 1 time -> Remainder is 3

3 –> The biggest smaller Fibonacci number 8 take out 0 time –> Remainder is 3

3 –> The biggest smaller Fibonacci number 5 take out 0 time –> Remainder is 3

3 –> The biggest smaller Fibonacci number 3 take out 1 time –> Remainder is 0

0 –> The biggest smaller Fibonacci number 2 take out 0 time –> Remainder is 0

0 –> The biggest smaller Fibonacci number 1 take out 0 time –> Remainder is 0

(and ignore the first Fibonacci number 1)

Output -> 16 (base 10) = 100100 (base Fibonacci).

It is easy to see that:

  1. Each number has unique representation
  2. Each representation has unique number
  3. Because for n>3, Fibonacci(n) < 2 * Fibonacci(n-1), the representation is using only 2 symbols, 1 and 0
  4. Because Fibonacci(n) = Fibonacci(n-1) + Fibonacci(n-2), there could be no two consecutive 1s in Fibonacci number system. 0110 (base Fibonacci) is the same as 1000 (base Fibonacci) and the only latter one is valid
  5. Because 11 is an invalid sequence, Fibonacci presentation of a number will be longer than base 2 presentation (normal binary)

Stay tuned for the code and more observations about basic arithmetic operations – or start sharing here :-)

BTW, we can use any strictly increasing monotonic series as the basis for a number system which has first element as 1. (That is, it is possible to come up with factorial number system also.)

Here is the code:

using namespace std;
#include <iostream>
#include <string>
#include <fstream>

const int size = 50; // maximum size of presentation
unsigned long int f[size]; // place value holder
int value = 100; // number of integers to convert

void fill (unsigned long int *f);
void convert (int i, int j, std::string &s);
int findMSP (int i);

int main(int argc, char** argv) {

ofstream outfile;
outfile.open(“FibonacciNumbers.csv”);

// Create a long enough place value array
fill(f);

// now convert integers from 1 through ‘value’ to the new place value system
for (int i = 1; i < value; i++) {

std:string presentation;
convert(i, findMSP(i), presentation);
outfile << i << “,” << presentation << endl;

}

outfile.close();

return 0;

}

// core logic

void convert (int i, int j, std::string &s) {

int mul = i/f[j];

i %= f[j];

s.append(1,(char)mul+’0′);

if (j >= 1) convert (i, –j, s);
else return;

}

// just how many places are needed for the number in the new place value system
int findMSP (int i) {

int j = size – 1;
for (; j >=1; j–) {

if (f[j] <= i) break;

}
return j;

}

void fill (unsigned long int *f) {

// fibonacci for now
f[0] = 1;
f[1] = 2;
for (int i = 2; i < size; i++) {

f[i] = f[i-1] + f[i-2];

}

}

Pair – A new data structure?

Inspired by dances, here is a data structure called “pair”. Please let me know if similar data structure exists.

A pair has two elements – element [0] and element [1].

An element may “join” or “leave” the pair under some conditions:

  • The pair is in “empty” state to start with
  • When created, an element must be in “waiting” state. The pair goes into “proposed” state
  • If two elements are in “waiting” state, they enter in a “bonded” state. The pair goes into “full” state.
  • Once “bonded”, if one of the element wants to “leave”, it enters “leave requested” state. The pair goes into “shaky” state
  • If both the elements are in “leave requested” state, they are “debonded” from the pair and destroyed. The pair goes into “empty” state
  • If a “waiting” element wants to “leave” [that is, the pair was "proposed"], it is directly debonded and destroyed. The pair goes  into “empty” state

Most paired dances follow this data structure. I guess one-to-one chats also must be following this structure. What other uses can you think of?

Also, there is a possibility of a directional pair – kind of inner/outer loops of a raasa dance.

And finally, it could be more than a pair, a n-tuplet.

When two non-linearities collide

What happens if exponential growth of productivity continues?

Most tech managers will laugh at exponential growth of productivity. However, it is quite achievable. OK, OK! For now, let us assume it is possible to take steps that improve 20%-50% productivity improvements year over year. [Later I will charge consultancy fee if you want to know how :-)]

Over four separate occasions, I have experienced that while such a streak of improvements is possible, it is not practical to continue exponential growth for longer than 3 or 4 steps. Let me rephrase it for better understanding. While it is technically possible to make things better, faster and cheaper, it doesn’t happen after 3 or 4 such steps.

Typically I have seen such a growth hitting one of the following limits:

  • Problem domain limit: For example, when subatomic particles were discovered, they were discovered at a phenomenal rate. In a few decades it was predicted that there will be a particle for each person on the earth. Two things stopped that explosion. First, human beings out-produced themselves over papers :-) and second, Standard Model gave a very succinct presentation and the race ended. Similarly, when I mounted on designing game boards, I went exponentially inventing new boards (like cylindrical, helical, …) till I found the principle behind what I was doing. After that, I could predict all and there was an end
  • Operational limit: For example, if you could meta-automate tests, wrote libraries in a way that test cases generate themselves. I have reached to meta-automation stages 3 times in my career so far. It found bugs by tons. However, soon I hit operational limits in terms of the lab. I was able to generate tens of thousands of tests on demand but I still needed physical time and setups to run them
  • Organizational limit: For example, you followed “Decision-less Programming” to its extreme or designed tools to crank out near-faultless code, other organizations (like your users or testers) may not be ready for such an assault. Once I eliminated my team’s jobs doing something like this. All the foreseen projects were done when our homegrown “web-page flow to code converter” and “DB to web-page designer” worked perfectly in tandem. [The pages sucked aesthetically but hey, who cared about the UI of internal tools?]
  • Financial/market limit: For example, the market for cars can’t accept cars going faster than some kmph. Any faster and roads or human reflexes may end up in accidents
  • Social limit: For example, I remember someone from Siemens complaining about selling thermal power plants in India. While the plant technology was improved exponentially to the level that it could be run by less than half a dozen engineers, the government insisted on specifying the need more than a hundred to be employed. The power problem wasn’t so much of that of power generation but of employment generation, he noted

What other limits cross your mind?