Why should history be limited to browsing and search?

Why don’t we make history a feature for replacement and swapping too?

Advertisements

Find, Replace, Mark/Highlight – have a friend, Swap!

I am writing code for (yet another) control structure in Tcl. I realize that Find/Replace/Mark in notepad++ isn’t enough.

When you are writing test cases, you vary one parameter at a time. That means a set of tests modify parameter ‘a’. Yet very similar tests modify parameter ‘b’. All we need is to swap string for ‘a’ with string for ‘b’ in scripting.

I know swap is nothing but three replacements a->c, b->a, c->b. However, swap has very crisp and clear definition and machines do that much better than three cycles of replacements initiated by a human being.

Just like in replacement, the destination string can’t be a regular expression, ignoring cases may not make sense etc. Because swap is nothing but replacement both ways, both the fields have to ignore such modifiers.

Ah! Only if I could change code editors and word processors of the world!

Fibonacci number system

(In this post, “number” means “positive integer”.)

Simply speaking, if we substitute 2’s or 10’s power system as the basis for presentation with Fibonacci series, we can get  Fibonacci “based” number system.

For example, number 16 (base 10) can be converted into Fibonacci number system like this:

16 -> The biggest smaller Fibonacci number 13 take out 1 time -> Remainder is 3

3 –> The biggest smaller Fibonacci number 8 take out 0 time –> Remainder is 3

3 –> The biggest smaller Fibonacci number 5 take out 0 time –> Remainder is 3

3 –> The biggest smaller Fibonacci number 3 take out 1 time –> Remainder is 0

0 –> The biggest smaller Fibonacci number 2 take out 0 time –> Remainder is 0

0 –> The biggest smaller Fibonacci number 1 take out 0 time –> Remainder is 0

(and ignore the first Fibonacci number 1)

Output -> 16 (base 10) = 100100 (base Fibonacci).

It is easy to see that:

  1. Each number has unique representation
  2. Each representation has unique number
  3. Because for n>3, Fibonacci(n) < 2 * Fibonacci(n-1), the representation is using only 2 symbols, 1 and 0
  4. Because Fibonacci(n) = Fibonacci(n-1) + Fibonacci(n-2), there could be no two consecutive 1s in Fibonacci number system. 0110 (base Fibonacci) is the same as 1000 (base Fibonacci) and the only latter one is valid
  5. Because 11 is an invalid sequence, Fibonacci presentation of a number will be longer than base 2 presentation (normal binary)

Stay tuned for the code and more observations about basic arithmetic operations – or start sharing here 🙂

BTW, we can use any strictly increasing monotonic series as the basis for a number system which has first element as 1. (That is, it is possible to come up with factorial number system also.)

Here is the code:

using namespace std;
#include <iostream>
#include <string>
#include <fstream>

const int size = 50; // maximum size of presentation
unsigned long int f[size]; // place value holder
int value = 100; // number of integers to convert

void fill (unsigned long int *f);
void convert (int i, int j, std::string &s);
int findMSP (int i);

int main(int argc, char** argv) {

ofstream outfile;
outfile.open(“FibonacciNumbers.csv”);

// Create a long enough place value array
fill(f);

// now convert integers from 1 through ‘value’ to the new place value system
for (int i = 1; i < value; i++) {

std:string presentation;
convert(i, findMSP(i), presentation);
outfile << i << “,” << presentation << endl;

}

outfile.close();

return 0;

}

// core logic

void convert (int i, int j, std::string &s) {

int mul = i/f[j];

i %= f[j];

s.append(1,(char)mul+’0′);

if (j >= 1) convert (i, –j, s);
else return;

}

// just how many places are needed for the number in the new place value system
int findMSP (int i) {

int j = size – 1;
for (; j >=1; j–) {

if (f[j] <= i) break;

}
return j;

}

void fill (unsigned long int *f) {

// fibonacci for now
f[0] = 1;
f[1] = 2;
for (int i = 2; i < size; i++) {

f[i] = f[i-1] + f[i-2];

}

}

Pair – A new data structure?

Inspired by dances, here is a data structure called “pair”. Please let me know if similar data structure exists.

A pair has two elements – element [0] and element [1].

An element may “join” or “leave” the pair under some conditions:

  • The pair is in “empty” state to start with
  • When created, an element must be in “waiting” state. The pair goes into “proposed” state
  • If two elements are in “waiting” state, they enter in a “bonded” state. The pair goes into “full” state.
  • Once “bonded”, if one of the element wants to “leave”, it enters “leave requested” state. The pair goes into “shaky” state
  • If both the elements are in “leave requested” state, they are “debonded” from the pair and destroyed. The pair goes into “empty” state
  • If a “waiting” element wants to “leave” [that is, the pair was “proposed”], it is directly debonded and destroyed. The pair goes  into “empty” state

Most paired dances follow this data structure. I guess one-to-one chats also must be following this structure. What other uses can you think of?

Also, there is a possibility of a directional pair – kind of inner/outer loops of a raasa dance.

And finally, it could be more than a pair, a n-tuplet.

Fascination to 2-D continues

I was editing some Excel sheets. Being a manager, I am just good at this piece of software it seems 🙂 People also warn that if I become executive, Excel will be substituted with Powerpoint.

Anyway, let us return to Excel. It lets me copy-paste in handsome manner. I can paste with/without format, with/without formula and also paste transverse (rows and columns flipped). As a user of Emacs (!a manager using Emacs!) I remember pasting in inverted order also. Excel notably misses this part.

However, yesterday I realized need for one more way of pasting – bottom-up pasting. It is easy to confuse it with Emacs’ inverted pasting. Here is the difference:

Emacs’ inverted pasting still pastes in  “forward” direction, from the cursor rightwards.

What I wanted was to paste from cursor leftwards or upwards.

That means:

  1. “Pasting information” direction may be straight or inverted (like that Emacs’ option)
  2. Pasting direction (as in my requirement) is different and independent from “Pasting information” direction
  3. Pasting direction may be rightwards (as usually done) or leftwards. In 1-D it may not make difference but in 2-D it will make huge difference
  4. Transverse pasting is yet another, independent activity

BTW, are we slowly gathering requirements for Excel++ here? Is 2-D programming really emerging here?

Learning Curve – Beyond Search Engines

Nobody doubts that Internet search has revolutionized human endeavor of gathering information. I have been using Internet for 20 years for now. Over the period of time, I have started realizing limits of just a “Search Engine”.

If you look back at the most prominent search engine Google, it has changed in only the following ways from black box point of view:

  1. Predictive search, which in turn killed “I’m feeling lucky”
  2. Tag based search on images etc.
  3. Infinite scrolling of search results rather than pagination
  4. More relevant search

Nothing of above was non-trivial achievement. However, it is still search, search and search. Wolform Alpha has a new approach to search – and some other search engines may have some more angles to search.

However, as an intelligent being, searching for the information isn’t enough. A lot of times another longer intellectual activity may be going in the end user’s mind. This activity is called “learning”.

My wonder is, can an engine be designed that takes all the random information on the web and draw a Learning Curve for a given concept? That means, out of given concept (keyword), can the Web be rendered in ordered manner to introduce that concept and gradually leading the reader to deeper and deeper concepts? I will call such an engine Learning Curve

Outcome of such an engine on human intelligent activity will be dramatic. Suddenly searching will sound too consumerist. If we can draw such a curve (or curves) for a given concept – and associate learning sessions to a tool like browser, potential of human civilization can be realized like it has never been. If we can make such an engine (and popularize it 🙂 human knowledge growth curve will be much steeper than current “search and get lucky” approach. If Learning Curve approach is like stable agriculture, searching is like hunting-gathering.

Next question is, how do we order the ever-increasing –knowledge-in-the-web in the order of easy to hard?

First negatives:

  • User rating of a standalone page is likely to be defrauded, misjudged
  • Asking administrators to rate the page is likely to be misjudged or devoid of motivation
  • Based on human learning habits, one way to learn may not exist

Now how it may be achieved:

  • Absolute rating of pages doesn’t make sense but relative rating does. A browser plugin may display two pages about a concept side by side with end user rating easier-harder
  • An engine storing each page’s ranking with each concept with other rating (now, that is computationally suicidal)
  • Upon next request to serve the page, the engine serves pages by rank. The user can adjust rank like chess engine according to her/his level in a particular subject. The service may not be serving pages like rank 1, 2, 3 and so on. The service may bunch pages in links for 1-50, 51-100, …

Do you think it may work? If not, what can work? Is this idea worth pursuing at all?

BTW, commercialization of this idea is easy 🙂

 

 

Testing a feature in presence of many possible combinations of environment

Take for example testing of a router where we list features in columns of test areas:

L1 L2 L3
10BasedT STP RIPv1
100BasedT RSTP RIPv2
1G PVST OSPF
10G MVST ISIS
BGP

In the beginning of a product like this, most interest will be in making sure each of the protocols work. Stand alone testing of the cells will be more than enough.

Once most low hanging (and highly damaging) bugs that could be found through testing cells of the matrix alone, are weeded out of the software, QA progresses by making “experts” of testing going column-wise – L1 expert, L2 expert, L3 expert etc. This leads to more experienced testers and technically great bugs (like route redistribution having memory leaks). QA also arranges its test plans by test areas and features.

At this stage, only a section of bugs seen by a customer is eliminated and the complaint “QA isn’t testing what the customer ways” continues.

That is because the customer doesn’t use the product by columns. A typical customer environment is a vector selected one member from each column (at most). A product is likely to break at any vector.

As you can see, overall testing of the matrix above would require testing 4*4*5 = 80 environments. In reasonable products the actual number may be in 10,000’s.

***

Testing a feature in presence of many possible combinations of environment is a well-known QA problem.

Various approaches have been suggested. There have been combinatorial engines and test pairs and so on to help QA optimize this multi-dimensional problem.

The approach I discuss here is yet another algorithm to be followed by semi-automation. Just let me know your thoughts about it.

***

Let us define our terms once again:

  • Feature: A feature in the product [I know that isn’t a very great definition.]
  • Area: A set of related features
  • Product: A set of Areas
  • Environment: A vector (or a sub-vector) of Product, with each component coming from an Area
  • Environment for a Feature: A maximal length sub-vector of the Product without the Area of Feature

Please understand once again that QA expertise and test plans are structured by Area (a column in the matrix). The best way to test will be to test every test of every Feature against every “Environment for a Feature”.

This approach is simply uneconomical. So, what is the next best approach?

***

Before coming to that, we need to understand how test cases are typically structured. In a feature, typically test cases have some overlap of steps – like configuration or authentication etc or overlapping phenomena like exchange of Hello packets, establishment of neighborhood etc.

That means, there is a significant redundancy of tests from white-box point of view.

This redundancy can be exploited to assure that the product may stand reasonably well in diverse environments. As we discussed earlier, such an environment is a vector of that matrix, which in turn is a sub-vector plus a test condition.

***

Understanding so much brings us to a workable solution for testing more “customer-like” without incurring too much cost.

The key understanding from the above discussion is that Environment for a Feature can be changed independently of the test case (or the Feature or the Area).

That means if the tester can signal an “environment controller” at the end of a test case, the controller may change the DUT/SUT to another Environment for the Feature. After that change is done, the tester just continues testing the system to the next test case – till all test cases end.

Because it is less likely that number of test cases are a factor (or multiple) of number of sub-vectors, within a few test cycles reasonable amount of test steps will be tested across reasonably diverse environmental sub-vectors.

As a strategy of testing the product, QA can assure its internal customers that over a few releases, most interesting bugs that can be found economically will be found.

***

What are the downsides of this approach? For sure, the Environment for a Feature must not have any configurations about the Feature under test – or even the Area. That means the tester will have to always configure the Feature before going to the next test. If you are working on a Feature that takes VERY long to converge, you are out of luck. [VPLS in networking comes across as an example.]

Since most products don’t involve long signaling delays, let me focus on the optimization of this testing.

How can we find maximum number of bugs in a given feature (or the entire column) related to environmental changes within minimum number of steps?

The answer is obvious to the readers of this blog – by taking the sub-vectors in anti-sorted way!