Sunday, December 9, 2012

TopCoder SRM 563 Editorial


This SRM featured three pairs of problems with easy and hard versions: FoxHandle, CoinsGame, and SpellCards. The two CoinsGame problems could be solved with graph search, and the two SpellCards problems could be solved with dynamic programming. Notably, the harder version of SpellCards reduced to a trivial application of knapsack, but proving this reduction was tricky and most coders used a more obvious, but longer, dynamic programming solution to solve it. In fact, the author himself stated in this forum post that he did not know about the knapsack solution until after the contest.


This is a useful exercise in using the substr() function. Let \( n \) be the length of \( s \). Iterate over all length-\( n \) substrings of \( t \). Let this substring be called \( u \). If \( u \) is equal to \( s \) and if the string formed by removing \( u \) from \( t \) is also equal to \( s \), then return true.


There are several approaches which work. The simplest one is probably just to brute force all possible moves. The following is pseudocode for a recursive solution which implements that idea:

// xi, yi are the coordinates of coin i.
// steps is the number of steps the number of moves that we have made so far.
rec( x0, y0, x1, y1, steps )
  if( ( x0', y0' is off the grid ) && ( x1', y1' is off the grid ) )
    return INFINITY;

  if( ( x0', y0' is off the grid ) || ( x1', y1' is off the grid ) )
    return steps;

  if( steps == 10 ) return INFINITY;

  ans = INFINITY;
  for each direction d
    let x0', y0' = move in direction d from x0, y0;
    let x1', y1' = move in direction d from x1, y1;
    ans = min( ans, rec( x0', y0', x1', y1', steps + 1 ) );

  return ans;
The recursion depth is at most 10, and each recursive call branches 4 times, so the function is called at most \( 4^{10} \approx 1 \text{ million } \) times. This is fast enough to pass on TopCoder.

Alternatively, you could use breadth-first search or some more fancy algorithms, but they aren't necessary in this case.


Let's start with a simple solution, figure out why it doesn't work, and then improve on it. Suppose we have the following recurrence:

f( i )
  if( i == n ) return 0;

  // use card
  use = 0;
  if( i + lvl[i] <= n ) ans = dmg[i] + f( i + lvl[i] );

  // don't use card
  no_use = f( i + 1 );

  return max( use, no_use );
This seems simple enough: We look at each card \( i \) and decide whether to use it or not. If we use it, then we remove cards \( i, i + 1, \ldots, i + lvl_i - 1 \), and then recursively proceed from there. The problem, unfortunately, is that the cards we decide to remove may not necessarily be contiguous. Consider the following example: \[ lvl = [ 2, 1, 1 ] \\ dmg = [ 100, 1, 0 ] \] Here it's optimal to first use middle card, then use the first card, dealing a total of 101 damage. Our algorithm, on the other hand, would return 100 because it uses the first card, and then discards the first two cards.

Suppose we keep a count of cards that need to be removed, but we don't remove them right away*. Call this value \( need \). Now, examining each card from left to right, we can do the following:

  • Don't use the card. This decreases our current \( need \) value by 1.
  • Use the card. This increases the damage we do by \( dmg_i \) and increases \( need \) by \( lvl_i \).

Claim: If, after examining all the cards, we have \( need = 0 \), then the cards we chose to use are a valid combination.

Proof: Suppose the rightmost card that we decided to use was card \( i \). When we processed card \( i \), \( need = k \) for some \( k \), and afterwards, we had \( need = k + lvl_i \). Since we managed to decrement \( need \) to 0 by the time we processed all the cards, then there must be at least \( k + lvl_i - 1 \) cards to the right of \( i \). Now let's use card \( i \) and then remove cards \( i, i + 1, \ldots, i + lvl_i - 1 \) from the deck. This new deck also satisfies the \( need = 0 \) property. That is, if you look at all the cards we decided to use and increment or decrement \( need \), then you still get to \( need = 0 \) after examining all the cards. We can then inductively apply the above process. Use the rightmost card, and then remove it, along with some cards to the right of it. Eventually we will have used all the cards we chose originally without breaking any rules of the game.

Based on our above observation, we can write a new recurrence relation:

f( i, need )
  if( i == n ) return 0;

  // use card i
  use = 0;
  if( need + lvl[i] <= n ) use = dmg[i] + f( i + 1, need + lvl[i] - 1 );

  // don't use card i
  no_use = f( i + 1, max( need - 1, 0 ) )

  return max( use, no_use )
Now we just need to use dynamic programming or memoisation to make sure that each recursive call gets evaluated at most once. The complexity of this algorithm is \( O( N^2 ) \) for \( N \) cards, which is more than fast enough for the time limit.


There are several approaches to solve this problem. In this tutorial, I'll go over a simple semi-greedy approach. In general, problems that require a 'lexicographically least' solution suggest some sort of smart greedy solution, but this is not always the case.

We begin with a simple algorithm and fill in the blanks. The algorithm below keeps track of the position of the last character we used, and at each step, searches all subsequent characters and picks the 'best' one:

f( s )
  ans = "";
  last = 0;
  for t from 0 to length(s) / 2
    best = last;

    for i from last to length(s)
      if (we can append s[i] to ans) && (s[i] < s[best])
        best = i;

    ans += s[best];
    last = best;

The only missing piece is determining whether, at each step, we can append some \( s_i \) to \( ans \). As it turns out, the following condition is sufficient:

Let \( cnt( c, str ) \) denote the number of occurrences of \( c \) in some string \( str \). Then we can append \( s_i \) to \( ans \) if and only if for all \( c \), \[ cnt( c, ans ) + cnt( c, s_{ i, i + 1, \ldots, n - 1 } ) \geq cnt( c, s ) / 2 \]
The rationale for the above condition is the following. Let's examine each character that we have already added to \( ans \) (i.e. \( cnt( c, ans ) \)), as well as each character which we could potentially add to \( ans \) in the future (i.e. \( cnt( c, s_{ i, i + 1, \ldots, n - 1 } ) \) ). In order for us to produce a valid string in the end, these two numbers together must be at least equal to the number of characters we need in the final string, which is \( cnt( c, s ) / 2 \). This is both necessary and sufficient for \( s_i \) to be a valid addition to \( ans \).

Once we have the above condition, the pseudocode now looks like this:

f( s )
  ans = "";
  last = 0;
  for t from 0 to length(s) / 2
    best = last;

    for i from last to length(s)
      ok = true;
      for c from 'a'  to 'z'
        if cnt( c, ans ) + cnt( c, s.substr( i ) ) < cnt( c, s ) / 2
          ok = false;

      if ok && s[i] < s[best]
        best = i;

    ans += s[best];
    last = best;


We can solve this problem in a similar manner as we solved its division 2 version. All we have to do is to rotate the string, and compute the same recurrence relation as above. However, there is a more nifty solution that's trickier to see, but much, much easier to code. It turns out that SpellCards is equivalent to the knapsack problem, where each object \( i \) has weight \( lvl_i \) and value \( dmg_i \).

Proof: It suffices to show that a set of cards \( S = \{ x_1, x_2, \ldots, x_m \} \) can be played if and only if \[ \sum_{ x \in S } lvl_x \leq n \] One direction is clear -- all solutions to SpellCards must also be solutions to knapsack. For the converse, assume the above inequality holds for some set of cards \( S = \{ x_1, x_2, \ldots, x_m \} \).

Claim: There must be at least one card \( x_i \) that we can use without removing any other cards in \( S \) in the process.

Proof of claim: Assume the contrary -- that for each \( x_i \), there is some other \( x_j \) within the range \( [ x_i + 1, \ldots, x_i + lvl_{x_i} - 1 ] \) modulo \( n \). Then we have a collection of intervals \( [ x_i, x_i + 1, \ldots, x_i + lvl_{x_i} - 1 ] \) which not only cover all the cards, but also overlap. This implies that \[ \sum_{ x \in S } lvl_x > n \] which contradicts the assumption. Thus, there must be at least one card which we can use.

Given the claim, for every knapsack-satisfying set \( S = \{ x_1, \ldots, x_m \} \), we can use some \( x_i \in S \) without removing any of the other \( x_j \). This leaves us with \( n - lvl_{x_i} \) cards total. Furthermore, \[ \sum_{ x \in S \setminus \{ x_i \} } lvl_x = \left[ \sum_{ x \in S } lvl_x \right] - lvl_{x_i} \leq n - lvl_{x_i} \] I.e., after using \( x_i \), the set \( S \setminus \{ x_i \} \) still satisfies the knapsack inequality with the new set of cards. By induction, we can use all the cards if the initial set \( S \) satisfies the inequality. This completes the proof: we can solve SpellCards by simply applying a knapsack dynamic program with weights \( lvl_i \) and value \( dmg_i \).


The notion of an equivalence relation is a very useful tool from basic mathematics which can help us solve this problem. Equivalence relations allow us to define relationships between arbitrary subsets of a particular set \( S \) in terms of binary relations between individual elements, in effect turning \( O( 2^{|S|} ) \) amount of data into \( O( |S|^2 ) \).

In this problem, we would like to consider the following relation: for two given non-obstacle squares in the grid \( a \) and \( b \), place coins on both \( a \) and \( b \). We say \( a \sim b \) (or \( a \) is similar to \( b \)) if every sequence of moves that causes \( a \)'s coin to fall off also causes \( b \)'s coin to fall off, and vice versa. I.e., the starting configuration consisting of coins on \( a \) and \( b \) is not a good configuration. (The proof that this is an equivalence relation is left to the reader.)

Now suppose that from the above relation, we have equivalence classes \( C_1, C_2, \ldots, C_m \) that partition \( S \), the set of non-obstacle squares. By the way we defined the relation, an initial configuration is good if and only if it contains some squares in at least two classes. Then the total number of good configurations is the number of configurations containing only squares in one class, namely \[ 2^{|S|} - 1 - \sum_{i} \left( 2^{|C_i|} - 1 \right) \]

It remains to determine how we actually compute the equivalence classes. We do this by computing which pairs \( a, b \) are not similar, i.e. there exists some sequence of moves which causes \( a \)'s coin to fall off, but not \( b \)'s, or vice versa. It turns out that we can do this with a simple breath-first search (DFS will overflow the stack). Let \( prev( s, d ) \) denote the set of all squares \( t \) such that moving in direction \( d \) from \( t \) takes you to \( s \). Then the following algorithm computes whether or not pairs of squares are distinct:

boolean distinct[N][N];

list q;
for( square s0 )
for( square s1 )
  distinct[s0][s1] = false;
  for( each direction d )
    let s0' = move in direction d from s0;
    let s1' = move in direction d from s1;
    if( ( s0' is off the board ) ^ ( s1' is off the board ) )
      distinct[s0][s1] = true;
  if( distinct[s0][s1] )
    q.add( s0, s1 )

while( q is not empty )
  s0, s1 = q.pop();
  for( each direction d )
  for( t0 in prev( s0, d ) )
  for( t1 in prev( s1, d ) )
    if( !distinct[t0][t1] )
      distinct[t0][t1] = true;
      q.add( t0, t1 );
This algorithm runs in \( O( |S|^2 ) \); since \( S \leq 1600 \), this will complete in under the time limit.

Once we have computed which pairs are distinct, it is quite simple to figure out the equivalence classes and then compute the final answer.


  1. This is a very standard technique in designing amortised algorithms: you keep track of the 'costs' you incur, and then you figure out how to 'distribute' them optimally later. Resizing ArrayLists or vectors is a classic example of a commonly used amortised algorithm.

Saturday, November 17, 2012

General Impressions about InterviewStreet

As a longtime participant of programming competitions (ACM-ICPC, TopCoder, Code Jam, Codeforces, SPOJ, etc...), I initially found the idea of InterviewStreet to be promising. Like TopCoder and other similar competitions, it builds connections between companies and a wide talent pool. However, I feel the attention that InterviewStreet receives is undeserved. While it excels at attracting industry partners to sponsor it and use its platform, the quality of the competitor-facing side of the website is very lacking.

The main problem that I have with InterviewStreet is that the website's problem statements are very rough and unpolished, often lacking crucial details and having contradictory statements or examples. While it seems like the people behind InterviewStreet are familiar with other competitions, they are rather poor producing problem statements of the same quality. The writers' seem to have a poor grasp of English, and are very bad at expressing things in an unambiguous, mathematical manner, as is required by these types of competition sites. For example, consider the following screenshot of a problem statement:

When I see this image, I immediately notice a few things:

  1. The final sentence of the first paragraph asks the reader to "find out the number of zombies at every junction after 'k' timesteps". Many problem-solvers will realise that the problem asks for the expected value of the number of zombies, but to new users or to users for whom English is not a native language, this sentence can seem very confusing.
  2. It's not specified what a zombie will do in the case where there are no neighbouring junctions.
  3. The input format doesn't specify whether the numbers are separated by spaces, newlines, commas, or arbitrary characters. Once again, one can read the example input to figure this out, but this should not be necessary. The input format should completely describe all possible input.
  4. There are lots of stylistic issues. All the input variables are uppercase, except for 'k', which is lowercase. Word choice is not the best, and the grammar and spelling are terrible. All in all, these small details make the statement much harder to parse, unlike statements on other websites like TopCoder.

This is not an isolated case -- in the InterviewStreet Collegiate Challenge last spring, there were contradictions between the problem descriptions and the input, causing participants to lose valuable coding time. (Strangely enough, the results of the collegiate challenge are no longer posted on InterviewStreet's website.) Many of the other challenges on the website suffer from the same problem. The result of having poor problem statements is like having students take a poorly-written exam in a class -- less-talented students may stumble upon correct answers, which smarter students may fail. In this case success on InterviewStreet is as much dependent on being able to decipher poorly-written English as it is on having true programming skills. I do not exaggerate when I say that some of these problems contain the worst writing that I've ever seen in competitive programming. For comparison, look at the ACM-ICPC, TopCoder, Google Code Jam, or Kaggle. Should InterviewStreet wish to remain competitive (or be taken seriously, for that matter), it should put more effort into producing higher-quality problem statements.

InterviewStreet also fails to leverage the power of the programming community. One of Codeforces' first features when the launched in 2010 was to provide a blogging framework for its participants. Likewise, TopCoder has a wiki and forum, and hosts an extensive series of community-written articles. InterviewStreet has a forum, but few people seem to use it (perhaps because of the problems above).

Despite all of this, InterviewStreet does a good job in attracting industry partners, which I would attribute mostly to hype and rhetoric about its appeal as a 'startup' with potential to 'disrupt' the talent search process. If I were a company or startup looking for talent, I would turn instead to Kaggle, TopCoder, or Codeforces -- websites which provide great platforms for similar competitions, with well-established communities and some of the most impressive talent pools in the world. In the past I have heard that InterviewStreet is better for job-searching than TopCoder, but if the current trend continues, that will no longer be the case.

Thursday, October 4, 2012

Algebraic Type Systems / Combinatorial Species 101

This post is not meant to be a formal introduction into type theory or combinatorial species, but rather attempts to introduce some of the intuition behind the theory. Readers should have a basic understanding of naïve set theory, calculus, and data types in modern programming languages.


Let us begin by examining how types manifest themselves in programming. Many C-like programming languages provide basic types called int, float, char, and bool representing integers, floating point numbers, character literals, and booleans, respectively. Many languages also provide more complicated composite types, which are built up from simpler types. Simple examples include pair< x, y >, which consists of a value of type x and a value of type y, or a list< x > containing elements of type x. (In this article we will use uppercase letters \( X, Y, Z, \ldots \) to denote arbitrary types; feel free to substitute these abstract types with concrete types like int or char if it makes it easier to understand.)

In mathematical language, we could equivalently define \( P(X,Y) \) (representing a pair< x, y >) to be the set \( X \times Y \). Algebraic type theory allows us to represent these mathematical definitions as algebraic expressions, which we will call type expressions. In this case, the type expression for a pair is $$ P( X, Y ) = X \times Y = X Y $$ (Such a type is called a product type.) These expressions can be manipulated like normal algebraic expressions -- for example, a pair of two pairs pair< pair< x, x >, pair< y, z > > could be represented as $$ P( P( X, X ), P( Y, Z ) ) = ( X \times X ) \times ( Y \times Z ) = X^2 Y Z $$

A more complicated question is how to represent lists. If a list< x > contains exactly \( n \) elements, then it has the type \( X^n \); however, \( n \) is not fixed, so it is not clear how to represent lists of arbitrary length. We solve this problem by introducing another bit of notation, namely the \( + \) operator, representing alternation. The type expression \( T( A, B ) = A + B \) means that a variable of type \( T( A, B ) \) can take on a value of either type \( A \) or type \( B \). Mathematically, we could write \( T( A, B ) = A \sqcup B \), where \( \sqcup \) denotes the disjoint union operator. These sum types manifest themselves in the rarely-used union structure in C.

Now, equipped with this notation, we can write a type expression for list< x >: $$ L(X) = X^0 + X^1 + X^2 + X^3 + \ldots $$ Since \( X^n \) denotes a structure containing \( n \) values, the above expression indicates that lists can contain any nonnegative number of values. But wait, what does \( X^0 \) mean here? In normal arithmetic, \( X^0 = 1 \); as it turns out, this definition also makes sense in type expressions. We denote the symbol \( 1 \) to be a type which can take on exactly one, constant value. This is often called a unit type. Here \( X^0 = 1 \) denotes an empty list; since all empty lists can be regarded as equal (mathematically, at least), it makes sense that they form a unit type.

A unit type is trivial in the sense that it yields no additional information -- it can only be a constant value. Thus, when we write $$ Pair( 1, X ) = 1 \times X = X $$ the expression indicates that pair consisting of a constant and an \( X \) can be completely described by specifying a value in \( X \).

What else can we do with unit types? For starters, we could consider adding them together. As we all learned in kindergarten, $$ 1 + 1 = 2 $$ But what does this mean in terms of types? Well \( 2 \) seems to a type which can either be one constant, or a different constant; if we call the first constant true and the second constant false, then we see that \( 2 \) is the same as the boolean type. We can similarly represent the set of \( 32 \)-bit integers with the expression \( 1 + 1 + \ldots + 1 = 2^{32} \); we can view this expression as being able to choose one value out of \( 4294967296 \) possible options, or alternatively, fixing \( 32 \) bits, each of which can take on \( 2 \) values.

Just as we can define \( 1 \) to be the multiplicative identity, we can also define \( 0 \) to be the additive identity. I.e., for any \( X \), $$ 0 + X = X $$ \( 0 \) is referred to as the bottom type, and it is a type with no values whatsoever. Thus, the expression \( 0 + X \), meaning "either nothing or \( X \)", is the same as just \( X \).

Recursive Data Types and Generating Functions

Earlier we defined \( L( X ) \) with the following type expression: $$ L( X ) = 1 + X + X^2 + X^3 + \ldots $$ It is also possible to define lists recursively in the following manner: $$ L( X ) = 1 + X \times L( X ) $$ We read the expression thus: a list is either an empty list (the \( 1 \)), or a node which holds one element (the \( X \)) and points to a successor list (the \( L( X ) \)). This is the standard way of defining singly-linked lists in Lisp. Alternatively, in C++, this could be written

class node
 int* value;
 node* next;
typedef node* list;
list EMPTY = NULL;

What is interesting about this recursive definition is that it is exactly the same as the previous definition. If we repeatedly expand the expression, we get $$ L( X ) = 1 + X \times L( X ) = 1 + X ( 1 + X \times L( X ) ) = 1 + X + X^2 \times L( X ) = 1 + X + X^2 + X^3 \times L( X ) = \ldots $$

Furthermore, the type expression is a special polynomial called a generating function. A generating function is a polynomial, such that the coefficient of \( X^n \) is some special function of \( n \). Whenever we write a type expression for some container \( C( X ) \), the coefficient of \( X^n \) usually denotes the number of 'differently-shaped' containers that hold exactly \( n \) \( X \)'s. In this case, every singly linked list of length \( n \) has the same shape. Therefore, the coefficient of \( X^n \) is 1.

A simple data structure which could have many different shapes for some fixed size is a binary tree. For example, a binary tree of size \( 3 \) could have the following shapes:

If we let \( T( X ) \) denote the type expression for binary trees, then we know that the coefficient of \( X^3 \) in \( T( X ) \) is \( 5 \), since there are exactly \( 5 \) binary trees of size \( 3 \).

As with lists, we can write type expressions for binary trees. One possible way to define a binary tree is either an empty node, or a node containing an element and two children. This gives us the expression $$ T( X ) = 1 + X \times T( X )^2 $$ Notice that this is a standard quadratic equation in terms of \( T( X ) \). Suspending disbelief for a moment, we can actually solve for a solution to \( T( X ) \): $$ T( X ) = \frac{ 1 \pm \sqrt{ 1 - 4 X } }{ 2 X } $$ Here, the square root of \( 1 - 4 X \) obviously makes no sense in the standard numerical interpretation of the square root; rather, we take \( \sqrt{ 1 - 4 X } \) to mean the power series of \( X \), that when squared, is equal to \( 1 - 4 X \). In this case, we can simply take the Taylor series of the expression, which gives us $$ \frac{ 1 \pm \sqrt{ 1 - 4 X } }{ 2 X } = 1 \mp X \mp 2 X^2 \mp 5 X^3 \mp 14 X^4 \mp 42 X^5 \mp \ldots $$ The 'minus' root is the only one that makes sense here, so we have $$ T( X ) = 1 + X + 2 X^2 + 5 X^3 + 14 X^4 + 42 X^5 + \ldots $$

As a sanity check, we can verify that there is exactly \( 1 \) tree of size \( 0 \), \( 1 \) tree of size \( 1 \), \( 2 \) trees of size \( 2 \), and \( 5 \) trees of size \( 3 \). Astute readers may have already recognised that \( 1, 1, 2, 5, 14, 42 \) are the first few Catalan numbers, a sequence which appears often in combinatorics. One of the interesting properties of the Catalan numbers is that the \( n \)-th Catalan number corresponds exactly to the number of binary trees of size \( n \). By expanding a type expression for binary trees, we have come up with a generating function for Catalan numbers!

Differentiating Data Types

Surprisingly enough, we can actually perform a variation of differentiation on data structures. In functional programming, it is often useful to consider a data structure with a 'hole' punched in it; i.e., if we take a single value out of a container and mark the spot with a placeholder. For example, in a list< int >, we can take out an int at some point in the list and keep a pointer to the position where we removed the int. These modified data structures are called one-hole contexts because one you 'fill in' the hole with a concrete value, you get a data structure of the same shape as the original.

We begin by examining the lowly pair \( P( X, X ) \), which has the type \( X^2 \). Suppose we pluck out one of the values in the pair -- how do we represent what remains? If we remove either element, only one element remains; this can be represented with an \( X \). However, we also have to keep track of which element we removed -- a hole in the left element is not the same as a hole in the right element. Thus, the one-hole context of \( P( X, X ) \) is \( X + X = 2 X \). You may alternatively think of \( 2 X \) as a tuple containing a boolean (denoting whether the left or right element was removed) and an \( X \) (representing the remaining element). Interestingly enough, \( 2 X \) happens to be the derivative of \( X^2 \) with respect to \( X \). As it turns out, this is true in general -- it has been proven that one-hole contexts have an exact correspondence with the derivatives of type expressions!

A more complicated example is the list \( L( X ) \). First, let's compute the derivative of \( L( X ) \); we begin by rewriting $$ 1 + X + X^2 + X^3 + \ldots = \frac{ 1 }{ 1 - X } $$ Then, $$ L'( X ) = \frac{ d }{ d X } \frac{ 1 }{ 1 - X } = \left( \frac{ 1 }{ 1 - X } \right)^2 = L( X )^2 $$ Whenever we punch a hole in a list, it leaves a hole, along with the lists before and after it. Thus, the two lists in \( L( X )^2 \) represent, respectively, the prefix and suffixes of the one-hole context. We can thus 'fill in' the context and construct a new list by concatenating the prefix, some value \( X \), and the suffix.

Likewise, we can perform differentiation on binary trees: $$ T'( X ) = T( X )^2 + 2 X \times T( X ) \times T'( X ) $$ This definition is somewhat more cryptic, but it can be deciphered with the same type of analysis that we used earlier. Given the root node of a binary tree with a hole in it, either

  1. The hole lies in the root node, in which case our node has form \( T( X )^2 \).
  2. Or the hole lies in either the left or right subtree; in this case the root node can be described by specifying its content \( X \), a complete subtree \( T( X ) \), and an incomplete subtree \( T'( X ) \). Since there are two choices for where the hole lies, we multiply this term by 2.

Further Reading

  1. Conor McBride's original paper demonstrating that one-hole contexts are equivalent to derivatives.
  2. Wikipedia's articles on algebraic type theory and combinatorial species, an equivalent way of defining types.
  3. A series of two articles (1), (2) on Lab49 about differentiating data structures.

Sunday, July 22, 2012

TopCoder Arena Resizing Bug in xmonad/dwm/awesome/ratpoison (AKA The Java "Empty Grey Window" Bug)

What happened:

I recently switched to using xmonad as my main window manager, and I immediately noticed some issues with resizing the TopCoder Arena. Specifically, the main content area stayed the same size no matter how I resized the window itself. If I expanded the window, everything outside the content area would be blank grey pixels. As I soon found out, I wasn't the only person to have encountered this issue before -- the same situation is described on the TopCoder forums here and here (screenshot here).

The reason:

As it turns out, this bug applies to all Java applications running on many non-reparenting window managers (including dwm, awesome, ratpoison, etc...). According to the Arch wiki, Java has a hardcoded list of non-reparenting window managers; if you use anything outside of this list, then strange things happen.

Common fixes:

  • If you're using OpenJDK, fixing the problem should be relatively simple. You simply have to set _JAVA_AWT_WM_NONREPARENTING=1 in whichever shell you use and try logging out and logging back in. This method is independent of window manager, but I'm not sure if it works for Sun/Oracle Java.
  • You could try pretending to Java that you're using a supported non-reparenting window manager, such as LG3D. Tools such as SetWMName (xmonad only) and wmname allow you to do this.

Moar info:

  • As it turns out, there is another bug in xmonad which messes up window focus (using hotkeys, typing in text areas, etc...). A summary of the bug and some fixes can be found here. (Updated on 2012-12-02: Apparently this has been fixed by the latest commit to the DARCS version of XMonad.)
  • HaskellWiki's FAQ on xmonad
  • ArchWiki's pages on xmonad, awesome, and dwm

Monday, June 25, 2012

InterviewStreet: Going Office (AKA The Most Vital Arc Problem)


Going Office is definitely one of the tougher problems on InterviewStreet, but the main idea* behind the problem is simple: given a weighted undirected graph \( G = ( V, E ) \) and two nodes \( s, t \in V \), we wish to answer \( Q \) queries of the following form:

What is the shortest path from \( s \) to \( t \) given that edge \( e \in E \) is removed from the graph?
(This type of problem is also known as the "most vital arc" problem -- see the first footnote below.)

Let \( N = | V | \) and \( M = | E | \) as usual. In this case, we are guaranteed that \( N, M, Q \leq 200000 \).


You should definitely know Dijkstra's algorithm before attempting to solve this problem. Also, knowing about segment trees with lazy propagation would be good, but is not required to understand the solution.

Building Bridges

First of all, it should be clear that naively running Dijkstra's isn't going to run in time, because it requires \( O( Q N \log N ) \) operations. In fact, the size of the input suggests that we should look for at worst an \( O( N \log N ) \) solution. What can we do then? We may first try to solve an easier problem: can we determine whether an edge \( e \), when removed, will increase the shortest path length between \( s \) and \( t \)?

  1. An edge is optimal if it is on a shortest path from \( s \) to \( t \).
  2. An edge is a bridge* if it is on all shortest paths from \( s \) to \( t \). (Here we refer only to shortest paths in \( G \), without having removed any edges.)

Let \( d_s( u ) \) (resp. \( d_t( u ) \)) be the shortest-path distance from \( s \) (resp. \( t \)) to \( u \), and let \( OPT \) be the shortest-path distance from \( s \) to \( t \). Then we can show the following properties:

  1. \( e = ( u, v ) \) is optimal if and only \( d_s( u ) + length( e ) + d_t( v ) = OPT \).
  2. \( e = ( u, v ) \) is a bridge if and only if it is optimal and for all other optimal \( e' = ( u', v' ) \), we have either \( d_s( v' ) \leq d_s( u ) \) or \( d_s( v ) \leq d_s( u' ) \). In other words, when we travel along an optimal path from \( s \) to \( t \), then between lengths \( d_s( u ) \) and \( d_s( v ) \), we must be traveling along \( e \). By convention we denote a bridge with the letter \( b \).
  3. As a corollary of the above statement, \( e \) is a bridge if and only if, when removed from \( G \), the shortest path distance from \( s \) to \( t \) increases.
(Proofs are left to the reader.) These properties should give you some idea of how to find all the bridges in \( G \). If you still need some hints, an algorithm for computing bridges is included at the end of this post.


Now we have a way of identifying which edges, when removed, affect the shortest path length. This still leaves one major question unanswered: given a bridge \( b \), how exactly do we determine the shortest path length from \( s \) to \( t \) once \( b \) is removed? Running Dijkstra's repeatedly is too slow, since we can easily construct examples in which every edge is a bridge. (Consider a graph consisting of only a path from \( s \) to \( t \).)

Let \( K \) denote the number of bridges. Note that Lemma 2 above implies that we can uniquely order the bridges \( b_0, b_1, \ldots, b_{K-1} \) based on their relative distances from \( s \). Could we use this to help us?

Definition: Let the \( i \)-th island be defined as the set of all vertices \( v \), such that there exists a shortest path from \( s \) to \( v \) using no more than \( i \) bridges. (If \( v \) is not connected to either \( s \) or \( t \), then we can ignore it -- it lies on some strange continent somewhere far away.)
The intuition behind islands and bridges is this: bridges are the fastest way to "travel" between the islands when going from \( s \) to \( t \). If we remove an edge that is not a bridge, our shortest path is not affected. However, when we take out a bridge, we are forced to travel along another (possibly much longer) path.
Proposition: Consider the bridge \( b_i \), connecting the \( i \)-th and \( i + 1 \)-st islands. Let \( E_i \) be the set of all edges \( e \) connecting an island with index \( \leq i \) to an island with index \( > i \). Then the shortest path from \( s \) to \( t \) that bypasses \( b_i \) must have length \[ \min_{ e = ( u, v ) \in E_i } \{ d_s( u ) + length( e ) + d_t( v ) \} \]
Proof: Any path \( P \) from \( s \) to \( t \) bypassing \( b_i \) must include some edge in \( E_i \). Now consider some edge \( e = ( u, v ) \in E_i \) -- the shortest path from \( s \) to \( t \) that goes through \( e \) must have length \( d_s( u ) + length( e ) + d_t( v ) \). Lastly, by the way we have constructed the islands, we know that \( b_i \) is not on the shortest paths from \( s \) to \( u \) or from \( v \) to \( t \).

A Data Structure for Bypass Length

Let \( bypass( i ) \) denote the length of the shortest path that bypasses \( b_i \). The above observations suggest that we can run the following algorithm to compute \( bypass( i ) \):

For each non-bridge edge \( e = ( u, v ) \) connecting islands \( i \) and \( j \), where \( i < j \):
For each \( k \in \{ i, i + 1, \ldots, j - 1 \} \):
\( bypass( k ) \leftarrow \min \{ bypass( k ), d_s( u ) + length( e ) + d_t( v ) \} \)
However, there is a slight problem with this algorithm: it's too slow. Through some tricky analysis, you can show that in the worst case, we can construct a graph that will cause the above subroutine to take \( O( M^{3/2} ) \) operations.

What we need is a data structure that supports the following two operations efficiently:

  1. \( update( i, j, val ) \), which runs \( bypass( k ) \leftarrow \min \{ bypass( k ), val \} \) for all \( k \in \{ i, i+1, \ldots, j-1 \} \)
  2. \( query( i ) \), which returns \( bypass( i ) \)

One possible data structure that can support both operations in \( O( \log K ) \) time is called a segment tree with lazy propagation or a segment tree with range update. There are several tutorials online about this type of tree, and it's a good data structure to be familiar with. (The main problem right now is that the structure is rather complicated and the online tutorials are not very good. I may do an article on it some time, if there is sufficient demand.)

However, if you are lazy, there is another data structure called the Kempe tree, which is a bit simpler and can support the same sort of queries, also using \( O( \log K ) \) operations. Check the link for more information.

Putting It All Together

The previous observations suggest the following algorithm for solving the problem:

  1. Compute \( d_s \) and \( d_t \) using Dijkstra's algorithm.
  2. Find the bridges:
    1. First find all the optimal edges by iterating over all edges \( e = ( u, v ) \), and storing the edges such that \( d_s( u ) + length( e ) + d_t( v ) = OPT \).
    2. Sort all the optimal edges by \( d_s( u ) \) and \( d_s( v ) \). Use the criterion from above to find the bridges.
  3. Find the islands.
    1. One way (the hard way) is to run a modified version of Dijkstra.
    2. A smarter and shorter alternative is to just run multiple depth-first searches from \( s \) and from the far end of each bridge. It's up to the reader to figure out how to do so.
  4. Find the bypassing paths.
    1. Initialise your favourite range-update data structure.
    2. Iterate over all the non-bridge edges. If \( e = ( u, v ) \) is an edge crossing from island \( i \) to island \( j > i \), then range-update \( bypass(i), bypass(i+1), \ldots, bypass(j-1) \) such that they are no more than \( d_s( u ) + length( e ) + d_t( v ) \).
  5. Process all the queries. If the edge in question is not a bridge, then we simply return the optimal length. Otherwise, we query the range-update data structure to find the best bypass length.
All of the above steps run in require \( O( R \log R ) \) operations where \( R = max\{ N, M, Q \} \), so the algorithm has complexity \( O( R \log R ) \).

The solution took me around 150 lines to implement with a decent amount of spacing and a relatively clean coding style. I'm somewhat suspicious, though, that I've overcomplicated the solution, and that a nicer solution exists. Message me or comment here if you have any other alternatives.


  1. See this paper for an example of previous work on the subject.
  2. See also this page.

The Kempe Tree: A Simple Data Structure for Supporting Range Updates or Queries


This post describes two variations on tree structure that I've used in some programming problems for solving variations of the range query/update problem. I haven't seen anyone else use it before, so for the moment I'm going to call it the Kempe tree, named after my great-grandfather who was also named Richard Kempe.

Suppose we are given a list of data \( a_0, a_1, \ldots, a_{n-1} \). In general terms, range update data structure supports the following type of operation: \[ update( i, j, x ): a_k \leftarrow x \text{ for all } k \in \{ i, i+1, \ldots, j \} \] Similarly, a range query data structure supports the following type of operation: \[ query( i, j ) = f( a_i, a_{i+1}, \ldots, a_{j} ) \] (Here \( f \) is taken to be some function we can aggregate over a large number of values, like \( \min \), \( + \), etc....) Both of these operations are expected to be reasonably fast, usually taking at most \( O( n ) \) steps.

As an example, segment trees and Fenwick trees (or binary-indexed trees) are range query data structures. Segment trees with lazy propagation support both range queries and range updates.

The Kempe tree is a data structure that supports either fast range updates and single-element queries, or fast range queries and single-element updates, but not both simultaneously. (Okay, fine, it can support both, provided one is \( O( \log^2 n ) \). But more on that later.) Why, then, should I use a Kempe tree, you ask? The main reason is that it uses numerical tricks to make it easier to code, the same reason why the Fenwick tree is sometimes used instead of the segment tree. (This also means that, for all intents and purposes, Kempe trees will never be used outside of competition programming.)

Range Traversals

The Kempe tree, like the Fenwick tree, is an array-based tree. Suppose we have \( n \) data entries, a[0], a[1], ..., a[n-1]; then we require an array of size \( N + n \), where \( N \) is the smallest power of \( 2 \) that is \( \geq n \). Let tree[i] denote this array. The entries of tree[i] are allocated as follows:

  • Elements a[0], a[1], ..., a[n-1] are stored in entries tree[N], tree[N+1], ..., tree[N+n-1], respectively.
  • The parent node for any entry tree[i] is tree[i/2].
  • Likewise, a non-leaf node tree[i] has children tree[2*i] and tree[2*i+1].
  • The root of the tree is tree[1].
See the following image for a description of the mapping. Numbers in the tree nodes denote position in the array tree[i].

We say a tree node governs the array entries that are its descendants. For some range a[i], a[i+1], ..., a[j], the following algorithm finds the minimal set of tree nodes tree[x[0]], tree[x[1]], ..., tree[x[m-1]] that govern a[i], a[i+1], ..., a[j]:

vector find_governing_nodes( int i, int j )
  vector ans;
  i += N, j += N;
  while( i <= j )
    if( i % 2 == 1 ) ans.push_back( i );
    if( j % 2 == 0 ) ans.push_back( j );
    i = ( i + 1 ) / 2;
    j = ( j - 1 ) / 2;
  return ans;

A sketch of the proof for the correctness of the algorithm is the following. We start with i and j at the ends of the interval we are interested in. We process the interval by moving i and j up the tree, and by 'pulling' them together. If moving either i or j shrinks the interval during the next step, then we add them to the list of governing nodes. The following image illustrates the process in action:

  1. We wish to query the range \( [1, 5] \). i and j are initialised to 9 and 13, respectively.
  2. Since 9 would jump out of its subtree during the next move, we add 9 to the list of governing nodes.
  3. We move i to 5 and j to 6. Both i and j will jump out of their respective ranges during the next move, so we add both 5 and 6 to the list of governing nodes.
  4. i gets moved to 3, and j to 2. This is an invalid range, so we break and return the list [9, 5, 6].

Range Queries and Singleton Updates

We now present an example of a tree structure that can handle updates to single elements and range queries. The function we will try to query is the sum function.

// define MAX to be some really big number

int tree[MAX], N;

// initialises a tree for n data entries
int init( int n )
  N = 1;
  while( N < n ) N *= 2;
  for( int i = 1; i < N + n; i++ ) tree[i] = 0;

// compute the product a[i] + a[i+1] + ... + a[j]
int range_sum( int i, int j )
  int ans = 1;
  for( i += N, j += N; i <= j; i = ( i + 1 ) / 2, j = ( j - 1 ) / 2 )
    if( i % 2 == 1 ) ans += tree[i];
    if( j % 2 == 0 ) ans += tree[j];
  return ans;

// set a[i] = val
void update( int i, int val )
  int diff = val - tree[i+N];
  for( int j = i + N; j; j /= 2 ) tree[j] += diff;

Range Updates and Singleton Queries

The following example will allow us to increment a range, while allowing us to query the result on a singleton value.

// define MAX to be a really big number

int tree[MAX], N;

// initialises a tree for n data entries
int init( int n )
  N = 1;
  while( N < n ) N *= 2;
  for( int i = 1; i < N + n; i++ ) tree[i] = 0;

// increments a[k] by val for all k between i and j, inclusive
void range_increment( int i, int j, int val )
  for( i += N, j += N; i <= j; i = ( i + 1 ) / 2, j = ( j - 1 ) / 2 )
    if( i % 2 == 1 ) tree[i] += val;
    if( j % 2 == 0 ) tree[j] += val;

// compute a[i]
int query( int i )
  int ans = 0;
  for( int j = i + N; j; j /= 2 ) ans += tree[j];
  return ans;

Final Remarks

We can easily extend the tree to support many types of functions, such as the product function, min, max, set union, set intersection, and so forth.

Unfortunately, under the current scheme, the tree cannot simultaneously support range updates and queries. There is a way to support both operations, provided that one operation takes \( O( \log n ) \) and the other takes \( O( \log^2 n ) \). (The proof of this is left to the reader.) Thus, Kempe trees do not provide as much functionality as segment trees, but are generally somewhat easier to code.

Thursday, May 31, 2012

Numbers Are Friends, Not Food: Unfriendly Numbers from InterviewStreet

Today we're going to go over Unfriendly Numbers, an interesting InterviewStreet problem involving a little number theory.



This article assumes that you are familiar with the Euclidean algorithm, as well as the naive \( O(\sqrt{N}) \) algorithms for integer factorisation and prime sieving.

Bounding the number of prime factors

Suppose we are given some number \(N\) between \(1\) and \(MAX\), inclusive, and we want to know the maximum number of prime factors that \(N\) can have.

We define the function \(primorial(K)\) (see here) as the product of the first \(K\) primes. It should be clear that \(primorial(K)\) is the smallest number with exactly \(K\) distinct prime factors. Thus, the maximum number of prime factors that \(N\) can have is simply the largest value of \(K\) such that \(primorial(K) \leq MAX\).

Since \(primorial(K)\) grows more quickly than \(K!\), then it should be clear that the bound on the number of unique prime factors grows extremely slowly. I.e., we can expect very large numbers to have relatively few numbers of prime factors.

Bounding the number total divisors

Now let us look at a harder problem: suppose we are given a number \(N\) between \(1\) and \(MAX\), and we want to determine the maximum number of total divisors that \(N\) can have. Define the divisor function, or \(d(N)\) as the number of divisors of \(N\). According to this Wikipedia article, the divisor function grows more slowly than \( N^\epsilon \) for all \( \epsilon > 0 \).

However, in programming competitions, this may not be a good enough estimate: more specifically, can we actually compute \( max \{ d(N) \} \) for all \(N\) in our range? Let us begin by introducing some more terminology: a highly composite number is an integer which has more factors than any smaller integer. We can then restate problem as computing the number of factors of \(M\), where \(M\) is the largest highly composite number between \(1\) and \(MAX\).

As it turns out, there is a fast algorithm for computing the largest highly composite number. We begin by first examining a few lemmata. Here we adopt the convention that \(p_i\) refers to the \(i\)-th prime. I.e. \( p_1 = 2 \), \( p_2 = 3 \), \( p_3 = 5 \), and so forth.

  1. Lemma 1: Suppose \(M\) is an integer with factorisation \( p_1^{e_1} p_2^{e_2} p_3^{e_3} \ldots \). Then $$ d(M) = \prod_i ( e_i + 1 ) $$

    Proof: If \(K\) is a divisor of \(M\) with factorisation \( p_1^{f_1} p_2^{f_2} p_3^{f_3} \ldots \), then for all \(i\), we have \( 0 \leq f_i \leq e_i \). There are thus \( e_i + 1 \) possibilities for each \( f_i \), so a total of \( \prod_i ( e_i + 1 ) \) possible factors.

  2. Lemma 2: Suppose \(M\) is a highly composite number with factorisation \( p_1^{e_1} p_2^{e_2} p_3^{e_3} \ldots \). Then \( e_i \geq e_{i+1} \) for all \( i \).

    Proof: Suppose not: i.e. \( e_i < e_{i+1} \) for some \(i\). Swap the exponents and let \(K = p_1^{e_1} p_2^{e_2} \ldots p_i^{e_{i+1}} p_{i+1}^{e_i} \ldots \). Then \( K < M \), but by our previous lemma, \( d(K) = d(M) \). Thus, \(M\) is not highly composite.

By Lemma 2, we know that all highly composite numbers, when factored, have non-increasing sequences of exponents. Could we, then, take a dual approach and search over all non-increasing sequences of exponents to find highly composite numbers? The following recursive algorithm is an implementation of that idea in C++:

// algorithm for finding highly composite numbers

pair< long long, int > rec( long long bound, int index, long long prod, int max_pow )
 long long best = prod;
 int num_div = 1;
 prod *= primes[index];
 for( int i = 1; i <= max_pow && prod <= bound; i++, prod *= primes[index] )
  pair< long long, int > next = rec( bound, index + 1, prod, i );
  if( ( i + 1 ) * next.second > num_div )
   best = next.first;
   num_div = ( i + 1 ) * next.second;
 return make_pair( best, num_div );

pair< long long, int > largest_highly_composite( long long bound )
 return rec( bound, 0, 1, INF );
Here, the first term returned, \(M\), is the largest highly composite number less than or equal to bound, and the second term returned is \(d(M)\). The array primes stores a pre-generated array of primes.

The recursion depth is bounded by the number of unique prime factors of any number less than or equal to bound, which we determined earlier was very small. Furthermore, the branching factor is bound by the logarithm of bound, as well as some other factors. In practice, the algorithm runs almost instantaneously for any long long, though there are some overflow issues with the larger numbers.

InterviewStreet Challenge - Unfriendly Numbers

Problem Link

The problem statement is simple enough: count the number of factors of \(K\) that are not factors of any number in \(B_1, B_2, B_3, \ldots, B_N\). We are given the following constraints:

  • \(1 \leq N \leq 10^6\)
  • \(1 \leq K \leq 10^{13}\)
  • \(1 \leq B_i \leq 10^{18}\) for all \(i\)

First, as always, we consider the naive solution: simply generate all the factors of \(K\) and for each factor \(F_i\), we check whether or not it is a factor of each \(B_i\). Unfortunately, there can be up to a million \(B_i\), so checking each \(F_i\) is going to be relatively slow. Let us try using our previous code to estimate the maximum number of \(F_i\). Plugging in largest_highly_composite(1e13) reveals that the largest highly composite number in the range is \(9316358251200\), which has \(10752\) factors. This is a small number, but certainly not small enough to check every pair of \(F_i\) and \(B_j\). We have to try something smarter.

A digression: numbers as vectors

There are many ways to see the next step of the solution; in my case, I relied on geometric intuition, though I'm sure other people did it differently. We begin by considering each integer \( N = p_1^{e_1} p_2^{e_2} \ldots \) as a vector, where the \(i\)-th entry of the vector is \( e_i \). For example, consider numbers that are products of powers of \(2\) and \(3\) only. The vectors corresponding to these numbers have only two nonzero components, and thus can be plotted on the coordinate plane:

Here \(3\) is plotted at \( (0,1) \) indicating that \( 3 \) has the factorisation \( 2^0 3^1 \). Likewise, \( 12 = 2^2 3^1 \) is plotted at \( (2,1) \), and so on. If we include more prime factors, then we'd need more dimensions to represent the numbers. Unfortunately, I'm not too good at drawing things in 12-dimensional space, so I'll leave it as an exercise for you, the reader.

Using this geometric representation, saying that \(a\) is a divisor of \(b\) is equivalent to saying that \(a\) is contained in the box with one corner at \(0\), and another corner at \(b\). That is, if \(F(n)\) is the set of factors of \(n\) then \(F(n)\) appears on the grid as the box bounded by \(0\) and \(n\). Similarly, the greatest common divisor of \(a\) and \(b\) is the `largest' point which appears in the boxes \(F(a)\) and \(F(b)\), and the least common multiple is the `smallest' point whose box contains \(a\) and \(b\).

Now let's consider the problem statement again: we need to find the number of points in \(K\)'s box that are completely outside of the box of any \(B_i\). This is still a hard problem, but now we have a way of visualising it. Let us return to the diagram; suppose \( K = 12 \) and \( B_1 = 18 \):

We want to count the number of points in \(F(12)\) (the green region) that are not in \(F(18)\) (the blue region). In this case, the valid points are \( 12 \) and \( 4 \), and the invalid points are \(1\), \(2\), \(3\), \(6\), \(9\), and \(18\).

Note, however, that we didn't really need to consider all the points in \(F(18)\), as some of them are outside of \(F(12\)) already. Instead, it would have been equivalent to count the points inside \(F(12)\) which were outside of \( F(18) \cap F(12) = F(6) \). More generally, this means that we need to count the points in \(F(K)\) that are not in \( F(K) \cap F(B_i) \) for any \(i\). How does this help us? Well, we showed earlier that \(F(K)\) contained around \(10000\) points. Thus, all the one million \(F(B_i)\) get mapped into at most \(10000\) distinct \( F(K) \cap F(B_i) \).

By this point, you, the astute reader, may have already realised that \( F(K) \cap F(B_i) \) is simply \( F( gcd( K, B_i ) ) \). This suggests the following algorithm:

  1. First, generate all the factors of \(K\). This can be done naively in \(O(\sqrt{K})\) time.
  2. For each \( i \), compute \( gcd( K, B_i ) \) and store it in a set called \( bad \). This takes \( O(N \log K ) \) for \( N \) bad numbers.
  3. Lastly, we compute the answer by finding the number of factors in \( F(K) \) that are factors of no numbers in \( bad \). Since both sets contain at most \( F(K) \) numbers, then this takes \( O( |F(K)|^2 ) \) time.

And there you have it! As usual, being able to solve problems like these requires a solid understanding of the background material, as well as a little bit of insight or elbow grease. A little bit of number-theoretical intuition helps a great deal in programming contests, and as always, the best way to gain said intuition is just to solve more problems.


Originally I intended to also do a writeup of another excellent problem, EllysNumbers, the Division 1, Level 2 problem from TopCoder SRM 534. However, I am lazy and the people at TopCoder have already written up a good tutorial here. Try solving it yourself!

Tuesday, May 22, 2012

"Мне до лампочки": Cold War Propaganda, Language Learning, and Light Bulbs

In learning a language, it's often the case that we encounter new idioms or expressions that make little sense without historical background or further context. (E.g. this expression, and for a more extreme example, this episode of Star Trek: TNG.) Such was the case when I was learning Russian a few years ago, and I happened upon the expression "(это) мне до лампочки", lit. "it's just like the light bulb to me". Figuratively, this is taken to mean that the person does not care much about the thing to which he is referring. This idiom is apparently relatively new, taking hold within the past century or so, and two of my professors have given me rather differing information about the etymology of this expression.
  1. The first professor, an American, explained to me that during one of the many Soviet industrialisation programmes, there was a drive to provide modern lighting systems to the general populace. However, because of inadequacies in the Soviet system, this programme took far longer to complete than originally envisioned, so the phrase "мне до лампочки" entered Russian vernacular as an expression describing something pointless or worthless.
  2. In a later encounter, a second professor, a Russian, refuted the first professor's claim as American propaganda leftover from the Cold War. According to him, the true origin of the phrase is the following: In earlier times, miners often fastened mining lamps to themselves via straps or harnesses. In Russia, one method of wearing the lamp involved fastening the lamp near one's crotch. Thus, "мне до лампочки" actually arose as a miners' euphemism for a more vulgar phrase, later being adopted by the Russian-speaking population as a whole.
While both explanations are plausible, a quick Google search for "мне до лампочки этимология [etymology]" lends more support to the latter theory. This naturally leads to the question: supposing the first explanation was indeed false, how and why did it arise? This may be hard to determine without much more research, but it seems clear that even in fields such as etymology, political and cultural biases can surface.