Sunday, November 30, 2014

C++ Specific Questions - Overloading VS Overriding

Ref: http://stackoverflow.com/questions/429125/override-and-overload-in-c

Overloading generally means that you have two or more functions in the same scope having the same name. The function that better matches the arguments when a call is made wins and is called. Important to note, as opposed to calling a virtual function, is that the function that's called is selected at compile time. It all depends on the static type of the argument. If you have an overload for B and one for D, and the argument is a reference to B, but it really points to a D object, then the overload for B is chosen in C++. That's called static dispatch as opposed to dynamic dispatch. You overload if you want to do the same as another function having the same name, but you want to do that for another argument type. Example:
void print(Foo const& f) {
    // print a foo
}

void print(Bar const& bar) {
    // print a bar
}
they both print their argument, so they are overloaded. But the first prints a foo, and the second prints a bar. If you have two functions that do different things, it's considered bad style when they have the same name, because that can lead to confusion about what will happen actually when calling the functions. Another usecase for overloading is when you have additional parameters for functions, but they just forward control to other functions:
void print(Foo & f, PrintAttributes b) { 
    /* ... */ 
}

void print(Foo & f, std::string const& header, bool printBold) {
    print(f, PrintAttributes(header, printBold));
}
That can be convenient for the caller, if the options that the overloads take are often used.
Overriding is something completely different. It doesn't compete with overloading. It means that if you have a virtual function in a base class, you can write a function with the same signature in the derived class. The function in the derived class overrides the function of the base. Sample:
struct base {
    virtual void print() { cout << "base!"; }
}

struct derived: base {
    virtual void print() { cout << "derived!"; }
}
Now, if you have an object and call the print member function, the print function of the derived is always called, because it overrides the one of the base. If the function print wasn't virtual, then the function in the derived wouldn't override the base function, but would merely hide it. Overriding can be useful if you have a function that accepts a base class, and every one that's derived from it:
void doit(base &b) {
    // and sometimes, we want to print it
    b.print();
}
Now, even though at compile time the compiler only knows that b is at least base, print of the derived class will be called. That's the point of virtual functions. Without them, the print function of the base would be called, and the one in the derived class wouldn't override it.



You overload functions for three reasons:
  1. To provide two (or more) functions that perform similar, closely related things, differentiated by the types and/or number of arguments it accepts. Contrived example:
    void Log(std::string msg); // logs a message to standard out
    void Log(std::string msg, std::ofstream); // logs a message to a file
  2. To provide two (or more) ways to perform the same action. Contrived example:
    void Plot(Point pt); // plots a point at (pt.x, pt.y)
    void Plot(int x, int y); // plots a point at (x, y)
  3. To provide the ability to perform an equivalent action given two (or more) different input types. Contrived example:
    wchar_t      ToUnicode(char c);
    std::wstring ToUnicode(std::string s);
In some cases it's worth arguing that a function of a different name is a better choice than an overloaded function. In the case of constructors, overloading is the only choice.

Overriding a function is entirely different, and serves an entirely different purpose. Function overriding is how polymorphism works in C++. You override a function to change the behavior of that function in a derived class. In this way, a base class provides interface, and the derived class provides implementation.

Microsoft Onsite Interview Checklist

1. Resume preparation

2. Basic data structure/algorithm

3. Design question

4. Behavior question

5. C++ language specific question

6. Knowledge about ASP.NET

基础data structure/algorithm列表

1. Implementation of stack/queue (array and linked list)
Implement queue using array: circular array - when meet end, wrap around to the beginning.
An array with size n can implement an queue with size n-1 (related to the queue full condition)
Queue empty: head == tail
Queue full: head == tail+1 or head == tail wrap around
Enqueue: tail++ or wrap around
Dequeue: head++ or wrap around

2. Binary search tree: insertion, deletion, successor, traversal, balance.
(1) Insertion: always insert to leaf; need to find the parent of the insertion position, O(h).
(2) Deletion: assume delete z.
z has not child: just delete.
z has only one child: replace z by its child.
z has both left/right child: find leftmost node in z's right subtree y. If y==z.right, replace z by y. Otherwise replace y by y's right child, and then replace z by y.
http://geeksquiz.com/binary-search-tree-set-2-delete/
(3) Successor: http://www.geeksforgeeks.org/inorder-successor-in-binary-search-tree/
(4) Preorder, in-order, and post-order traversal using iterative methods.

3. Heap: creation, insertion.
(1) Concept: nearly complete binary tree, implemented using array
Parent-child relation: for 0-based array (A[0:n-1])
parent(i) = (i-1)/2
left(i) = 2*i+1
right(i) = 2*i+2
(2) Properties:
Max heap: A[parent(i)] >= A[i] - heap sort
Min heap: A[parent(i)] <= A[i] -  priority queue
(3) Heapify: find the largest among y = {A[i], A[left(i)], A[ right(i)]}.
If y = A[i], done; else, swap A[i] with y, and heapify y
(4) Build max heap: for each i from heap.size/2  to 0, heapify(i).

4. Hash table: array + linked list implementation, collision

5. Sorting: (1) merge sort, (2) quick sort, (3) counting sort
Merge sort:
http://en.wikibooks.org/wiki/Algorithm_Implementation/Sorting/Merge_sort
http://www.sanfoundry.com/cpp-program-implement-merge-sort/
Quick sort:
http://en.wikibooks.org/wiki/Algorithm_Implementation/Sorting/Quicksort
https://gsamaras.wordpress.com/code/quicksort-c/

6. Master theorem
Key: compare logb(a) and c: http://en.wikipedia.org/wiki/Master_theorem

1. Stack implementation using array in C++:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <string>
using namespace std;
 
class Stack {
private:
      int top;
      int capacity;
      int *storage;
public:
      Stack(int capacity) {
            if (capacity <= 0)
                  throw string("Stack's capacity must be positive");
            storage = new int[capacity];
            this->capacity = capacity;
            top = -1;
      }
 
      void push(int value) {
            if (top == capacity)
                  throw string("Stack's underlying storage is overflow");
            top++;
            storage[top] = value;
      }
 
      int peek() {
            if (top == -1)
                  throw string("Stack is empty");
            return storage[top];
      }
 
      void pop() {
            if (top == -1)
                  throw string("Stack is empty");
            top--;
      }
 
      bool isEmpty() {
            return (top == -1);
      }
 
      ~Stack() {
            delete[] storage;
      }
};

2. Min Binary Heap Implementation in C++:

http://www.codeproject.com/Tips/816934/Min-Binary-Heap-Implementation-in-Cplusplus

面试中的clarifying questions

面试题经常有意无意字面上很含糊,这个时候一定需要和面世官交流搞清楚确切的意思。总结一下每个topic需要澄清的地方:

1. Array:
(1) Sorted or not?
(2) How many elements?
(3) Element type? Int, float, double?
(4) What's the range of those numbers? Positive or negative?
(5) Contain duplicates or not?
(6) Subsequence: adjacent or not?

2. Binary tree:
(1) Binary search tree or normal binary tree?
(2) Balanced or not?
(3) Complete or not?
(4) Has parent pointer or not?

3. Linked list:
(1) Singly or doubly linked list?
(2) Has duplicated nodal value or not?

4. String:
(1) Need to remove white spaces? Tab and newline?
(2) Only has digits? English letters? Upper or lower case?

5. Graph:
(1) How many nodes and edges?
(2) Directed or undirected?
(3) Edges have weights? If so, what's the range?
(4) Has loops? Negative sum loops?

6. Return value:
(1) What should my method return?
(2) If there are multiple solutions to the problem, which one should be returned?
(3) If it should return multiple values, do you have any preference on what to return?
(4) What should I do/return if the input is invalid / does not match the constraints?





Saturday, November 29, 2014

[LeetCode] Wildcard Matching

Implement wildcard pattern matching with support for '?' and '*'.
'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "*") → true
isMatch("aa", "a*") → true
isMatch("ab", "?*") → true
isMatch("aab", "c*a*b") → false



思路1:DP

和Regular Expression Matching很像,这里的'?'相当于Regular Expression中的'.',但'*'的用法不一样。这里'*'与前一个字符没有联系,并且无法消去前一个字符,但可以表示任意一串字符。递推公式的推导和Regular Expression Matching也基本类似。

p[j-1] == s[i-1] || p[j-1] == '?':dp[i][j] = dp[i-1][j-1]
p[j-1] == '*':
1. 匹配0个字符:dp[i][j] = dp[i][j-1]
2. 匹配1个字符:dp[i][j] = dp[i-1][j-1]
3. 匹配多个字符:dp[i][j] = dp[i-1][j]

先用二维数组写,发现会Memory Limit Exceeded

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
class Solution {
public:
    bool isMatch(const char *s, const char *p) {
        int m = strlen(s), n = strlen(p);
        vector<vector<bool>> dp(m+1, vector<bool>(n+1,false));
        dp[0][0] = true;
        for(int i=0; i<m; i++) {
            for(int j=1; j<n; j++) {
                if(p[j-1]==s[i-1] || p[j-1]=='?') 
                    dp[i][j] = i>0 && dp[i-1][j-1];
                else if(p[j-1]=='*') 
                    dp[i][j] = dp[i][j-1] || (i>0 && (dp[i-1][j-1] || dp[i-1][j]));
            }
        }
        return dp[m][n];
    }
};


然后用一维滚动数组改写,发现会Time Limit Exceeded


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
class Solution {
public:
    bool isMatch(const char *s, const char *p) {
        int m = strlen(s), n = strlen(p);
        vector<bool> dp(m+1, false);
        for(int i=0; i<m; i++) {
            bool diag = dp[0];
            dp[0] = i==0 ? true : false;
            for(int j=1; j<n; j++) {
                int temp = dp[j];
                if(p[j-1]==s[i-1] || p[j-1]=='?') 
                    dp[j] = i>0 && diag;
                else if(p[j-1]=='*') 
                    dp[j] = dp[j-1] || (i>0 && (diag || dp[j]));
                diag = temp;
            }
        }
        return dp[n];
    }
};


思路2:双指针扫描

用DP会TLE,那么一定有其他好的方法。



[LeetCode] Regular Expression Matching

Implement regular expression matching with support for '.' and '*'.
'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true


思路1: DP

关键在于如何处理这个'*'号。

状态:和Mininum Edit Distance这类题目一样。
dp[i][j]表示s[0:i-1]是否能和p[0:j-1]匹配。

递推公式:由于只有p中会含有regular expression,所以以p[j-1]来进行分类。
p[j-1] != '.' && p[j-1] != '*':dp[i][j] = dp[i-1][j-1] && (s[i-1] == p[j-1])
p[j-1] == '.':dp[i][j] = dp[i-1][j-1]

而关键的难点在于 p[j-1] = '*'。由于星号可以匹配0,1,乃至多个p[j-2]。
1. 匹配0个元素,即消去p[j-2],此时p[0: j-1] = p[0: j-3]
dp[i][j] = dp[i][j-2]

2. 匹配1个元素,此时p[0: j-1] = p[0: j-2]
dp[i][j] = dp[i][j-1]

3. 匹配多个元素,此时p[0: j-1] = { p[0: j-2], p[j-2], ... , p[j-2] }
dp[i][j] = dp[i-1][j] && (p[j-2]=='.' || s[i-2]==p[j-2])

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class Solution {
public:
    bool isMatch(const char *s, const char *p) {
        int m = strlen(s), n = strlen(p);
        vector<vector<bool>> dp(m+1, vector<bool>(n+1,false));
        dp[0][0] = true;
        
        for(int i=0; i<=m; i++) {
            for(int j=1; j<=n; j++) {
                if(p[j-1]!='.' && p[j-1]!='*') {
                    if(i>0 && s[i-1]==p[j-1] && dp[i-1][j-1])
                        dp[i][j] = true;
                }
                else if(p[j-1]=='.') {
                    if(i>0 && dp[i-1][j-1])
                        dp[i][j] = true;
                }
                else if(j>1) {  //'*' cannot be the 1st element
                    if(dp[i][j-1] || dp[i][j-2])  // match 0 or 1 preceding element
                        dp[i][j] = true;
                    else if(i>0 && (p[j-2]==s[i-1] || p[j-2]=='.') && dp[i-1][j]) // match multiple preceding elements
                        dp[i][j] = true;
                }
            }
        }
        return dp[m][n];
    }
};


思路2: 双指针扫描

LeetCode作者给的解法,非常巧妙:

http://leetcode.com/2011/09/regular-expression-matching.html

Thursday, November 27, 2014

[LeetCode] Max Points on a Line

Given n points on a 2D plane, find the maximum number of points that lie on the same straight line.

思路:

解这个平面几何题有3个要点:

1. 如何判断共线?
两点成一直线,所以两点没有共线不共线之说。对于点p1(x1, y1),p2(x2, y2),p3(x3, y3)来说,共线的条件是p1-p2连线的斜率与p1-p3连线的斜率相同,即
(y2-y1)/(x2-x1) = (y3-y1)/(x3-x1)
所以对共线的n点,其中任意两点连线的斜率相同。

2. 如何判断最多的共线点?
对于每个点p出发,计算该点到所有其他点qi的斜率,对每个斜率统计有多少个点符合。其中最多的个数加1(出发点本身)即为最多的共线点。

3. 特殊情况
当x1 = x2,y1!=y2时,为垂直连线。计算斜率时分母为0会出错。
当x1 = x2,y1 = y2时,两点重合。则(x2, y2)和所有(x1, y1)的连线共线。


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class Solution {
public:
    int maxPoints(vector<Point> &points) {
        int maxPts = 0;
        for(int i=0; i<points.size(); i++) {
            int nMax = 0, nSame = 0, nInf = 0;
            unordered_map<float,int> comSlopes;
            
            for(int j=i+1; j<points.size(); j++) {
                if(points[j].x==points[i].x) {
                    if(points[j].y==points[i].y)
                        nSame++;
                    else
                        nInf++;
                    continue;
                }
                float slope = (float)(points[j].y-points[i].y)/(float)(points[j].x-points[i].x);
                comSlopes[slope]++;
                nMax = max(nMax, comSlopes[slope]);
            }
            
            nMax = max(nMax, nInf)+nSame+1;
            maxPts = max(maxPts,nMax);
        }
        return maxPts;
    }
};

[LeetCode] Word Ladder I, II

Word Ladder I

Given two words (start and end), and a dictionary, find the length of shortest transformation sequence from start to end, such that:
  1. Only one letter can be changed at a time
  2. Each intermediate word must exist in the dictionary
For example,
Given:
start = "hit"
end = "cog"
dict = ["hot","dot","dog","lot","log"]
As one shortest transformation is "hit" -> "hot" -> "dot" -> "dog" -> "cog",
return its length 5.
Note:
  • Return 0 if there is no such transformation sequence.
  • All words have the same length.
  • All words contain only lowercase alphabetic characters.


Word Ladder II

Given two words (start and end), and a dictionary, find all shortest transformation sequence(s) from start to end, such that:
  1. Only one letter can be changed at a time
  2. Each intermediate word must exist in the dictionary
For example,
Given:
start = "hit"
end = "cog"
dict = ["hot","dot","dog","lot","log"]
Return
  [
    ["hit","hot","dot","dog","cog"],
    ["hit","hot","lot","log","cog"]
  ]
Note:
  • All words have the same length.
  • All words contain only lowercase alphabetic characters.


思路:

LeetCode中为数不多的考图的难题。尽管题目看上去像字符串匹配题,但从“shortest transformation sequence from start to end”还是能透露出一点图论中最短路径题的味道。如何转化?

1. 将每个单词看成图的一个节点。
2. 当单词s1改变一个字符可以变成存在于字典的单词s2时,则s1与s2之间有连接。
3. 给定s1和s2,问题I转化成了求在图中从s1->s2的最短路径长度。而问题II转化为了求所有s1->s2的最短路径。

无论是求最短路径长度还是求所有最短路径,都是用BFS。在BFS中有三个关键步骤需要实现:

1. 如何找到与当前节点相邻的所有节点。
这里可以有两个策略:
(1) 遍历整个字典,将其中每个单词与当前单词比较,判断是否只差一个字符。复杂度为:n*w,n为字典中的单词数量,w为单词长度。
(2) 遍历当前单词的每个字符x,将其改变成a~z中除x外的任意一个,形成一个新的单词,在字典中判断是否存在。复杂度为:26*w,w为单词长度。
这里可以和面试官讨论两种策略的取舍。对于通常的英语单词来说,长度大多小于100,而字典中的单词数则往往是成千上万,所以策略2相对较优。

2. 如何标记一个节点已经被访问过,以避免重复访问。
可以将访问过的单词从字典中删除。

3. 一旦BFS找到目标单词,如何backtracking找回路径?



Word Ladder I

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class Solution {
public:
    int ladderLength(string start, string end, unordered_set<string> &dict) {
        dict.insert(end);
        queue<pair<string,int>> q;
        q.push(make_pair(start,1));
        while(!q.empty()) {
            string s = q.front().first;
            int len = q.front().second;
            if(s==end) return len;
            q.pop();
            vector<string> neighbors = findNeighbors(s, dict);
            for(int i=0; i<neighbors.size(); i++) 
                q.push(make_pair(neighbors[i],len+1));
        }
        return 0;
    }
    
    vector<string> findNeighbors(string s, unordered_set<string> &dict) {
        vector<string> ret;
        for(int i=0; i<s.size(); i++) {
            char c = s[i];
            for(int j=0; j<26; j++) {
                if(c=='a'+j) continue;
                s[i] = 'a'+j;
                if(dict.count(s)) {
                    ret.push_back(s);    
                    dict.erase(s);    
                }
            }
            s[i] = c;
        }
        return ret;
    }
};


Word Ladder II