Yelp Design

P1 设计一个load balancer

太阁项目秀 Load Balancer原理和实战 太阁微项目 018 LoadBalancer(二)

P2 设计一个tiny url

https://zhuanlan.zhihu.com/p/21425634

class Solution {
public:

    // Encodes a URL to a shortened URL.
    string encode(string longUrl) {
        if (l2s.count(longUrl)) return l2s[longUrl];
        string shortUrl = GenerateShortUrl();
        l2s[longUrl] = shortUrl;
        s2l[shortUrl] = longUrl;
        return shortUrl;
    }

    // Decodes a shortened URL to its original URL.
    string decode(string shortUrl) {
        return s2l[shortUrl];
    }
private:
    string GenerateShortUrl() {
        return ConvertTo62(l2s.size());
    }
    string ConvertTo62(int number) {
        string Encode62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
        string res;
        for (int i = 0; i < 6; ++i) {
            res = Encode62[number%62] + res;
            number /= 62;
        }
        return res;
    }
    unordered_map<string, string> l2s, s2l;
};

// Your Solution object will be instantiated and called as such:
// Solution solution;
// solution.decode(solution.encode(url));

P3 what make your DB query slow

  • Slow server network connection
  • Inadequate memory / page faults
  • Missing indexes
  • Index columns in wrong order or sorted improperly
  • Indexes missing columns (covering index)
  • Need to create indexed views
  • Replacing cursors with set-based operations
  • Table partitioning
  • Poor table normalization / denormalization design
  • Buffer pool for recent pages too small

link

  • Firstly I’ll check whether all the queries are running slow or any particular query is taking long time to execute. If all the queries are running slow I’ll check any blocking are there in the database engine. If any particular query is running slow I’ll take the execution plan and create missing indexes.
  • I’ll check if any indexes are fragmented in the disk. If the index pages are fragmented the same needs to be reorganized or rebuild.

P3 troubleshooting slow page 网页打开慢

client side, server side 经典面试题:用户反映你开发的网站访问很慢可能会是什么原因 bandwidth

P4 设计一个方法来判断两端文字的相似度,我用的是cosine similarity. 然后有一些follow-up 问题,比如如何来设置threshold,怎么根据现有的来training。比较难。我没答好

余弦定理的应用:基于文字的文本相似度计算

P5 问了database, index。 如何用几句话讲明白什么是index

http://blog.codinglabs.org/articles/theory-of-mysql-index.html

P6 inverted index。

第一题是给你一个list of strings, 每个strings是一个句子包含lots of terms. 输出每个terms所出现的index of strings

# documents = ['I love to eat burger', 'burger is good to eat'......]
# 'burger' = [0, 1, ...]
# 'eat' = [0, 1, ...]
# 'love = [0, ...]

hashmap解, T: $$O(M*N)$$, M是length of documents, N是number of terms in doc Follow up, 如何判断两个term是similar?

倒排索引(英语:Inverted index),也常被称为反向索引、置入档案或反向档案,是一种索引方法,被用来存储在全文搜索下某个单词在一个文档或者一组文档中的存储位置的映射。它是文档检索系统中最常用的数据结构。 以英文为例,下面是要被索引的文本:

$$T_{0}$$ = "it is what it is"

$$T_{1}$$ = "what is it"

$$T_{2}$$ ="it is a banana"

我们就能得到下面的反向文件索引:

 "a":      {2}
 "banana": {2}
 "is":     {0, 1, 2}
 "it":     {0, 1, 2}
 "what":   {0, 1}

P7 讲intern project,问garbage collection如何实现? 问了database index 是什么?如果有一个query 非常慢,为什么? 怎么找到原因。如何优化?如果是因为index的问题,解释为什么index影响query 速度。

Java垃圾回收机制一 简介–Java Garbage Collection Introduction “地球人都知道,Java有个东西叫垃圾收集器,它让创建的对象不需要像c/cpp那样delete、free掉,你能不能谈谈,GC是在什么时候,对什么东西,做了什么事情?”

P8 Cross-site request forgery csrf attack http://baike.baidu.com/link?url=FXeEnZO4X4RXeHUA23saNcwJs5yLd9DledQpatBm6lasT2NhtJjB-RhZ8Xnp7-W6lXY5emCjJW_2R70oG7FrKa

其实就是我的浏览器存了A网站的cookie。然后你在B网站上挂了一个链接,作用是向A网站提交一个转账的request。如果我点了B网站的链接他就会利用我A网站的cookie把这笔钱划走。 下面的博文讲得特别好! http://www.cnblogs.com/hyddd/archive/2009/04/09/1432744.html

P9 Cross Site Scripting xss attack http://baike.baidu.com/item/XSS%E6%94%BB%E5%87%BB/954065?fr=aladdin

其实就是在公共区域插入恶意代码,引导浏览的用户去点击执行,然后就会获取用户的cookie呀!比如我在一个用户开的帖子下面加回复。然后回复里面带恶意的javascript。

P10 SQL vs NoSQL

http://www.thegeekstuff.com/2014/01/sql-vs-nosql-db/?utm_source=tuicool

硅谷之路5:NoSQL就是把东西放在一起 https://www.bittiger.io/classpage/LHySgzKxNtEtL5eLz

P11 SQL injection

https://zh.wikipedia.org/wiki/SQL%E8%B3%87%E6%96%99%E9%9A%B1%E7%A2%BC%E6%94%BB%E6%93%8A

Horizontal scaling means that you scale by adding more machines into your pool of resources whereas Vertical scaling means that you scale by adding more power (CPU, RAM) to an existing machine.

P12 Rate Limiter

硅谷之路 55:如何设计Rate Limiter 看似简单的一个问题:请求速率限制问题

static class TokenBucket {

  private final int capacity;
  private final int tokensPerSeconds;
  private int tokens = 0;
  private long timestamp = System.currentTimeMillis();

  public TokenBucket(int tokensPerUnit, TimeUnit unit) {
    capacity = tokensPerSeconds = (int) (tokensPerUnit / unit.toSeconds(1L));
  }

  public boolean take() {
    long now = System.currentTimeMillis();
    tokens += (int) ((now - timestamp) * tokensPerSeconds / 1000);
    if (tokens > capacity) tokens = capacity;
    timestamp = now;
    if (tokens < 1) return false;
    tokens--;
    return true;
  }

}

public static void main(String[] args) throws InterruptedException {
  TokenBucket bucket = new TokenBucket(250, TimeUnit.MINUTES);
  Thread.sleep(1000L);
  for (int i = 0; i < 5; i++) {
    System.out.println(bucket.take());
  }
  Thread.sleep(1000L);
  for (int i = 0; i < 5; i++) {
    System.out.println(bucket.take());
  }
}

P15 dns, tcp

楼主能否介绍下tcp,dns,数据库问的基本问题是啥啊? 明天就要面试了但是不知道这些会问道什么程度,还要去背对应英文

问的是从在浏览器里面敲yelp.com到返回网页中间有哪些环节,然后每蹦一个名词就让你大概讲讲这个是什么,其实不难,就是基础cs知识。

在浏览器地址栏输入一个URL后回车,背后会进行哪些技术步骤? 当···时发生了什么?

P16 MapReduce

硅谷之路9:深入浅出理解GFS 硅谷之路11:六字口诀掌握 MapReduce

results matching ""

    No results matching ""