https://www.evernote.com/shard/s424/sh/abe1558e-b512-4ff0-929b-f305221166e0/3feb3686c7b488d341ddfec1629a836f

Web crawler design

https://codereview.stackexchange.com/questions/111183/multithreaded-webcrawler-in-java

  1. design logging system

https://www.careercup.com/question?id=5641031379845120

1.) What kind of alarms will be raised? (emails, pageouts) -

We can have an alarming interface on the server side and each type of alarm (email, pageout) will implement it and using a factory class we can get appropriate instance of alarm system.

2) Does data need to be logged for future reference and how long?

3) How often patterns change ?

If they change frequently, we can implement observer pattern where the logging process running on each app server will register itself as observer. Every new pattern added on the logging system will be distributed to all app servers.

4) For notifying logging system about the events, we can either use simple messaging service.

Logging System

1) Pattern database and Pattern publisher.

2) Event Collector (messaging service)

3) Alarm Interface and implementing classes.

4) Event database

5) GUI to add new patterns and view event history.

App Server

1) Daemon process monitoring the patterns distributed by logging system.

  1. transactional KV store

test and set, append-only file

https://redis.io/topics/persistence

https://redis.io/topics/transactions

https://redis.io/topics/persistence\#append-only-file

  1. 系统设计,一个比较老资格三哥,问了我设计dropbox 的sync pipeline,多个client和一个server怎么样sync。从最简单的案例开始(没有conflict),如何添加文件,修改文件,api 该如何定义,push 到server 端后,server 端怎么存,存什么,数据结构长什么样,其他客户端怎么样pull。我总体就是按照git的思路来答的,总体来说三哥全程都笑呵呵的,所以不知道三哥是满意还是不满意 https://blogs.dropbox.com/tech/2014/07/streaming-file-synchronization/

4.

Producer / consumer pattern

https://dzone.com/articles/producer-consumer-design

第三轮 thumbnail 系统设计

这轮答得有点崩,刚开始没反应过来这个是system design,回答的特别被动,基本是被面试官领着走的。其实就是让你设计一个简单的系统,通过使用message queue, producer/consumer的pattern来完成一个 请求处理图片 -

处理给定图片的小系统。

问了些producer这边的REST怎么设计?简单的POST, {domainName}/queue/{imageId}

status code应该用什么?400,403,404 客户端错误/ 200 请求成功 / 500 服务器错误

queue里面的message应该包含什么信息?imageId, jobId, metadata, timestamp

如果message没有放到queue里怎么办 ?return 500, log

consumer如果处理不了message怎么办? log, discard, retry

如果超时怎么办? kill thread

怎么监控consumer的状态? expose thread info 到REST? 记录thread pickup job 的timestamp

consumer会遇见什么问题? message 不valid没法parse, 处理的时候image可能已经不存在了

consumer挂了,没处理完成的图片怎么办?producer放message到queue的时候要log这个Event, 如果挂了对比producer/consumer的log找没处理的图片

. more info on 1point3acres

queue挂了怎么办? HA,做replication, producer log message

多台机器的consumer要考虑什么?

https://www.cloudamqp.com/blog/2015-09-03-part4-rabbitmq-for-beginners-exchanges-routing-keys-bindings.html

Message flow in RabbitMQ
  1. The producer publishes a message to an exchange. When you create the exchange, you have to specify the type of it. The different types of exchanges are explained in detail later on.
  2. The exchange receives the message and is now responsible for the routing of the message. The exchange takes different message attributes into account, such as routing key, depending on the exchange type.
  3. Bindings have to be created from the exchange to queues. In this case, we see two bindings to two different queues from the exchange. The Exchange routes the message into the queues depending on message attributes.
  4. The messages stay in the queue until they are handled by a consumer
  5. The consumer handles the message.

Direct:

  • A direct exchange delivers messages to queues based on a message routing key. In a direct exchange, the message is routed to the queues whose binding key exactly matches the routing key of the message. If the queue is bound to the exchange with the binding key pdfprocess, a message published to the exchange with a routing key pdfprocess will be routed to that queue.
  • Fanout: A fanout exchange routes messages to all of the queues that are bound to it.
  • Topic: The topic exchange does a wildcard match between the routing key and the routing pattern specified in the binding.
  • Headers: Headers exchanges use the message header attributes for routing.

AMQP( Advanced Message Queuing Protool)

It provides flow controlled,message-oriented communication with message-delivery guarantees such as at-most-once

(where each message is delivered once or never), at-least-once (where each message is certain to be delivered, but may do so multiple times) and exactly-once (where the message will always certainly arrive and do so only once), and authentication and/or encryption based on and/or It assumes an underlying reliable transport layer protocol such as Transmission Control Protocol (TCP).

Dropbox files at rest are encrypted using 256-bit Advanced Encryption Standard (AES)

  • Dropbox uses Secure Sockets Layer (SSL)/Transport Layer Security (TLS) to protect data in transit between Dropbox apps and our servers
  • SSL/TSL creates a secure tunnel protected by 128-bit or higher Advanced Encryption Standard (AES) encryption
  • Dropbox applications and infrastructure are regularly tested for security vulnerabilities and hardened to enhance security and protect against attacks
  • Two-step verification is available for an extra layer of security at login
  • If you use two-step verification, you can choose to receive security codes by text message or from an authenticator app
  • Public files are only viewable by people who have a link to the file(s)

results matching ""

    No results matching ""